自我保護(hù)機(jī)制定義
當(dāng)Eureka Server節(jié)點(diǎn)在短時(shí)間內(nèi)丟失過多客戶端時(shí)(可能發(fā)生了網(wǎng)絡(luò)分區(qū)故障)赶撰,那么這個(gè)節(jié)點(diǎn)就會(huì)進(jìn)入自我保護(hù)模式贮尉。一旦進(jìn)入該模式琼了,Eureka Server就會(huì)保護(hù)服務(wù)注冊(cè)表中的信息嫡丙,不再刪除服務(wù)注冊(cè)表中的數(shù)據(jù)(也就是不會(huì)注銷任何微服務(wù))。當(dāng)網(wǎng)絡(luò)故障恢復(fù)后快鱼,該Eureka Server節(jié)點(diǎn)會(huì)自動(dòng)退出自我保護(hù)模式颠印。
自我保護(hù)機(jī)制意義
自我保護(hù)模式是一種應(yīng)對(duì)網(wǎng)絡(luò)異常的安全保護(hù)措施。它的架構(gòu)哲學(xué)是寧可同時(shí)保留所有微服務(wù)(健康的微服務(wù)和不健康的微服務(wù)都會(huì)保留)抹竹,也不盲目注銷任何健康的微服務(wù)线罕。使用自我保護(hù)模式,可以讓Eureka集群更加的健壯柒莉、穩(wěn)定
自我保護(hù)機(jī)制實(shí)現(xiàn)
- 相關(guān)的參數(shù)列表
/**
* 注冊(cè)服務(wù)每分鐘發(fā)送的心跳數(shù)統(tǒng)計(jì)信息
*/
private final MeasuredRate renewsLastMin;
/**
* 期望每分鐘更新(續(xù)租闻坚,發(fā)送心跳)的最小次數(shù)
*/
protected volatile int numberOfRenewsPerMinThreshold;
/**
* 期望每分鐘更新(續(xù)租沽翔,發(fā)送心跳)的最大次數(shù)
*/
protected volatile int expectedNumberOfRenewsPerMin;
- 相關(guān)的配置信息
- eureka.server.renewal-percent-threshold =0.85 默認(rèn)值等于0.85
- eureka.server.enable-self-preservation =true(默認(rèn)true兢孝,表示開啟)
- [x] 配置獲取方式
/**
* 觸發(fā)自我保護(hù)機(jī)制的閥值配置信息
* eureka.server.renewal-percent-threshold =0.85 默認(rèn)值等于0.85
* 用于計(jì)算期望更新最小次數(shù)
* @return
*/
@Override
public double getRenewalPercentThreshold() {
return configInstance
.getDoubleProperty(namespace + "renewalPercentThreshold", 0.85)
.get();
}
/**
* 自我保護(hù)開關(guān)配置信息
* eureka.server.enable-self-preservation =true(默認(rèn)true,表示開啟)
* 用于判斷是否開啟自我保護(hù)模式
* @return
*/
@Override
public boolean shouldEnableSelfPreservation() {
return configInstance
.getBooleanProperty(namespace + "enableSelfPreservation", true)
.get();
}
- 觸發(fā)條件
- eureka.server.enable-self-preservation =true
- numberOfRenewsPerMinThreshold > 0
- renewsLastMin > numberOfRenewsPerMinThreshold
條件之間且關(guān)系: 1 && 2 && 3 ;即當(dāng)每分鐘期望最小的更新次數(shù)大于0時(shí)仅偎,以及每分鐘心跳次數(shù)( renewsLastMin ) 小于 numberOfRenewsPerMinThreshold 時(shí)跨蟹,并且開啟自動(dòng)保護(hù)模式開關(guān)( eureka.enableSelfPreservation = true ) 時(shí),觸發(fā)自動(dòng)保護(hù)機(jī)制橘沥,不再自動(dòng)過期租約
觸發(fā)條件的具體實(shí)現(xiàn)在PeerAwareInstanceRegistryImpl類下窗轩,具體如下:
/**
* 判斷是否進(jìn)入自我保護(hù)模式,當(dāng)返回false觸發(fā)自我保護(hù)模式座咆,不在下線服務(wù)
* @return
*/
@Override
public boolean isLeaseExpirationEnabled() {
//開關(guān)判斷
if (!isSelfPreservationModeEnabled()) {
// The self preservation mode is disabled, hence allowing the instances to expire.
return true;
}
//每分鐘心跳次數(shù)判斷
// 每分鐘期望最小的更新次數(shù)大于0時(shí)痢艺,以及每分鐘心跳次數(shù)( renewsLastMin ) 小于 numberOfRenewsPerMinThreshold 時(shí),返回true
return numberOfRenewsPerMinThreshold > 0 && getNumOfRenewsInLastMin() > numberOfRenewsPerMinThreshold;
}
自我保護(hù)機(jī)制參數(shù)的計(jì)算以及取值
- renewsLastMin (每分鐘心跳發(fā)送的次數(shù))
renewsLastMin 用來統(tǒng)計(jì)每分鐘所有實(shí)例發(fā)送心跳的次數(shù)仓洼,數(shù)據(jù)每分鐘更新一次,當(dāng)前獲取的是上一分鐘的統(tǒng)計(jì)數(shù)據(jù)堤舒,默認(rèn)情況下色建,每個(gè)實(shí)例30s發(fā)送一次心跳,每個(gè)實(shí)例每分鐘發(fā)送2次心跳舌缤。正常情況每分鐘統(tǒng)計(jì)數(shù)據(jù)為:實(shí)例數(shù)*2
renewsLastMin 在實(shí)例發(fā)送心跳時(shí)自增1箕戳,具體在AbstractInstanceRegistry.renew方法,實(shí)現(xiàn)如下:
//心跳統(tǒng)計(jì)數(shù)據(jù)+1
renewsLastMin.increment();
//租約刷新
leaseToRenew.renew();
==備注:== MeasuredRate 是一個(gè)速率測(cè)量工具類国撵,具體實(shí)現(xiàn)可以自行查看
- expectedNumberOfRenewsPerMin (期望每分鐘最大更新次數(shù))與numberOfRenewsPerMinThreshold(期望每分鐘最小的更新次數(shù))
由于expectedNumberOfRenewsPerMin與numberOfRenewsPerMinThreshold總是同時(shí)計(jì)算以及同時(shí)更新陵吸,在此合并在一起分析。
計(jì)算方式如下:
- expectedNumberOfRenewsPerMin = 當(dāng)前注冊(cè)的應(yīng)用實(shí)例數(shù)*2
- numberOfRenewsPerMinThreshold=expectedNumberOfRenewsPerMin*續(xù)租百分比(eureka.server.renewal-percent-threshold =0.85 )
- 期望每分鐘最大更新次數(shù)為實(shí)例數(shù) * 2,為什么呢?在renewsLastMin部分我們分析過介牙,在正常情況下壮虫,每個(gè)實(shí)例每分鐘的心跳數(shù)為2,(30s一次)耻瑟,每分鐘的最大值就是實(shí)例數(shù) * 2,這里我們期望的最大值就是在所有服務(wù)都正常的情況下旨指,每分鐘的心跳數(shù)
- 期望每分鐘最小更新數(shù)為期望最大值*續(xù)租的續(xù)租百分比,當(dāng)每分種的心跳數(shù)小于期望最小值是就開啟自我保護(hù)喳整,不在下線服務(wù) ( renewsLastMin > numberOfRenewsPerMinThreshold)
expectedNumberOfRenewsPerMin 與numberOfRenewsPerMinThreshold參數(shù)計(jì)算更新時(shí)機(jī)
- 服務(wù)初始化時(shí)(PeerAwareInstanceRegistryImpl.openForTraffic)
public void openForTraffic(ApplicationInfoManager applicationInfoManager, int count) {
// Renewals happen every 30 seconds and for a minute it should be a factor of 2.
this.expectedNumberOfRenewsPerMin = count * 2;
this.numberOfRenewsPerMinThreshold =
(int) (this.expectedNumberOfRenewsPerMin * serverConfig.getRenewalPercentThreshold());
logger.info("Got " + count + " instances from neighboring DS node");
logger.info("Renew threshold is: " + numberOfRenewsPerMinThreshold);
this.startupTime = System.currentTimeMillis();
if (count > 0) {
this.peerInstancesTransferEmptyOnStartup = false;
}
DataCenterInfo.Name selfName = applicationInfoManager.getInfo().getDataCenterInfo().getName();
boolean isAws = Name.Amazon == selfName;
if (isAws && serverConfig.shouldPrimeAwsReplicaConnections()) {
logger.info("Priming AWS connections for all replicas..");
primeAwsReplicas(applicationInfoManager);
}
logger.info("Changing status to UP");
applicationInfoManager.setInstanceStatus(InstanceStatus.UP);
super.postInit();
}
- 實(shí)例注冊(cè)時(shí)(AbstractInstanceRegistry.register)
//服務(wù)實(shí)例信息不存在與集群中
// The lease does not exist and hence it is a new registration
synchronized (lock) {
if (this.expectedNumberOfRenewsPerMin > 0) {
// Since the client wants to cancel it, reduce the threshold
// (1
// for 30 seconds, 2 for a minute)
this.expectedNumberOfRenewsPerMin = this.expectedNumberOfRenewsPerMin + 2;
this.numberOfRenewsPerMinThreshold =
(int) (this.expectedNumberOfRenewsPerMin * serverConfig.getRenewalPercentThreshold());
}
}
- 定時(shí)重置(PeerAwareInstanceRegistryImpl.updateRenewalThreshold)
private void scheduleRenewalThresholdUpdateTask() {
timer.schedule(new TimerTask() {
@Override
public void run() {
updateRenewalThreshold();
}
}, serverConfig.getRenewalThresholdUpdateIntervalMs(),
serverConfig.getRenewalThresholdUpdateIntervalMs());
}
- 服務(wù)下線(PeerAwareInstanceRegistryImpl.cancel)
public boolean cancel(final String appName, final String id,
final boolean isReplication) {
if (super.cancel(appName, id, isReplication)) {
replicateToPeers(Action.Cancel, appName, id, null, null, isReplication);
synchronized (lock) {
if (this.expectedNumberOfRenewsPerMin > 0) {
// Since the client wants to cancel it, reduce the threshold (1 for 30 seconds, 2 for a minute)
this.expectedNumberOfRenewsPerMin = this.expectedNumberOfRenewsPerMin - 2;
this.numberOfRenewsPerMinThreshold =
(int) (this.expectedNumberOfRenewsPerMin * serverConfig.getRenewalPercentThreshold());
}
}
return true;
}
return false;
}