發(fā)現(xiàn)問(wèn)題
公司線(xiàn)上的服務(wù)運(yùn)行一段時(shí)間后就出現(xiàn)某個(gè)服務(wù)節(jié)點(diǎn)無(wú)響應(yīng)皮迟,查看內(nèi)存監(jiān)控,對(duì)應(yīng)的Jvm的堆耗盡秸架。好在服務(wù)是多節(jié)點(diǎn),線(xiàn)上dump運(yùn)行服務(wù)的Jvm快照咆蒿,下載到本地進(jìn)行分析东抹。
使用MAT打開(kāi)快照文件,此處省略掉使用MAT的過(guò)程沃测,分析發(fā)現(xiàn)有大量的com.netflix.servo.monitor.BasicTimer未釋放缭黔,且被org.springframework.cloud.netflix.metrics.servo.ServoMonitorCache占用。
分析問(wèn)題
在工程中查找到ServoMonitorCache類(lèi)蒂破,發(fā)現(xiàn)在spring-cloud-netflix-core包下馏谨,然后打開(kāi)該jar包,查看其spring.factories去查看是那里自動(dòng)配置生成了該類(lèi)寞蚌,找到org.springframework.cloud.netflix.metrics.servo.ServoMetricsAutoConfiguration中自動(dòng)配置田巴,然后再搜索那里使用了該類(lèi),在org.springframework.cloud.netflix.metrics.MetricsInterceptorConfiguration中發(fā)現(xiàn)了ServoMonitorCache對(duì)象的使用挟秤。看到metrics就明白抄伍,是對(duì)服務(wù)的監(jiān)控對(duì)象艘刚。代碼如下:
@Configuration
@ConditionalOnProperty(value = "spring.cloud.netflix.metrics.enabled", havingValue = "true", matchIfMissing = true)
@ConditionalOnClass({ Monitors.class, MetricReader.class })
public class MetricsInterceptorConfiguration {
@Configuration
@ConditionalOnWebApplication
@ConditionalOnClass(WebMvcConfigurerAdapter.class)
static class MetricsWebResourceConfiguration extends WebMvcConfigurerAdapter {
@Bean
MetricsHandlerInterceptor servoMonitoringWebResourceInterceptor() {
return new MetricsHandlerInterceptor();
}
@Override
public void addInterceptors(InterceptorRegistry registry) {
registry.addInterceptor(servoMonitoringWebResourceInterceptor());
}
}
@Configuration
@ConditionalOnClass({ RestTemplate.class, JoinPoint.class })
@ConditionalOnProperty(value = "spring.aop.enabled", havingValue = "true", matchIfMissing = true)
static class MetricsRestTemplateAspectConfiguration {
@Bean
RestTemplateUrlTemplateCapturingAspect restTemplateUrlTemplateCapturingAspect() {
return new RestTemplateUrlTemplateCapturingAspect();
}
}
@Configuration
@ConditionalOnClass({ RestTemplate.class, HttpServletRequest.class }) // HttpServletRequest implicitly required by MetricsTagProvider
static class MetricsRestTemplateConfiguration {
@Value("${netflix.metrics.restClient.metricName:restclient}")
String metricName;
/*
*此處為關(guān)鍵代碼
*編號(hào)1
*/
@Bean
MetricsClientHttpRequestInterceptor spectatorLoggingClientHttpRequestInterceptor(
Collection<MetricsTagProvider> tagProviders,
ServoMonitorCache servoMonitorCache) {
return new MetricsClientHttpRequestInterceptor(tagProviders,
servoMonitorCache, this.metricName);
}
@Bean
BeanPostProcessor spectatorRestTemplateInterceptorPostProcessor() {
return new MetricsInterceptorPostProcessor();
}
//編號(hào)2
private static class MetricsInterceptorPostProcessor
implements BeanPostProcessor, ApplicationContextAware {
private ApplicationContext context;
private MetricsClientHttpRequestInterceptor interceptor;
@Override
public Object postProcessBeforeInitialization(Object bean, String beanName) {
return bean;
}
@Override
public Object postProcessAfterInitialization(Object bean, String beanName) {
if (bean instanceof RestTemplate) {
if (this.interceptor == null) {
this.interceptor = this.context
.getBean(MetricsClientHttpRequestInterceptor.class);
}
RestTemplate restTemplate = (RestTemplate) bean;
// create a new list as the old one may be unmodifiable (ie Arrays.asList())
ArrayList<ClientHttpRequestInterceptor> interceptors = new ArrayList<>();
interceptors.add(interceptor);
interceptors.addAll(restTemplate.getInterceptors());
restTemplate.setInterceptors(interceptors);
}
return bean;
}
@Override
public void setApplicationContext(ApplicationContext context)
throws BeansException {
this.context = context;
}
}
}
}
在上面代碼中編號(hào)1處,自動(dòng)配置生成了MetricsClientHttpRequestInterceptor攔截器截珍,然后把ServoMonitorCache采用構(gòu)造器注入傳入了攔截器攀甚;然后代碼編號(hào)2處的postProcessAfterInitialization函數(shù)中,把該攔截器賦值給了RestTemplate岗喉;很熟悉的對(duì)象秋度,Spring的Rest服務(wù)訪(fǎng)問(wèn)客戶(hù)端,公司的微服務(wù)采用Restful接口钱床,使用該對(duì)象作為客戶(hù)端荚斯。
然后進(jìn)入MetricsClientHttpRequestInterceptor,核心代碼如下:
@Override
public ClientHttpResponse intercept(HttpRequest request, byte[] body,
ClientHttpRequestExecution execution) throws IOException {
long startTime = System.nanoTime();
ClientHttpResponse response = null;
try {
response = execution.execute(request, body);
return response;
}
finally {
SmallTagMap.Builder builder = SmallTagMap.builder();
//編號(hào)3
for (MetricsTagProvider tagProvider : tagProviders) {
for (Map.Entry<String, String> tag : tagProvider
.clientHttpRequestTags(request, response).entrySet()) {
builder.add(Tags.newTag(tag.getKey(), tag.getValue()));
}
}
//編號(hào)4
MonitorConfig.Builder monitorConfigBuilder = MonitorConfig
.builder(metricName);
monitorConfigBuilder.withTags(builder);
servoMonitorCache.getTimer(monitorConfigBuilder.build())
.record(System.nanoTime() - startTime, TimeUnit.NANOSECONDS);
}
}
編號(hào)3處代碼,發(fā)現(xiàn)對(duì)象tagProviders事期,回過(guò)去看代碼也是該攔截器構(gòu)造時(shí)傳入的參數(shù)滥壕;現(xiàn)在去看一下這個(gè)對(duì)象是什么,因?yàn)樵搶?duì)象是構(gòu)造器注入的兽泣,說(shuō)明也是由spring容器配置生成的绎橘,所以繼續(xù)在autoconfig文件中查找,發(fā)現(xiàn)在org.springframework.cloud.netflix.metrics.servo.ServoMetricsAutoConfiguration中自動(dòng)配置生成:
@Configuration
@ConditionalOnClass(name = "javax.servlet.http.HttpServletRequest")
protected static class MetricsTagConfiguration {
@Bean
public MetricsTagProvider defaultMetricsTagProvider() {
return new DefaultMetricsTagProvider();
}
}
進(jìn)入DefaultMetricsTagProvider該對(duì)象代碼唠倦,核心代碼如下:
public Map<String, String> clientHttpRequestTags(HttpRequest request,
ClientHttpResponse response) {
String urlTemplate = RestTemplateUrlTemplateHolder.getRestTemplateUrlTemplate();
if (urlTemplate == null) {
urlTemplate = "none";
}
String status;
try {
status = (response == null) ? "CLIENT_ERROR" : ((Integer) response
.getRawStatusCode()).toString();
}
catch (IOException e) {
status = "IO_ERROR";
}
String host = request.getURI().getHost();
if( host == null ) {
host = "none";
}
String strippedUrlTemplate = urlTemplate.replaceAll("^https?://[^/]+/", "");
Map<String, String> tags = new HashMap<>();
tags.put("method", request.getMethod().name());
tags.put("uri", sanitizeUrlTemplate(strippedUrlTemplate));
tags.put("status", status);
tags.put("clientName", host);
return Collections.unmodifiableMap(tags);
}
發(fā)現(xiàn)其就是分解了Http的客戶(hù)端請(qǐng)求称鳞,其中關(guān)鍵就是method(get、post稠鼻、delete等http方法)冈止、status狀態(tài)、clientName訪(fǎng)問(wèn)的服務(wù)域名枷餐、uri訪(fǎng)問(wèn)路徑(包含參數(shù))靶瘸。
然后,返回去看代碼編號(hào)4處毛肋,生成了一個(gè)對(duì)象com.netflix.servo.monitor.MonitorConfig,主要就是name和tags怨咪,name默認(rèn)的就是restclient(可以在屬性文件中修改);tags就是DefaultMetricsTagProvider中那些tag標(biāo)簽。
然后進(jìn)入ServoMonitorCache.getTimer函數(shù):
public synchronized BasicTimer getTimer(MonitorConfig config) {
BasicTimer t = this.timerCache.get(config);
if (t != null)
return t;
t = new BasicTimer(config);
this.timerCache.put(config, t);
if (this.timerCache.size() > this.config.getCacheWarningThreshold()) {
log.warn("timerCache is above the warning threshold of " + this.config.getCacheWarningThreshold() + " with size " + this.timerCache.size() + ".");
}
this.monitorRegistry.register(t);
return t;
}
此處就很簡(jiǎn)單了润匙,先在緩存中查找該MonitorConfig對(duì)象有沒(méi)有诗眨,沒(méi)有則新增一個(gè)BasicTimer,若有就更新該BasicTimer的參數(shù)孕讳,題外話(huà)匠楚,BasicTimer就存儲(chǔ)了各個(gè)接口的訪(fǎng)問(wèn)最大時(shí)間、最小時(shí)間厂财、平均時(shí)間等芋簿。
分析到這里就明白了,我們公司的內(nèi)部服務(wù)直接互相訪(fǎng)問(wèn)時(shí)璃饱,采用了簽名校驗(yàn)与斤,即在訪(fǎng)問(wèn)時(shí),都在URL后增加一個(gè)簽名參數(shù)荚恶,密鑰只有公司的各個(gè)服務(wù)節(jié)點(diǎn)上配置撩穿,簽名校驗(yàn)通過(guò)則允許訪(fǎng)問(wèn),不通過(guò)則直接拒絕訪(fǎng)問(wèn)谒撼,這樣可提高一下接口的安全等級(jí)食寡;簽名機(jī)制中,明文混入了一個(gè)隨機(jī)數(shù)廓潜,增強(qiáng)簽名的安全性抵皱,這樣就導(dǎo)致了每次的接口訪(fǎng)問(wèn)url都不一樣善榛,然后在DefaultMetricsTagProvider中解析的uri也就都不一樣,最終導(dǎo)致了MonitorConfig對(duì)象不一樣叨叙,所以接口調(diào)用一次锭弊,生成一個(gè)BasicTimer對(duì)象,久而久之也就打爆Jvm堆內(nèi)存擂错。
解決方案
- 改變簽名機(jī)制味滞,將簽名放入PostBody中
- 去掉該攔截器
因?yàn)楣痉?wù)的接口監(jiān)控已有其他第三方組件服務(wù)完成,不需使用netflix-core的監(jiān)控钮呀,所以選擇第二種方案剑鞍。
實(shí)現(xiàn)方法
回到MetricsInterceptorConfiguration,看到如下代碼
@Configuration
@ConditionalOnProperty(value = "spring.cloud.netflix.metrics.enabled", havingValue = "true", matchIfMissing = true)
@ConditionalOnClass({ Monitors.class, MetricReader.class })
public class MetricsInterceptorConfiguration {
熟悉springboot的一看就明白爽醋,只需要將屬性spring.cloud.netflix.metrics.enabled置為false即可關(guān)閉該自動(dòng)配置文件類(lèi)蚁署。
最后
一次隱藏比較深的崩潰經(jīng)歷,springboot和springcloud帶來(lái)了極大的開(kāi)發(fā)便捷性蚂四,由本人極力主張將后端開(kāi)發(fā)棧轉(zhuǎn)為springcloud光戈,但便利的同時(shí),也帶來(lái)了更多的不透明遂赠,隨之也就會(huì)出現(xiàn)各種各樣的問(wèn)題久妆。
繼續(xù)提高技術(shù)內(nèi)力、充分學(xué)會(huì)各種分析工具跷睦、掌握正確的代碼閱讀方法筷弦,才能應(yīng)對(duì)未知的問(wèn)題。
歡迎各位提建議抑诸,交流烂琴。