版本環(huán)境
- spring boot: 2.2.4.RELEASE
- spring-data-elasticsearch: 3.2.4.RELEASE
- Elasticsearch: 6.3.0
問題描述
使用 spring data elasticsearch 來連接使用 elasticsearch, 配置如下:
spring:
data:
elasticsearch:
cluster-name: aliyun-es
cluster-nodes: 114.xx.xx.xx:9300
之前都運行好好的介返,已經(jīng)確認 elasticsearch 的 9300 和 9200 端口無任何問題。但今天在項目中加入 Actuator 來監(jiān)控系統(tǒng)運行情況時,報了如下錯誤:
2020-05-21 19:03:47,183 WARN [http-nio-8080-exec-3] org.springframework.boot.actuate.elasticsearch.ElasticsearchRestHealthIndicator [AbstractHealthIndicator.java:87] Elasticsearch health check failed
java.net.ConnectException: Timeout connecting to [localhost/127.0.0.1:9200]
at org.elasticsearch.client.RestClient$SyncResponseListener.get(RestClient.java:959)
at org.elasticsearch.client.RestClient.performRequest(RestClient.java:233)
at org.springframework.boot.actuate.elasticsearch.ElasticsearchRestHealthIndicator.doHealthCheck(ElasticsearchRestHealthIndicator.java:60)
at org.springframework.boot.actuate.health.AbstractHealthIndicator.health(AbstractHealthIndicator.java:82)
at org.springframework.boot.actuate.health.HealthIndicator.getHealth(HealthIndicator.java:37)
at org.springframework.boot.actuate.health.HealthEndpointWebExtension.getHealth(HealthEndpointWebExtension.java:95)
at org.springframework.boot.actuate.health.HealthEndpointWebExtension.getHealth(HealthEndpointWebExtension.java:43)
at org.springframework.boot.actuate.health.HealthEndpointSupport.getContribution(HealthEndpointSupport.java:108)
at org.springframework.boot.actuate.health.HealthEndpointSupport.getAggregateHealth(HealthEndpointSupport.java:119)
at org.springframework.boot.actuate.health.HealthEndpointSupport.getContribution(HealthEndpointSupport.java:105)
at org.springframework.boot.actuate.health.HealthEndpointSupport.getHealth(HealthEndpointSupport.java:83)
at org.springframework.boot.actuate.health.HealthEndpointSupport.getHealth(HealthEndpointSupport.java:70)
at org.springframework.boot.actuate.health.HealthEndpointWebExtension.health(HealthEndpointWebExtension.java:81)
at org.springframework.boot.actuate.health.HealthEndpointWebExtension.health(HealthEndpointWebExtension.java:70)
問題解決
查看錯誤地方 ElasticsearchRestHealthIndicator 的源碼:
@Override
protected void doHealthCheck(Health.Builder builder) throws Exception {
Response response = this.client.performRequest(new Request("GET", "/_cluster/health/"));
StatusLine statusLine = response.getStatusLine();
if (statusLine.getStatusCode() != HttpStatus.SC_OK) {
builder.down();
builder.withDetail("statusCode", statusLine.getStatusCode());
builder.withDetail("reasonPhrase", statusLine.getReasonPhrase());
return;
}
try (InputStream inputStream = response.getEntity().getContent()) {
doHealthCheck(builder, StreamUtils.copyToString(inputStream, StandardCharsets.UTF_8));
}
}
可以看到方法第一行檢測 Elasticsearch 是否健康是使用 GET 請求訪問了 /_cluster/health 路徑惕鼓,但為什么訪問的地址是 localhost:9200 呢谴返?猜測應(yīng)該是 Spring Boot 默認的配置,于是在查看 elasticsearch 的自動配置類 org.springframework.boot.autoconfigure.elasticsearch.
在 RestClientProperties 中:
@ConfigurationProperties(prefix = "spring.elasticsearch.rest")
public class RestClientProperties {
/**
* Comma-separated list of the Elasticsearch instances to use.
*/
private List<String> uris = new ArrayList<>(Collections.singletonList("http://localhost:9200"));
}
這個 uris 應(yīng)該就是導(dǎo)致錯誤的原因蜡饵,默認是 http://localhost:9200管怠,所以配置下:
spring:
data:
elasticsearch:
cluster-name: aliyun-es
cluster-nodes: 114.xx.xx.xx:9300
elasticsearch:
rest:
uris: ["114.xx.xx.xx:9200"]
connection-timeout: 10s
重新運行后再次出錯:
2020-05-21 19:28:51,726 WARN [http-nio-8080-exec-8] org.springframework.boot.actuate.elasticsearch.ElasticsearchHealthIndicator [AbstractHealthIndicator.java:87] Elasticsearch health check failed
org.elasticsearch.ElasticsearchTimeoutException: java.util.concurrent.TimeoutException: Timeout waiting for task.
at org.elasticsearch.common.util.concurrent.FutureUtils.get(FutureUtils.java:79)
at org.elasticsearch.action.support.AdapterActionFuture.actionGet(AdapterActionFuture.java:54)
at org.elasticsearch.action.support.AdapterActionFuture.actionGet(AdapterActionFuture.java:44)
at org.springframework.boot.actuate.elasticsearch.ElasticsearchHealthIndicator.doHealthCheck(ElasticsearchHealthIndicator.java:79)
at org.springframework.boot.actuate.health.AbstractHealthIndicator.health(AbstractHealthIndicator.java:82)
at org.springframework.boot.actuate.health.HealthIndicator.getHealth(HealthIndicator.java:37)
at org.springframework.boot.actuate.health.HealthEndpointWebExtension.getHealth(HealthEndpointWebExtension.java:95)
錯誤說是連接超時,debug 發(fā)現(xiàn)發(fā)送請求檢測的超時時間只有 100 毫秒
這個時間太快了亏掀,我的網(wǎng)絡(luò)環(huán)境不支持,需要增加點超時時間泛释,由于這個健康檢測是由 Actuator 執(zhí)行的滤愕,于是去查看 Actuator 中 Elasticsearch 的自動配置類,在 ElasticsearchHealthIndicatorProperties 中找到:
@ConfigurationProperties(
prefix = "management.health.elasticsearch",
ignoreUnknownFields = false
)
@Deprecated
public class ElasticsearchHealthIndicatorProperties {
private List<String> indices = new ArrayList();
private Duration responseTimeout = Duration.ofMillis(100L);
}
可以看到 responseTimeout 為 100 毫秒怜校,和上面 debug 的時間一致间影,應(yīng)該就是這一項了,修改 yml :
# actuator
management:
endpoints:
web:
exposure:
include: ['*']
health:
elasticsearch:
response-timeout: 3s
再次運行茄茁,無誤魂贬。
還有一種方式也可以解決,但是并不是一種好的解決方式,那就是關(guān)閉 actuator 對 elasticsearch 的健康檢查:
management:
health:
elasticsearch:
enabled: false