Quick walkthrough of a problem on a 3 node elasticsearch cluster first noticed with the generic yellow/red cluster warning. The chain of events causing the problem looks like…
- Master node begins to exhaust its JVM heap and becomes unresponsive, as its a data node and a master, it has higher resource usage than other non-master nodes to account for its managing the cluster.
- The non-master nodes can no longer contact the master and drop it as a node, dropping to a two node cluster.
- As the previous master is no longer part of the cluster, resource usage drops, and becomes responsive again, so it rejoins the cluster.
- The cluster begins to re-balance shards now that it has become 3 nodes again, increasing resource usage
- Problem cycles round to start ad-infinituum
Symptoms
The elasticsearch nodes are all containerised, first thing we can see is that some of the containers had been recently restarted. By the time I had ssh’d in, there were 3 of 3 nodes in the cluster, and we had a master. CPU was high as the cluster was rebalancing shards.
Discovery
Looking at 2 of the nodes showed
curl localhost:9200/_cluster/health?pretty
{
"cluster_name" : "my-cluster",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 3,
"number_of_data_nodes" : 3,
"active_primary_shards" : 2843,
"active_shards" : 3162,
"relocating_shards" : 0,
"initializing_shards" : 12,
"unassigned_shards" : 9280,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 87,
"number_of_in_flight_fetch" : 8213
}
Then on checking the third node
curl localhost:9200/_cluster/health?pretty
{
"error" : "MasterNotDiscoveredException[waited for [30s]]",
"status" : 503
}
Looking at the elasticsearch logs
Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "elasticsearch[i-instance][[http_server_worker.default]][T#8]"
Exception in thread "elasticsearch[i-instance][fetch_shard_started][T#89]" java.lang.OutOfMemoryError: Java heap space
Exception in thread "elasticsearch[i-instance][[http_server_worker.default]][T#2]" java.lang.OutOfMemoryError: Java heap space
Actions
Recovering the cluster involves reducing the jvm heap and cpu usage to allow rebalancing to occur without stressing the master node.
Disable shard rebalancing until we can cope with it
curl -XPUT localhost:9200/_cluster/settings -d '{
"transient" : {
"cluster.routing.allocation.enable" : "none"
}
}'
Close old indices, dropping resource usage
curator close indices --older-than 30 --time-unit days --timestring '%Y.%m.%d
Allow primary rebalancing
curl -XPUT localhost:9200/_cluster/settings -d '{
"transient" : {
"cluster.routing.allocation.enable" : "primaries"
}
}'
Once we’re stable, allow all re-balancing
curl -XPUT localhost:9200/_cluster/settings -d '{
"transient" : {
"cluster.routing.allocation.enable" : "all"
}
}
Hopefully there is now enough resources for the shards be rebalanced in the cluster
while true; do sleep 3; curl localhost:9200/_cluster/health?pretty -s | grep \"unassigned_shards\"; done
"unassigned_shards" : 8420,
"unassigned_shards" : 8384,
"unassigned_shards" : 8339,
In reality I used a couple of tweaks to push things along,
Updating the concurrent recoveries to speed up rebalancing, I had enough slack in the system to do this
curl -XPUT localhost:9200/_cluster/settings -d '{
"persistent" : {
"cluster.routing.allocation.node_concurrent_recoveries" : 20
}
}'
{"acknowledged":true,"persistent":{"cluster":{"routing":{"allocation":{"node_concurrent_recoveries":"20"}}}},"transient":{}}
There were a couple of damaged indices that needed to be removed, you can recover these with backups and work
curl -XGET -s 'localhost:9200/_cat/indices' | grep red
red open logstash-sensu-2015.12.29 5 1
red open logstash-myproject-2016.01.01 5 1