Monitoring Apache Zookeeper Servers
This article shows you how to monitor Apache Zookeeper servers using Netflix Exhibitor.
Join the DZone community and get the full member experience.
Join For FreeApache zookeeper is the go-to service for maintaining and managing data in a distributed environment. It is highly popular coordination and synchronization service for storing and maintaining data for distributed systems. Various systems are using Apache Zookeeper servers for leader election, registry naming, node coordination, and data management. It is widely used in the industry for Apache Hadoop, Apache HBase, Apache Kafka, etc. Zookeeper makes it quite easy to implement advanced patterns in distributed systems.
Management of data is a complicated task in a distributed system. Apache zookeeper does a good job of coordination across multiple nodes so that application developers do not need to worry about such common problems in distributed systems. It is simple and easy to use, and it is itself distributed over multiple nodes so that there is no single point of failure.
Many times in distributed systems, there is a need to monitor zookeeper servers. The common way of doing this is to use the four-letter “srvr” command.
xxxxxxxxxx
root :~# echo srvr | nc localhost 2181
Zookeeper version: 3.4.12-9a32a0b3d8a6492fa18ed92f6d2408bbfc408912, built on 07/01/2019 19:04 GMT
Latency min/avg/max: 0/0/41
Received: 6165
Sent: 6326
Connections: 22
Outstanding: 0
Zxid: 0x2d55000000d2
Mode: follower
Node count: 2923
However, this command needs to be run on all the nodes to check their status.
Exhibitor provided by Netflix is a supervisor system for Apache Zookeeper. Exhibitor comes with a built-in Rest API that provides more information on the Zookeeper ensemble instances.
The Exhibitor Rest APIs are grouped in various categories. To monitor the entire zookeeper cluster, you should look into the cluster category.
Some of the useful APIs for monitoring zookeeper servers is as follows:
- Status
The best way to monitor the status of the entire cluster is to use the cluster status Rest API from exhibitor.
Method | GET |
URL | exhibitor/v1/cluster/status |
Argument | n/a |
Response | ServerStatus[] |
xxxxxxxxxx
root :~# curl http://10.65.115.67:8180/exhibitor/v1/cluster/status
[{"code":3,"description":"serving","hostname":"10.120.187.29","isLeader":false},
{"code":3,"description":"serving","hostname":"10.65.115.70","isLeader":false},
{"code":3,"description":"serving","hostname":"10.65.115.69","isLeader":false},
{"code":3,"description":"serving","hostname":"10.65.115.68","isLeader":false},
{"code":3,"description":"serving","hostname":"10.65.115.67","isLeader":true}]
Exhibitor internally makes Rest calls to each of the other servers in the ensemble and provides an integrated view of the cluster. This is very useful to obtain the complete cluster status by invoking the Rest API on a single node. The code integer specified in the Rest API response above is the state of the Zookeeper server. For eg, in the above call, the code 3 denotes that the server has status as serving.
Code | State |
0 | LATENT |
1 | DOWN |
2 | NOT_SERVING |
3 | SERVING |
The isLeader boolean field helps you determine the current leader of the cluster.
- RemoteGetStatus
There is a remoteGetStatus API that returns the status of the provided instance. This can be used to check the status of local as well as remote servers.
Method | GET |
URL | exhibitor/v1/cluster/state/{hostname} |
Argument | N/A |
Response | ClusterState |
xxxxxxxxxx
root :~# curl http://10.65.115.67:8180/exhibitor/v1/cluster/state/10.65.115.68
{"response":{"switches:{"restarts":true,"cleanup":true,"backups":true},"state":3,"description":"serving","isLeader":false},"errorMessage":"","success":true}
In the API response of the remoteGetStatus API, the state, description, and isLeader information is the same as the cluster status API. There is additional information on switches that are provided here.
ClusterSwitches
xxxxxxxxxx
{
"restarts": boolean, // true if instance restarts are on
"unlistedRestarts": boolean, // true if unlisted instance restarts are on
"cleanup": boolean, // true if the cleanup task is on
"backups": boolean // true if the backup task is on
}
- getStatus
The getStatus API returns the state of the local zookeeper instance.
Method | GET |
URL | exhibitor/v1/cluster/state |
Argument | N/A |
Response | ClusterState |
root :~# curl http://10.65.115.67:8180/exhibitor/v1/cluster/state
{"switches":{"restarts":true,"cleanup":true,"backups":true},"state":3,"description":"serving","isLeader":true}
The API response of getStatus has the same parameters as the remoteGetStatus. You can refer to the Exhibitor Rest Entities page for more details.
Using the Rest APIs of Exhibitor is an excellent way to obtain the status of the entire cluster. The. Exhibitor Rest APIs can be easily incorporated into application workflows to get information about the cluster
Opinions expressed by DZone contributors are their own.
Comments