How to Use MirrorMaker With Apache Kafka Clusters
In this article, see how to use MirrorMaker with Apache Kafka clusters.
Join the DZone community and get the full member experience.
Join For FreeMirrorMaker is a process in Apache Kafka to replicate or mirror data between Kafka Clusters. Don't confuse it with the replication of data among Kafka nodes of the same cluster. One use case is to provide a replica of a complete Kafka cluster in another data center to cater to different use cases without impacting the original cluster.
You can check out my other article on Kafka, which would help to have basic idea of Apache Kafka setup and commands.
In MirrorMaker, there is a consumer connector and producer connector. The consumer will read data from topics in source Kafka cluster and the producer connector will write those events or data to target Kafka Cluster. Source cluster and target cluster are independent of each other.
Let's understand this with a simple setup where both clusters exist on the same machine. We are using two Kafka Clusters; each with two Kafka nodes and one zookeeper node. All processes run on the same host. One Kafka Cluster is the source and the other is the target. This setup is just for demonstration purposes being single zookeeper node cluster and on the same host; it is not meant for production.
1. Create folders for zookeeper and Kafka logs.
xxxxxxxxxx
$ pwd
/home/chandrashekhar/kafka_2.13-2.4.1/
mkdir -p data/zookeeper1
mkdir -p data/zookeeper2
mkdir -p data/kafka-logs-1-1
mkdir -p data/kafka-logs-1-2
mkdir -p data/kafka-logs-2-1
mkdir -p data/kafka-logs-2-2
2. Configuration for zookeeper nodes.
x
[chandrashekhar@localhost kafka_2.13-2.4.1]$ vi config/zookeeper1.properties
dataDir=~/kafka_2.13-2.4.1/data/zookeeper1
clientPort=2181
maxClientCnxns=0
[chandrashekhar@localhost kafka_2.13-2.4.1]$ vi config/zookeeper2.properties
dataDir=~/kafka_2.13-2.4.1/data/zookeeper2
clientPort=2182
maxClientCnxns=0
3. Configuration for Kafka nodes. Total 4 Kafka nodes, 2 node connect to 2181 and other 2 to 2182.
[chandrashekhar@localhost kafka_2.13-2.4.1]$ cp config/server.properties config/server1-1.properties
[chandrashekhar@localhost kafka_2.13-2.4.1]$ cp config/server.properties config/server1-2.properties
[chandrashekhar@localhost kafka_2.13-2.4.1]$ cp config/server.properties config/server2-1.properties
[chandrashekhar@localhost kafka_2.13-2.4.1]$ cp config/server.properties config/server2-2.properties
-----
vi ~/kafka_2.13-2.4.1/config/server1-1.properties
broker.id=0
port=9093
zookeeper.connect=localhost:2181
advertised.host.name = localhost
log.dirs=~/kafka_2.13-2.4.1/data/kafka-logs-1-1
-----
vi ~/kafka_2.13-2.4.1/config/server1-2.properties
broker.id=1
port=9094
zookeeper.connect=localhost:2181
advertised.host.name = localhost
log.dirs=~/kafka_2.13-2.4.1/data/kafka-logs-1-2
-----
vi ~/kafka_2.13-2.4.1/config/server2-1.properties
broker.id=2
port=9095
zookeeper.connect=localhost:2182
advertised.host.name = localhost
log.dirs=~/kafka_2.13-2.4.1/data/kafka-logs-2-1
-----
vi ~/kafka_2.13-2.4.1/config/server2-2.properties
broker.id=4
port=9096
zookeeper.connect=localhost:2182
advertised.host.name = localhost
log.dirs=~/kafka_2.13-2.4.1/data/kafka-logs-2-2
-----
4. Start zookeeper nodes and Kafka nodes.
x
chandrashekhar@chandrashekhar:~/kafka_2.13-2.4.1/bin$ ./zookeeper-server-start.sh ../config/zookeeper1.properties
chandrashekhar@chandrashekhar:~/kafka_2.13-2.4.1/bin$ ./zookeeper-server-start.sh ../config/zookeeper2.properties
chandrashekhar@chandrashekhar:~/kafka_2.13-2.4.1/bin$ ./kafka-server-start.sh ../config/server1-1.properties
chandrashekhar@chandrashekhar:~kafka_2.13-2.4.1/bin$ ./kafka-server-start.sh ../config/server1-2.properties
chandrashekhar@chandrashekhar:~/kafka_2.13-2.4.1/bin$ ./kafka-server-start.sh ../config/server2-1.properties
chandrashekhar@chandrashekhar:~/kafka_2.13-2.4.1/bin$ ./kafka-server-start.sh ../config/server2-2.properties
5. Create topic mirrormakerPOC on both Kafka clusters with same number of partition.
x
chandrashekhar@chandrashekhar:~/kafka_2.13-2.4.1/bin$ ./kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 2 --partitions 2 --topic mirrormakerPOC
Created topic mirrormakerPOC.
chandrashekhar@chandrashekhar:~/kafka_2.13-2.4.1/bin$ ./kafka-topics.sh --create --zookeeper localhost:2182 --replication-factor 2 --partitions 2 --topic mirrormakerPOC
Created topic mirrormakerPOC.
6. Create consumer and producer configuration file for mirror maker.
xxxxxxxxxx
chandrashekhar@chandrashekhar:~$ cat sourceCluster1Consumer.config
bootstrap.servers=localhost:9093,localhost:9094
exclude.internal.topics=true
client.id=mirror_maker_consumer
group.id=mirror_maker_consumer
chandrashekhar@chandrashekhar:~$ cat targetClusterProducer.config
bootstrap.servers=localhost:9095,localhost:9096
acks=1
batch.size=50
client.id=mirror_maker_test_producer
7. Now run MirrorMaker process based on consumer and producer configuration defined in last step.
xxxxxxxxxx
chandrashekhar@chandrashekhar:~/kafka_2.13-2.4.1/bin$ ./kafka-mirror-maker.sh --consumer.config ../../sourceCluster1Consumer.config --num.streams 1 --producer.config ../../targetClusterProducer.config --whitelist=".*"
8. Start sending message to Kafka Cluster 1 listening on zookeeper port 2181.
x
chandrashekhar@chandrashekhar:~/kafka_2.13-2.4.1/bin$ ./kafka-console-producer.sh --broker-list localhost:9093,localhost:9094 --topic mirrormakerPOC
>2134
>111
9. Start consuming on Kafka nodes of both Kafka Clusters.
- Consume for Kafka nodes on 2nd Cluster.
x
./kafka-console-consumer.sh --bootstrap-server localhost:9095,localhost:9096 --topic mirrormakerPOC --group topic_group_2
2134
111
- Consume for Kafka nodes on 1st Cluster.
xxxxxxxxxx
./kafka-console-consumer.sh --bootstrap-server localhost:9093,localhost:9094 --topic mirrormakerPOC --group topic_group_1
2134
111
10. Monitor list of topics, details of topic and offset for particular consumer-group.
x
[chandrashekhar bin]$ ./kafka-topics.sh --list --zookeeper localhost:2182
__consumer_offsets
mirrormakerPOC
[chandrashekhar bin]$ ./kafka-topics.sh --list --zookeeper localhost:2181
__consumer_offsets
mirrormakerPOC
------------------------
[chandrashekhar bin]$ ./kafka-topics.sh --describe --zookeeper localhost:2182 --topic mirrormakerPOC
Topic: mirrormakerPOC PartitionCount: 2 ReplicationFactor: 2 Configs:
Topic: mirrormakerPOC Partition: 0 Leader: 3 Replicas: 3,2 Isr: 3,2
Topic: mirrormakerPOC Partition: 1 Leader: 2 Replicas: 2,3 Isr: 2,3
[chandrashekhar bin]$
[chandrashekhar bin]$ ./kafka-topics.sh --describe --zookeeper localhost:2181 --topic mirrormakerPOC
Topic: mirrormakerPOC PartitionCount: 2 ReplicationFactor: 2 Configs:
Topic: mirrormakerPOC Partition: 0 Leader: 0 Replicas: 0,1 Isr: 0,1
Topic: mirrormakerPOC Partition: 1 Leader: 1 Replicas: 1,0 Isr: 1,0
------------------------
[chandrashekhar bin]$ ./kafka-consumer-groups.sh --bootstrap-server localhost:9095,localhost:9096 --group topic_group_2 --describe
GROUP TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
topic_group_2 mirrormakerPOC 0 4 4 0 consumer-topic_group_2-1-846dfe1f-c487-410f-961d-5df50da2ea58 /127.0.0.1 consumer-topic_group_2-1
topic_group_2 mirrormakerPOC 1 4 4 0 consumer-topic_group_2-1-846dfe1f-c487-410f-961d-5df50da2ea58 /127.0.0.1 consumer-topic_group_2-1
[chandrashekhar bin]$
That's it, I hope this article will help you have a basic idea of mirroring or replicating data from one Kafka cluster to another Kafka cluster.
Opinions expressed by DZone contributors are their own.
Comments