A Quickstart Guide to Apache Kafka
This guide is a quick walkthrough of installing Apache Kafka on your machine as well as sending and receiving messages.
Join the DZone community and get the full member experience.
Join For FreeWant to install Apache Kafka on your machine as well as send and receive messages? Here is a quick walkthrough of how to do it.
1. Download Apache Kafka
Go to https://kafka.apache.org/downloads and download the latest release of Apache Kafka.
2. Install Apache Kafka
The main steps involved in installing Apache Kafka are:
Create a Directory for Apache Kafka Installation
Go to the download directory and move the file to a location in which you want to install the binaries.
In case the directory doesn't exist, first, create the directory structure.
Untar the File and Go to the Kafka Folder
Untar the file in the installation directory. Once the Kafka installation is unpacked, go to the Kafka folder.
bash-3.2$ mkdir kafka-3.1.0
bash-3.2$ mv kafka_2.13-3.1.0.tgz kafka-3.1.0/
bash-3.2$ cd kafka-3.1.0/
bash-3.2$ tar -xzf kafka_2.13-3.1.0.tgz
bash-3.2$ cd kafka_2.13-3.1.0
Important: Before proceeding to the next step, make sure you have installed Java 8+ installed on your machine. Failing this step, Apache Kafka components won't start on your machine.
3. Start the Kafka Processes
In order to start Apache Kafka, the first step is to start the Zookeeper service and then start the main Kafka server.
A typical Kafka installation directory looks like this:
bash-3.2$ ls
LICENSE bin libs logs
NOTICE config licenses site-docs
All the binaries are located within the bin folder. To start the services such as Zookeeper and Kafka server, go to the bin directory and start these services. Let's start them one at a time.
Start the Zookeeper Service
Zookeeper startup script is present within the Apache Kafka bin folder.
bash-3.2$ bin/zookeeper-server-start.sh config/zookeeper.properties
Zookeeper service generates logs on the console and starts in the foreground.
You can start the process using nohup as well if you are on a Unix-based OS. This will run the process in background and you can exit the shell without worrying about the process getting aborted. To know more about the nohup command, refer https://en.wikipedia.org/wiki/Nohup.
Start the Apache Kafka Broker Service
Start the Kafka service using this command.
bash-3.2$ ./kafka-server-start.sh ../config/server.properties
Though the binaries and the startup scripts are located in the bin directory under the main Kafka installation location, the configuration files for starting the services are located within the config folder.
If you are starting the Kafka from the top-level installation folder, you must specify the full path or relative path to the startup script (kafka-server-start.sh within the bin folder) and the configuration file (all the Kafka server configurations are specified within the server.properties file in config folder).
Once you start the server, it will output logs on the terminal and start.
Basic Troubleshooting
In case your server fails to start, check the port on which it is running and see if it is occupied by another process. If yes, then either change the server startup port in the server.properties file or kill the process that is interfering with the startup.
[2022-04-02 14:45:09,799] INFO [BrokerToControllerChannelManager broker=0 name=alterIsr]: Recorded new controller, from now on will use broker localhost:9092 (id: 0 rack: null) (kafka.server.BrokerToControllerRequestThread)
In this case, our server is running on the default port 9092.
Sending and Receiving Messages
Before we go into the details of sending and receiving messages using Kafka, let's look at the five core APIs that Kafka exposes:
- Admin API for managing and inspecting Kafka
- Producer API to publish to Kafka topics
- Consumer API to connect to Kafka topics and consume messages
- Kafka Streams API for providing stream processing capabilities
- Kafka Connect API for developing connectors for consuming and producing events from external system applications.
Click on the link in order to get more details about these APIs.
Create a Kafka Topic
We will create a simple producer and a consumer to publish and subscribe to messages from a kafka topic. To do so, let's first create a kafka topic.
bash-3.2$ bin/kafka-topics.sh --create --topic my-first-topic --bootstrap-server localhost:9092
Created topic my-first-topic.
bash-3.2$
Publish Messages to the Topic
Use the kafka-console-producer.sh
in the bin directory to publish some messages to your Kafka topic created above. To do that, you will have to provide the topic to publish the messages to and the server address to which the topic is present.
bash-3.2$ ./kafka-console-producer.sh --topic my-first-topic --bootstrap-server localhost:9092
>my first message
>my second message
>my third message
>
Here our server is running on localhost and listening to 9092 TCP port.
Once you have published the messages, these messages are now sitting in the topic for the consumers to pick them.
Consume Messages From the Topic
Use the kafka-console-consumer.sh present in the bin directory to consume the messages from the Kafka topic. You will have to provide the server host and port to connect to and the topic to subscribe to. These are the same details you provided above. Once you start the script, you will see the messages you published appear on the terminal.
bash-3.2$ ./kafka-console-consumer.sh --topic my-first-topic --from-beginning --bootstrap-server localhost:9092
my first message
my second message
my third message
In this quickstart guide, we learned to install the Kafka server, start the Zookeeper, start the Apache Kafka server, create a topic, and publish and subscribe from a Kafka topic.
References
Apache Kafka. Apache Kafka. (n.d.). Retrieved April 17, 2022, from https://kafka.apache.org/
Apache Kafka: Introduction. GeeksforGeeks. (2018, February 8). Retrieved April 17, 2022, from https://www.geeksforgeeks.org/apache-kafka/
Apache kafka® quick start - docker. Confluent. (n.d.). Retrieved April 17, 2022, from https://developer.confluent.io/quickstart/kafka-docker/
IBM Cloud Education. (n.d.). What is middleware? IBM. Retrieved April 15, 2022, from https://www.ibm.com/cloud/learn/middleware
Camposo, G. (2021). Messaging with Apache Kafka. Cloud-Native Integration with Apache Camel, 167–209. https://doi.org/10.1007/978-1-4842-7211-4_5
Estrada, R., & Ruiz, I. (2016). The broker: Apache kafka. Big Data SMACK, 165–203. https://doi.org/10.1007/978-1-4842-2175-4_8
Estrada, R., & Ruiz, I. (2016). The broker: Apache kafka. Big Data SMACK, 165–203. https://doi.org/10.1007/978-1-4842-2175-4_8
G., P., & C., S. (2018). Pipeline for real-time anomaly detection in log data streams using Apache Kafka and Apache Spark. International Journal of Computer Applications, 182(24), 8–13. https://doi.org/10.5120/ijca2018917942
Garg, N. (2013). Apache Kafka: Set up Apache Kafka clusters and develop custom message producers and consumers using practical, hands-on examples. Packt Publishing.
IBM developer. (n.d.). Retrieved April 17, 2022, from https://developer.ibm.com/articles/monitoring-apache-kafka-apps/
Mssaperla. (n.d.). Apache Kafka - Azure Databricks. Azure Databricks | Microsoft Docs. Retrieved April 17, 2022, from https://docs.microsoft.com/en-us/azure/databricks/spark/latest/structured-streaming/kafka
Software & Support Media GmbH. (2015). Apache kafka.
Vohra, D. (2016). Using Apache Kafka. Pro Docker, 185–194. https://doi.org/10.1007/978-1-4842-1830-3_12
Wu, H. (2019). Research proposal: Reliability evaluation of the apache Kafka streaming system. 2019 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW). https://doi.org/10.1109/issrew.2019.00055
Zelenin, A., & Kropp, A. (2021). Kafka Deep Dive. Apache Kafka, 73–125. https://doi.org/10.3139/9783446470460.003
Verma, P. (2022, April 7). How to configure Apache ActiveMQ on AWS - DZone Cloud. dzone.com. Retrieved April 17, 2022, from https://dzone.com/articles/configuring-apache-activemq-on-aws
Opinions expressed by DZone contributors are their own.
Comments