Setting Up a Cassandra Cluster Through Ansible
Creating a cluster manually is a tedious task. Ansible can automate the task and handle the configuration management for us.
Join the DZone community and get the full member experience.
Join For FreeIn this post, we will use Ansible to and set up an Apache Cassandra database cluster. We will use AWS EC2 instances as the nodes for the cluster. Creating a cluster manually is a tedious task. We have to manually configure each node and each node must be correctly configured before starting the cluster. With Ansible, we can automate the task and let Ansible handle the configuration management for us.
First of all, create a directory for storing the files and folders related to the playbook. It helps in keeping our work organized and saves us from the confusion which may arise due to relative and absolute path references in passing the variables in our playbook. Following is the structure of my directory that contains the playbook and the roles:
Steps to Follow While Using AWS
- Create two or three instances of AWS EC2 that will serve as the nodes in a cluster.
- Create a security group to allow all connections and add the nodes to that security groups.
- Create an inventory that has the IP addresses of the nodes.
- Add the inventory file into the configuration file of the Ansible, e.g. ansible.cfg.
Now, we create a playbook to set up the nodes for us. Following is the playbook:
---
- hosts: aws-webservers
gather_facts: yes
remote_user: ec2-user
become: yes
vars:
cluster_name: Test_Cluster
seeds: 13.xxx.xxx.xxx
roles:
- installation
Then, we define the roles we have created. In the role, installation, the following tasks have been achieved:
- Installing a JRE.
- Adding and unpacking the Apache Cassandra tar.
- Replacing the cassandra.yaml having default configurations with cassandra.yaml with our own configurations, whose details are given below.
- Ensuring Cassandra is started.
The following is the main.yml file from the roles:
---
- name: Copt Java RPM file
copy:
src: jdk-8_linux-x64_bin.rpm
dest: /tmp/jdk-8_linux-x64_bin.rpm
- name: install JDK via RPM file with yum
yum:
name: /tmp/jdk-8_linux-x64_bin.rpm
state: present
- name: Copy Cassandra tar
copy:
src: apache-cassandra-3.11.2-bin.tar.gz
dest: /tmp/apache-cassandra-3.11.2-bin.tar.gz
- name: Extract Cassandra
command: tar -xvf /tmp/apache-cassandra-3.11.2-bin.tar.gz
- name: override cassandra.yaml file
template: src=cassandra.yaml dest=apache-cassandra-3.11.2/conf/
- name: Run Cassandra from bin folder
command: ./cassandra -fR
args:
chdir: /home/ec2-user/apache-cassandra-3.11.2/bin/
The cassandra.yaml contains most of the Cassandra configuration such as ports used, file locations, and seed node IP addresses. We need to edit this file on each node, so I have created a template for the file. The template cassandra.yaml uses the following variables:
- cluster_name: '{{ cluster_name }}' can be anything chosen by you to describe the name of the cluster.
- seeds: "{{ seeds }}" are the IP addresses of the clusters seed servers. Seed nodes are used as known places where cluster information (such as a list of nodes in the cluster) can be obtained.
- listen_address: {{ aws-webservers }} is the IP address that Cassandra will listen on for internal (Cassandra to Cassandra) communication will occur.
- rpc_address: {{ aws-webservers }} is the IP address that Cassandra will listen on for client-based communication.
Now, we can run the playbook and our cluster will be up and running. We can add more nodes to the list by simply adding them to the host list and Ansible will ensure that Cassandra is installed and the nodes are connected to the cluster and started.
Points to Remember
The host IP should be the public IP of a node.
Put the Java rpm packages and Cassandra tar file in the files directory of the role created.
Use Java 8, as Cassandra is not supported on higher versions of Java. It will throw the following error:
[0.000s][warning][gc] -Xloggc is deprecated. Will use -Xlog:gc:/home/mmatak/monero/apache-cassandra-3.11.1/logs/gc.log instead.
intx ThreadPriorityPolicy=42 is outside the allowed range [ 0 ... 1 ]
Improperly specified VM option 'ThreadPriorityPolicy=42'
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
Thus, Ansible makes it very easy to install distributed systems like Cassandra. The thought of doing it manually is very disheartening. The full source code including templates and directory structure are here.
This article was first published on the Knoldus blog.
Published at DZone with permission of Sudeep James Tirkey, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments