Step-by-Step Guide: Installing Apache Cassandra NoSQL Database on CentOS 8

Apache Cassandra is a powerful, open-source, high-performance NoSQL database management system designed to offer no single point of failure. Unlike traditional databases like MySQL and PostgreSQL, Cassandra utilizes a cluster model rather than a table model, making it ideal for applications that cannot risk data loss. Its architecture ensures automatic data replication across multiple nodes, ensuring fault tolerance and allowing seamless replacement of failed nodes without downtime.

If your priority includes scalability, high availability, and peak performance, Apache Cassandra emerges as a top-choice database solution.

This comprehensive guide will walk you through the steps to install Apache Cassandra on CentOS 8.

Requirements

  • A CentOS 8 server with at least 2 GB of RAM.
  • Root user access with a set password.

Getting Started

It’s advisable to start by updating your server to the latest stable state. Execute the following command:

dnf update

After updating, restart your server to apply these updates.

Install Java

Apache Cassandra requires OpenJDK 8 and Python2. Install these packages using:

dnf install java-1.8.0-openjdk-devel python2

To confirm the installation and version of Java, use:

java -version

Your output should resemble:

openjdk version "1.8.0_232"
OpenJDK Runtime Environment (build 1.8.0_232-b09)
OpenJDK 64-Bit Server VM (build 25.232-b09, mixed mode)

Install Apache Cassandra

Cassandra isn’t in the default CentOS 8 repository, so you’ll need to create a new repository file. Open and edit /etc/yum.repos.d/cassandra.repo:

nano /etc/yum.repos.d/cassandra.repo

Insert the following configuration:

[cassandra]
name = DataStax Repo for Apache Cassandra
baseurl = http://rpm.datastax.com/community
enabled = 1
gpgcheck = 0

Save and close the file. Then, install Apache Cassandra:

dnf install dsc20

Create a Systemd Unit File for Cassandra

Apache Cassandra doesn’t come with a service file by default. Create a systemd service file to manage it:

nano /etc/systemd/system/cassandra.service

Write the following:

[Unit]
Description=Apache Cassandra
After=network.target
[Service]
PIDFile=/var/run/cassandra/cassandra.pid
User=cassandra
Group=cassandra
ExecStart=/usr/sbin/cassandra -f -p /var/run/cassandra/cassandra.pid
Restart=always
[Install]
WantedBy=multi-user.target

Save the changes, reload the systemd daemon, and then start and enable Cassandra:

systemctl daemon-reload
systemctl start cassandra
systemctl enable cassandra

Verify the service status with:

systemctl status cassandra

Expected output:

? cassandra.service - Apache Cassandra
   Loaded: loaded (/etc/systemd/system/cassandra.service; disabled; vendor preset: disabled)
   Active: active (running) since [Timestamp]
 Main PID: 1888 (java)
    Tasks: 53 (limit: 25044)
   Memory: 272.7M
   CGroup: /system.slice/cassandra.service
           └─1888 java -ea -javaagent:/usr/share/cassandra/lib/jamm-0.2.5.jar -XX:+CMSClassUnloadingEnabled
	           -XX:+UseThreadPriorities -XX:ThreadPriorities whatever

Test Apache Cassandra Installation

Cassandra is now operational on your server. Check its status using:

nodetool status

This should return:

Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens  Owns (effective)  Host ID                               Rack
UN  127.0.0.1  46.11 KB   256     100.0%            2a680007-8c30-4bde-9a3f-9fa212b96d11  rack1

Configure Apache Cassandra

Cassandra accepts connections from localhost by default. To log into Cassandra via the Cassandra Query Language (CQL), enter:

cqlsh

You’ll see:

Connected to Test Cluster at localhost:9160.
[cqlsh 4.1.1 | Cassandra 2.0.17 | CQL spec 3.1.1 | Thrift protocol 19.39.0]
Use HELP for help.
cqlsh>

To alter the default cluster name from “Test Cluster,” use the CQL shell:

cqlsh
cqlsh> UPDATE system.local SET cluster_name = 'HowtoForge Cluster' WHERE KEY = 'local';

Exit the shell:

cqlsh>exit;

Edit the cassandra.yaml configuration and set the new cluster name:

nano /etc/cassandra/default.conf/cassandra.yaml

Modify:

cluster_name: 'HowtoForge Cluster'

Save the file, flush the system cache, and restart Cassandra:

nodetool flush system
systemctl restart cassandra

Log back in to confirm the cluster name change:

cqlsh

You should see:

Connected to HowtoForge Cluster at localhost:9160.
[cqlsh 4.1.1 | Cassandra 2.0.17 | CQL spec 3.1.1 | Thrift protocol 19.39.0]
Use HELP for help.
cqlsh>

Conclusion

Well done! You have successfully installed and configured Apache Cassandra on CentOS 8. Should you have any questions, please feel free to ask for assistance.

Frequently Asked Questions (FAQ)

  • Is there an alternative CentOS version compatible with these instructions?While this guide targets CentOS 8, many instructions are applicable to other CentOS versions with similar package management and configuration directories.
  • Can I use a newer version of Java?It’s advised to use the version of Java compatible with Cassandra’s specified requirements. Check Cassandra’s documentation for any updates or version compatibility changes.
  • How can I scale Cassandra for larger data workloads?Cassandra is inherently designed to scale by adding more nodes to your cluster, distributing data, and increasing your cluster’s capacity and fault tolerance.
  • Where can I find more about Cassandra’s configuration settings?The Apache Cassandra documentation is a comprehensive resource that covers various configuration settings and best practices.