Apache Cassandra is a powerful, open-source, high-performance NoSQL database management system designed to offer no single point of failure. Unlike traditional databases like MySQL and PostgreSQL, Cassandra utilizes a cluster model rather than a table model, making it ideal for applications that cannot risk data loss. Its architecture ensures automatic data replication across multiple nodes, ensuring fault tolerance and allowing seamless replacement of failed nodes without downtime.
If your priority includes scalability, high availability, and peak performance, Apache Cassandra emerges as a top-choice database solution.
This comprehensive guide will walk you through the steps to install Apache Cassandra on CentOS 8.
Requirements
- A CentOS 8 server with at least 2 GB of RAM.
- Root user access with a set password.
Getting Started
It’s advisable to start by updating your server to the latest stable state. Execute the following command:
dnf update
After updating, restart your server to apply these updates.
Install Java
Apache Cassandra requires OpenJDK 8 and Python2. Install these packages using:
dnf install java-1.8.0-openjdk-devel python2
To confirm the installation and version of Java, use:
java -version
Your output should resemble:
openjdk version "1.8.0_232" OpenJDK Runtime Environment (build 1.8.0_232-b09) OpenJDK 64-Bit Server VM (build 25.232-b09, mixed mode)
Install Apache Cassandra
Cassandra isn’t in the default CentOS 8 repository, so you’ll need to create a new repository file. Open and edit /etc/yum.repos.d/cassandra.repo
:
nano /etc/yum.repos.d/cassandra.repo
Insert the following configuration:
[cassandra] name = DataStax Repo for Apache Cassandra baseurl = http://rpm.datastax.com/community enabled = 1 gpgcheck = 0
Save and close the file. Then, install Apache Cassandra:
dnf install dsc20
Create a Systemd Unit File for Cassandra
Apache Cassandra doesn’t come with a service file by default. Create a systemd service file to manage it:
nano /etc/systemd/system/cassandra.service
Write the following:
[Unit] Description=Apache Cassandra After=network.target [Service] PIDFile=/var/run/cassandra/cassandra.pid User=cassandra Group=cassandra ExecStart=/usr/sbin/cassandra -f -p /var/run/cassandra/cassandra.pid Restart=always [Install] WantedBy=multi-user.target
Save the changes, reload the systemd daemon, and then start and enable Cassandra:
systemctl daemon-reload
systemctl start cassandra systemctl enable cassandra
Verify the service status with:
systemctl status cassandra
Expected output:
? cassandra.service - Apache Cassandra Loaded: loaded (/etc/systemd/system/cassandra.service; disabled; vendor preset: disabled) Active: active (running) since [Timestamp] Main PID: 1888 (java) Tasks: 53 (limit: 25044) Memory: 272.7M CGroup: /system.slice/cassandra.service └─1888 java -ea -javaagent:/usr/share/cassandra/lib/jamm-0.2.5.jar -XX:+CMSClassUnloadingEnabled -XX:+UseThreadPriorities -XX:ThreadPriorities whatever
Test Apache Cassandra Installation
Cassandra is now operational on your server. Check its status using:
nodetool status
This should return:
Datacenter: datacenter1 ======================= Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 127.0.0.1 46.11 KB 256 100.0% 2a680007-8c30-4bde-9a3f-9fa212b96d11 rack1
Configure Apache Cassandra
Cassandra accepts connections from localhost by default. To log into Cassandra via the Cassandra Query Language (CQL), enter:
cqlsh
You’ll see:
Connected to Test Cluster at localhost:9160. [cqlsh 4.1.1 | Cassandra 2.0.17 | CQL spec 3.1.1 | Thrift protocol 19.39.0] Use HELP for help. cqlsh>
To alter the default cluster name from “Test Cluster,” use the CQL shell:
cqlsh
cqlsh> UPDATE system.local SET cluster_name = 'HowtoForge Cluster' WHERE KEY = 'local';
Exit the shell:
cqlsh>exit;
Edit the cassandra.yaml
configuration and set the new cluster name:
nano /etc/cassandra/default.conf/cassandra.yaml
Modify:
cluster_name: 'HowtoForge Cluster'
Save the file, flush the system cache, and restart Cassandra:
nodetool flush system
systemctl restart cassandra
Log back in to confirm the cluster name change:
cqlsh
You should see:
Connected to HowtoForge Cluster at localhost:9160. [cqlsh 4.1.1 | Cassandra 2.0.17 | CQL spec 3.1.1 | Thrift protocol 19.39.0] Use HELP for help. cqlsh>
Conclusion
Well done! You have successfully installed and configured Apache Cassandra on CentOS 8. Should you have any questions, please feel free to ask for assistance.
Frequently Asked Questions (FAQ)
- Is there an alternative CentOS version compatible with these instructions?While this guide targets CentOS 8, many instructions are applicable to other CentOS versions with similar package management and configuration directories.
- Can I use a newer version of Java?It’s advised to use the version of Java compatible with Cassandra’s specified requirements. Check Cassandra’s documentation for any updates or version compatibility changes.
- How can I scale Cassandra for larger data workloads?Cassandra is inherently designed to scale by adding more nodes to your cluster, distributing data, and increasing your cluster’s capacity and fault tolerance.
- Where can I find more about Cassandra’s configuration settings?The Apache Cassandra documentation is a comprehensive resource that covers various configuration settings and best practices.