Apache Cassandra is an open-source, distributed NoSQL database developed at Facebook.Inc by the Avinash Lakshman and Prashant Malik and belongs to the Apache foundation. It aimed at powering Facebook’s search feature. It is a DBMS providing the ability to efficiently handle large amounts of data across the various commodity servers. NoSQL is an ideal choice for scalable applications.
It is available for almost all the major Linux distributions. Apache Cassandra can efficiently support and boost the capability of the applications that require the processing of a lot of data. It has a powerful resource manager that can effectively perform all the tedious tasks.
Others features of the Apache Cassandra is that it has not even a single point of failure. Each node of the arrangement contains different data, but, there is not the presence of a master node, and each node in the cluster can request any service.
Cassandra is being actively used at organizations like Hulu, eBay, Netflix, Reddit, GoDaddy, CERN, etc. Since the Apache Cassandra is decentralized, there is not even a single point of failure.
It has the capability of fault tolerance, meaning that it has the functionality of replication across several of the servers. NoSQL database management systems are safer than the traditional SQL servers in the sense that they are more sheltered from the frequent attacks like SQL injection, etc.
Different Versions of Apache Cassandra (right from its initial release)
The first public release of Apache Cassandra was on April 12, 2010. This release contains the support for Apache Hadoop MapReduce and integrated caching. The next version of the Apache Cassandra, i.e., 0.7, it was released on January 08, 2011. In this edition, it was allowed to change online schemas, and secondary indexes were added. In the next update, i.e., 0.8, the functionality of Cassandra Query Language (CQL) and support for zero-downtime upgrades and self-tuning tables were introduced.
Version 1.0 contains the improved read-performance, integrated compression, and levelled compaction. Version 1.1 was released on Apr 23, 2012. It includes row-level isolation and self-tuning caches. Version 1.2 was released on Jan 2, 2013, and provides request tracing, inter-node communication, and atomic batches.
Version 2.0 was released on September 4, 2013. It contains improved compactions, lightweight transactions, and triggers. In the successive releases, i.e., 2.1, 2.2, and 3.0, there were no significant changes made in them.
The odd numbered releases from 3.1 to 3.10, i.e., 3.1, 3.3, 3.5, 3.7 and 3.9 contained only bug fixes while the even numbered versions, i.e., 3.2, 3.4, 3.6, 3.8 and 3.10 contained both bug fixes and new functionalities and features. All these releases from 3.1 to 3.10 were monthly updates.
The latest addition in the Apache Cassandra is of the version 3.11 on February 11, 2019. Release 3.11 was equipped with critical bug fixes and the addition of new features in it.
In this article, we will be going to learn about the installation process of Apache Cassandra on Ubuntu 18.04.
Since Apache Cassandra is written in the Java Programming language, so the first thing to be kept in mind is to install the Oracle Java JDK. To install the latest version of OpenJDK, enter the below commands in your Ubuntu 18.04 terminal:
sudo apt update
sudo apt install OpenJDK-8-JDK
Install Apache Cassandra
You can either install the Cassandra from its.DEB file or add the official package repository of the Cassandra and make the installation and updating process of the Apache Cassandra a lot easier.
Adding the Official package repository
To install the Apache Cassandra easily, you can add the official repository of the Cassandra by entering the following commands into the terminal:
wget –q -0 – https://www.apache.org/dist/cassandra/KEYS | sudo apt -key add –
sudo sh –c ‘echo “deb http://www.apache.org/dist/cassandra/debian 311x main” > /etc/apt/sources.list.d/cassandra.sources.list’
The above lines will import the GPG key of the repository and add the repository of the Apache Cassandra into a file called cassandra.sources.list.
After running the above commands, run the below commands to install the Apache Cassandra:
sudo apt update
sudo apt install cassandra
To confirm if everything is working fine, type the following command into the terminal:
Interacting with the Cassandra
One can interact with the Cassandra using the Cassandra Query Language (CQL). To interact with the Cassandra using the CQL, a command line utility named ‘CQLSH’ is required. Just type the below command in the command line to open the CQLSH:
Configuring the Apache Cassandra
Configuring something means arranging several small components in a particular manner to form a complete system.
Cassandra comes with a default configuration that is okay with a single system. But, if the Cassandra is going to be used within a cluster of computing systems, it is advisable to modify Cassandra’s configuration file accordingly.
This configuration file is situated at the /etc/cassandra/. To open the configuration file with the ‘nano’ text editor, type the below command:
sudo nano /etc/cassandra/cassandra.yaml
Modifications to do in the Configuration file
Changing the Cluster name
You can change the cluster name to whatever name you want. To change the cluster name, look out for the below parameter and do the desired changes:
Cluster_name: [some name]
Changing the default data storage port
To change the default data storage port, check for the below parameter, and do the desired changes:
Adding the IP of the nodes that make the cluster
To add the IP addresses of the nodes that make up the cluster, look for the ‘seed_provider’ parameter, and do the changes accordingly.
Seeds: [node 1], [node 2], ……… [node n]
Save & Reload the configuration file
After all the necessary modifications had been made to the configuration file, save and reload the configuration file by entering the following command line command:
sudo systemct1 reload cassandra
Apache Cassandra is a NoSQL database which is an ideal choice for the scalable applications and that require a large amount of data processing. It provides unbeatable results without compromising the performance. In this article, we have learned how to install and configure the Apache Cassandra correctly in Ubuntu 18.04.