What is Cassandra?

Apache Cassandra is an open-source, distributed, NoSQL DBMS that can process large volumes of data across several servers quickly.

Avinash Lakshman and Prashant Malik initially developed Cassandra at Facebook for Facebook's inbox search feature (when you search your inbox on Facebook and get results, thank these guys!).

Apache Cassandra's architecture. Image courtesy: Max McKittrick

The best way to describe Cassandra to a newcomer is that it is a KKV store. The two Ks comprise the primary key.
The first K is the partition key and is used to determine which node the data lives on and where it is found on disk. The partition contains multiple rows within it and a row within a partition is identified by the second K, which is the clustering key.
The clustering key acts as both a primary key within the partition and how the rows are sorted. You can think of a partition as an ordered dictionary.

Stanislav Vishnevskiy, CTO, Discord

Apache Cassandra uses the CQL (Cassandra Query Language), an alternative to SQL, for querying and processing data.

Why should you use Cassandra?

Cassandra is one of the most popular databases because it's scalable, fault tolerant and easy to learn and configure.

Since it is decentralized, every node in each cluster is identical, and as a result, there are no single points of failure—a system where if one critical component fails, the entire system stops working.

A node is the place where actually data is stored—the basic component of Apache Cassandra.

Who uses Cassandra?

Several major organizations worldwide such as Apple, Netflix, RackSpace, SoundCloud and Uber use Cassandra for their services.

Think we're missing something? 🧐 Help us update this article by sending us your suggestions here. 🙏

See also

Articles you might be interested in

  1. A quick introduction to Apache Cassandra
  2. How Discord stores billions of messages
  3. What is Apache Cassandra?
  4. Top 5 reasons to use the Apache Cassandra database