Difficulty: Intermediate
Estimated Time: 15 minutes

In this scenario, you will:

  • Understand how Apache Cassandra™ performs read-repairs on inconsistent data.

Consistency is the tricky challenge for distributed systems. As distributed systems trade-off consistency for performance, some of the nodes in a cluster may become inconsistent. When Cassandra notices these inconsistencies, Cassandra takes steps to resolve the consistencies. This resolution is the role of Read-Repair.


Step 1 of 6

Step 1

We've already started a three node cluster for you and loaded some data into the videos_by_tag table.

TODO load the schema in the background

We are going to bring down one of the nodes responsible for the cassandra tag. As a review, the following command will tell you what nodes these are:

ccm node1 nodetool getendpoints killrvideo videos_by_tag 'cassandra'

Be sure you make note of which nodes these are.

Choose one of the nodes to bring down. Before bringing the node down, flush its data by executing the following command:

ccm node1 nodetool drain

Now bring down your chosen node responsible for the cassandra replica.

ccm node1 nodetool stopdaemon

Wait for the node to terminate before continuing.

ccm node2 nodetool status

Keep track of which node you brought down.

In the data folder of the downed node, find the directory that contains the table data for videos_by_tag.

cd node2/data/data/killrvideo ls -l

Delete the entire directory.

rm -rf videos_by_tag-*