Difficulty: Intermediate
Estimated Time: 15 minutes

In this scenario, you will:

  • Understand basic Apache Cassandra™ compaction strategies

As memtables fill up, Cassandra writes them to disk in the form of SSTables. If this were the end of the story, the number of data files used to contain SSTables would become large and slow the Cassandra read performance. Therefore, Cassandra must consolidate these files from time to time. This consolidation is called compaction.

In this exercise, we observe the effects of compaction.

compaction

Step 1 of 5

Step 1

For this scenario, you will only need a single node, which we've started for you already and launched cqlsh.

Go ahead and create the killrvideo keyspace and also create the videos_by_tag table:

CREATE KEYSPACE killrvideo WITH replication = {'class':'SimpleStrategy', 'replication_factor': 1};

CREATE TABLE killrvideo.videos_by_tag ( tag TEXT, video_id UUID, added_date TIMESTAMP, title TEXT, PRIMARY KEY ((tag), video_id) );

Now, insert a single row into the table:

INSERT INTO killrvideo.videos_by_tag (tag, added_date, video_id, title) VALUES ('cassandra', dateof(now()), uuid(), 'Cassandra Master');

Next, exit cqlsh:

exit

At the shell, use nodetool to force Cassandra to flush the memtable to an SSTable:

nodetool flush

Let's investigate the SSTable in the node's data directory. Remember the actual name of the directory will be a unique random value:

ls -l /usr/share/cassandra/data/killrvideo/videos_by_tag-*

You will see several files with names that start with. These are the files associated with the first SSTable.