Difficulty: Intermediate
Estimated Time: 25 minutes

Cassandra clusters in production must perform anti-entropy (repair) operations in order to maintain data consistency. Reaper is the preferred solution for anti-entropy operations in Cassandra at scale.

What are anti-entropy (repair) operations in Caasandra?
Every Cassandra cluster will need repairs to maintain data consistency. Tombstones (deletions) or unavailable nodes are common causes of data inconsistency. The Casssandra repair process compares data between replicas and updates all replicas to the latest data. View the Cassandra docs for more details on anti-entropy repair.

This scenario adds Reaper to a Cassandra cluster running in Kubernetes with the cass operator.

In this scenario, we'll:

  • Set up a Cassandra cluster running on Kubernetes
  • Install and configure Reaper
  • Install the example Pet Clinic app
  • View and manage anti-entropy operations in the Cassandra cluster

Let's get started!

In this scenarion, we learned how to:

  • Install Reaper in a Kubernetes cluster
  • Configure Reaper to use Cassandra as a backing store
  • Connect Reaper to Cassandra to control repairs
  • Perform immediate and scheduled repairs in Cassandra

The setup for Reaper was relatively complex. One of the advantages of using K8ssandra to run Cassandra in Kubernetes is that it takes care of configuring the most commonly used tools for managing Cassandra in production; Reaper, Prometheus, Grafana and Medusa.

Automated Repair - Reaper

Step 1 of 17

Create a Cassandra Cluster

In this step you will create a three node Cassandra cluster. In the terminal you will see Katacoda setting up the environment by installing kind and Helm, creating the Kubernetes cluster, configuuring the Ingress and installing the Cassandra Operator.

While you wait for the environment, click to open cassandra-cluster.yaml in the Katacoda editor.

Reaper uses Java Management Extensions(JMX) to interact with Cassandra clusters. In this scenario, Reaper will be running separately from the Cassandra cluster in its own pod. Therefore, we need to enable remote JMX access in the Cassandra cluster. The configuration file sets the LOCAL_JMX environment variable to no and puts the JMX credentials in /config/jmxremote.password.

Wait until you see that the environment has been created.

Environment Created

Start the cluster creation.

kubectl apply -f cassandra-cluster.yaml

Use the watch command with kubectl to view pod info including status. (The watch command will update this info every 2s.)

watch kubectl get pods

The Cassandra cluster will be fully up and running when READY state of all three cluster-dc1 pods is 2/2.

Pro Tip: It may take 4-5 minutes before all three Cassandra nodes are up and running.

Click to send a Ctrl-C to stop monitoring the pod state.