Difficulty: Intermediate
Estimated Time: 16 minutes

Materialized Views (MV) are a global index. When a new MV is declared, a new table is created and distributed to the different nodes using the standard table distribution mechanisms. It’s scalable, just like normal tables. It is populated by a query running against the base table. It’s not possible to directly update a MV; it’s updated when the base table is updated.

Each Materialized View is a set of rows that correspond to rows present in the underlying, or base, table specified in the materialized view’s SELECT statement. Reads from a Materialized View are just as fast as regular reads from a table and just as scalable. But as expected, updates to a table with Materialized Views are slower than regular updates since these updates need to update both the original table and the Materialized View and ensure the consistency of both updates. However, doing those in the application without server help would have been even slower.

Some common use cases for MV are Indexing with denormalization, different sort orders, and filtering (pre-computed queries). This session also goes over examples, the CQL syntax to use, and limitations.

In this lab, you’ll create a base table and different Materialized Views for that base table. You’ll execute updates to the base table and see what happens to the view. Additionally, you’ll learn how to query the MV. Start by creating a Docker container with Scylla.

This lab is part of Scylla University. You can learn more about MV in this lesson.

Congratulations! You have completed the first Materialized Views and Indexes Lab.

To summarize, a MV is a view of a “base table”. Under the hood, it’s created as a separate table that can not be modified directly. In a Materialized View, all the components of the original primary key of the table MUST also appear in the MV’s key.

It’s possible to create a view with only some of the columns from the base table as well as use different sorting orders.

Check out other courses on Scylla University to improve your skills.

Another helpful resource is the Scylla Documentation section on MV.

ScyllaDB: Materialized Views and Indexes Lab 1

Step 1 of 2

Setting up the Environment

In this step, you'll set up the environment by creating a three-node Scylla cluster. You’ll then connect to the cluster with the CQL Shell, create a keyspace, a table, insert some data into the newly created table, and read the data. These topics are explained in more detail in this course on Scylla University.

Setting Up the Environment

Start by creating a three-node Scylla cluster using Docker. Create one node, called Node_X:

docker run --name Node_X -d scylladb/scylla:4.3.0 --overprovisioned 1 --smp 1

You can learn more about best practices for running Scylla on Docker here.

Create two more nodes, Node_Y and Node_Z, and add them to the cluster of Node_X.

The command “$(docker inspect –format='{{ .NetworkSettings.IPAddress }}’ Node_X)” translates to the IP address of Node-X:

docker run --name Node_Y -d scylladb/scylla:4.3.0 --overprovisioned 1 --smp 1 --seeds="$(docker inspect --format='{{ .NetworkSettings.IPAddress }}' Node_X)"

docker run --name Node_Z -d scylladb/scylla:4.3.0 --overprovisioned 1 --smp 1 --seeds="$(docker inspect --format='{{ .NetworkSettings.IPAddress }}' Node_X)"

Wait a minute or so and check the node status:

docker exec -it Node_X nodetool status

You’ll see that eventually, all the nodes have UN for status. U means up, and N means normal. If you get a message "nodetool: Unable to connect to Scylla API server: java.net.ConnectException: Connection refused (Connection refused)", it means you have to wait a bit more for the node to be up and responding.

Once the nodes are up, and the cluster is set, we can use the CQL shell to create a table.

Run a CQL shell:

docker exec -it Node_X cqlsh

Next, create a keyspace called architecture, with a Replication Factor of three:

CREATE KEYSPACE architecture WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'replication_factor' : 3};

Let’s look at a table of buildings: the key is the building’s unique name, and additional columns are each building’s city, the year it was built, and height in meters:

use architecture;

CREATE TABLE buildings ( name text, city text, built_year smallint, height_meters smallint, PRIMARY KEY (name) );

Now let’s insert some data:

INSERT INTO buildings (name, city, built_year, height_meters) VALUES ('Burj Khalifa', 'Dubai', 2010, 828); INSERT INTO buildings (name, city, built_year, height_meters) VALUES ('Shanghai World Financial Center', 'Shanghai', 2008, 487); INSERT INTO buildings (name, city, built_year, height_meters) VALUES ('Taipei 101', 'Taipei', 2004, 449); INSERT INTO buildings (name, city, built_year, height_meters) VALUES ('Sears Tower', 'Chicago', 1974, 442); INSERT INTO buildings (name, city, built_year, height_meters) VALUES ('World Trade Center', 'New York City', 1972, 417); INSERT INTO buildings (name, city, built_year, height_meters) VALUES ('Empire State Building', 'New York City', 1931, 381); INSERT INTO buildings (name, city, built_year, height_meters) VALUES ('Chrysler Building', 'New York City', 1930, 283);

The table now looks like this:

SELECT * FROM buildings;