Welcome!
Digital Library Data Modeling Example for Cassandra
In this scenario, you will:
- Create tables for a digital library data use case
- Populate tables with sample digital library data
- Design and execute CQL queries over digital library data
This scenario is also available on our datastax.com/dev site, where you can find many more resources to help you succeed with Apache Cassandra™.
Congratulations!
You've completed the scenario!
Scenario Rating
In this scenario, you explored:
- Schema design for a digital library use case
- Sample digital library data
- CQL queries over digital library data
Your environment is currently being packaged as a Docker container and the download will begin shortly. To run the image locally, once Docker has been installed, use the commands
cat scrapbook_datastax_cassandra-data-modeling/music-data_container.tar | docker load
docker run -it /datastax_cassandra-data-modeling/music-data:
Oops!! Sorry, it looks like this scenario doesn't currently support downloads. We'll fix that shortly.

Steps
Digital Library Data Modeling Example for Cassandra
Create a keyspace
Create the music_data
keyspace:
cqlsh -e "
CREATE KEYSPACE music_data
WITH replication = {
'class': 'NetworkTopologyStrategy',
'DC-Houston': 1 };"
Create tables
Create tables performers
, albums_by_performer
, albums_by_title
,
albums_by_genre
, tracks_by_title
, tracks_by_album
, users
and tracks_by_user
:
cqlsh -e "
USE music_data;
CREATE TABLE performers (
name TEXT,
type TEXT,
country TEXT,
born INT,
died INT,
founded INT,
PRIMARY KEY ((name))
);
CREATE TABLE albums_by_performer (
performer TEXT,
year INT,
title TEXT,
genre TEXT,
PRIMARY KEY ((performer),year,title)
) WITH CLUSTERING ORDER BY (year DESC, title ASC);
CREATE TABLE albums_by_title (
title TEXT,
year INT,
performer TEXT,
genre TEXT,
PRIMARY KEY ((title),year)
) WITH CLUSTERING ORDER BY (year DESC);
CREATE TABLE albums_by_genre (
genre TEXT,
year INT,
title TEXT,
performer TEXT,
PRIMARY KEY ((genre),year,title)
) WITH CLUSTERING ORDER BY (year DESC, title ASC);
CREATE TABLE tracks_by_title (
title TEXT,
album_year INT,
album_title TEXT,
number INT,
length INT,
genre TEXT,
PRIMARY KEY ((title),album_year,album_title,number)
) WITH CLUSTERING ORDER BY (album_year DESC, album_title ASC, number ASC);
CREATE TABLE tracks_by_album (
album_title TEXT,
album_year INT,
number INT,
title TEXT,
length INT,
genre TEXT STATIC,
PRIMARY KEY ((album_title,album_year),number)
);
CREATE TABLE users (
id UUID,
name TEXT,
PRIMARY KEY ((id))
);
CREATE TABLE tracks_by_user (
id UUID,
month DATE,
timestamp TIMESTAMP,
album_title TEXT,
album_year INT,
number INT,
title TEXT,
length INT,
PRIMARY KEY ((id,month),timestamp,album_title,album_year,number)
) WITH CLUSTERING ORDER BY (timestamp DESC, album_title ASC, album_year ASC, number ASC);"
Populate tables using DSBulk
Load data into table performers
:
dsbulk load -url performers.csv \
-k music_data \
-t performers \
-header true \
-logDir /tmp/logs
Retrieve some rows from table performers
:
cqlsh -e "SELECT * FROM music_data.performers LIMIT 10;"
Load data into tables albums_by_performer
, albums_by_title
and albums_by_genre
:
dsbulk load -url albums.csv \
-k music_data \
-t albums_by_performer \
-header true \
-logDir /tmp/logs
dsbulk load -url albums.csv \
-k music_data \
-t albums_by_title \
-header true \
-logDir /tmp/logs
dsbulk load -url albums.csv \
-k music_data \
-t albums_by_genre \
-header true \
-logDir /tmp/logs
Retrieve some rows from tables albums_by_performer
, albums_by_title
and albums_by_genre
:
cqlsh -e "SELECT * FROM music_data.albums_by_performer LIMIT 5;"
cqlsh -e "SELECT * FROM music_data.albums_by_title LIMIT 5;"
cqlsh -e "SELECT * FROM music_data.albums_by_genre LIMIT 5;"
Load data into tables tracks_by_title
and tracks_by_album
:
dsbulk load -url tracks.csv \
-k music_data \
-t tracks_by_title \
-header true \
-m "0=album_title, \
1=album_year, \
2=genre, \
3=number, \
4=title" \
-logDir /tmp/logs
dsbulk load -url tracks.csv \
-k music_data \
-t tracks_by_album \
-header true \
-m "0=album_title, \
1=album_year, \
2=genre, \
3=number, \
4=title" \
-logDir /tmp/logs
Retrieve some rows from tables tracks_by_title
and tracks_by_album
:
cqlsh -e "SELECT * FROM music_data.tracks_by_title LIMIT 5;"
cqlsh -e "SELECT * FROM music_data.tracks_by_album LIMIT 5;"
Insert rows using the CQL shell
Start the CQL shell:
cqlsh -k music_data
Insert rows into table users
:
INSERT INTO users (id, name)
VALUES (12345678-aaaa-bbbb-cccc-123456789abc, 'Joe');
INSERT INTO users (id, name)
VALUES (UUID(), 'Jen');
INSERT INTO users (id, name)
VALUES (UUID(), 'Jim');
SELECT * FROM users;
Insert rows into table tracks_by_user
:
INSERT INTO tracks_by_user (id, month, timestamp, album_title, album_year, number, title)
VALUES (12345678-aaaa-bbbb-cccc-123456789abc, '2020-01-01', '2020-01-05T11:22:33', '20 Greatest Hits', 1982, 16, 'Hey Jude');
INSERT INTO tracks_by_user (id, month, timestamp, album_title, album_year, number, title)
VALUES (12345678-aaaa-bbbb-cccc-123456789abc, '2020-09-01', '2020-09-15T09:00:00', '20 Greatest Hits', 1982, 16, 'Hey Jude');
INSERT INTO tracks_by_user (id, month, timestamp, album_title, album_year, number, title)
VALUES (12345678-aaaa-bbbb-cccc-123456789abc, '2020-09-01', '2020-09-15T16:41:10', 'Legendary Concert Performances', 1978, 6, 'Johnny B. Goode');
INSERT INTO tracks_by_user (id, month, timestamp, album_title, album_year, number, title)
VALUES (12345678-aaaa-bbbb-cccc-123456789abc, '2020-09-01', '2020-09-15T16:44:56', 'The Beatles 1967-1970', 1973, 17, 'Come Together');
INSERT INTO tracks_by_user (id, month, timestamp, album_title, album_year, number, title)
VALUES (12345678-aaaa-bbbb-cccc-123456789abc, '2020-09-01', '2020-09-15T21:13:13', 'Dark Side Of The Moon', 1973, 3, 'Time');
SELECT * FROM tracks_by_user;
Design query Q1
Find a performer with name The Beatles
:
Solution
SELECT *
FROM performers
WHERE name = 'The Beatles';
Design query Q2
Find albums of performer The Beatles
; order by year (desc):
Solution
SELECT *
FROM albums_by_performer
WHERE performer = 'The Beatles';
Design query Q3
Find an album with title Magical Mystery Tour
and year 1967
:
Solution
SELECT *
FROM albums_by_title
WHERE title = 'Magical Mystery Tour'
AND year = 1967;
Design query Q4
Find albums with title 20 Greatest Hits
; order by year (desc):
Solution
SELECT *
FROM albums_by_title
WHERE title = '20 Greatest Hits';
Design query Q5
Find albums from genre Classical
; order by year (desc):
Solution
SELECT *
FROM albums_by_genre
WHERE genre = 'Classical';
Design query Q6
Find tracks with title Let It Be
:
Solution
SELECT *
FROM tracks_by_title
WHERE title = 'Let It Be';
Design query Q7
Find tracks from album Magical Mystery Tour
of 1967
; order by track number (asc):
Solution
SELECT *
FROM tracks_by_album
WHERE album_title = 'Magical Mystery Tour'
AND album_year = 1967;
Design query Q8
Find a user with id 12345678-aaaa-bbbb-cccc-123456789abc
:
Solution
SELECT *
FROM users
WHERE id = 12345678-aaaa-bbbb-cccc-123456789abc;
Design query Q9
Find all tracks played by a user in September 2020
; order by timestamp (desc):
Solution
SELECT timestamp, album_title, album_year, number, title
FROM tracks_by_user
WHERE id = 12345678-aaaa-bbbb-cccc-123456789abc
AND month = '2020-09-01';