Difficulty: Beginner
Estimated Time: 15 minutes

In this learning unit, you will:

  • Understand DSBulk use cases
  • Use DSBulk commands load, unload and count
  • Learn about DSBulk options -url, -k, -t, -m and more
  • Explore several examples of using DSBulk

This scenario is also available on our datastax.com/dev site, where you can find many more resources to help you succeed with Apache Cassandra™.

Did you know?

You can use Cassandra as a service in the cloud. Nothing to install, no credit card required. Sign up and launch your database with a few clicks at astra.datastax.com!

In this scenario, you learned about:

  • DSBulk use cases
  • DSBulk commands load, unload and count
  • DSBulk options -url, -k, -t, -m and more
  • Several examples of using DSBulk

Bulk Loading Large Datasets into Apache Cassandra™

Step 1 of 8

DataStax Bulk Loader

DataStax Bulk Loader (DSBulk) is an efficient, flexible, easy-to-use and free command-line utility for Apache Cassandra™ that excels at loading, unloading and counting data. You should use DSBulk to:

  • Load data from CSV or JSON files into the database
  • Unload data stored in the database into CSV or JSON files
  • Quickly count the number of rows in a given table

DSBulk is a good choice for small, medium and large datasets. It gets data in and out of the database significantly faster than individual INSERTs, the COPY command or other community tools. Only for very large datasets that reside in a distributed file system, a potentially faster alternative to DSBulk could be data loading with Apache Spark™.