In this learning unit, you will:
- Understand DSBulk use cases
- Use DSBulk commands
- Learn about DSBulk options
- Explore several examples of using DSBulk
This scenario is also available on our datastax.com/dev site, where you can find many more resources to help you succeed with Apache Cassandra™.
Did you know?
You can use Cassandra as a service in the cloud. Nothing to install, no credit card required. Sign up and launch your database with a few clicks at astra.datastax.com!
In this scenario, you learned about:
- DSBulk use cases
- DSBulk commands
- DSBulk options
- Several examples of using DSBulk
Bulk Loading Large Datasets into Apache Cassandra™
DataStax Bulk Loader
DataStax Bulk Loader (DSBulk) is an efficient, flexible, easy-to-use and free command-line utility for Apache Cassandra™ that excels at loading, unloading and counting data. You should use DSBulk to:
- Load data from CSV or JSON files into the database
- Unload data stored in the database into CSV or JSON files
- Quickly count the number of rows in a given table
DSBulk is a good choice for small, medium and large datasets. It gets data in and out of the database
significantly faster than individual
COPY command or other community tools. Only for very large datasets
that reside in a distributed file system, a potentially faster alternative to DSBulk
could be data loading with Apache Spark™.