Difficulty: beginner
Estimated Time: 15-20 minutes

Unpacking a WARC with Warcat

In this tutorial, you will learn how to extract web resources from WARC files as individual files.

We will use the Warcat Python package that provides tools for managing WARC files.


Basic familiarity with the UNIX command line is recommended.


v0.4 Created by The UK Web Archive.

Step 1 - Install Warcat

First, we need to install Warcat.


You can install Warcat with the following command. Like all the commands in this tutorial, you can just click it and it will start running in the Terminal view to the right:

pip3 install Warcat

Check it's working

Once it's installed, you can check it's working by looking at the command-line options:

python3 -m warcat -h