Difficulty: Beginner
Estimated Time: 25

Welcome to the lakeFS Playground!

In this tutorial we’ll work with a sample dataset to give you a sense for the ways lakeFS makes it easy to work with data.

We will learn how to use lakeFS using lakectl, and how to run basic commands - manage and explore repositories, commits and files.

As part of this exercise you will also create a new branch, run a Spark job, check out the diff and merge it.

For more information about lakeFS or lakectl go to the lakeFS docs

If you have questions along the way, don’t hesitate to ask on the Slack Channel.

Getting started with lakeFS

Step 1 of 5

lakectl and repositories

Once LakeFS is Done initializing we will have a ready environment for you to play with.


lakectl is a CLI tool allowing exploration and manipulation of a lakeFS environment

In order to see the available commands: lakectl --help

Repository commands

In lakeFS, a repository is a logical namespace used to group together objects, branches and commits. It is the equivalent of a Bucket in S3, and a repository in Git.

Let's start by creating a repository.

We will name the repository my_repo:
lakectl repo create lakefs://my-repo local://storage-location

You can see the repository was created, and the default branch is master.

Now let's list your repositories: lakectl repo list

To see all available repository commands use --help: lakectl repo --help