Welcome to the lakeFS Playground!
In this tutorial we’ll work with a sample dataset to give you a sense for the ways lakeFS makes it easy to work with data.
We will learn how to use lakeFS using lakectl, and how to run basic commands - manage and explore repositories, commits and files.
As part of this exercise you will also create a new branch, run a Spark job, check out the diff and merge it.
For more information about lakeFS or lakectl go to the lakeFS docs
If you have questions along the way, don’t hesitate to ask on the Slack Channel.
Getting started with lakeFS
lakectl and repositories
Once LakeFS is Done initializing we will have a ready environment for you to play with.
lakectl is a CLI tool allowing exploration and manipulation of a lakeFS environment
In order to see the available commands:
In lakeFS, a repository is a logical namespace used to group together objects, branches and commits. It is the equivalent of a Bucket in S3, and a repository in Git.
Let's start by creating a repository.
We will name the repository my_repo:
lakectl repo create lakefs://my-repo local://storage-location
You can see the repository was created, and the default branch is master.
Now let's list your repositories:
lakectl repo list
To see all available repository commands use --help:
lakectl repo --help