Difficulty: Beginner
Estimated Time: 10 minutes

This example demonstrates how you can use kubeflow end-to-end to train and serve a Sequence-to-Sequence model on an existing kubernetes cluster. This tutorial is based upon the article "How To Create Data Products That Are Magical Using Sequence-to-Sequence Models".

You can find more details at https://github.com/kubeflow/examples/tree/master/github_issue_summarization

In this scenario you learned how to deploy different style of ML workloads using Kubernetes and Kubeflow.

The aim of Kubeflow is to provide a set of simple manifests that give you an easy to use ML stack anywhere Kubernetes is already running and can self configure based on the cluster it deploys into.

More details can be found at https://github.com/kubeflow/kubeflow

Don’t stop now! The next scenario will only take about 10 minutes to complete.

Deploying Github Issue Summarization with Kubeflow

Step 1 of 4

Deploying Kubeflow

With Kubeflow being an extension to Kubernetes, all the components need to be deployed to the platform.

The team have provided an installation script which uses Ksonnet to deploy Kubeflow to an existing Kubernetes cluster. Ksonnet requires a valid Github token. The following can be used within Katacoda. Run the command to set the required environment variable.

export GITHUB_TOKEN=99510f2ccf40e496d1e97dbec9f31cb16770b884

Once installed, you can run the installation script:

export KUBEFLOW_VERSION=0.2.5
curl https://raw.githubusercontent.com/kubeflow/kubeflow/v${KUBEFLOW_VERSION}/scripts/deploy.sh | bash

You should see the Kubeflow pods starting.

kubectl get pods

Create Persistent Volume and Services for Katacoda

To ensure Kubeflow runs successfully on Katacoda, deploy the following extensions.

kubectl apply -f ~/kubeflow/katacoda.yaml

This will create the LoadBalancer and Persistent Volume required by Kubeflow. This will vary based on your environment.