Difficulty: intermediate
Estimated Time: 17 minutes
  • Scenario Contents
  • Educational Objective
  • What To Expect
  • What You Need To Know Before You Start
  • Understanding Why Kubernetes Horizontal Pod Autoscaler is Important

Scenario Contents

  • Step 1: Installing the Kubernetes Metric Server
  • Step 2: Creating the Single Pod Deployment to be Autoscaled
  • Step 3: Stressing the Pod in the Deployment
  • Step 4: Applying Horizontal Pod Autoscaling the Alleviate CPU Stress
  • Step 5: Viewing the HPA in Action

Educational Objective

The purpose of this scenario is to help the learner understand the nature and use of Kubneretes Horizontal Pod Autoscaler, which is also known by the shortharnd term, K8S HPA.

What To Expect

After taking this scenario you will:

  • Understand how K8S HPA ensures that your application or service always has the number of pods necessary to operate at the best efficiency possible.
  • Be able to create a simple autoscaling deployment using K8S HPA imperatively at the command line.
  • Be able to view the state of your deployment running under K8S HPA using standard kubectl commands.
  • Impose overwhelming load on a pod and then witness K8S HPA reduce load burden by adding more pods to your Kubernetes cluster

What You Need To Know Before You Start

In order to get full benefit from this scenario, we expect that you have some introductory familiarity with Kubernetes. You need to have a working knowledge of clusters, pods, deployments and services. Also, you should be comfortable working with the command, kubectl.

We'll show you exactly the commands you will be using, but having a conceptual understanding of the commands and objects you're working with will help you learn in a more meaningful way, well beyond simple rote entry of commands in a terminal window.

Understanding Why Kubernetes Horizontal Pod Autoscaler is Important.

Kubernetes Horizontal Pod Autoscaler (HPA) addresses a basic problem in distributed, container based architecture: scaling the computing environment up or down to meet load demand. When you apply HPA to a Kubernetes deployment or replicaset, intelligence in HPA will keep an eye on the CPU utilization of the pods in force. When a particular pod starts to approach a usage threshold, HPA will create additional pods to alleviate the load burden of the pod(s) that is reaching the utilization limit.

The autoscaling capabilities of Kubernetes Horizontal Pod Autoscaler safeguard pods against runtime performance degradation due to capacity overloads.

However, please be advised that HPA will only create pods in existing nodes of a cluster. It does not have the capability to create new nodes when the cluster becomes overloaded. To apply node autoscaling to the cluster, you need to use a tool such as Cluster Autoscaler.

You've crossed the finish line!

In this scenario you learned how to:

  • Install the Kubernetes Metrics Server so that we could monitor the behavior of pods running in a Kubernetes cluster
  • Create a single pod deployment of a web server application intended to be autoscaled by HPA
  • Install a test container from which to call the web server application from within the Kubernetes cluster
  • Apply HPA to monitor and scale up the number of pods as the first pod exceeded resource allocation
  • Monitor pods in order to observe HPA scale up more pods to meet usage demand,

We hope you found this scenario informative, engaging and fun. We have more scenarios in the works that cover man different aspects of Kubernetes . We hope you return to make more of our offerings soon!

To learn more about Kubernetes Horizontal Autoscaling, visit the topic on the Kubernetes web site, here.

Using Kubernetes Horizontal Pod Autoscaler

Step 1 of 5

Step 1 - Installing the Kubernetes Metric Server

This preview video shows you exactly what you're going to do in this step.

Time to complete step: 3 Minutes


IMPORTANT: You need to do the steps in sequence in order for the state of the lesson's learning environment to be consistent. Otherwise, you'll get behaviors that might be confusing.


In order to perform all the activities required in this step you need to know how to work with the text editor, vi to add data to a text file.

The way that the Horizontal Pod Autoscaler (HPA) works is that it monitors the utilization among all the pods relevant to a particular deployment. If a pod's utilization starts to exceed it allocation, HPA will spin up a new pod, provided that a node on the cluster has the space to accommodate an additional pod.

But, in order to work, there need to be a metrics controller installed that will report to HPA utilization activity on the pods. In the case of this scenario, we need to install the metrics controller. The controller will use is Kubernetes Metric Server.

The installation process is a 3 step process.

  • Get the metrics server code from GitHub
  • Add a setting in the yaml manifest file, deploy/1.8+/metrics-server-deployment.yaml to allow the metrics server to work in the scenario interactive computing environment
  • Apply the manifest files to the Kubernetes cluster running in the scenario to install the metric server
  • Verify the metrics server is running The details are as follows.

Get the metrics server code from GitHub

git clone https://github.com/kubernetes-incubator/metrics-server.git

Add a setting in the yaml manifest file

After you've cloned the metrics server code from GitHub, to the metrics server directory, like so

cd metrics-server/

We need to add some information to the manifest yaml file, deploy/1.8+/metrics-server-deployment.yam in order to have Metric Server work properly in this interactive learning environment. We're going to open the yaml file in the terminal window using the vi text editor. Then, we're going to make the necessary addition and finally save the file.

Open the yaml file, deploy/1.8+/metrics-server-deployment.yaml in vi using the following command:

vi deploy/1.8+/metrics-server-deployment.yaml

You should see this within the yaml file:

  containers:
  - name: metrics-server
    image: k8s.gcr.io/metrics-server-amd64:v0.3.1
    imagePullPolicy: Always
    volumeMounts:
    - name: tmp-dir
      mountPath: /tmp

Put the vi editor in to insert mode and add the text within the comments below:

  containers:
  - name: metrics-server
    image: k8s.gcr.io/metrics-server-amd64:v0.3.1
    imagePullPolicy: Always
    #add text starting here...
    command:
    - /metrics-server
    - --metric-resolution=30s
    - --kubelet-insecure-tls
    - --kubelet-preferred-address-types=InternalIP
    #... ending here
    volumeMounts:
    - name: tmp-dir
      mountPath: /tmp

Save the contents of the yaml file.

Apply the manifest files

Now we need to actually install the metrics server. Execute the command below.

kubectl apply -f deploy/1.8+/

Verify the metrics server is running

Execute this kubectl command to verify the metrics server is installed

kubectl get po -n kube-system |grep metrics

Wait 60 seconds for the metrics server to warm up, then type this command to ensure the the metrics server is working:

kubectl top pod --all-namespaces

You should get some metrics info, similar to, but not exactly like this:

kube-system   coredns-78fcdf6894-n48vg          2m           10Mi
kube-system   coredns-78fcdf6894-rkbgg          2m           9Mi
kube-system   etcd-master                       14m          85Mi
kube-system   kube-apiserver-master             27m          407Mi
kube-system   kube-controller-manager-master    21m          59Mi
kube-system   kube-proxy-bq9hs                  2m           19Mi
kube-system   kube-proxy-c7qnk                  2m           17Mi
kube-system   kube-scheduler-master             7m           14Mi
kube-system   metrics-server-7dfcc96bd9-txz92   2m           14Mi
kube-system   weave-net-5x6ns                   1m           57Mi
kube-system   weave-net-b9cm6                   1m           52Mi

The next step is to install an application that we'll use to burden CPU utilization and which HPA will remedy.