- Scenario Contents
- Educational Objective
- What To Expect
- What You Need To Know Before You Start
- Understanding Why Kubernetes Horizontal Pod Autoscaler is Important
- Step 1: Installing the Kubernetes Metric Server
- Step 2: Creating the Single Pod Deployment to be Autoscaled
- Step 3: Stressing the Pod in the Deployment
- Step 4: Applying Horizontal Pod Autoscaling the Alleviate CPU Stress
- Step 5: Viewing the HPA in Action
The purpose of this scenario is to help the learner understand the nature and use
of Kubneretes Horizontal Pod Autoscaler, which is also known by the
What To Expect
After taking this scenario you will:
- Understand how K8S HPA ensures that your application or service always has the number of pods necessary to operate at the best efficiency possible.
- Be able to create a simple autoscaling deployment using K8S HPA imperatively at the command line.
- Be able to view the state of your deployment running under K8S HPA using standard
- Impose overwhelming load on a pod and then witness K8S HPA reduce load burden by adding more pods to your Kubernetes cluster
What You Need To Know Before You Start
In order to get full benefit from this scenario, we expect that you have some introductory familiarity
with Kubernetes. You need to have a working knowledge of clusters, pods,
deployments and services. Also, you should be
comfortable working with the command,
We'll show you exactly the commands you will be using, but having a conceptual understanding of the commands and objects you're working with will help you learn in a more meaningful way, well beyond simple rote entry of commands in a terminal window.
Understanding Why Kubernetes Horizontal Pod Autoscaler is Important.
Kubernetes Horizontal Pod Autoscaler (HPA) addresses a basic problem in distributed, container based architecture: scaling the computing environment up or down to meet load demand. When you apply HPA to a Kubernetes deployment or replicaset, intelligence in HPA will keep an eye on the CPU utilization of the pods in force. When a particular pod starts to approach a usage threshold, HPA will create additional pods to alleviate the load burden of the pod(s) that is reaching the utilization limit.
The autoscaling capabilities of Kubernetes Horizontal Pod Autoscaler safeguard pods against runtime performance degradation due to capacity overloads.
However, please be advised that HPA will only create pods in existing nodes of a cluster. It does not have the capability to create new nodes when the cluster becomes overloaded. To apply node autoscaling to the cluster, you need to use a tool such as Cluster Autoscaler.
You've crossed the finish line!
In this scenario you learned how to:
- Install the Kubernetes Metrics Server so that we could monitor the behavior of pods running in a Kubernetes cluster
- Create a single pod deployment of a web server application intended to be autoscaled by HPA
- Install a test container from which to call the web server application from within the Kubernetes cluster
- Apply HPA to monitor and scale up the number of pods as the first pod exceeded resource allocation
- Monitor pods in order to observe HPA scale up more pods to meet usage demand,
We hope you found this scenario informative, engaging and fun. We have more scenarios in the works that cover man different aspects of Kubernetes . We hope you return to make more of our offerings soon!
To learn more about Kubernetes Horizontal Autoscaling, visit the topic on the Kubernetes web site, here.
Using Kubernetes Horizontal Pod Autoscaler
Step 1 - Installing the Kubernetes Metric Server
This preview video shows you exactly what you're going to do in this step.
Time to complete step: 3 Minutes
IMPORTANT: You need to do the steps in sequence in order for the state of the lesson's learning environment to be consistent. Otherwise, you'll get behaviors that might be confusing.
In order to perform all the activities required in this step you need to know how to work with the text editor,
to add data to a text file.
The way that the Horizontal Pod Autoscaler (HPA) works is that it monitors the utilization among all the pods relevant to a particular deployment. If a pod's utilization starts to exceed it allocation, HPA will spin up a new pod, provided that a node on the cluster has the space to accommodate an additional pod.
But, in order to work, there need to be a metrics controller installed that will report to HPA utilization activity on the pods. In the case of this scenario, we need to install the metrics controller. The controller will use is Kubernetes Metric Server.
The installation process is a 3 step process.
- Get the metrics server code from GitHub
- Add a setting in the yaml manifest file,
deploy/1.8+/metrics-server-deployment.yamlto allow the metrics server to work in the scenario interactive computing environment
- Apply the manifest files to the Kubernetes cluster running in the scenario to install the metric server
- Verify the metrics server is running The details are as follows.
Get the metrics server code from GitHub
git clone https://github.com/kubernetes-incubator/metrics-server.git
Add a setting in the yaml manifest file
After you've cloned the metrics server code from GitHub, to the metrics server directory, like so
We need to add some information to the manifest yaml file,
deploy/1.8+/metrics-server-deployment.yam in order to have
Metric Server work properly in this interactive learning environment. We're going to open the yaml file in the terminal
window using the
vi text editor. Then, we're going to make the necessary addition and finally save the file.
Open the yaml file,
vi using the following command:
You should see this within the yaml file:
containers: - name: metrics-server image: k8s.gcr.io/metrics-server-amd64:v0.3.1 imagePullPolicy: Always volumeMounts: - name: tmp-dir mountPath: /tmp
Put the vi editor in to insert mode and add the text within the comments below:
containers: - name: metrics-server image: k8s.gcr.io/metrics-server-amd64:v0.3.1 imagePullPolicy: Always #add text starting here... command: - /metrics-server - --metric-resolution=30s - --kubelet-insecure-tls - --kubelet-preferred-address-types=InternalIP #... ending here volumeMounts: - name: tmp-dir mountPath: /tmp
Save the contents of the yaml file.
Apply the manifest files
Now we need to actually install the metrics server. Execute the command below.
kubectl apply -f deploy/1.8+/
Verify the metrics server is running
kubectl command to verify the metrics server is installed
kubectl get po -n kube-system |grep metrics
Wait 60 seconds for the metrics server to warm up, then type this command to ensure the the metrics server is working:
kubectl top pod --all-namespaces
You should get some metrics info, similar to, but not exactly like this:
kube-system coredns-78fcdf6894-n48vg 2m 10Mi kube-system coredns-78fcdf6894-rkbgg 2m 9Mi kube-system etcd-master 14m 85Mi kube-system kube-apiserver-master 27m 407Mi kube-system kube-controller-manager-master 21m 59Mi kube-system kube-proxy-bq9hs 2m 19Mi kube-system kube-proxy-c7qnk 2m 17Mi kube-system kube-scheduler-master 7m 14Mi kube-system metrics-server-7dfcc96bd9-txz92 2m 14Mi kube-system weave-net-5x6ns 1m 57Mi kube-system weave-net-b9cm6 1m 52Mi
The next step is to install an application that we'll use to burden CPU utilization and which HPA will remedy.