EKS : Horizontal Pod Auto-scaler (HPA)

An efficient application architecture is the one which can scale on its own to digest the increasing load on the application to meet the traffic demands in terms of resources. In this article we are going to read about how kubernetes do it on pod level as well as at the node level. This is how things work in production. So let's get started.

We are going to work on EKS cluster and we have it already.

ip-10-0-0-212.ap-south-1.compute.internal Ready <none> 42m v1.16.12-eks-904af05
ip-10-0-1-42.ap-south-1.compute.internal Ready <none> 42m v1.16.12-eks-904af05
ip-10-0-1-77.ap-south-1.compute.internal Ready <none> 42m v1.16.12-eks-904af05
[linuxadvise@linuxadvise .kube]$


  • While we work on production it is very important that application should never go down and also it should auto scale to meet the increasing demand of resources.

There are two ways to auto scale in EKS

Horizontal Pod Scaling (HPA)

The Horizontal Pod Auto-scaler automatically scales the number of pods in a replication controller, deployment, replica set or stateful set based on observed CPU utilization (or, with custom metrics support, on some other application-provided metrics). Note that Horizontal Pod Auto-scaling does not apply to objects that can't be scaled, for example, DaemonSets.

  • Now we will see HPA in action, we will apply load on our cluster and see how HPA works.

  • For this we are going to install metrics server which will gather metrics from our cluster.

  • Install metrics server helm chart in metrics namespace.

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml

  • Now we are going to deploy a PHP application which is going to generate high amount of CPU in the application.

 kubectl run php-apache --image=k8s.gcr.io/hpa-example --requests=cpu=200m --expose --port=80

  • Now we are going to configure HPA whenever CPU utilization is 50% and scale between 1 and 10 pods

kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10
[linuxadvise@linuxadvise 3-1-Autoscaling]$ kubectl get hpa
php-apache Deployment/php-apache <unknown>/50% 1 10 1 82s
[linuxadvise@linuxadvise 3-1-Autoscaling]$

Wait for 5 min and then generate some load on the application.

  • Now we will open another server window and manually put load on the application

kubectl run -i --tty load-generator --image=busybox /bin/sh
## Once inside the container , execute the below command
while true; do wget -q -O - http://php-apache; done

  • Wait for 5 minutes to see hpa in action.

and if you terminate the infinite loop to generate load on the application, pods will scale down on its own 😊

So that is how horizontal pod auto-scaling works.

PS: Friends, we are looking forward for feedback from our lovely , valuable and respected readers.We request you to provide the same in the comments section.

Also please follow us on LinkedIn and Facebook by clicking on links given below.

259 views0 comments

Subscribe Form

©2020 by Linux Advise