Mastering Autoscaling in Amazon EKS: Scaling Your Kubernetes Workloads Dynamically

@Harsh
7 min readMay 5, 2024

--

Introduction:

Autoscaling is a crucial aspect of managing workloads in Amazon Elastic Kubernetes Service (EKS), ensuring optimal resource utilization and performance.

In this comprehensive blog, we’ll delve into the theory and practical implementation of autoscaling at both the pod and node levels using Horizontal Pod Autoscaler (HPA) and Auto Scaling Groups (ASG) respectively.

Understanding Autoscaling:

Autoscaling is a pivotal feature in managing workloads within Amazon Elastic Kubernetes Service (EKS), ensuring optimal resource utilization and application performance. There are two primary mechanisms for autoscaling: Horizontal Pod Autoscaler (HPA) and Auto Scaling Groups (ASG).

Horizontal Pod Autoscaler (HPA):

HPA is responsible for dynamically adjusting the number of pod replicas based on CPU or memory utilization metrics. As the workload demand fluctuates, HPA scales the pod replicas up or down to maintain desired performance levels. This ensures that your application can efficiently handle varying levels of traffic without manual intervention.

Auto Scaling Groups (ASG):

ASG manages the scaling of worker nodes within your EKS cluster. By monitoring metrics such as CPU utilization or custom-defined thresholds, ASG automatically adds or removes nodes to match the workload demand. This ensures that your cluster can efficiently accommodate changes in application load while optimizing resource utilization and minimizing costs.

In summary, HPA and ASG work together to provide a comprehensive autoscaling solution for Amazon EKS, enabling you to achieve high availability, scalability, and cost-effectiveness for your Kubernetes workloads. Understanding the functionality and benefits of both mechanisms is essential for designing robust and resilient deployment architectures on AWS. 📈💡

Setting Up Autoscaling in Amazon EKS:

Auto Scaling Groups (ASG) Or Cluster AutoScaler:

1. Launch EKS Cluster with asg access.

  • While launching the EKS cluster, you have to enable the asg access so that the worker nodes will launch by asg that is Auto Scaling Group.
  • Run the follow command to launch the EKS cluster with required configurations.
 eksctl create cluster --asg-access --name my-cluster-1 --asg-access --nodes-max 10 --nodes-min 2 --nodes 3 --node-type t2.small --nodegroup-name Node-group-A --ssh-access --enable-ssm --version 1.25 --region us-west-1

You can customize the name, regions, number of nodes, as per your requirements.

2. Deploy cluster-autoscaler:

  • Deploy this cluster-autoscaler deployment that will send the nodes metrics or information to the Auto scaling group that will further scale the nodes accordingly.
  • First download deployment manifest with this command.
curl -O https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml
  • Now open the file and make following changes.
  • At line number 165, change the <cluster-name> to your cluster name.
  • Save the file after making these changes.
  • Apply the file with kubectl command.
 kubectl apply -f cluster-autoscaler-autodiscover.yaml

3. Required Steps.

  • Add safe-to-evict feature in this deployment with following command.
 kubectl patch deployment cluster-autoscaler -n kube-system -p '{"spec":{"template": {"metadata": {"annotations": {"cluster-autoscaler.kubernetes.io/safe-to-evict": "false"}}}}}'
  • Run the following command that will open cluster-autoscaler deployment file in editor.
 kubectl -n kube-system edit deployment.apps/cluster-autoscaler
  • Add the following two lines in the file at line number 166 and 167
  • The first line --balance-similar-node-groups ensures that there is enough available compute across all the regions.
  • --skip-nodes-with-system-pods-false ensures that there is no problem in scaling from zero.
  • Now as soon as you save and close the file, the changes will updated automatically.
  • Now we have to update the image of the deployment also. This image should be compatible with the kubernetes version you have used in EKS.
  • Since we have used 1.25 version of kubernetes in EKS, we will use 1.25.1 image of cluster-autoscaler.
kubectl set image deployment cluster-autoscaler -n kube-system cluster-autoscaler=registry.k8s.io/autoscaling/cluster-autoscaler:v1.25.1
  • Now we will check whether this deployment will manage the nodes or not.
 kubectl logs -n kube-system deployment.apps/cluster-autoscaler

4. Testing

  • Now we will validate ASG scaling by simulating load spikes and observing the addition or removal of nodes in the cluster.
  • We have launched EKS cluster with 3 worker nodes.
  • But after, when we setup auto scaling in the cluster, it will detect the load generated and scale the nodes accordingly. Here in my case, there is no load at all, hence it will automatically scale in the nodes.

Horizontal Pod AutoScaler (HPA):

1. Install Metrics Server:

Deploy Metrics Server to collect CPU and memory metrics from pods.

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
serviceaccount/metrics-server created

2. Deploying a demo App :

  • Deploy a demo app that will be autoscaled by HPA, i.e , HorizontalPodAutoscaler
  • Create a deployment manifest for deploying this app. Put following yaml code inside it.
apiVersion: apps/v1
kind: Deployment
metadata:
name: php-apache
spec:
selector:
matchLabels:
run: php-apache
template:
metadata:
labels:
run: php-apache
spec:
containers:
- name: php-apache
image: registry.k8s.io/hpa-example
ports:
- containerPort: 80
resources:
limits:
cpu: 500m
requests:
cpu: 200m
---
apiVersion: v1
kind: Service
metadata:
name: php-apache
labels:
run: php-apache
spec:
ports:
- port: 80
selector:
run: php-apache

3. Define HPA:

  • Create an HPA resource for your deployment or stateful set, specifying metrics and scaling thresholds.
  • Create a HPA manifest with following yaml code.
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
creationTimestamp: null
name: php-apache
namespace: default
spec:
maxReplicas: 10
minReplicas: 1
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: php-apache
targetCPUUtilizationPercentage: 50
status:
currentReplicas: 1
desiredReplicas: 1
  • There is a alternate command also for this.
kubectl autoscale deployment <name> --cpu-percent=50 --min=1 --max=10

4. Testing:

  • Validate HPA functionality by generating load and observing pod scaling behavior.
  • Run the following command that will generate load on these servers and you will see that HPA automatically scale pods.
kubectl run -i --tty load-generator --rm --image=busybox:1.28 --restart=Never -- /bin/sh -c "while sleep 0.01; do wget -q -O- http://php-apache; done"
  • Run the following command and observe the load generation and pod scaling.
kubectl get hpa -w

You will notice that each time the load is increasing on the pods, and hpa will scale those pods as soon as it cross the limit of 50% CPU utilization.

Conclusion:

By mastering autoscaling with HPA and ASG in Amazon EKS, you can effectively manage workload fluctuations, improve resource utilization, and enhance application scalability. Understanding the theory and implementing practical steps outlined in this blog will empower you to build resilient and efficient Kubernetes deployments on AWS. Start optimizing your EKS clusters for autoscaling today! 🚀🔍

--

--

@Harsh
@Harsh

Written by @Harsh

A devOps engineer from India

Responses (2)