Autoscaling workloads with KEDA and the Prometheus Scaler
Walking through the process of learning how to use a KEDA scaler for autoscaling Kubernetes workloads
Published on:
Jun 9, 2025Last updated on:
Jun 12, 2025This blog is part of our KEDA series, we recommend reading the rest of the posts in the series:
- Introducing Kubernetes Event-driven Autoscaling (KEDA)
- Getting Started with Autoscaling in Kubernetes with KEDA
- Autoscaling workloads with KEDA and the Prometheus Scaler
KEDA’s Prometheus Scaler
In the list of KEDA’s built-in scalers we see that KEDA has a Prometheus Scaler, meaning KEDA has built-in support for using metrics/events from Prometheus as a source to trigger autoscaling operations. I want to create a proof of concept that will show we can have an autoscaling infrastructure with KEDA that can fetch metrics from Prometheus and automatically scale a workload.
In this blog post I will demonstrate using KEDA with Prometheus metrics to automatically scale a workload in a Kubernetes cluster. To do this, my proof of concept will scale a front-end workload based on the amount of traffic it is receiving.
To create this proof of concept, we are going to need:
- A Kubernetes cluster using Kubernetes version v1.29 or higher
- Prometheus
- KEDA
- An example workload that exposes Prometheus metrics
- The exposed Prometheus metrics collected by Prometheus
Prerequisite Services: Installation
First, I use the
kind
(Kubernetes in Docker) tool to create a local Kubernetes cluster with the following command:kind create cluster
To verify
kind
created a Kubernetes cluster compatible with KEDA, I runkubectl version
to verify the version of the Kubernetes cluster.Then, I add the following Helm repositories:
helm repo add kedacore https://kedacore.github.io/charts helm repo add podinfo https://stefanprodan.github.io/podinfo helm repo add prometheus-community \ https://prometheus-community.github.io/helm-charts helm repo update
Now, I install the kube-prometheus-stack Helm chart with the Prometheus Operator configured to use
ServiceMonitors
in any namespace with the following command:helm install kube-prom-stack prometheus-community/kube-prometheus-stack \ --create-namespace --namespace obs-system \ --version 70.4.2 \ --set prometheus.prometheusSpec.serviceMonitorSelectorNilUsesHelmValues=false
Then, I install the KEDA Helm chart with the following commands:
helm install keda kedacore/keda \ --create-namespace --namespace keda \ --version 2.17.1
Finally, I install the podinfo Helm chart with it’s
ServiceMonitor
for the Prometheus Operator to find and use to configure Prometheus to scrape metrics from the podinfo app with the following command:helm install podinfo podinfo/podinfo \ --create-namespace --namespace podinfo \ --version 6.8.0 \ --set serviceMonitor.enabled=true
ScaledObject or ScaledJob
Now that I have the all required services installed, I need to create the configurations that KEDA will use to automatically scale my target workload, the podinfo application.
I first need to determine if I want to use KEDA’s ScaledObject
1 or ScaledJob
2 since the podinfo Helm chart I deployed came with a Kubernetes Deployment configured with 1 pod replica by default, the ScaledObject
seems to be the appropriate choice.
If the infrastructure of my Kubernetes application included a controller for running jobs or had the sole responsibility of processing queues, I would consider re-designing my application’s Kubernetes architecture by removing the controller and having KEDA serve as the controller for running the queue processing jobs3.
Creating a ScaledObject
Now, I want to create the manifest file for a Kubernetes custom resource I’m not familiar with, therefore to create a properly configured ScaledObject
I will need to refer to the ScaledObject specification1 and the Kubernetes documentation for configuring the scaling behaviour in HorizontalPodAutoscaler objects.
Defining the Target and Scaling Behaviour
After reviewing the documentation, I set my autoscaling target:
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: podinfo
…and configure the podinfo application’s scaling behaviour:
spec:
pollingInterval: 3
cooldownPeriod: 30
minReplicaCount: 1
maxReplicaCount: 5
advanced:
horizontalPodAutoscalerConfig:
behavior:
scaleDown:
stabilizationWindowSeconds: 30
policies:
- type: Pods
value: 1
periodSeconds: 3
scaleUp:
stabilizationWindowSeconds: 0
policies:
- type: Pods
value: 1
periodSeconds: 3
For the scaling strategy, I choose to define HPA configurations in the ScaledObject
manifest, if I want to define a more advanced scaling strategy I would make use of KEDA’s scaling modifiers feature.
Defining the Scale Trigger(s)
Now, to complete the manifest file for my ScaledObject
I need to configure how the scaling operations can be triggered with one or more scalers. To understand how I can configure Prometheus as a trigger, I go back to the Prometheus scaler documentation and see I need to provide the following information in my manifest file.
Address of the Prometheus server
Since my proof of concept has Prometheus and KEDA running in the same cluster, I can use Prometheus’s internal DNS record in the Kubernetes cluster which follows the format <serviceName>.<namespaceName>.svc.cluster.local
.
Prometheus query
The Prometheus query to return the value that will be evaluated against a threshold every time a scaling decision needs to be made_. To determine which query I should use, I reviewed the metrics exposed by my application, ran a kubectl port-forward
command to access the Prometheus URL using localhost and ran executed Prometheus queries until I found a query I was happy to use for scaling. In this proof of concept, I decided to have scaling decision determined by how many HTTP requests the podinfo application has received within a certain time period.
Thresholds
The threshold
or activationThreshold
the value the Prometheus query must be equal to or exceed in order for the number of pod replicas to scale up automatically. Since my podinfo installation starts off with 1 replica and I don’t plan on scaling my application to zero I don’t need to the set the activationThreshold
, so I only set the value of the threshold
.
After gathering the information I need, I add the parameters for my Prometheus trigger to complete the manifest file for ScaledObject
:
spec:
triggers:
- type: prometheus
metadata:
serverAddress: http://kube-prom-stack-kube-prome-prometheus.obs-system.svc.cluster.local:9090
query: sum(increase(http_requests_total{namespace="podinfo", container="podinfo", status="200"}[1m]))
threshold: '30'
An example of the full manifest file can be found in LiveWyer’s Labs repository.
Deploying the ScaledObject
Now that I have a manifest file for my ScaledObject
, I deploy it the same namespace as the configured scale target with the following command:
kubectl apply -f scaledobject.yaml -n podinfo
KEDA’s Generated HPA Object
After deploying my ScaledObject
, I check to see if I configured it correctly by checking the status of the ScaledObject
.
$ kubectl get scaledobject -n podinfo
NAME SCALETARGETKIND SCALETARGETNAME MIN MAX READY ACTIVE FALLBACK PAUSED TRIGGERS AUTHENTICATIONS AGE
scale-podinfo apps/v1.Deployment podinfo 1 5 True True False Unknown prometheus 13s
Checking the logs of the KEDA operator pod, I can see it successfully detected the newly deployed ScaledObject
…
2025-04-16T14:46:25Z INFO Reconciling ScaledObject {"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject", "ScaledObject": {"name":"scale-podinfo","namespace":"podinfo"}, "namespace": "podinfo", "name": "scale-podinfo", "reconcileID": "c9d25bd1-5de3-43d3-920f-0fee1f3a3793"}
2025-04-16T14:46:25Z INFO Adding Finalizer for the ScaledObject {"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject", "ScaledObject": {"name":"scale-podinfo","namespace":"podinfo"}, "namespace": "podinfo", "name": "scale-podinfo", "reconcileID": "c9d25bd1-5de3-43d3-920f-0fee1f3a3793"}
2025-04-16T14:46:25Z INFO KubeAPIWarningLogger metadata.finalizers: "finalizer.keda.sh": prefer a domain-qualified finalizer name including a path (/) to avoid accidental conflicts with other finalizer writers
2025-04-16T14:46:25Z INFO Detected resource targeted for scaling {"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject", "ScaledObject": {"name":"scale-podinfo","namespace":"podinfo"}, "namespace": "podinfo", "name": "scale-podinfo", "reconcileID": "c9d25bd1-5de3-43d3-920f-0fee1f3a3793", "resource": "apps/v1.Deployment", "name": "podinfo"}
2025-04-16T14:46:25Z INFO Creating a new HPA {"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject", "ScaledObject": {"name":"scale-podinfo","namespace":"podinfo"}, "namespace": "podinfo", "name": "scale-podinfo", "reconcileID": "c9d25bd1-5de3-43d3-920f-0fee1f3a3793", "HPA.Namespace": "podinfo", "HPA.Name": "keda-hpa-scale-podinfo"}
2025-04-16T14:46:25Z INFO Initializing Scaling logic according to ScaledObject Specification {"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject", "ScaledObject": {"name":"scale-podinfo","namespace":"podinfo"}, "namespace": "podinfo", "name": "scale-podinfo", "reconcileID": "c9d25bd1-5de3-43d3-920f-0fee1f3a3793"}
…and created a HPA object. If the ScaledObject
is misconfigured then a HPA object may not be created by KEDA, for example if I misconfigured the target reference a HPA object would not have been created.
With the kubectl tree
plugin4, I ran a command targeting the created ScaledObject
to see it’s relationship with the created HPA object.
$ kubectl tree scaledobject scale-podinfo -n podinfo
NAMESPACE NAME READY REASON AGE
podinfo ScaledObject/scale-podinfo True ScaledObjectReady 28m
podinfo └─HorizontalPodAutoscaler/keda-hpa-scale-podinfo - 28m
Just in case, I also check the status of the HPA object created by KEDA.
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
...
name: keda-hpa-scale-podinfo
status:
conditions:
- lastTransitionTime: "2025-04-16T14:46:40Z"
message: recommended size matches current size
reason: ReadyForNewScale
status: "True"
type: AbleToScale
- lastTransitionTime: "2025-04-16T14:46:40Z"
message: 'the HPA was able to successfully calculate a replica count from external
metric s0-prometheus(&LabelSelector{MatchLabels:map[string]string{scaledobject.keda.sh/name:
scale-podinfo,},MatchExpressions:[]LabelSelectorRequirement{},})'
reason: ValidMetricFound
status: "True"
type: ScalingActive
- lastTransitionTime: "2025-04-16T14:46:40Z"
message: the desired count is within the acceptable range
reason: DesiredWithinRange
status: "False"
type: ScalingLimited
The status message in the HPA object the HPA was able to successfully calculate a replica count from external metric
indicates that the Kubernetes HPA controller was able to fetch the metrics from the KEDA Metrics Server.
Testing the Scale Triggers
Now that my application and autoscaling infrastructure are in place, I want to verify that my workload will be autoscaled as advertised based on the changing results returned by the Prometheus query I defined in my ScaledObject
. To run the test with my setup:
I open a terminal and run a watch command:
kubectl get pods -n podinfo -w
I open another terminal and run the following command to allow me to access the podinfo application at
localhost:8080
:kubectl -n podinfo port-forward deploy/podinfo 8080:9898
I open another terminal and run the following command that will continuously access my podinfo application every second
while :; do curl localhost:8080; sleep 1; done
While the looped command is running, I monitor my first terminal to see if the number the pods replicas increase and if it does increase, I observe how quickly it scales the number of replicas from the
minReplicaCount
to themaxReplicaCount
that I declared in myScaledObject
.After waiting a while, I check the logs of the KEDA operator and I don’t see any scale up events in the logs because KEDA has the Kubernetes HPA controller handle the autoscaling using the HPA object KEDA created.
After sometime I stop the
curl
loop and observe the scale down behaviour.
Now that I have a working ScaledObject
I would personally want to move the ScaledObject
into a Helm chart.
Wrapping Up
That wraps up the demo of the proof of concept I created to autoscale an example workload with Prometheus metrics via KEDA’S Prometheus metrics.
You can find the manifest files and more specific instructions to re-create my proof of concept demo in LiveWyer’s Lab repository.
If you already have experience writing Kubernetes manifests files and deploying them with kubectl apply
, then the jump to writing a Kubernetes manifest file for KEDA custom resources should be relatively straightforward. The trickiest part would be learning the specification for KEDA’s custom resources but the specifications can be found in the KEDA website.
In regards to this proof of concept with Prometheus, in practice or production you should consider to having KEDA securely access the Prometheus server, for example, a Prometheus server may:
- Be exposed with Kubernetes Ingress
- Have authentication in front of it
In which case, you will have to make use of one of the methods KEDA provides to manage authentication flows.
In the next part of this blog series, I will be demonstrating how to have KEDA authenticate against a Redis server and scale jobs (starting from 0 jobs) based on the length of a Redis list.
Footnote
This blog is part of our KEDA series, we recommend reading the rest of the posts in the series:
- Introducing Kubernetes Event-driven Autoscaling (KEDA)
- Getting Started with Autoscaling in Kubernetes with KEDA
- Autoscaling workloads with KEDA and the Prometheus Scaler