Cover image for LiveWyer blog post: Advanced VPA implementation: Production patterns, ecosystem tools, and real-world configuration
Engineering • 21min read

Advanced VPA implementation: Production patterns, ecosystem tools, and real-world configuration

Master advanced VPA configurations, ecosystem integration, and production patterns using proven implementations.

Written by:

Avatar Louise Champ Louise Champ

Published on:

Oct 13, 2025

Last updated on:

Oct 13, 2025

This blog is part of our Vertical Pod Autoscaling series, we recommend reading the rest of the posts in the series:

  1. Understanding Kubernetes VPA: Why manual resource allocation is costing you thousands
  2. VPA component deep dive: How Kubernetes Vertical Pod Autoscaler works
  3. VPA production deployment: Avoiding pitfalls for safe and effective scaling
  4. Advanced VPA implementation: Production patterns, ecosystem tools, and real-world configuration

What does advanced VPA implementation actually look like in practice? With solid foundations in VPA architecture and proven production deployment strategies under our belt, we are now positioned to explore the sophisticated implementation patterns that transform VPA from basic resource optimisation into comprehensive infrastructure automation. The techniques we will examine in this final part focus on real VPA controller capabilities that exist today, proven ecosystem tools, and production patterns that deliver measurable results in complex environments.

Most organisations implement VPA with basic configurations and plateau after initial success with simple workloads. However, the difference between basic adoption and transformative infrastructure automation lies in understanding advanced VPA resource policies, integrating with mature ecosystem tools, and implementing enterprise patterns that work reliably across diverse operational scenarios that mirror real-world complexity.

Please note: The examples in this blog assume VPA installation via the Fairwinds Helm chart. If you installed VPA using the official installation scripts from the kubernetes/autoscaler repository, the pod labels will differ. For official installations, use -l app=vpa-* instead of the Fairwinds chart labels shown in these examples.

Advanced resource policy configuration

The VPA controller provides sophisticated resource policy controls that enable fine-grained management of how recommendations are generated and applied. Understanding these capabilities allows them to implement VPA optimisation that respects operational boundaries whilst maximising efficiency gains through precise control over scaling behaviour.

What makes resource policies essential for production VPA deployments? They provide the control mechanisms that ensure VPA automation enhances rather than disrupts operational reliability, enabling teams to implement optimisation within carefully defined boundaries that align with their specific risk tolerance and performance requirements.

Granular container resource control

The VPA controller supports detailed per-container resource policies that enable different optimisation strategies within the same pod. This capability becomes essential for complex applications where different containers have varying resource characteristics and optimisation requirements:

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: multi-container-vpa
  namespace: production
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web-app-with-sidecar
  updatePolicy:
    updateMode: "InPlaceOrRecreate"
  resourcePolicy:
    containerPolicies:
    # Main application container - full optimisation
    - containerName: web-container
      minAllowed:
        cpu: 200m
        memory: 256Mi
      maxAllowed:
        cpu: 2000m
        memory: 4Gi
      controlledResources: ["cpu", "memory"]
      mode: "Auto"
    
    # Database sidecar - memory optimisation only
    - containerName: redis-cache
      minAllowed:
        memory: 128Mi
      maxAllowed:
        memory: 2Gi
      controlledResources: ["memory"]
      mode: "Auto"
    
    # Logging sidecar - stable resources
    - containerName: fluent-bit
      minAllowed:
        cpu: 50m
        memory: 64Mi
      maxAllowed:
        cpu: 200m
        memory: 256Mi
      controlledResources: ["cpu", "memory"]
      mode: "Initial"

This configuration demonstrates how the VPA controller enables different optimisation strategies for each container. The main application receives full CPU and memory optimisation, the Redis sidecar focuses on memory efficiency to maintain cache performance, and the logging sidecar uses Initial mode to set appropriate startup resources without ongoing changes.

What this means is that we can apply different levels of automation based on each container’s role and stability requirements. The logging sidecar, for example, typically has predictable resource requirements that do not benefit from ongoing adjustments, whilst the main application container can benefit significantly from continuous optimisation based on actual usage patterns.

Controlling resource requests vs limits

The VPA controller provides controlledValues settings that determine whether VPA manages just resource requests or both requests and limits. This control becomes crucial for applications where they want to maintain specific limit relationships:

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: controlled-values-vpa
  namespace: production
spec:
  targetRef:
    apiVersion: apps/v1
    kind: StatefulSet
    name: database-cluster
  updatePolicy:
    updateMode: "InPlaceOrRecreate"
  resourcePolicy:
    containerPolicies:
    - containerName: postgres
      minAllowed:
        cpu: 1000m
        memory: 2Gi
      maxAllowed:
        cpu: 8000m
        memory: 32Gi
      controlledResources: ["cpu", "memory"]
      controlledValues: RequestsOnly
      mode: "Auto"

When using RequestsOnly, VPA optimises resource requests based on actual usage whilst preserving your manually configured limits. This approach works well for databases where you want VPA to right-size requests for scheduling efficiency whilst maintaining generous limits for performance bursts.

Advanced in-place update configuration

As covered in Part 3, InPlaceOrRecreate mode provides non-disruptive resource updates when possible, falling back to pod recreation when necessary. In advanced scenarios, precise control over eviction behaviour becomes crucial for complex applications with specific operational requirements.

Eviction requirements for complex workloads

The evictionRequirements configuration enables fine-grained control over when VPA should recreate pods versus attempting in-place updates, particularly valuable for multi-container applications with different update tolerance:

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: advanced-eviction-vpa
  namespace: production
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: complex-workload
  updatePolicy:
    updateMode: "InPlaceOrRecreate"
    evictionRequirements:
    - resources: ["memory"]
      changeRequirement: TargetHigherThanRequests
    - resources: ["cpu"]
      changeRequirement: TargetLowerThanRequests
  resourcePolicy:
    containerPolicies:
    - containerName: main-app
      minAllowed:
        cpu: 500m
        memory: 1Gi
      maxAllowed:
        cpu: 8000m
        memory: 32Gi
      controlledResources: ["cpu", "memory"]
      mode: "Auto"

This configuration forces pod recreation for memory decreases and CPU decreases, whilst allowing in-place updates for increases. Such granular control becomes essential for applications where certain resource changes require careful coordination with application state or external dependencies.

Advanced resource policy integration

Complex applications require sophisticated resource policy coordination that goes beyond basic container-level controls, particularly when in-place updates interact with multi-container dependencies:

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: coordinated-update-vpa
  namespace: production
spec:
  targetRef:
    apiVersion: apps/v1
    kind: StatefulSet
    name: distributed-database
  updatePolicy:
    updateMode: "InPlaceOrRecreate"
    evictionRequirements:
    - resources: ["memory"]
      changeRequirement: TargetHigherThanRequests
      containers: ["primary-db"]  # Only primary container triggers recreation
  resourcePolicy:
    containerPolicies:
    - containerName: primary-db
      minAllowed:
        cpu: 2000m
        memory: 8Gi
      maxAllowed:
        cpu: 16000m
        memory: 128Gi
      controlledResources: ["cpu", "memory"]
      controlledValues: RequestsOnly  # Preserve custom limit ratios
      mode: "Auto"
    
    - containerName: backup-agent
      minAllowed:
        cpu: 200m
        memory: 512Mi
      maxAllowed:
        cpu: 2000m
        memory: 8Gi
      controlledResources: ["cpu", "memory"]
      mode: "Initial"  # Avoid disrupting backup processes

This pattern demonstrates container-specific eviction requirements where only changes to the primary database container trigger pod recreation, whilst sidecar containers can be updated in-place without affecting critical database operations. The controlledValues: RequestsOnly setting preserves manually configured limit ratios whilst enabling request optimisation.

Multi-container coordination patterns

Real-world applications often involve multiple containers that require coordinated resource management. The VPA controller provides capabilities for managing these scenarios through sophisticated resource policies and mode combinations that address different container roles and requirements.

Service mesh integration patterns

Service mesh deployments require careful coordination between application containers and proxy sidecars. The VPA controller enables different optimisation strategies for each component, which becomes essential when managing Istio or Linkerd deployments at scale:

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: istio-mesh-vpa
  namespace: production
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: payment-service
  updatePolicy:
    updateMode: "InPlaceOrRecreate"
  resourcePolicy:
    containerPolicies:
    # Main application container
    - containerName: payment-app
      minAllowed:
        cpu: 200m
        memory: 512Mi
      maxAllowed:
        cpu: 4000m
        memory: 8Gi
      controlledResources: ["cpu", "memory"]
      mode: "Auto"
    
    # Istio sidecar proxy
    - containerName: istio-proxy
      minAllowed:
        cpu: 50m
        memory: 128Mi
      maxAllowed:
        cpu: 1000m
        memory: 1Gi
      controlledResources: ["cpu", "memory"]
      mode: "Initial"
    
    # Exclude init containers from VPA management
    - containerName: istio-init
      mode: "Off"

This pattern enables the application container to receive ongoing optimisation whilst the Istio proxy gets appropriate initial sizing without disruptive updates that could affect mesh connectivity. Init containers are explicitly excluded since they run only during pod startup and do not have meaningful resource consumption patterns for VPA analysis.

Database clustering scenarios

Stateful applications like database clusters require careful VPA configuration to balance optimisation with operational stability. We have found that conservative approaches work best for these critical workloads:

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: postgres-cluster-vpa
  namespace: data
spec:
  targetRef:
    apiVersion: apps/v1
    kind: StatefulSet
    name: postgres-cluster
  updatePolicy:
    updateMode: "Initial"
  resourcePolicy:
    containerPolicies:
    - containerName: postgres
      minAllowed:
        cpu: 1000m
        memory: 4Gi
      maxAllowed:
        cpu: 8000m
        memory: 64Gi
      controlledResources: ["cpu", "memory"]
      controlledValues: RequestsOnly
      mode: "Auto"
    
    - containerName: pgbouncer
      minAllowed:
        cpu: 100m
        memory: 256Mi
      maxAllowed:
        cpu: 1000m
        memory: 2Gi
      controlledResources: ["cpu", "memory"]
      mode: "Initial"

Database configurations typically use Initial or RequestsOnly modes to avoid disruption whilst ensuring new pods start with appropriate resources. The connection pooler receives initial sizing but avoids ongoing changes that could affect connection handling during production operations.

So under the hood, this configuration provides optimisation benefits whilst respecting the stability requirements that database workloads demand. The Initial mode ensures that new pods benefit from VPA analysis without disrupting running database instances, which is particularly important during cluster maintenance or scaling operations.

Production-ready ecosystem integration

What ecosystem tools actually deliver value for advanced VPA implementations? The VPA ecosystem includes mature tools that extend basic VPA capabilities with enhanced visualisation, policy management, and operational integration. Understanding how to implement these tools enables sophisticated VPA management at scale.

Goldilocks dashboard implementation

Goldilocks provides comprehensive VPA visualisation through a web interface that simplifies VPA adoption and ongoing management.

Important prerequisite: Goldilocks requires VPA (specifically the recommender component) to be installed in your cluster. If you do not have VPA installed, you can either install it separately or use Goldilocks’ built-in VPA sub-chart.

Rather than managing individual manifests, Goldilocks is best deployed using the official Helm chart:

# Add the Fairwinds Helm repository
helm repo add fairwinds-stable https://charts.fairwinds.com/stable
helm repo update

# Install Goldilocks with production-ready configuration
helm upgrade --install goldilocks fairwinds-stable/goldilocks \
  --namespace goldilocks \
  --create-namespace \
  --version 8.0.0 \
  --set controller.enabled=true \
  --set dashboard.enabled=true \
  --set dashboard.excludeContainers="istio-proxy,linkerd-proxy" \
  --set controller.onByDefault=false \
  --set dashboard.service.type=ClusterIP \
  --set controller.resources.requests.cpu=100m \
  --set controller.resources.requests.memory=128Mi \
  --set controller.resources.limits.cpu=500m \
  --set controller.resources.limits.memory=512Mi \
  --set dashboard.resources.requests.cpu=100m \
  --set dashboard.resources.requests.memory=256Mi \
  --set dashboard.resources.limits.cpu=500m \
  --set dashboard.resources.limits.memory=1Gi

The --set controller.onByDefault=false ensures that Goldilocks only creates VPA objects for explicitly labelled namespaces and deployments, providing controlled adoption rather than cluster-wide automatic VPA creation. The --set vpa.enabled=true installs the VPA recommender component if it is not already present in your cluster.

Access the Goldilocks dashboard using port-forward:

# Forward dashboard port to local machine
kubectl -n goldilocks port-forward svc/goldilocks-dashboard 8080:80

# Access dashboard at http://localhost:8080

Goldilocks automatically creates VPA objects for namespaces and deployments marked with appropriate labels, providing a dashboard that displays recommendations alongside current resource allocation for easy comparison and decision-making. What makes Goldilocks particularly valuable is its ability to generate VPA configurations automatically whilst respecting namespace boundaries and security policies.

Understanding Goldilocks QoS recommendations

Goldilocks provides two distinct Quality of Service recommendations based on different operational strategies, using VPA’s statistical analysis to offer both conservative and aggressive resource allocation approaches.

  • Guaranteed QoS recommendations use VPA’s target value for both requests and limits, ensuring resources are always available to the container. This approach prioritises stability over efficiency and works particularly well for critical applications like databases, payment systems, and services that require predictable performance. The guaranteed approach also integrates effectively with Horizontal Pod Autoscaler for coordinated scaling strategies.
  • Burstable QoS recommendations use VPA’s lowerBound for requests and upperBound for limits, allowing applications to consume resources beyond their base allocation when available. This approach prioritises cost efficiency over predictability, enabling better cluster utilisation whilst providing flexibility for handling spiky workloads or applications with variable resource consumption patterns.

The choice between QoS classes represents different operational priorities; Guaranteed QoS trades efficiency for stability by reserving resources whether they are used or not, whilst Burstable QoS optimises for cluster utilisation at the cost of potential performance variability during resource contention.

Understanding these trade-offs will help teams select appropriate QoS strategies based on their specific application requirements, risk tolerance, and cost optimisation goals rather than applying uniform resource allocation strategies across diverse workload types.

Enabling automatic VPA creation

Goldilocks can automatically create VPA objects for workloads in marked namespaces, simplifying adoption across multiple applications. This automation reduces the operational overhead of managing individual VPA configurations:

# Enable Goldilocks for a namespace
kubectl label namespace production goldilocks.fairwinds.com/enabled=true

# Enable for specific deployments
kubectl label deployment web-app goldilocks.fairwinds.com/enabled=true

# Exclude specific containers from VPA recommendations
kubectl annotate deployment web-app goldilocks.fairwinds.com/exclude-containers=istio-proxy,linkerd-proxy

These labels and annotations control automatic VPA creation, allowing teams to gradually adopt VPA across their infrastructure whilst excluding components that should not be optimised. The exclude-containers annotation is particularly valuable in service mesh environments where proxy containers should maintain stable resource allocations.

What this automation enables is systematic VPA adoption without the operational burden of manually creating and maintaining individual VPA configurations for every workload. Teams can enable optimisation at the namespace level and rely on Goldilocks to generate appropriate VPA policies based on best practices.

Cost impact analysis with Kubecost

How do we measure the financial impact of VPA optimisation? Kubecost provides comprehensive cost analysis and resource monitoring that enables teams to understand the financial benefits of VPA-driven resource optimisation, though it operates independently rather than integrating directly with VPA recommendations.

Kubecost’s primary value lies in cost visibility and right-sizing recommendations that complement VPA analysis. Container request right-sizing recommendations page shows containers which would benefit from changes to their resource requests, providing similar insights to VPA but with direct cost impact calculations and automated implementation capabilities.

Important Note: Kubecost installation requires platform-specific configuration to access cloud provider billing APIs and cluster metrics. Each cloud provider (AWS EKS, Google GKE, Azure AKS) and deployment type (managed vs self-hosted) requires different Helm values for proper cost attribution and billing integration. Consult the Kubecost installation documentation for your specific platform configuration.

# Install Kubecost for cost monitoring
# Note: This is a basic example - production installations require 
# platform-specific configuration for billing API access
helm repo add kubecost https://kubecost.github.io/cost-analyzer/
helm repo update

helm upgrade --install kubecost kubecost/cost-analyzer \
  --namespace kubecost \
  --create-namespace \
  --version 2.8.2
  # Add platform-specific values here based on your environment

Access the Kubecost dashboard using port-forward:

# Access Kubecost dashboard
kubectl port-forward -n kubecost svc/kubecost-cost-analyzer 9090:9090
# Visit http://localhost:9090 for cost analysis

Kubecost provides several capabilities that help measure VPA effectiveness:

  • Cost allocation by workload: Detailed cost breakdowns at namespace, deployment, and pod levels that help teams understand resource spend before and after VPA optimisation.
  • Right-sizing recommendations: The container request right-sizing recommendations table is sorted by default in descending order according to the estimated savings available.
  • Efficiency tracking: Current efficiency metrics that show usage-to-request ratios, helping validate VPA improvements.
  • Savings estimates: Monthly cost projections based on resource allocation changes.

The complementary approach works best when teams use VPA for technical resource optimisation whilst leveraging Kubecost for financial impact measurement and business justification. Kubecost offers insights into fine-tuning scaling strategies—whether using HPA and VPA for container scaling, enabling data-driven decisions about where to focus optimisation efforts based on both technical and financial impact.

This approach provides the cost visibility that makes VPA adoption sustainable by demonstrating concrete financial benefits rather than just technical efficiency improvements.

Multi-cluster VPA management

How do we manage VPA consistently across multiple clusters? Enterprise environments require systematic VPA management across different environments whilst accommodating regional considerations and varying operational requirements.

The approaches we will explore use standard GitOps patterns that teams should already understand, extending existing infrastructure-as-code practices to include VPA policy management without requiring new operational workflows.

GitOps-based VPA policy management

Infrastructure-as-code approaches enable systematic VPA deployment with version control and automated policy distribution. This approach treats VPA configurations as infrastructure components that follow standard development practices. In ArgoCD, this could be achieved as follows:

# argocd-vpa-application.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: vpa-policies
  namespace: argocd
  finalizers:
    - resources-finalizer.argocd.argoproj.io
spec:
  project: infrastructure
  source:
    repoURL: https://github.com/company/k8s-vpa-policies
    targetRevision: HEAD
    path: vpa-policies
  destination:
    server: https://kubernetes.default.svc
    namespace: kube-system
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    syncOptions:
    - CreateNamespace=true
    retry:
      limit: 5
      backoff:
        duration: 5s
        factor: 2
        maxDuration: 3m

---

# argocd-vpa-applicationset.yaml
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
  name: vpa-policies-multi-cluster
  namespace: argocd
spec:
  generators:
  - clusters:
      selector:
        matchLabels:
          environment: production
  template:
    metadata:
      name: 'vpa-policies-{{name}}'
    spec:
      project: infrastructure
      source:
        repoURL: https://github.com/company/k8s-vpa-policies
        targetRevision: HEAD
        path: 'environments/{{metadata.labels.environment}}'
      destination:
        server: '{{server}}'
        namespace: kube-system
      syncPolicy:
        automated:
          prune: true
          selfHeal: true
        syncOptions:
        - CreateNamespace=true

GitOps patterns enable teams to manage VPA configurations through standard development workflows whilst ensuring consistency across multiple clusters and environments. This approach provides the change management, approval processes, and audit trails that enterprise environments require for infrastructure modifications.

With this in mind, the key benefit of GitOps-based VPA management is that it integrates VPA policy changes with existing infrastructure workflows, reducing operational complexity whilst maintaining the visibility and control that teams need for production environments.

Environment-specific VPA configuration

Different environments often require different VPA configurations based on risk tolerance and operational requirements. Development environments might use aggressive optimisation whilst production environments require conservative approaches:

# argocd-environment-specific-vpa.yaml
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
  name: vpa-policies-by-environment
  namespace: argocd
spec:
  generators:
  - clusters:
      selector:
        matchLabels:
          argocd.argoproj.io/secret-type: cluster
  template:
    metadata:
      name: 'vpa-{{metadata.labels.environment}}-{{name}}'
    spec:
      project: infrastructure
      source:
        repoURL: https://github.com/company/k8s-vpa-policies
        targetRevision: HEAD
        path: base
        kustomize:
          images:
          - name: vpa-recommender
            newTag: 'v1.1.0'
          patchesStrategicMerge:
          - 'overlays/{{metadata.labels.environment}}/vpa-overrides.yaml'
      destination:
        server: '{{server}}'
        namespace: kube-system
      syncPolicy:
        automated:
          prune: true
          selfHeal: true
        syncOptions:
        - CreateNamespace=true

---

# Repository structure example:
# base/
#   kustomization.yaml
#   vpa-base.yaml
# overlays/
#   production/
#     kustomization.yaml
#     vpa-overrides.yaml
#   development/
#     kustomization.yaml
#     vpa-overrides.yaml

Example environment-specific override for production environments:

# overlays/production/vpa-overrides.yaml
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: web-app-vpa
spec:
  updatePolicy:
    updateMode: "Initial"  # Conservative for production
  resourcePolicy:
    containerPolicies:
    - containerName: web-app
      minAllowed:
        cpu: 500m      # Higher minimums for production
        memory: 1Gi
      maxAllowed:
        cpu: 4000m     # Conservative maximums
        memory: 8Gi

---

# overlays/development/vpa-overrides.yaml
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: web-app-vpa
spec:
  updatePolicy:
    updateMode: "InPlaceOrRecreate"  # Aggressive for development
  resourcePolicy:
    containerPolicies:
    - containerName: web-app
      minAllowed:
        cpu: 100m      # Lower minimums for development
        memory: 256Mi
      maxAllowed:
        cpu: 2000m     # More restrictive for cost control
        memory: 4Gi

This ArgoCD ApplicationSet pattern enables different VPA configurations per environment whilst maintaining consistent base policies. Production environments use conservative settings with higher minimum allocations and more restrictive update modes, whilst development environments can use more aggressive optimisation.

What this demonstrates is the importance of matching VPA configuration to operational requirements rather than using identical settings across all environments. Production workloads benefit from conservative approaches that prioritise stability, whilst development environments can accept more aggressive optimisation that maximises cost savings.

Advanced monitoring and troubleshooting

What monitoring approaches actually work for production VPA deployments? Understanding VPA effectiveness requires comprehensive monitoring that tracks recommendation accuracy, update success rates, and cost impact. The patterns we will explore provide practical visibility into VPA operations without overwhelming monitoring systems.

Production VPA deployments require monitoring approaches that address both technical performance and business impact, enabling teams to validate effectiveness whilst identifying issues before they affect operations.

VPA performance monitoring

Comprehensive monitoring provides visibility into VPA effectiveness and operational impact. The monitoring setup needs to track VPA component health, recommendation quality, and update success patterns.

Important Note: Monitoring resource availability depends on your VPA installation method:

  • Fairwinds VPA Helm chart: Includes PodMonitor resources by default.
  • Steve Hipwell VPA chart: Includes optional ServiceMonitor support.
  • Official kubernetes/autoscaler: Requires manual monitoring configuration.

Once the VPA components metrics are exposed to Prometheus, we can create alerting on these components:

# vpa-prometheus-rules.yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: vpa-monitoring
  namespace: monitoring
spec:
  groups:
  - name: vpa.rules
    rules:
    - alert: VPARecommenderDown
      expr: up{job=~".*vpa-recommender.*"} == 0
      for: 5m
      labels:
        severity: critical
      annotations:
        summary: "VPA Recommender is down"
        description: "VPA Recommender has been down for more than 5 minutes"
    
    - alert: VPAUpdaterDown
      expr: up{job=~".*vpa-updater.*"} == 0
      for: 5m
      labels:
        severity: critical
      annotations:
        summary: "VPA Updater is down"
        description: "VPA Updater has been down for more than 5 minutes"
    
    - alert: VPARecommendationLatencyHigh
      expr: histogram_quantile(0.95, rate(vpa_recommender_recommendation_latency_seconds_bucket[5m])) > 30
      for: 10m
      labels:
        severity: warning
      annotations:
        summary: "VPA recommendation latency is high"
        description: "VPA recommendation generation is taking longer than 30 seconds"

    # VPA object count tracking
    - record: vpa:objects_total
      expr: vpa_recommender_vpa_objects_count

VPA-specific metrics available include:

  • vpa_recommender_vpa_objects_count: Number of VPA objects being managed
  • vpa_recommender_recommendation_latency_seconds: Time taken to generate recommendations
  • vpa_recommender_execution_latency_seconds: VPA recommender execution time
  • vpa_recommender_aggregate_container_states_count: Container state tracking

For VPA recommendation metrics, you will also need to configure kube-state-metrics to expose VPA recommendation data:

kube-state-metrics:
  rbac:
    extraRules:
    - apiGroups: ["autoscaling.k8s.io"]
      resources: ["verticalpodautoscalers"]
      verbs: ["list", "watch"]
  prometheus:
    monitor:
      enabled: true
  customResourceState:
    enabled: true
    config:
      kind: CustomResourceStateMetrics
      spec:
        resources:
        - groupVersionKind:
            group: autoscaling.k8s.io
            kind: "VerticalPodAutoscaler"
            version: "v1"
          labelsFromPath:
            vpa_name: [metadata, name]
            vpa_namespace: [metadata, namespace]
          metrics:
          - name: "vpa_containerrecommendations_target"
            help: "VPA container target recommendations"
            each:
              type: Gauge
              gauge:
                path: [status, recommendation, containerRecommendations]
                valueFrom: [target, memory]

These monitoring configurations work differently depending on your installation method and provide comprehensive visibility into both VPA component health and recommendation effectiveness. Teams using the Fairwinds chart benefit from automatic monitoring setup, whilst those using official installations need manual ServiceMonitor configuration.

Systematic advanced VPA adoption

How do we implement these advanced patterns systematically? Successful advanced VPA adoption requires structured approaches that build on basic VPA operation whilst addressing the complexity of multi-container applications, ecosystem integration, and enterprise operational requirements.

The adoption patterns we recommend focus on incremental capability expansion rather than attempting comprehensive advanced implementation immediately. This approach builds operational confidence whilst delivering measurable benefits at each stage.

Phased advanced implementation

Advanced VPA capabilities work best when introduced systematically, allowing teams to validate effectiveness at each level before increasing complexity:

# Phase 1: Advanced resource policies for critical applications
kubectl apply -f advanced-resource-policies/
sleep 300
kubectl get vpa --all-namespaces | grep "complex"

# Phase 2: Ecosystem tool integration
kubectl apply -f goldilocks/
kubectl apply -f monitoring/
sleep 600
curl http://goldilocks-dashboard/api/v1/summary

# Phase 3: Multi-cluster policy distribution
flux bootstrap github --owner=company --repository=k8s-vpa-policies
kubectl apply -f flux-vpa-policies.yaml

Each phase builds on the previous one, enabling teams to understand advanced VPA behaviour whilst maintaining operational control. This systematic approach reduces risk whilst maximising the benefits that advanced VPA capabilities provide.

What this systematic approach enables is confident adoption of sophisticated VPA patterns without overwhelming teams with complexity or introducing operational risks that could undermine VPA adoption across the organisation.

Looking forward: VPA ecosystem maturity

Where is VPA development heading? The VPA ecosystem continues evolving with enhanced capabilities for complex environments whilst maintaining focus on production reliability and operational simplicity. Understanding current development directions helps teams plan for continued VPA adoption and capability expansion.

The developments we are tracking focus on improving production readiness through enhanced reliability, better integration with existing Kubernetes features, and expanded ecosystem tool support that addresses enterprise operational requirements that we encounter regularly in client environments.

Emerging VPA capabilities

Current VPA development focuses on addressing the operational challenges that teams encounter with complex production deployments. The Multi-Dimensional Pod Autoscaler project aims to provide native coordination between VPA and HPA functionality, eliminating current complexities around using both scaling approaches simultaneously.

Enhanced custom resource support enables VPA integration with sophisticated workload types beyond standard Deployments and StatefulSets, including custom operators and application-specific controllers that manage complex distributed systems. This expansion addresses the reality that many production applications use custom controllers that require specialised scaling approaches.

However, these emerging capabilities are in development rather than production-ready today. Teams should focus on the mature VPA features we have covered rather than waiting for future developments that might change direction or timeline.

Enterprise integration reality

The VPA ecosystem now includes mature enterprise tools that provide comprehensive management capabilities for large-scale deployments. Goldilocks offers production-ready dashboard functionality with automatic VPA generation and policy management. Kubecost provides cost analysis that correlates VPA optimisation with financial impact.

GitOps integration patterns enable VPA management through standard infrastructure-as-code workflows, providing the version control, approval processes, and deployment automation that enterprise environments require for reliable operations at scale. These patterns work with existing operational workflows rather than requiring new toolchains.

What makes these ecosystem tools valuable is their focus on solving real operational challenges rather than adding complexity for theoretical benefits. They integrate with existing workflows whilst providing capabilities that address the practical needs of teams managing VPA at scale.

Production excellence with advanced VPA

The advanced implementation patterns we have explored demonstrate how VPA evolves from basic resource optimisation into sophisticated infrastructure automation when combined with mature ecosystem tools and enterprise-grade operational practices.

Successful advanced VPA adoption requires understanding not just the technical capabilities, but the operational patterns, ecosystem integration opportunities, and monitoring approaches that transform automated scaling from a tactical tool into a strategic infrastructure capability that adapts to changing requirements whilst delivering consistent optimisation benefits.

After all, the journey from basic VPA implementation to advanced enterprise deployment involves systematic progression through increasingly sophisticated configurations, ecosystem tool integration, and operational practices that build organisational confidence whilst delivering measurable benefits through improved resource efficiency and operational excellence.

The patterns we have examined provide the foundation for VPA implementations that deliver sustained value rather than one-time improvements, enabling organisations to build resource optimisation capabilities that adapt to changing requirements whilst maintaining operational reliability.

Frequently Asked Questions

How do I validate that in-place Pod resizing is working correctly in my cluster?

Verify Kubernetes version 1.27+ and VPA 1.4.0+, then check that the InPlacePodVerticalScaling feature gate is enabled (it is enabled by default in 1.33+). Monitor VPA updater logs for "Successfully updated resources in-place" events versus "Falling back to recreation" messages. Test with a simple deployment using InPlaceOrRecreate mode and observe whether CPU increases happen without pod restart.

What are the signs that my advanced VPA configuration is working properly?

Look for stable VPA recommendations over multiple days, successful update events in VPA component logs, appropriate resource utilisation improvements without performance degradation, and consistent cost savings through tools like Kubecost. Advanced configurations should show different optimisation patterns for different containers based on their resource policies.

How do I troubleshoot VPA issues in multi-container scenarios?

Check each container policy individually using kubectl describe vpa <name> to see per-container recommendations. Verify that controlledResources settings match your intent, review container logs for resource-related errors, and ensure Pod Disruption Budgets allow necessary updates for containers using Recreate or InPlaceOrRecreate modes.

When should I use Goldilocks versus implementing VPA configurations manually?

Use Goldilocks when managing VPA across multiple namespaces or applications, when they need dashboard visualisation for recommendation review, or when teams prefer web interfaces over YAML configuration. Manual implementation works better for sophisticated resource policies, custom automation workflows, or environments with specific compliance requirements around configuration management.

Additional Resources

This blog is part of our Vertical Pod Autoscaling series, we recommend reading the rest of the posts in the series:

  1. Understanding Kubernetes VPA: Why manual resource allocation is costing you thousands
  2. VPA component deep dive: How Kubernetes Vertical Pod Autoscaler works
  3. VPA production deployment: Avoiding pitfalls for safe and effective scaling
  4. Advanced VPA implementation: Production patterns, ecosystem tools, and real-world configuration