Kubernetes (k8s) deployments already have a max surge concept, and there's no reason this surge should only apply to new rollouts and not to node maintenance or other situations where PodDisruptionBudget (PDB)-protected pods need to be evicted. This project uses node cordons to signal eviction-autoscaler Custom Resources that correspond to a PodDisruptionBudget and target a deployment. An eviction autoscaler controller then attempts to scale up a the targeted deployment (or scaleset if you're feeling brave) when the pdb's allowed disruptions is zero and scales down once evictions have stopped.
Overprovisioning isn't free. Sometimes it makes sense to run as cost-effectively as possible, but you still don't want to experience downtime due to a cluster upgrade or even a VM maintenance event.
Your app might also experience issues for unrelated reasons, and a maintenance event shouldn't result in downtime if adding extra replicas can save you.
- Node Controller: Signals eviction-autoscaler for all pods on cordoned nodes selected by corresponding pdb whose name/namespace it shares.
- Eviction-autoscaler Controller: Watches eviction-autoscale resources. If there a recent eviction singals and the PDB's AllowedDisruotions is zero, it triggers a surge in the corresponding deployment. Once evitions have stopped for some cooldown period and allowed diruptions has rised above zero it scales down.
- PDB Controller (Optional): Automatically creates eviction-autoscalers Custom Resources for existing PDBs.
- Deployment Controller (Optional): Creates PDBs for deployments that don't already have them and keeps min available matching the deployments replicas (not counting any surged in by eviction autoscaler)
graph TD;
Cordon[Cordon]
NodeController[Cordoned Node Controller]
CRD[Eviction Autoscaler Custom Resource]
Controller[Eviction-Autoscaler Controller]
Deployment[Deployment or StatefulSet]
PDB[Pod Disruption Budget]
PDBController[Optional PDB creator]
Cordon -->|Triggers| NodeController
NodeController -->|writes spec| CRD
CRD -->|spec watched by| Controller
Controller -->|surges and shrinks| Deployment
Controller -->|Writes status| CRD
Controller -->|reads allowed disruptions | PDB
PDBController -->|watches | Deployment
PDBController -->|creates if not exist| PDB
- Docker
- kind for e2e tests.
- A sense of adventure
You can install Eviction-Autoscaler using the Azure Kubernetes Extension Resource Provider (RP) or via Helm.
-
Add the eviction-autoscaler Helm repository:
helm repo add eviction-autoscaler https://azure.github.io/eviction-autoscaler/charts helm repo update
-
Install the chart into your cluster:
helm install eviction-autoscaler eviction-autoscaler/eviction-autoscaler \ --namespace eviction-autoscaler --create-namespace \ --set controllerConfig.pdb.create=true
Configuration Examples:
Enable PDB creation and specify target namespaces:
helm install eviction-autoscaler eviction-autoscaler/eviction-autoscaler \
--namespace eviction-autoscaler --create-namespace \
--set controllerConfig.pdb.create=true \
--set controllerConfig.namespaces.actionedNamespaces="{kube-system,default}"Enable eviction-autoscaler for all namespaces by default:
helm install eviction-autoscaler eviction-autoscaler/eviction-autoscaler \
--namespace eviction-autoscaler --create-namespace \
--set controllerConfig.pdb.create=false \
--set controllerConfig.namespaces.enabledByDefault=trueNote: Setting
pdb.create=truewill automatically create a PodDisruptionBudget (PDB) for deployments that do not already have one, ensuring your workloads are protected and enabling eviction-autoscaler to manage disruptions effectively.If a deployment already has a PDB whose label selector matches the deployment's pod template labels, eviction-autoscaler will not create a new PDB—even if
pdb.create=true. This avoids duplicate PDBs and ensures existing disruption budgets are respected.For example, if you deploy an app without a PDB:
apiVersion: apps/v1 kind: Deployment metadata: name: my-app namespace: default spec: replicas: 2 selector: matchLabels: app: my-app template: metadata: labels: app: my-app spec: containers: - name: my-app image: nginxWith
pdb.create=true, eviction-autoscaler will automatically create a matching PDB:apiVersion: policy/v1 kind: PodDisruptionBudget metadata: name: my-app namespace: default spec: minAvailable: 2 selector: matchLabels: app: my-appIf a matching PDB already exists, eviction-autoscaler will not create another. If you later disable
pdb.create, eviction-autoscaler will not delete any existing PDBs—it will simply stop creating new ones.
- (Optional) Customize values by passing
--values my-values.yamlor using--set key=value.
Refer to the Helm Values for configuration options.
Follow the steps below to register the required features and deploy the extension to your AKS cluster.
az feature register --namespace Microsoft.KubernetesConfiguration --name ExtensionsWait until the feature state is Registered:
az feature show --namespace Microsoft.KubernetesConfiguration --name Extensionsaz provider register -n Microsoft.KubernetesConfigurationaz aks create \
--resource-group <your-resource-group> \
--name <your-aks-cluster-name> \
--node-count 2 \
--generate-ssh-keysaz k8s-extension create \
--cluster-name <your-cluster-name> \
--cluster-type managedClusters \
--extension-type microsoft.evictionautoscaler \
--name <your-extension-name> \
--resource-group <your-resource-group-name> \
--release-train stable \
--auto-upgrade-minor-version trueWith namespace configuration:
az k8s-extension create \
--cluster-name <your-cluster-name> \
--cluster-type managedClusters \
--extension-type microsoft.evictionautoscaler \
--name eviction-autoscaler \
--resource-group <your-resource-group-name> \
--release-train stable \
--configuration-settings controllerConfig.namespaces.actionedNamespaces="{kube-system,default}" \
--auto-upgrade-minor-version trueCluster-wide auto-protection (enable all namespaces and auto-create PDBs):
az k8s-extension create \
--cluster-name <your-cluster-name> \
--cluster-type managedClusters \
--extension-type microsoft.evictionautoscaler \
--name eviction-autoscaler \
--resource-group <your-resource-group-name> \
--release-train dev \
--configuration-settings controllerConfig.namespaces.enabledByDefault=true controllerConfig.pdb.create=true\
--config AgentTimeoutInMinutes=30 \
--subscription <your-subscription-id> \
--version 0.1.16 \
--auto-upgrade-minor-version falseConfiguration Options:
controllerConfig.pdb.create=true- Automatically creates PDBs for deployments (default: false)controllerConfig.namespaces.enabledByDefault=true- Enables all namespaces (default: false, opt-in mode)controllerConfig.namespaces.actionedNamespaces- List of namespaces to enable when using opt-in mode (default: [kube-system])
Common Configuration Combinations:
-
Conservative (Manual Control) -
pdb.create=false,enabledByDefault=false,actionedNamespaces=[kube-system]- Only watches specific namespaces, requires manual PDB creation
-
Targeted Auto-Protection -
pdb.create=true,enabledByDefault=false,actionedNamespaces=[production,staging]- Auto-creates PDBs only in specified namespaces
- Most common production setup - balances automation with control
-
Cluster-Wide Auto-Protection -
pdb.create=true,enabledByDefault=true- Auto-creates PDBs for all deployments across all namespaces
- Maximum automation and protection, namespaces can opt-out with annotation
eviction-autoscaler.azure.com/enable: "false"
-
Monitoring Only -
pdb.create=false,enabledByDefault=true- Monitors all namespaces but doesn't create PDBs
- The
eviction-autoscaler.azure.com/pdb-create: "true"annotation is ignored when controller-levelpdb.create=false - Good for migrating from manual PDB management
Note: The
--configuration-settings controllerConfig.pdb.create=trueoption enables automatic creation of PodDisruptionBudgets (PDBs) for deployments that do not already have one. ensuring your workloads are protected and enabling eviction-autoscaler to manage disruptions effectively. Eviction-autoscaler determines whether a deployment already has a corresponding PDB by comparing the PDB's label selector with the deployment's pod template labels. This ensures that each deployment is protected from disruptions and avoids duplicate PDBs. If you later disablepdb.create, eviction-autoscaler will not delete any existing PDBs—it will simply stop creating new ones. Note: The--auto-upgrade-minor-version falseoption is only required if you want to disable automatic minor version upgrades. Note: The--release-train devoption specifies that the extension will use the "dev" release train, which typically includes the latest development builds and experimental features.
Other available release train options includestable(recommended for production workloads) andpreview(for pre-release features).
Usedevfor testing or development environments,previewfor evaluating upcoming features, andstablefor production deployments.
Refer to the extension documentation for configuration options.
Configuration options will be documented here in future updates. If you have suggestions, please open an issue or PR.
Important: When modifying controller configuration values (such as
controllerConfig.pdb.createorcontrollerConfig.namespaces.*), you must delete and re-install the extension for the changes to take effect.To update configuration:
Delete the existing extension:
az k8s-extension delete --resource-group <your-resource-group> \ --cluster-name <your-cluster-name> \ --cluster-type managedClusters \ --name evictionautoscaler --yesRe-install the extension with your updated configuration settings using the
az k8s-extension createcommand shown above.
If you want to exclude a specific deployment from automatic PodDisruptionBudget (PDB) creation, add the following annotation to its manifest:
metadata:
annotations:
eviction-autoscaler.azure.com/pdb-create: "false"This annotation instructs eviction-autoscaler not to create a PDB for that deployment, regardless of whether you installed via Helm or the Azure Kubernetes Extension Resource Provider.
Eviction-autoscaler automatically skips PDB creation for deployments that have a maxUnavailable value other than 0 in their rolling update strategy. This is because such deployments already tolerate some level of downtime during updates or maintenance.
For example, the following deployment will not get an automatic PDB:
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 25% # This doesn't affect PDB creation
maxUnavailable: 1 # Allows 1 pod to be unavailable - skips PDB creation
# ... rest of specIn this case, since maxUnavailable: 1, the deployment is explicitly designed to tolerate one pod being down. Creating a PDB would conflict with this configuration. Note that maxSurge does not affect PDB creation - only maxUnavailable matters.
If you want a PDB for such a deployment, you can either:
- Set
maxUnavailable: 0in the deployment strategy, or - Manually create and manage the PDB yourself
This behavior applies to both integer values (maxUnavailable: 1) and percentage values (maxUnavailable: 25%). Only deployments with maxUnavailable: 0 or maxUnavailable: 0% will automatically get PDBs created.
Eviction autoscaler provides flexible namespace-level control with two operational modes controlled by environment variables:
ENABLED_BY_DEFAULT: Controls the operational mode (default:false)false: Namespaces disabled by default - only specified namespaces enabledtrue: Namespaces enabled by default - all namespaces enabled unless disabled
ACTIONED_NAMESPACES: Comma-separated list of namespaces with special behaviorPDB_CREATE: Enable automatic PDB creation for deployments (default:false)
When ENABLED_BY_DEFAULT=false (the default), eviction autoscaler operates as follows:
- All namespaces are disabled by default
- Namespaces listed in
ACTIONED_NAMESPACESare automatically enabled - Other namespaces can be enabled by adding the annotation
eviction-autoscaler.azure.com/enable: "true" - Namespaces in
ACTIONED_NAMESPACEScan be overridden with annotationeviction-autoscaler.azure.com/enable: "false"
Configuration via Helm:
helm install eviction-autoscaler eviction-autoscaler/eviction-autoscaler \
--namespace eviction-autoscaler --create-namespace \
--set controllerConfig.pdb.create=true \
--set controllerConfig.namespaces.enabledByDefault=false \
--set-json 'controllerConfig.namespaces.actionedNamespaces=["kube-system","production","staging"]'Or via values.yaml:
controllerConfig:
pdb:
create: true
namespaces:
enabledByDefault: false # Namespaces disabled by default (default)
actionedNamespaces:
- kube-system
- production
- stagingConfiguration via environment variables:
export ENABLED_BY_DEFAULT=false
export ACTIONED_NAMESPACES="kube-system,production,staging"Enabling a namespace when enabled_by_default=false:
apiVersion: v1
kind: Namespace
metadata:
name: development
annotations:
eviction-autoscaler.azure.com/enable: "true" # Explicitly enableOr using kubectl:
kubectl annotate namespace development eviction-autoscaler.azure.com/enable=trueWhen ENABLED_BY_DEFAULT=true is set, eviction autoscaler operates as follows:
- All namespaces are enabled by default
ACTIONED_NAMESPACESis ignored - only annotations control which namespaces are disabled- Namespaces can be disabled by adding the annotation
eviction-autoscaler.azure.com/enable: "false" - Namespaces can be explicitly enabled with annotation
eviction-autoscaler.azure.com/enable: "true"(though they're already enabled by default)
Configuration via Helm:
helm install eviction-autoscaler eviction-autoscaler/eviction-autoscaler \
--namespace eviction-autoscaler --create-namespace \
--set controllerConfig.pdb.create=true \
--set controllerConfig.namespaces.enabledByDefault=trueOr via values.yaml:
controllerConfig:
pdb:
create: true
namespaces:
enabledByDefault: true # Namespaces enabled by default
actionedNamespaces: [] # Ignored when enabled_by_default=trueConfiguration via environment variables:
Set the environment variable ENABLED_BY_DEFAULT=true.
Disabling a namespace when enabled_by_default=true:
apiVersion: v1
kind: Namespace
metadata:
name: development
annotations:
eviction-autoscaler.azure.com/enable: "false" # Explicitly disableOr using kubectl:
kubectl annotate namespace development eviction-autoscaler.azure.com/enable=falseEnabling a namespace not in the list:
apiVersion: v1
kind: Namespace
metadata:
name: development
annotations:
eviction-autoscaler.azure.com/enable: "true" # Explicitly enable| Mode | ENABLED_BY_DEFAULT |
Default Behavior | ACTIONED_NAMESPACES |
Annotation Behavior |
|---|---|---|---|---|
| enabled_by_default=false (default) | false or unset |
All disabled | These namespaces are enabled | Can enable others with enable: "true" or override with enable: "false" |
| enabled_by_default=true | true |
All enabled | Ignored | Can disable with enable: "false" |
Important: Annotations always take precedence over the default behavior and the ACTIONED_NAMESPACES list.
When eviction-autoscaler is disabled for a namespace (either by annotation or configuration change), resources are automatically cleaned up based on their ownership:
Resources created by eviction-autoscaler with the ownedBy: EvictionAutoScaler annotation are fully managed by the controller:
-
When a namespace is disabled:
- The
DeploymentToPDBReconcilerdetects the namespace is disabled - It deletes all controller-owned PDBs in that namespace
- The
EvictionAutoScalerCRs are automatically deleted by Kubernetes garbage collection (via OwnerReference)
- The
-
When a deployment is deleted:
- The PDB is automatically deleted (via OwnerReference: PDB → Deployment)
- The
EvictionAutoScalerCR is automatically deleted (via OwnerReference: EvictionAutoScaler → PDB)
Example of controller-owned resources:
# PDB created by eviction-autoscaler
kubectl get pdb my-app -o yaml
# metadata:
# annotations:
# ownedBy: EvictionAutoScaler
# ownerReferences:
# - apiVersion: apps/v1
# kind: Deployment
# name: my-appResources created manually without the ownedBy: EvictionAutoScaler annotation are preserved:
-
When a namespace is disabled:
- The
PDBToEvictionAutoScalerReconcilerdeletes only theEvictionAutoScalerCR - Your manually created PDB is left intact - eviction-autoscaler never deletes resources it doesn't own
- The
-
When a deployment is deleted:
- If the PDB has no OwnerReference (user-owned), it remains untouched
- Only the
EvictionAutoScalerCR is deleted
Example of user-owned PDB:
# User creates their own PDB
kubectl apply -f - <<EOF
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: my-app
namespace: default
spec:
minAvailable: 2
selector:
matchLabels:
app: my-app
EOF
# Eviction-autoscaler creates an EvictionAutoScaler CR but does NOT take ownership of the PDB
# If namespace is disabled, only the EvictionAutoScaler CR is deleted - the PDB remainsNamespace watches trigger reconciliation by listing all deployments/PDBs in that namespace. This is efficient because:
- The controller-runtime client uses an in-memory cache
- List operations read from cache, not the Kubernetes API server
- No API server round-trip overhead
- Fast local memory operations
Via Helm:
helm install eviction-autoscaler eviction-autoscaler/eviction-autoscaler \
--namespace eviction-autoscaler --create-namespace \
--set controllerConfig.pdb.create=true \
--set controllerConfig.namespaces.enabledByDefault=true \
--set-json 'controllerConfig.namespaces.actionedNamespaces=["kube-system","production"]'Via environment variables:
Deploy with environment variables:
apiVersion: apps/v1
kind: Deployment
metadata:
name: eviction-autoscaler
namespace: eviction-autoscaler
spec:
template:
spec:
containers:
- name: manager
env:
- name: ENABLED_BY_DEFAULT
value: "true"
- name: ACTIONED_NAMESPACES
value: "kube-system,production,staging"
- name: PDB_CREATE
value: "true"Deployments in production and staging namespaces will be managed by eviction autoscaler. Deployments in other namespaces (e.g., development, testing) will be ignored.
When eviction-autoscaler creates a PodDisruptionBudget (PDB) for a deployment, it manages the PDB's lifecycle using both Kubernetes owner references and annotations:
- Owner Reference: Links the PDB to its deployment, ensuring the PDB is deleted when the deployment is deleted
- Annotation:
ownedBy: EvictionAutoScalermarks the PDB as managed by eviction-autoscaler
If you want to take manual control of a PDB that was created by eviction-autoscaler, remove the ownedBy annotation:
kubectl annotate pdb <pdb-name> -n <namespace> ownedBy-When the annotation is removed, eviction-autoscaler will:
- Detect the annotation removal (which triggers reconciliation)
- Remove the owner reference from the PDB
- Stop managing the PDB
After this, the PDB becomes user-managed and will not be deleted when the deployment is deleted. You take full responsibility for managing and cleaning up the PDB.
Example workflow:
# Check the current PDB annotations
kubectl get pdb my-app -n default -o jsonpath='{.metadata.annotations}'
# Remove the ownedBy annotation to take control
kubectl annotate pdb my-app -n default ownedBy-
# The PDB is now yours to manage
# Deleting the deployment will no longer delete the PDB
kubectl delete deployment my-app -n default
# You must manually delete the PDB when you're done with it
kubectl delete pdb my-app -n defaultRe-establishing controller ownership:
If you want eviction-autoscaler to take control back of a PDB, simply add the annotation back:
# Add the annotation back to return control to eviction-autoscaler
kubectl annotate pdb my-app -n default ownedBy=EvictionAutoScaler
# The controller will re-establish the owner reference on the next reconciliation
# The PDB will now be deleted when the deployment is deletedUse docker buildx through the Make target to build and push a manifest image for multiple architectures.
make docker-buildx \
IMG=<registry>/<repo>/eviction-autoscaler:<tag> \
PLATFORMS=linux/amd64,linux/arm64Notes:
docker-buildxpushes directly to the registry (--push), soIMGmust be a pushable image reference.- For the
docker-buildxMakefile target, ifPLATFORMSis omitted, it defaults tolinux/arm64,linux/amd64,linux/s390x,linux/ppc64le. - In
hack/release.sh, ifRELEASE_PLATFORMSis omitted, it defaults tolinux/amd64,linux/arm64.
kubectl create ns laboratory
kubectl create deployment -n laboratory piggie --image nginx
# unless disabled there will now be a pdb and a pdbwatcher that map to the deployment
# show a starting state
kubectl get pods -n laboratory
kubectl get poddisruptionbudget piggie -n laboratory -o yaml # should be allowed disruptions 0
kubectl get evictionautoscalers piggie -n laboratory -o yaml
# cordon
NODE=$(kubectl get pods -n laboratory -l app=piggie -o=jsonpath='{.items[*].spec.nodeName}')
kubectl cordon $NODE
# show we've scaled up
kubectl get pods -n laboratory
kubectl get poddisruptionbudget piggie -n laboratory -o yaml # should be allowed disruptions 1
kubectl get evictionautoscalers piggie -n laboratory -o yaml
# actually kick the node off now that pdb isn't at zero.
kubectl drain $NODE --delete-emptydir-data --ignore-daemonsets
Here's a drain of Node on a to node cluster that is running the aks store demo (4 deployments and two stateful sets). You can see the drains being rejected then going through on the left and new pods being surged in on the right.
This project originated as an intern project and is still available at github.com/Javier090/k8s-pdb-autoscaler.
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.
