Migrate from cluster autoscaler to node auto-provisioning

Migrate your existing Azure Kubernetes Service (AKS) cluster from cluster autoscaler to node auto-provisioning using the steps in this guide.

Node auto-provisioning (NAP) uses pending pod resource requirements to decide the optimal virtual machine (VM) configuration to run those workloads in the most efficient and cost-effective manner.

Node auto-provisioning is based on the open-source Karpenter project and the AKS Karpenter provider. Node auto-provisioning automatically deploys, configures, and manages Karpenter on your AKS clusters.

Cluster autoscaler vs. node auto-provisioning

Why migrate from cluster autoscaler to node auto-provisioning

Node auto-provisioning improves bin-packing, automates node lifecycle management, and reduces operational overhead compared to cluster autoscaler.

Reason to Migrate Cluster Autoscaler (CAS) Node Auto Provisioning (NAP)
VM Size Flexibility Preexisting node pools with single VM size per pool Dynamic provisioning of mixed VM sizes for cost/performance balance
Cost Optimization Adds/removes nodes in pools; risk of underutilization Intelligent bin-packing reduces fragmentation and lowers costs
Management Overhead Requires manual tuning of CAS profiles Fully managed experience integrated with AKS
Lifecycle Management Basic scale-up/scale-down only Advanced node lifecycle optimization; manage node updates, disruption + more
Future Feature Development Cluster autoscaler is maintained, with minimal feature enhancements Continuous active development and new feature enhancements

Cluster autoscaler profile settings vs. node auto-provisioning configuration settings

The following table maps cluster autoscaler profile settings to node auto-provisioning configuration settings for the NodePool CRD. This table also shows the cluster autoscaler Azure CLI command and its NAP CRD equivalent.

Cluster Autoscaler Profile Setting Description CAS CLI Example NAP Disruption Setting Description NAP YAML Example
balance-similar-node-groups Balances node pools across zones CLI:
az aks update --resource-group <rg> --name <cluster> --cluster-autoscaler-profile balance-similar-node-groups=true
N/A NAP uses Karpenter’s provisioning logic; no direct equivalent YAML:
# Not applicable in NAP
expander Strategy for selecting node pool for scale-up CLI:
az aks update --resource-group <rg> --name <cluster> --cluster-autoscaler-profile expander=least-waste
N/A NAP dynamically provisions optimal VM sizes; no expander concept YAML:
# Not applicable in NAP
scale-down-unneeded-time Time a node must be unneeded before eligible for scale down (default: 10m) CLI:
az aks update --resource-group <rg> --name <cluster> --cluster-autoscaler-profile scale-down-unneeded-time=10m
consolidateAfter Time NAP waits after discovering consolidation opportunity before disrupting node YAML:
disruption:
consolidateAfter: 10m
scale-down-unready-time Time an unready node must be unneeded before eligible for scale down (default: 20m) CLI:
az aks update --resource-group <rg> --name <cluster> --cluster-autoscaler-profile scale-down-unready-time=20m
terminationGracePeriod Grace period for pod termination before node removal YAML:
disruption:
terminationGracePeriod: 20m
scale-down-utilization-threshold Node utilization threshold for scale down (default: 0.5) CLI:
az aks update --resource-group <rg> --name <cluster> --cluster-autoscaler-profile scale-down-utilization-threshold=0.5
consolidationPolicy Policy for consolidation: WhenEmpty or WhenEmptyOrUnderUtilized YAML:
disruption:
consolidationPolicy: WhenEmptyOrUnderUtilized
scan-interval How often autoscaler reevaluates cluster (default: 10s) CLI:
az aks update --resource-group <rg> --name <cluster> --cluster-autoscaler-profile scan-interval=10s
N/A NAP doesn't use periodic scans; decisions are event-driven YAML:
# Not applicable in NAP
skip-nodes-with-local-storage Prevents deleting nodes with local storage CLI:
az aks update --resource-group <rg> --name <cluster> --cluster-autoscaler-profile skip-nodes-with-local-storage=true
Annotation: karpenter.sh/do-not-disrupt Blocks disruption for specific nodes or pods YAML:
metadata:
annotations:
karpenter.sh/do-not-disrupt: "true"
skip-nodes-with-system-pods Prevents deleting nodes with system pods CLI:
az aks update --resource-group <rg> --name <cluster> --cluster-autoscaler-profile skip-nodes-with-system-pods=true
Annotation: karpenter.sh/do-not-disrupt Same behavior for NAP YAML:
metadata:
annotations:
karpenter.sh/do-not-disrupt: "true"
max-empty-bulk-delete Max empty nodes deleted at once (default: 10) CLI:
az aks update --resource-group <rg> --name <cluster> --cluster-autoscaler-profile max-empty-bulk-delete=10
budgets Rate limits voluntary disruptions (percentage or absolute nodes) YAML:
disruption:
budgets:
- nodes: "10"
max-graceful-termination-sec Max seconds to wait for pod termination during scale down (default: 600s) CLI:
az aks update --resource-group <rg> --name <cluster> --cluster-autoscaler-profile max-graceful-termination-sec=600
terminationGracePeriod Explicitly sets termination grace period for NAP nodes YAML:
disruption:
terminationGracePeriod: 600s
max-node-provision-time Max time to wait for node provisioning (default: 15m) CLI:
az aks update --cluster-autoscaler-profile max-node-provision-time=15m
N/A NAP provisions nodes immediately based on pending pods YAML:
# Not applicable in NAP
ok-total-unready-count / max-total-unready-percentage Limits unready nodes during autoscaling CLI:
az aks update --cluster-autoscaler-profile ok-total-unready-count=3
budgets Can enforce disruption limits during maintenance windows YAML:
disruption:
budgets:
- nodes: "20%"

Note

Unlike cluster autoscaler, NAP doesn't use Azure CLI commands to manage node behavior, so all decision making for NAP-managed nodes is determined by the CRDs. For more on configuring your cluster specifications for NAP, visit our NodePool documentation and AKSNodeClass documentation.

Before you begin

Prerequisite Notes
Azure Subscription If you don't have an Azure subscription, you can create a Trial.
Azure CLI 2.76.0 or later. To find the version, run az --version. For more information about installing or upgrading the Azure CLI, see Install Azure CLI.

Limitations

See NAP limitations and unsupported features.

Disable cluster autoscaler

Pre-migration checklist

  • Confirm cluster eligibility for node auto-provisioning. For more on NAP requirements, see Overview of NAP documentation.
  • Right-size workloads for consolidation.
  • Verify your system node pool is active.
    • AKS requires a system node pool for system components (such as CoreDNS and Karpenter). When NAP is enabled, AKS is responsible for autoscaling the system pool.

Important

If your workloads depend on custom subnets or network policies, configure custom subnets or network policies in the AKSNodeClass before migrating workloads to avoid scheduling failures. See the AKSNodeClass documentation for details.

Disable cluster autoscaler safely

If cluster autoscaler is enabled cluster-wide, disable it at the cluster level using the --disable-cluster-autoscaler flag. Nodes aren’t removed when you disable cluster autoscaler, so your capacity stays steady.

az aks update --resource-group myResourceGroup --name myAKSCluster --disable-cluster-autoscaler

If cluster autoscaler is only enabled on select node pools, disable cluster autoscaler for specific node pools using the --disable-cluster-autoscaler flag.

# Disable CAS on a specific pool
az aks nodepool update \
  --resource-group myResourceGroup \
  --cluster-name myAKSCluster \
  --name mypool1 \
  --disable-cluster-autoscaler

You can also set the node count of your node pool to a pinned count as you begin the migration to node auto-provisioning. The following az aks nodepool scale command pins the node count of node pool mypool1 in cluster myAKSCluster to five (5).

# (Optional) Pin to a safe desired count before the switch
az aks nodepool scale \
  --resource-group myResourceGroup \
  --cluster-name myAKSCluster \
  --name mypool1 \
  --node-count 5

Enable node auto-provisioning

Enable node auto-provisioning on an existing cluster

Enable node auto-provisioning on an existing cluster using the az aks update command and set --node-provisioning-mode to Auto.

az aks update --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP_NAME --node-provisioning-mode Auto

Define your first NodePool and AKSNodeClass

After enabling node auto-provisioning on your cluster, create a basic NodePool and AKSNodeClass to start provisioning nodes. These custom resource definition (CRD) files are used by NAP to define the types of nodes provisioned for your workloads.

This example creates a basic NodePool that:

  • Supports on-demand instances
  • Uses D series VMs
  • Sets a CPU limit of 100
  • Enables consolidation when nodes are empty or underutilized
#nodepool-default.yaml
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: default
spec:
  template:
    metadata:
      labels:
        intent: apps
    spec:
      nodeClassRef:
        name: default
        group: karpenter.azure.com
        kind: AKSNodeClass
      requirements:
        - key: karpenter.sh/capacity-type
          operator: In
          values: [on-demand]
        - key: karpenter.azure.com/sku-family
          operator: In
          values: [D]
  limits:
    cpu: 100
  disruption:
    consolidationPolicy: WhenEmptyOrUnderutilized
    consolidateAfter: 0s
    expireAfter: Never
---
apiVersion: karpenter.azure.com/v1beta1
kind: AKSNodeClass
metadata:
  name: default
  annotations:
    kubernetes.io/description: "General purpose AKSNodeClass for running Ubuntu nodes"
spec:
  imageFamily: Ubuntu

You can now deploy the custom resources to your cluster with the following kubectl command:

kubectl apply -f nodepool-default.yaml

Migrate workloads from fixed pools to node auto-provisioning managed nodes

Note

Consider setting node affinity that matches your specifications in NAP's NodePool and AKSNodeClass CRDs to ensure that your workloads can tolerate the types of nodes you defined NAP to provision and that they're scheduled to the NAP-managed nodes when desired. See the AKS node selector and affinity documentation for best practices.

Now scale down user pools gradually (keep the system pool):

# For each user pool, step down to 0 (this command should respect properly set PDBs)
az aks nodepool scale \
  --resource-group <RG> \
  --cluster-name <CLUSTER> \
  --name <USER_POOL> \
  --node-count 0

As pods evict, node auto-provisioning provisions replacement nodes per your NodePool and AKSNodeClass rules. If a user pool must go to zero, remember you can only do that on user pools (not system pool), and with cluster autoscaler disabled, which is already disabled in an earlier step.

Note

We recommend a gradual scale down in waves, and watch replicas/PDBs to avoid dips in availability.

To confirm that the scale down is working and workloads are being scheduled to NAP-managed nodes safely, check:

  • Custom resource definition files are active
  • Karpenter events detailing NAP decisions
  • Nodeclaims are created in response to pending pod pressure

Verify node auto-provisioning

Check CRDs and understand NAP fields

Check CRDs to confirm they are in use:

# Verify CRDs
kubectl get crd | grep karpenter

View field descriptions with the kubectl explain command:

# Use help api to describe fields
kubectl explain nodepool.spec

Confirm new NAP-managed nodes are being created

To ensure that NAP is properly provisioning new nodes in response to pending pod pressure, verify that the new nodes are being created. Node auto-provisioning produces cluster events that you can use to monitor deployment and scheduling decisions. View events through the Kubernetes events stream.

kubectl get events -A --field-selector source=karpenter -w

Alternatively, view the NodeClaims that represent the nodes being created:

kubectl get nodeclaims

A populated list confirms NAP is responding to pending pod pressure.

Clean up old autoscaling

  • If you're using managed AKS cluster autoscaler only, cluster autoscaler is already disabled with the above steps.
  • If you're using self-hosted cluster autoscaler installed in kube-system, scale the cluster autoscaler pods to zero and remove.
kubectl -n kube-system scale deploy/cluster-autoscaler --replicas=0
kubectl -n kube-system delete deploy/cluster-autoscaler

Fine-tune node auto-provisioning post-migration

After you complete your migration, you can fine-tune your cluster with these capabilities.

  • Manage disruption behavior - Tune disruption consolidationPolicy and consolidateAfter windows to balance cost vs. virtual machine churn. See the NAP Disruption documentation.
  • Multiple NodePools - Split by workload class (for example, Spot vs On-Demand, GPU vs CPU) and use requirements, weights, and taints to control placement. See the NAP NodePool documentation.
  • Networking - For more information on managing networking with custom virtual networks, see the NAP networking documentation.
  • Observability - Stream Karpenter events and expose NAP control-plane metrics via Azure Monitor managed Prometheus. See the NAP observability documentation.

Next steps

For more information on node auto-provisioning in AKS, see the following articles: