Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Migrate your existing Azure Kubernetes Service (AKS) cluster from cluster autoscaler to node auto-provisioning using the steps in this guide.
Node auto-provisioning (NAP) uses pending pod resource requirements to decide the optimal virtual machine (VM) configuration to run those workloads in the most efficient and cost-effective manner.
Node auto-provisioning is based on the open-source Karpenter project and the AKS Karpenter provider. Node auto-provisioning automatically deploys, configures, and manages Karpenter on your AKS clusters.
Cluster autoscaler vs. node auto-provisioning
Why migrate from cluster autoscaler to node auto-provisioning
Node auto-provisioning improves bin-packing, automates node lifecycle management, and reduces operational overhead compared to cluster autoscaler.
| Reason to Migrate | Cluster Autoscaler (CAS) | Node Auto Provisioning (NAP) |
|---|---|---|
| VM Size Flexibility | Preexisting node pools with single VM size per pool | Dynamic provisioning of mixed VM sizes for cost/performance balance |
| Cost Optimization | Adds/removes nodes in pools; risk of underutilization | Intelligent bin-packing reduces fragmentation and lowers costs |
| Management Overhead | Requires manual tuning of CAS profiles | Fully managed experience integrated with AKS |
| Lifecycle Management | Basic scale-up/scale-down only | Advanced node lifecycle optimization; manage node updates, disruption + more |
| Future Feature Development | Cluster autoscaler is maintained, with minimal feature enhancements | Continuous active development and new feature enhancements |
Cluster autoscaler profile settings vs. node auto-provisioning configuration settings
The following table maps cluster autoscaler profile settings to node auto-provisioning configuration settings for the NodePool CRD. This table also shows the cluster autoscaler Azure CLI command and its NAP CRD equivalent.
| Cluster Autoscaler Profile Setting | Description | CAS CLI Example | NAP Disruption Setting | Description | NAP YAML Example |
|---|---|---|---|---|---|
balance-similar-node-groups |
Balances node pools across zones | CLI: az aks update --resource-group <rg> --name <cluster> --cluster-autoscaler-profile balance-similar-node-groups=true |
N/A | NAP uses Karpenter’s provisioning logic; no direct equivalent | YAML: # Not applicable in NAP |
expander |
Strategy for selecting node pool for scale-up | CLI: az aks update --resource-group <rg> --name <cluster> --cluster-autoscaler-profile expander=least-waste |
N/A | NAP dynamically provisions optimal VM sizes; no expander concept | YAML: # Not applicable in NAP |
scale-down-unneeded-time |
Time a node must be unneeded before eligible for scale down (default: 10m) | CLI: az aks update --resource-group <rg> --name <cluster> --cluster-autoscaler-profile scale-down-unneeded-time=10m |
consolidateAfter |
Time NAP waits after discovering consolidation opportunity before disrupting node | YAML: disruption: consolidateAfter: 10m |
scale-down-unready-time |
Time an unready node must be unneeded before eligible for scale down (default: 20m) | CLI: az aks update --resource-group <rg> --name <cluster> --cluster-autoscaler-profile scale-down-unready-time=20m |
terminationGracePeriod |
Grace period for pod termination before node removal | YAML: disruption:terminationGracePeriod: 20m |
scale-down-utilization-threshold |
Node utilization threshold for scale down (default: 0.5) | CLI: az aks update --resource-group <rg> --name <cluster> --cluster-autoscaler-profile scale-down-utilization-threshold=0.5 |
consolidationPolicy |
Policy for consolidation: WhenEmpty or WhenEmptyOrUnderUtilized |
YAML: disruption:consolidationPolicy: WhenEmptyOrUnderUtilized |
scan-interval |
How often autoscaler reevaluates cluster (default: 10s) | CLI: az aks update --resource-group <rg> --name <cluster> --cluster-autoscaler-profile scan-interval=10s |
N/A | NAP doesn't use periodic scans; decisions are event-driven | YAML: # Not applicable in NAP |
skip-nodes-with-local-storage |
Prevents deleting nodes with local storage | CLI: az aks update --resource-group <rg> --name <cluster> --cluster-autoscaler-profile skip-nodes-with-local-storage=true |
Annotation: karpenter.sh/do-not-disrupt |
Blocks disruption for specific nodes or pods | YAML: metadata: annotations: karpenter.sh/do-not-disrupt: "true" |
skip-nodes-with-system-pods |
Prevents deleting nodes with system pods | CLI: az aks update --resource-group <rg> --name <cluster> --cluster-autoscaler-profile skip-nodes-with-system-pods=true |
Annotation: karpenter.sh/do-not-disrupt |
Same behavior for NAP | YAML: metadata: annotations: karpenter.sh/do-not-disrupt: "true" |
max-empty-bulk-delete |
Max empty nodes deleted at once (default: 10) | CLI: az aks update --resource-group <rg> --name <cluster> --cluster-autoscaler-profile max-empty-bulk-delete=10 |
budgets |
Rate limits voluntary disruptions (percentage or absolute nodes) | YAML: disruption: budgets: - nodes: "10" |
max-graceful-termination-sec |
Max seconds to wait for pod termination during scale down (default: 600s) | CLI: az aks update --resource-group <rg> --name <cluster> --cluster-autoscaler-profile max-graceful-termination-sec=600 |
terminationGracePeriod |
Explicitly sets termination grace period for NAP nodes | YAML: disruption: terminationGracePeriod: 600s |
max-node-provision-time |
Max time to wait for node provisioning (default: 15m) | CLI: az aks update --cluster-autoscaler-profile max-node-provision-time=15m |
N/A | NAP provisions nodes immediately based on pending pods | YAML: # Not applicable in NAP |
ok-total-unready-count / max-total-unready-percentage |
Limits unready nodes during autoscaling | CLI: az aks update --cluster-autoscaler-profile ok-total-unready-count=3 |
budgets |
Can enforce disruption limits during maintenance windows | YAML: disruption: budgets: - nodes: "20%" |
Note
Unlike cluster autoscaler, NAP doesn't use Azure CLI commands to manage node behavior, so all decision making for NAP-managed nodes is determined by the CRDs. For more on configuring your cluster specifications for NAP, visit our NodePool documentation and AKSNodeClass documentation.
Before you begin
| Prerequisite | Notes |
|---|---|
| Azure Subscription | If you don't have an Azure subscription, you can create a Trial. |
| Azure CLI | 2.76.0 or later. To find the version, run az --version. For more information about installing or upgrading the Azure CLI, see Install Azure CLI. |
Limitations
See NAP limitations and unsupported features.
Disable cluster autoscaler
Pre-migration checklist
- Confirm cluster eligibility for node auto-provisioning. For more on NAP requirements, see Overview of NAP documentation.
- Right-size workloads for consolidation.
- Set proper resource requests/limits, replicas, and pod disruption budgets (PDBs) to allow for a gradual migration. This migration method requires properly set PDBs to ensure well-managed disruption of your workloads.
- Verify your system node pool is active.
- AKS requires a system node pool for system components (such as CoreDNS and Karpenter). When NAP is enabled, AKS is responsible for autoscaling the system pool.
Important
If your workloads depend on custom subnets or network policies, configure custom subnets or network policies in the AKSNodeClass before migrating workloads to avoid scheduling failures. See the AKSNodeClass documentation for details.
Disable cluster autoscaler safely
If cluster autoscaler is enabled cluster-wide, disable it at the cluster level using the --disable-cluster-autoscaler flag. Nodes aren’t removed when you disable cluster autoscaler, so your capacity stays steady.
az aks update --resource-group myResourceGroup --name myAKSCluster --disable-cluster-autoscaler
If cluster autoscaler is only enabled on select node pools, disable cluster autoscaler for specific node pools using the --disable-cluster-autoscaler flag.
# Disable CAS on a specific pool
az aks nodepool update \
--resource-group myResourceGroup \
--cluster-name myAKSCluster \
--name mypool1 \
--disable-cluster-autoscaler
You can also set the node count of your node pool to a pinned count as you begin the migration to node auto-provisioning. The following az aks nodepool scale command pins the node count of node pool mypool1 in cluster myAKSCluster to five (5).
# (Optional) Pin to a safe desired count before the switch
az aks nodepool scale \
--resource-group myResourceGroup \
--cluster-name myAKSCluster \
--name mypool1 \
--node-count 5
Enable node auto-provisioning
Enable node auto-provisioning on an existing cluster
Enable node auto-provisioning on an existing cluster using the az aks update command and set --node-provisioning-mode to Auto.
az aks update --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP_NAME --node-provisioning-mode Auto
Define your first NodePool and AKSNodeClass
After enabling node auto-provisioning on your cluster, create a basic NodePool and AKSNodeClass to start provisioning nodes. These custom resource definition (CRD) files are used by NAP to define the types of nodes provisioned for your workloads.
This example creates a basic NodePool that:
- Supports on-demand instances
- Uses D series VMs
- Sets a CPU limit of 100
- Enables consolidation when nodes are empty or underutilized
#nodepool-default.yaml
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: default
spec:
template:
metadata:
labels:
intent: apps
spec:
nodeClassRef:
name: default
group: karpenter.azure.com
kind: AKSNodeClass
requirements:
- key: karpenter.sh/capacity-type
operator: In
values: [on-demand]
- key: karpenter.azure.com/sku-family
operator: In
values: [D]
limits:
cpu: 100
disruption:
consolidationPolicy: WhenEmptyOrUnderutilized
consolidateAfter: 0s
expireAfter: Never
---
apiVersion: karpenter.azure.com/v1beta1
kind: AKSNodeClass
metadata:
name: default
annotations:
kubernetes.io/description: "General purpose AKSNodeClass for running Ubuntu nodes"
spec:
imageFamily: Ubuntu
You can now deploy the custom resources to your cluster with the following kubectl command:
kubectl apply -f nodepool-default.yaml
Migrate workloads from fixed pools to node auto-provisioning managed nodes
Note
Consider setting node affinity that matches your specifications in NAP's NodePool and AKSNodeClass CRDs to ensure that your workloads can tolerate the types of nodes you defined NAP to provision and that they're scheduled to the NAP-managed nodes when desired. See the AKS node selector and affinity documentation for best practices.
Now scale down user pools gradually (keep the system pool):
# For each user pool, step down to 0 (this command should respect properly set PDBs)
az aks nodepool scale \
--resource-group <RG> \
--cluster-name <CLUSTER> \
--name <USER_POOL> \
--node-count 0
As pods evict, node auto-provisioning provisions replacement nodes per your NodePool and AKSNodeClass rules. If a user pool must go to zero, remember you can only do that on user pools (not system pool), and with cluster autoscaler disabled, which is already disabled in an earlier step.
Note
We recommend a gradual scale down in waves, and watch replicas/PDBs to avoid dips in availability.
To confirm that the scale down is working and workloads are being scheduled to NAP-managed nodes safely, check:
- Custom resource definition files are active
- Karpenter events detailing NAP decisions
- Nodeclaims are created in response to pending pod pressure
Verify node auto-provisioning
Check CRDs and understand NAP fields
Check CRDs to confirm they are in use:
# Verify CRDs
kubectl get crd | grep karpenter
View field descriptions with the kubectl explain command:
# Use help api to describe fields
kubectl explain nodepool.spec
Confirm new NAP-managed nodes are being created
To ensure that NAP is properly provisioning new nodes in response to pending pod pressure, verify that the new nodes are being created. Node auto-provisioning produces cluster events that you can use to monitor deployment and scheduling decisions. View events through the Kubernetes events stream.
kubectl get events -A --field-selector source=karpenter -w
Alternatively, view the NodeClaims that represent the nodes being created:
kubectl get nodeclaims
A populated list confirms NAP is responding to pending pod pressure.
Clean up old autoscaling
- If you're using managed AKS cluster autoscaler only, cluster autoscaler is already disabled with the above steps.
- If you're using self-hosted cluster autoscaler installed in kube-system, scale the cluster autoscaler pods to zero and remove.
kubectl -n kube-system scale deploy/cluster-autoscaler --replicas=0
kubectl -n kube-system delete deploy/cluster-autoscaler
Fine-tune node auto-provisioning post-migration
After you complete your migration, you can fine-tune your cluster with these capabilities.
- Manage disruption behavior - Tune disruption
consolidationPolicyandconsolidateAfterwindows to balance cost vs. virtual machine churn. See the NAP Disruption documentation. - Multiple NodePools - Split by workload class (for example, Spot vs On-Demand, GPU vs CPU) and use requirements, weights, and taints to control placement. See the NAP NodePool documentation.
- Networking - For more information on managing networking with custom virtual networks, see the NAP networking documentation.
- Observability - Stream Karpenter events and expose NAP control-plane metrics via Azure Monitor managed Prometheus. See the NAP observability documentation.
Next steps
For more information on node auto-provisioning in AKS, see the following articles:
- Use node auto-provisioning in a custom virtual network
- Configure networking for node auto-provisioning on AKS
- Configure node pools for node auto-provisioning on AKS
- Configure disruption policies for node auto-provisioning on AKS
- Upgrade node images for node auto-provisioning on AKS
- Enable/Disable node auto-provisioning on AKS