Stop and start an Azure Kubernetes Service (AKS) cluster

You may not need to continuously run your Azure Kubernetes Service (AKS) workloads. For example, you may have a development cluster that you only use during business hours. This means there are times where your cluster might be idle, running nothing more than the system components. You can reduce the cluster footprint by scaling all User node pools to 0, but your System pool is still required to run the system components while the cluster is running.

To better optimize your costs during these periods, you can turn off, or stop, your cluster. This action stops your control plane and agent nodes, allowing you to save on all the compute costs, while maintaining all objects except standalone pods. The cluster state is stored for when you start it again, allowing you to pick up where you left off.

Caution

Stopping your cluster deallocates the control plane and releases the capacity. In regions experiencing capacity constraints, customers may be unable to start a stopped cluster. We do not recommend stopping mission critical workloads for this reason.

Note

AKS start operations will restore all objects from ETCD with the exception of standalone pods with the same names and ages. meaning that a pod's age will continue to be calculated from its original creation time. This count will keep increasing over time, regardless of whether the cluster is in a stopped state.

Before you begin

This article assumes you have an existing AKS cluster. If you need an AKS cluster, you can create one using Azure CLI, Azure PowerShell, or the Azure portal.

About the cluster stop/start feature

When using the cluster stop/start feature, the following conditions apply:

  • This feature is only supported for Virtual Machine Scale Set backed clusters.
  • You can't stop clusters which use the Node Autoprovisioning (NAP) feature.
  • The cluster state of a stopped AKS cluster is preserved for up to 12 months. If your cluster is stopped for more than 12 months, you can't recover the state. For more information, see the AKS support policies.
  • You can only perform start or delete operations on a stopped AKS cluster. To perform other operations, like scaling or upgrading, you need to start your cluster first.
  • If you provisioned PrivateEndpoints linked to private clusters, they need to be deleted and recreated again when starting a stopped AKS cluster.
  • Because the stop process drains all nodes, any standalone pods (i.e. pods not managed by a Deployment, StatefulSet, DaemonSet, Job, etc.) will be deleted.
  • When you start your cluster back up, the following behavior is expected:
    • The IP address of your API server may change.
    • If you're using cluster autoscaler, when you start your cluster, your current node count may not be between the min and max range values you set. The cluster starts with the number of nodes it needs to run its workloads, which isn't impacted by your autoscaler settings. When your cluster performs scaling operations, the min and max values will impact your current node count, and your cluster will eventually enter and remain in that desired range until you stop your cluster.

Stop an AKS cluster

  1. Use the az aks stop command to stop a running AKS cluster, including the nodes and control plane. The following example stops a cluster named myAKSCluster:

    az aks stop --name myAKSCluster --resource-group myResourceGroup
    
  2. Verify your cluster has stopped using the az aks show command and confirming the powerState shows as Stopped.

    az aks show --name myAKSCluster --resource-group myResourceGroup
    

    Your output should look similar to the following condensed example output:

    {
    [...]
      "nodeResourceGroup": "MC_myResourceGroup_myAKSCluster_chinaeast2",
      "powerState":{
        "code":"Stopped"
      },
      "privateFqdn": null,
      "provisioningState": "Succeeded",
      "resourceGroup": "myResourceGroup",
    [...]
    }
    

    If the provisioningState shows Stopping, your cluster hasn't fully stopped yet.

Important

If you're using pod disruption budgets, the stop operation can take longer, as the drain process will take more time to complete.

Start an AKS cluster

Caution

After utilizing the start/stop feature on AKS, it is essential to wait 15-30 minutes before restarting your AKS cluster. This waiting period is necessary because it takes several minutes for the relevant services to fully stop. Attempting to restart your cluster during this process can disrupt the shutdown process and potentially cause issues with the cluster or its workloads.

  1. Use the az aks start command to start a stopped AKS cluster. The cluster restarts with the previous control plane state and number of agent nodes. The following example starts a cluster named myAKSCluster:

    az aks start --name myAKSCluster --resource-group myResourceGroup
    
  2. Verify your cluster has started using the az aks show command and confirming the powerState shows Running.

    az aks show --name myAKSCluster --resource-group myResourceGroup
    

    Your output should look similar to the following condensed example output:

    {
    [...]
      "nodeResourceGroup": "MC_myResourceGroup_myAKSCluster_chinaeast2",
      "powerState":{
        "code":"Running"
     },
     "privateFqdn": null,
     "provisioningState": "Succeeded",
     "resourceGroup": "myResourceGroup",
    [...]
    }
    

    If the provisioningState shows Starting, your cluster hasn't fully started yet.

Next steps