Understand Azure Policy for Kubernetes clusters

Azure Policy extends Gatekeeper v3, an admission controller webhook for Open Policy Agent (OPA), to apply at-scale enforcements and safeguards on your cluster components in a centralized, consistent manner. Cluster components include pods, containers, and namespaces.

Azure Policy makes it possible to manage and report on the compliance state of your Kubernetes cluster components from one place. By using Azure Policy's Add-on or Extension, governing your cluster components is enhanced with Azure Policy features, like the ability to use selectors and overrides for safe policy rollout and rollback.

Azure Policy for Kubernetes supports the following cluster environments:

Important

The Azure Policy Add-on Helm model and the add-on for AKS Engine have been deprecated. Follow the instructions to remove the add-ons.

Overview

By installing Azure Policy's add-on or extension on your Kubernetes clusters, Azure Policy enacts the following functions:

  • Checks with Azure Policy service for policy assignments to the cluster.
  • Deploys policy definitions into the cluster as constraint template and constraint custom resources or as a mutation template resource (depending on policy definition content).
  • Reports auditing and compliance details back to Azure Policy service.

To enable and use Azure Policy with your Kubernetes cluster, take the following actions:

  1. Configure your Kubernetes cluster and install the Azure Kubernetes Service (AKS) add-on or Azure Policy's Extension for Arc-enabled Kubernetes clusters (depending on your cluster type).

    Note

    For common issues with installation, see Troubleshoot - Azure Policy Add-on.

  2. Create or use a sample Azure Policy definition for Kubernetes

  3. Assign a definition to your Kubernetes cluster

  4. Wait for validation

  5. Logging and troubleshooting

  6. Review limitations and recommendations in our FAQ section

Install Azure Policy Add-on for AKS

The Azure Policy Add-on for AKS is part of Kubernetes version 1.27 with long term support (LTS).

Prerequisites

  1. Register the resource providers and preview features.

    • Azure portal:

      Register the Microsoft.PolicyInsights resource providers. For steps, see Resource providers and types.

    • Azure CLI:

      # Log in first with az login
      az cloud set -n AzureChinaCloud
      az login
      
      # Provider register: Register the Azure Policy provider
      az provider register --namespace Microsoft.PolicyInsights
      
  2. You need the Azure CLI version 2.12.0 or later installed and configured. To find the version, run the az --version command. If you need to install or upgrade, see How to install the Azure CLI.

  3. The AKS cluster must be a supported Kubernetes version in Azure Kubernetes Service (AKS). Use the following script to validate your AKS cluster version:

    # Log in first with az login
    az cloud set -n AzureChinaCloud
    az login
    
    # Look for the value in kubernetesVersion
    az aks list
    
  4. Open ports for the Azure Policy extension. The Azure Policy extension uses these domains and ports to fetch policy definitions and assignments and report compliance of the cluster back to Azure Policy.

    Domain Port
    data.policy.core.chinacloudapi.cn 443
    store.policy.core.chinacloudapi.cn 443
    login.chinacloudapi.cn 443
    dc.services.visualstudio.com 443

After the prerequisites are completed, install the Azure Policy Add-on in the AKS cluster you want to manage.

  • Azure portal

    1. Launch the AKS service in the Azure portal by selecting All services, then searching for and selecting Kubernetes services.

    2. Select one of your AKS clusters.

    3. Select Policies on the left side of the Kubernetes service page.

    4. In the main page, select the Enable add-on button.

  • Azure CLI

    # Log in first with az login
    az cloud set -n AzureChinaCloud
    az login
    
    az aks enable-addons --addons azure-policy --name MyAKSCluster --resource-group MyResourceGroup
    

To validate that the add-on installation was successful and that the azure-policy and gatekeeper pods are running, run the following command:

# azure-policy pod is installed in kube-system namespace
kubectl get pods -n kube-system

# gatekeeper pod is installed in gatekeeper-system namespace
kubectl get pods -n gatekeeper-system

Lastly, verify that the latest add-on is installed by running this Azure CLI command, replacing <rg> with your resource group name and <cluster-name> with the name of your AKS cluster: az aks show --query addonProfiles.azurepolicy -g <rg> -n <cluster-name>. The result should look similar to the following output for clusters using service principals:

{
  "config": null,
  "enabled": true,
  "identity": null
}

And the following output for clusters using managed identity:

 {
   "config": null,
   "enabled": true,
   "identity": {
     "clientId": "########-####-####-####-############",
     "objectId": "########-####-####-####-############",
     "resourceId": "<resource-id>"
   }
 }

Install Azure Policy Extension for Azure Arc enabled Kubernetes

Azure Policy for Kubernetes makes it possible to manage and report on the compliance state of your Kubernetes clusters from one place. With Azure Policy's Extension for Arc-enabled Kubernetes clusters, you can govern your Arc-enabled Kubernetes cluster components, like pods and containers.

This article describes how to create, show extension status, and delete the Azure Policy for Kubernetes extension.

For an overview of the extensions platform, see Azure Arc cluster extensions.

Prerequisites

If you already deployed Azure Policy for Kubernetes on an Azure Arc cluster using Helm directly without extensions, follow the instructions to delete the Helm chart. After the deletion is done, you can then proceed.

  1. Ensure your Kubernetes cluster is a supported distribution.

    Note

    Azure Policy for Arc extension is supported on the following Kubernetes distributions.

  2. Ensure you met all the common prerequisites for Kubernetes extensions listed here including connecting your cluster to Azure Arc.

    Note

    Azure Policy extension is supported for Arc enabled Kubernetes clusters in these regions.

  3. Open ports for the Azure Policy extension. The Azure Policy extension uses these domains and ports to fetch policy definitions and assignments and report compliance of the cluster back to Azure Policy.

    Domain Port
    data.policy.core.chinacloudapi.cn 443
    store.policy.core.chinacloudapi.cn 443
    login.chinacloudapi.cn 443
    dc.services.visualstudio.com 443
  4. Before you install the Azure Policy extension or enabling any of the service features, your subscription must enable the Microsoft.PolicyInsights resource providers.

    Note

    To enable the resource provider, follow the steps in Resource providers and types or run either the Azure CLI or Azure PowerShell command.

    • Azure CLI

      # Log in first with az login
      # Provider register: Register the Azure Policy provider
      az provider register --namespace 'Microsoft.PolicyInsights'
      
    • Azure PowerShell

      # Log in first with Connect-AzAccount -Environment AzureChinaCloud
      
      # Provider register: Register the Azure Policy provider
      Register-AzResourceProvider -ProviderNamespace 'Microsoft.PolicyInsights'
      

Create Azure Policy extension

Note

Note the following for Azure Policy extension creation:

  • Auto-upgrade is enabled by default which will update Azure Policy extension minor version if any new changes are deployed.
  • Any proxy variables passed as parameters to connectedk8s will be propagated to the Azure Policy extension to support outbound proxy.

To create an extension instance, for your Arc enabled cluster, run the following command substituting <> with your values:

az k8s-extension create --cluster-type connectedClusters --cluster-name <CLUSTER_NAME> --resource-group <RESOURCE_GROUP> --extension-type Microsoft.PolicyInsights --name <EXTENSION_INSTANCE_NAME>

Example:

az k8s-extension create --cluster-type connectedClusters --cluster-name my-test-cluster --resource-group my-test-rg --extension-type Microsoft.PolicyInsights --name azurepolicy

Example Output:

{
  "aksAssignedIdentity": null,
  "autoUpgradeMinorVersion": true,
  "configurationProtectedSettings": {},
  "configurationSettings": {},
  "customLocationSettings": null,
  "errorInfo": null,
  "extensionType": "microsoft.policyinsights",
  "id": "/subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/my-test-rg/providers/Microsoft.Kubernetes/connectedClusters/my-test-cluster/providers/Microsoft.KubernetesConfiguration/extensions/azurepolicy",
 "identity": {
    "principalId": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
    "tenantId": null,
    "type": "SystemAssigned"
  },
  "location": null,
  "name": "azurepolicy",
  "packageUri": null,
  "provisioningState": "Succeeded",
  "releaseTrain": "Stable",
  "resourceGroup": "my-test-rg",
  "scope": {
    "cluster": {
      "releaseNamespace": "kube-system"
    },
    "namespace": null
  },
  "statuses": [],
  "systemData": {
    "createdAt": "2021-10-27T01:20:06.834236+00:00",
    "createdBy": null,
    "createdByType": null,
    "lastModifiedAt": "2021-10-27T01:20:06.834236+00:00",
    "lastModifiedBy": null,
    "lastModifiedByType": null
  },
  "type": "Microsoft.KubernetesConfiguration/extensions",
  "version": "1.1.0"
}

Show Azure Policy extension

To check the extension instance creation was successful, and inspect extension metadata, run the following command substituting <> with your values:

az k8s-extension show --cluster-type connectedClusters --cluster-name <CLUSTER_NAME> --resource-group <RESOURCE_GROUP> --name <EXTENSION_INSTANCE_NAME>

Example:

az k8s-extension show --cluster-type connectedClusters --cluster-name my-test-cluster --resource-group my-test-rg --name azurepolicy

To validate that the extension installation was successful and that the azure-policy and gatekeeper pods are running, run the following command:

# azure-policy pod is installed in kube-system namespace
kubectl get pods -n kube-system

# gatekeeper pod is installed in gatekeeper-system namespace
kubectl get pods -n gatekeeper-system

Delete Azure Policy extension

To delete the extension instance, run the following command substituting <> with your values:

az k8s-extension delete --cluster-type connectedClusters --cluster-name <CLUSTER_NAME> --resource-group <RESOURCE_GROUP> --name <EXTENSION_INSTANCE_NAME>

Create a policy definition

The Azure Policy language structure for managing Kubernetes follows that of existing policy definitions. There are sample definition files available to assign in Azure Policy's built-in policy library that can be used to govern your cluster components.

Azure Policy for Kubernetes also support custom definition creation at the component-level for both Azure Kubernetes Service clusters and Azure Arc-enabled Kubernetes clusters. Constraint template and mutation template samples are available in the Gatekeeper community library.

With a Resource Provider mode of Microsoft.Kubernetes.Data, the effects audit, deny, disabled, and mutate are used to manage your Kubernetes clusters.

Audit and deny must provide details properties specific to working with OPA Constraint Framework and Gatekeeper v3.

As part of the details.templateInfo or details.constraintInfo properties in the policy definition, Azure Policy passes the URI or Base64Encoded value of these CustomResourceDefinitions(CRD) to the add-on. Rego is the language that OPA and Gatekeeper support to validate a request to the Kubernetes cluster. By supporting an existing standard for Kubernetes management, Azure Policy makes it possible to reuse existing rules and pair them with Azure Policy for a unified cloud compliance reporting experience. For more information, see What is Rego?.

Assign a policy definition

To assign a policy definition to your Kubernetes cluster, you must be assigned the appropriate Azure role-based access control (Azure RBAC) policy assignment operations. The Azure built-in roles Resource Policy Contributor and Owner have these operations. To learn more, see Azure RBAC permissions in Azure Policy.

Find the built-in policy definitions for managing your cluster using the Azure portal with the following steps. If using a custom policy definition, search for it by name or the category that you created it with.

  1. Start the Azure Policy service in the Azure portal. Select All services in the left pane and then search for and select Policy.

  2. In the left pane of the Azure Policy page, select Definitions.

  3. From the Category dropdown list box, use Select all to clear the filter and then select Kubernetes.

  4. Select the policy definition, then select the Assign button.

  5. Set the Scope to the management group, subscription, or resource group of the Kubernetes cluster where the policy assignment applies.

    Note

    When assigning the Azure Policy for Kubernetes definition, the Scope must include the cluster resource.

  6. Give the policy assignment a Name and Description that you can use to identify it easily.

  7. Set the Policy enforcement to one of the following values:

    • Enabled - Enforce the policy on the cluster. Kubernetes admission requests with violations are denied.

    • Disabled - Don't enforce the policy on the cluster. Kubernetes admission requests with violations aren't denied. Compliance assessment results are still available. When you roll out new policy definitions to running clusters, Disabled option is helpful for testing the policy definition as admission requests with violations aren't denied.

  8. Select Next.

  9. Set parameter values

    • To exclude Kubernetes namespaces from policy evaluation, specify the list of namespaces in parameter Namespace exclusions. The recommendation is to exclude: kube-system, gatekeeper-system, and azure-arc.
  10. Select Review + create.

Alternately, use the Assign a policy - Portal quickstart to find and assign a Kubernetes policy. Search for a Kubernetes policy definition instead of the sample audit vms.

Important

Built-in policy definitions are available for Kubernetes clusters in category Kubernetes. For a list of built-in policy definitions, see Kubernetes samples.

Policy evaluation

The add-on checks in with Azure Policy service for changes in policy assignments every 15 minutes. During this refresh cycle, the add-on checks for changes. These changes trigger creates, updates, or deletes of the constraint templates and constraints.

In a Kubernetes cluster, if a namespace has the cluster-appropriate label, the admission requests with violations aren't denied. Compliance assessment results are still available.

  • Azure Arc-enabled Kubernetes cluster: admission.policy.azure.com/ignore

Note

While a cluster admin might have permission to create and update constraint templates and constraints resources install by the Azure Policy Add-on, these aren't supported scenarios as manual updates are overwritten. Gatekeeper continues to evaluate policies that existed prior to installing the add-on and assigning Azure Policy policy definitions.

Every 15 minutes, the add-on calls for a full scan of the cluster. After gathering details of the full scan and any real-time evaluations by Gatekeeper of attempted changes to the cluster, the add-on reports the results back to Azure Policy for inclusion in compliance details like any Azure Policy assignment. Only results for active policy assignments are returned during the audit cycle. Audit results can also be seen as violations listed in the status field of the failed constraint. For details on Non-compliant resources, see Component details for Resource Provider modes.

Note

Each compliance report in Azure Policy for your Kubernetes clusters include all violations within the last 45 minutes. The timestamp indicates when a violation occurred.

Some other considerations:

  • If the cluster subscription is registered with Microsoft Defender for Cloud, then Microsoft Defender for Cloud Kubernetes policies are applied on the cluster automatically.

  • When a deny policy is applied on cluster with existing Kubernetes resources, any preexisting resource that isn't compliant with the new policy continues to run. When the non-compliant resource gets rescheduled on a different node the Gatekeeper blocks the resource creation.

  • When a cluster has a deny policy that validates resources, the user doesn't get a rejection message when creating a deployment. For example, consider a Kubernetes deployment that contains replicasets and pods. When a user executes kubectl describe deployment $MY_DEPLOYMENT, it doesn't return a rejection message as part of events. However, kubectl describe replicasets.apps $MY_DEPLOYMENT returns the events associated with rejection.

Note

Init containers might be included during policy evaluation. To see if init containers are included, review the CRD for the following or a similar declaration:

input_containers[c] {
   c := input.review.object.spec.initContainers[_]
}

Constraint template conflicts

If constraint templates have the same resource metadata name, but the policy definition references the source at different locations, the policy definitions are considered to be in conflict. Example: Two policy definitions reference the same template.yaml file stored at different source locations like the Azure Policy template store (store.policy.core.chinacloudapi.cn) and GitHub.

When policy definitions and their constraint templates are assigned but aren't already installed on the cluster and are in conflict, they're reported as a conflict and aren't installed into the cluster until the conflict is resolved. Likewise, any existing policy definitions and their constraint templates that are already on the cluster that conflicts with newly assigned policy definitions continue to function normally. If an existing assignment is updated and there's a failure to sync the constraint template, the cluster is also marked as a conflict. For all conflict messages, see AKS Resource Provider mode compliance reasons

Logging

As a Kubernetes controller/container, both the azure-policy and gatekeeper pods keep logs in the Kubernetes cluster. In general, azure-policy logs can be used to troubleshoot issues with policy ingestion onto the cluster and compliance reporting. The gatekeeper-controller-manager pod logs can be used to troubleshoot runtime denies. The gatekeeper-audit pod logs can be used to troubleshoot audits of existing resources. The logs can be exposed in the Insights page of the Kubernetes cluster. For more information, see Monitor your Kubernetes cluster performance with Azure Monitor for containers.

To view the add-on logs, use kubectl:

# Get the azure-policy pod name installed in kube-system namespace
kubectl logs <azure-policy pod name> -n kube-system

# Get the gatekeeper pod name installed in gatekeeper-system namespace
kubectl logs <gatekeeper pod name> -n gatekeeper-system

If you're attempting to troubleshoot a particular ComplianceReasonCode that is appearing in your compliance results, you can search the azure-policy pod logs for that code to see the full accompanying error.

For more information, see Debugging Gatekeeper in the Gatekeeper documentation.

View Gatekeeper artifacts

After the add-on downloads the policy assignments and installs the constraint templates and constraints on the cluster, it annotates both with Azure Policy information like the policy assignment ID and the policy definition ID. To configure your client to view the add-on related artifacts, use the following steps:

  1. Set up kubeconfig for the cluster.

    For an Azure Kubernetes Service cluster, use the following Azure CLI:

    # Set context to the subscription
    az account set --subscription <YOUR-SUBSCRIPTION>
    
    # Save credentials for kubeconfig into .kube in your home folder
    az aks get-credentials --resource-group <RESOURCE-GROUP> --name <CLUSTER-NAME>
    
  2. Test the cluster connection.

    Run the kubectl cluster-info command. A successful run has each service responding with a URL of where it's running.

View the add-on constraint templates

To view constraint templates downloaded by the add-on, run kubectl get constrainttemplates. Constraint templates that start with k8sazure are the ones installed by the add-on.

View the add-on mutation templates

To view mutation templates downloaded by the add-on, run kubectl get assign, kubectl get assignmetadata, and kubectl get modifyset.

Get Azure Policy mappings

To identify the mapping between a constraint template downloaded to the cluster and the policy definition, use kubectl get constrainttemplates <TEMPLATE> -o yaml. The results look similar to the following output:

apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
    annotations:
    azure-policy-definition-id: /subscriptions/<SUBID>/providers/Microsoft.Authorization/policyDefinitions/<GUID>
    constraint-template-installed-by: azure-policy-addon
    constraint-template: <URL-OF-YAML>
    creationTimestamp: "2021-09-01T13:20:55Z"
    generation: 1
    managedFields:
    - apiVersion: templates.gatekeeper.sh/v1beta1
    fieldsType: FieldsV1
...

<SUBID> is the subscription ID and <GUID> is the ID of the mapped policy definition. <URL-OF-YAML> is the source location of the constraint template that the add-on downloaded to install on the cluster.

Once you have the names of the add-on downloaded constraint templates, you can use the name to see the related constraints. Use kubectl get <constraintTemplateName> to get the list. Constraints installed by the add-on start with azurepolicy-.

View constraint details

The constraint has details about violations and mappings to the policy definition and assignment. To see the details, use kubectl get <CONSTRAINT-TEMPLATE> <CONSTRAINT> -o yaml. The results look similar to the following output:

apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sAzureContainerAllowedImages
metadata:
  annotations:
    azure-policy-assignment-id: /subscriptions/<SUB-ID>/resourceGroups/<RG-NAME>/providers/Microsoft.Authorization/policyAssignments/<ASSIGNMENT-GUID>
    azure-policy-definition-id: /providers/Microsoft.Authorization/policyDefinitions/<DEFINITION-GUID>
    azure-policy-definition-reference-id: ""
    azure-policy-setdefinition-id: ""
    constraint-installed-by: azure-policy-addon
    constraint-url: <URL-OF-YAML>
  creationTimestamp: "2021-09-01T13:20:55Z"
spec:
  enforcementAction: deny
  match:
    excludedNamespaces:
    - kube-system
    - gatekeeper-system
    - azure-arc
  parameters:
    imageRegex: ^.+azurecr.io/.+$
status:
  auditTimestamp: "2021-09-01T13:48:16Z"
  totalViolations: 32
  violations:
  - enforcementAction: deny
    kind: Pod
    message: Container image nginx for container hello-world has not been allowed.
    name: hello-world-78f7bfd5b8-lmc5b
    namespace: default
  - enforcementAction: deny
    kind: Pod
    message: Container image nginx for container hello-world has not been allowed.
    name: hellow-world-89f8bfd6b9-zkggg

Troubleshooting the add-on

For more information about troubleshooting the Add-on for Kubernetes, see the Kubernetes section of the Azure Policy troubleshooting article.

For Azure Policy extension for Arc extension related issues, go to:

For Azure Policy related issues, go to:

Azure Policy Add-on for AKS Changelog

Azure Policy's Add-on for AKS has a version number that indicates the image version of add-on. As feature support is newly introduced on the Add-on, the version number is increased.

This section helps you identify which Add-on version is installed on your cluster and also share a historical table of the Azure Policy Add-on version installed per AKS cluster.

Identify which Add-on version is installed on your cluster

The Azure Policy Add-on uses the standard Semantic Versioning schema for each version. To identify the Azure Policy Add-on version being used, you can run the following command: kubectl get pod azure-policy-<unique-pod-identifier> -n kube-system -o json | jq '.spec.containers[0].image'

To identify the Gatekeeper version that your Azure Policy Add-on is using, you can run the following command: kubectl get pod gatekeeper-controller-<unique-pod-identifier> -n gatekeeper-system -o json | jq '.spec.containers[0].image'

Finally, to identify the AKS cluster version that you're using, follow the linked AKS guidance.

Add-on versions available per each AKS cluster version

1.8.0

Policy can now be used to evaluate CONNECT operations, for instance, to deny execs. Note that there is no brownfield compliance available for noncompliant CONNECT operations, so a policy with Audit effect that targets CONNECTs is a no op.

Security improvements.

  • Released November 2024
  • Kubernetes 1.27+
  • Gatekeeper 3.17.1

1.7.1

Introducing CEL and VAP. Common Expression Language (CEL) is a Kubernetes-native expression language that can be used to declare validation rules of a policy. Validating Admission Policy (VAP) feature provides in-tree policy evaluation, reduces admission request latency, and improves reliability and availability. The supported validation actions include Deny, Warn, and Audit. Custom policy authoring for CEL/VAP is allowed, and existing users won't need to convert their Rego to CEL as they will both be supported and be used to enforce policies. To use CEL and VAP, users need to enroll in the feature flag AKS-AzurePolicyK8sNativeValidation in the Microsoft.ContainerService namespace. For more information, view the Gatekeeper Documentation.

Security improvements.

  • Released September 2024
  • Kubernetes 1.27+ (VAP generation is only supported on 1.30+)
  • Gatekeeper 3.17.1

1.7.0

Introducing expansion, a shift left feature that lets you know up front whether your workload resources (Deployments, ReplicaSets, Jobs, etc.) will produce admissible pods. Expansion shouldn't change the behavior of your policies; rather, it just shifts Gatekeeper's evaluation of pod-scoped policies to occur at workload admission time rather than pod admission time. However, to perform this evaluation it must generate and evaluate a what-if pod that is based on the pod spec defined in the workload, which might have incomplete metadata. For instance, the what-if pod won't contain the proper owner references. Because of this small risk of policy behavior changing, we're introducing expansion as disabled by default. To enable expansion for a given policy definition, set .policyRule.then.details.source to All. Built-ins will be updated soon to enable parameterization of this field. If you test your policy definition and find that the what-if pod being generated for evaluation purposes is incomplete, you can also use a mutation with source Generated to mutate the what-if pods. For more information on this option, view the Gatekeeper documentation.

Expansion is currently only available on AKS clusters, not Arc clusters.

Security improvements.

  • Released July 2024
  • Kubernetes 1.27+
  • Gatekeeper 3.16.3

1.6.1

Security improvements.

  • Released May 2024
  • Gatekeeper 3.14.2

1.5.0

Security improvements.

  • Released May 2024
  • Kubernetes 1.27+
  • Gatekeeper 3.16.3

1.4.0

Enables mutation and external data by default. The additional mutating webhook and increased validating webhook timeout cap might add latency to calls in the worst case. Also introduces support for viewing policy definition and set definition version in compliance results.

  • Released May 2024
  • Kubernetes 1.25+
  • Gatekeeper 3.14.0

1.3.0

Introduces error state for policies in error, enabling them to be distinguished from policies in noncompliant states. Adds support for v1 constraint templates and use of the excludedNamespaces parameter in mutation policies. Adds an error status check on constraint templates post-installation.

  • Released February 2024
  • Kubernetes 1.25+
  • Gatekeeper 3.14.0

1.2.1

  • Released October 2023
  • Kubernetes 1.25+
  • Gatekeeper 3.13.3

1.1.0

  • Released July 2023
  • Kubernetes 1.27+
  • Gatekeeper 3.11.1

1.0.1

  • Released June 2023
  • Kubernetes 1.24+
  • Gatekeeper 3.11.1

1.0.0

Azure Policy for Kubernetes now supports mutation to remediate AKS clusters at-scale!

Remove the add-on

Remove the add-on from AKS

To remove the Azure Policy Add-on from your AKS cluster, use either the Azure portal or Azure CLI:

  • Azure portal

    1. Launch the AKS service in the Azure portal by selecting All services, then searching for and selecting Kubernetes services.

    2. Select your AKS cluster where you want to disable the Azure Policy Add-on.

    3. Select Policies on the left side of the Kubernetes service page.

    4. In the main page, select the Disable add-on button.

  • Azure CLI

    # Log in first with az login
    az cloud set -n AzureChinaCloud
    az login
    
    az aks disable-addons --addons azure-policy --name MyAKSCluster --resource-group MyResourceGroup
    

Remove the add-on from Azure Arc enabled Kubernetes

Note

Azure Policy Add-on Helm model is now deprecated. You should opt for the Azure Policy Extension for Azure Arc enabled Kubernetes instead.

To remove the Azure Policy Add-on and Gatekeeper from your Azure Arc enabled Kubernetes cluster, run the following Helm command:

helm uninstall azure-policy-addon

Remove the add-on from AKS Engine

Note

The AKS Engine product is now deprecated for Azure public cloud customers. Consider using Azure Kubernetes Service (AKS) for managed Kubernetes or Cluster API Provider Azure for self-managed Kubernetes. There are no new features planned; this project will only be updated for CVEs & similar, with Kubernetes 1.24 as the final version to receive updates.

To remove the Azure Policy Add-on and Gatekeeper from your AKS Engine cluster, use the method that aligns with how the add-on was installed:

  • If installed by setting the addons property in the cluster definition for AKS Engine:

    Redeploy the cluster definition to AKS Engine after changing the addons property for azure-policy to false:

    "addons": [
      {
        "name": "azure-policy",
        "enabled": false
      }
    ]
    

    For more information, see AKS Engine - Disable Azure Policy Add-on.

  • If installed with Helm Charts, run the following Helm command:

    helm uninstall azure-policy-addon
    

Limitations

  • For general Azure Policy definitions and assignment limits, review Azure Policy's documented limits
  • Azure Policy Add-on for Kubernetes can only be deployed to Linux node pools.
  • Maximum number of pods supported by the Azure Policy Add-on per cluster: 10,000
  • Maximum number of Non-compliant records per policy per cluster: 500
  • Maximum number of Non-compliant records per subscription: 1 million
  • Installations of Gatekeeper outside of the Azure Policy Add-on aren't supported. Uninstall any components installed by a previous Gatekeeper installation before enabling the Azure Policy Add-on.
  • Reasons for non-compliance aren't available for the Microsoft.Kubernetes.Data Resource Provider mode. Use Component details.
  • Component-level exemptions aren't supported for Resource Provider modes. Parameters support is available in Azure Policy definitions to exclude and include particular namespaces.
  • Using the metadata.gatekeeper.sh/requires-sync-data annotation in a constraint template to configure the replication of data from your cluster into the OPA cache is currently only allowed for built-in policies. The reason is because it can dramatically increase the Gatekeeper pods resource usage if not used carefully.

Configuring the Gatekeeper Config

Changing the Gatekeeper config is unsupported, as it contains critical security settings. Edits to the config are reconciled.

Using data.inventory in constraint templates

Currently, several built-in policies make use of data replication, which enables users to sync existing on-cluster resources to the OPA cache and reference them during evaluation of an AdmissionReview request. Data replication policies can be differentiated by the presence of data.inventory in the Rego, and the presence of the metadata.gatekeeper.sh/requires-sync-data annotation, which informs the Azure Policy addon which resources need to be cached for policy evaluation to work properly. This process differs from standalone Gatekeeper, where this annotation is descriptive, not prescriptive.

Data replication is currently blocked for use in custom policy definitions, because replicating resources with high instance counts can dramatically increase the Gatekeeper pods' resource usage if not used carefully. You'll see a ConstraintTemplateInstallFailed error when attempting to create a custom policy definition containing a constraint template with this annotation.

Removing the annotation may appear to mitigate the error you see, but then the policy addon will not sync any required resources for that constraint template into the cache. Thus, your policies will be evaluated against an empty data.inventory (assuming that no built-in is assigned that replicates the requisite resources). This will lead to misleading compliance results. As noted previously, manually editing the config to cache the required resources is also not permitted.

The following limitations apply only to the Azure Policy Add-on for AKS:

  • Namespaces automatically excluded by Azure Policy Add-on for evaluation: kube-system and gatekeeper-system.

Frequently asked questions

What does the Azure Policy Add-on / Azure Policy Extension deploy on my cluster upon installation?

The Azure Policy Add-on requires three Gatekeeper components to run: One audit pod and two webhook pod replicas. One Azure Policy pod and one Azure Policy webhook pod is also installed.

How much resource consumption should I expect the Azure Policy Add-on / Extension to use on each cluster?

The Azure Policy for Kubernetes components that run on your cluster consume more resources as the count of Kubernetes resources and policy assignments increases in the cluster, which requires audit and enforcement operations.

The following are estimates to help you plan:

  • For fewer than 500 pods in a single cluster with a max of 20 constraints: two vCPUs and 350 MB of memory per component.
  • For more than 500 pods in a single cluster with a max of 40 constraints: three vCPUs and 600 MB of memory per component.

Can Azure Policy for Kubernetes definitions be applied on Windows pods?

Windows pods don't support security contexts. Thus, some of the Azure Policy definitions, like disallowing root privileges, can't be escalated in Windows pods and only apply to Linux pods.

What type of diagnostic data gets collected by Azure Policy Add-on?

The Azure Policy Add-on for Kubernetes collects limited cluster diagnostic data. This diagnostic data is vital technical data related to software and performance. The data is used in the following ways:

  • Keep Azure Policy Add-on up to date.
  • Keep Azure Policy Add-on secure, reliable, performant.
  • Improve Azure Policy Add-on - through the aggregate analysis of the use of the add-on.

The information collected by the add-on isn't personal data. The following details are currently collected:

  • Azure Policy Add-on agent version

  • Cluster type

  • Cluster region

  • Cluster resource group

  • Cluster resource ID

  • Cluster subscription ID

  • Cluster OS (Example: Linux)

  • Cluster city

  • Cluster state or province

  • Cluster country or region

  • Exceptions/errors encountered by Azure Policy Add-on during agent installation on policy evaluation

  • Number of Gatekeeper policy definitions not installed by Azure Policy Add-on

What are general best practices to keep in mind when installing the Azure Policy Add-on?

  • Use system node pool with CriticalAddonsOnly taint to schedule Gatekeeper pods. For more information, see Using system node pools.
  • Secure outbound traffic from your AKS clusters. For more information, see Control egress traffic for cluster nodes.
  • If the cluster has aad-pod-identity enabled, Node Managed Identity (NMI) pods modify the nodes iptables to intercept calls to the Azure Instance Metadata endpoint. This configuration means any request made to the Metadata endpoint is intercepted by NMI even if the pod doesn't use aad-pod-identity.
  • AzurePodIdentityException CRD can be configured to inform aad-pod-identity that any requests to the Metadata endpoint originating from a pod that matches labels defined in CRD should be proxied without any processing in NMI. The system pods with kubernetes.azure.com/managedby: aks label in kube-system namespace should be excluded in aad-pod-identity by configuring the AzurePodIdentityException CRD. For more information, see Disable aad-pod-identity for a specific pod or application. To configure an exception, install the mic-exception YAML.

Next steps