在 Azure Kubernetes 服务 (AKS) 上配置和部署 Valkey 群集

本文介绍如何在 Azure Kubernetes 服务(AKS)上配置和部署 Valkey 群集,包括创建 Valkey 群集 ConfigMap、主群集和辅助群集 Pod,以确保冗余和区域复制,以及确保高可用性的 Pod 中断预算(PDB)。

注意

本文包含对术语“主”和“从”的引用,Microsoft 不再使用这些术语。 在从 Valkey 软件中删除该术语后,我们会将其从本文中删除。

连接到 AKS 群集

  • 使用 kubectl 命令将 az aks get-credentials 配置为连接到 AKS 群集。

    az aks get-credentials --resource-group $MY_RESOURCE_GROUP_NAME --name $MY_CLUSTER_NAME --overwrite-existing --output table
    

创建命名空间

  1. 使用 kubectl create namespace 命令为 Valkey 群集创建一个命名空间。

    kubectl create namespace valkey --dry-run=client --output yaml | kubectl apply -f -
    

    示例输出:

    namespace/valkey created
    

创建机密

  1. 使用 OpenSSL 为 Valkey 群集生成随机密码,并使用 az keyvault secret set 命令将其存储在 Azure 密钥保管库中。 设置策略以允许用户分配的标识使用 az keyvault set-policy 命令获取机密。

    SECRET=$(openssl rand -base64 32)
    echo requirepass $SECRET > /tmp/valkey-password-file.conf
    echo primaryauth $SECRET >> /tmp/valkey-password-file.conf
    az keyvault secret set --vault-name $MY_KEYVAULT_NAME --name valkey-password-file --file /tmp/valkey-password-file.conf --output none
    rm /tmp/valkey-password-file.conf
    az keyvault set-policy --name $MY_KEYVAULT_NAME --object-id $userAssignedObjectID --secret-permissions get --output table
    
  1. 使用 az aks show 命令获取 Azure KeyVault 机密提供程序加载项创建的标识 ID 和对象 ID。

    export userAssignedIdentityID=$(az aks show --resource-group $MY_RESOURCE_GROUP_NAME --name $MY_CLUSTER_NAME --query addonProfiles.azureKeyvaultSecretsProvider.identity.clientId --output tsv)
    export userAssignedObjectID=$(az aks show --resource-group $MY_RESOURCE_GROUP_NAME --name $MY_CLUSTER_NAME --query addonProfiles.azureKeyvaultSecretsProvider.identity.objectId --output tsv)
    
    
  1. 创建一个 SecretProviderClass 资源,以使用 kubectl apply 命令访问存储在密钥保管库中的 Valkey 密码。

    export TENANT_ID=$(az account show --query tenantId --output tsv)
    kubectl apply -f - <<EOF
    ---
    apiVersion: secrets-store.csi.x-k8s.io/v1
    kind: SecretProviderClass
    metadata:
      name: valkey-password
      namespace: valkey
    spec:
      provider: azure
      parameters:
        usePodIdentity: "false"
        useVMManagedIdentity: "true"
        userAssignedIdentityID: "${userAssignedIdentityID}"
        keyvaultName: ${MY_KEYVAULT_NAME}              # the name of the AKV instance
        objects: |
          array:
            - |
              objectName: valkey-password-file
              objectAlias: valkey-password-file.conf
              objectType: secret
        tenantId: "${TENANT_ID}" # the tenant ID of the AKV instance
    EOF
    

创建 Valkey 配置文件

  1. 创建用于存储 Valkey 配置文件的 ConfigMap 资源。

    kubectl apply -f - <<EOF
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: valkey-cluster
      namespace: valkey
    data:
      valkey.conf:  |+
        cluster-enabled yes
        cluster-node-timeout 15000
        cluster-config-file /data/nodes.conf
        appendonly yes
        protected-mode yes
        dir /data
        port 6379
        include /etc/valkey-password/valkey-password-file.conf
    EOF
    

    示例输出:

    configmap/valkey-cluster created
    

创建 Valkey 主群集 Pod

  1. 使用 StatefulSet 命令创建一个 spec.affinity 资源,其 kubectl apply 目标是将所有主数据库保留在区域 1 和区域 2 中(最好在不同的节点中)。

    kubectl apply -f - <<EOF
    ---
    apiVersion: apps/v1
    kind: StatefulSet
    metadata:
      name: valkey-masters
      namespace: valkey
    spec:
      serviceName: "valkey-masters"
      replicas: 3
      selector:
        matchLabels:
          app: valkey
      template:
        metadata:
          labels:
            app: valkey
            appCluster: valkey-masters
        spec:
          terminationGracePeriodSeconds: 20
          affinity:
            nodeAffinity:
              requiredDuringSchedulingIgnoredDuringExecution:
                nodeSelectorTerms:
                - matchExpressions:
                  - key: agentpool
                    operator: In
                    values:
                    - valkey
                  - key: topology.kubernetes.io/zone
                    operator: In
                    values:
                    - ${MY_LOCATION}-1
                - matchExpressions:
                  - key: agentpool
                    operator: In
                    values:
                    - valkey
                  - key: topology.kubernetes.io/zone
                    operator: In
                    values:
                    - ${MY_LOCATION}-2
            podAntiAffinity:
              preferredDuringSchedulingIgnoredDuringExecution:
              - weight: 100
                podAffinityTerm:
                  labelSelector:
                    matchExpressions:
                    - key: app
                      operator: In
                      values:
                      - valkey
                  topologyKey: topology.kubernetes.io/zone
              - weight: 90
                podAffinityTerm:
                  labelSelector:
                    matchExpressions:
                    - key: app
                      operator: In
                      values:
                      - valkey
                  topologyKey: kubernetes.io/hostname
          containers:
          - name: role-master-checker
            image: "${MY_ACR_REGISTRY}.azurecr.cn/valkey:latest"
            command:
              - "/bin/bash"
              - "-c"
            args:
              [
                "while true; do role=\$(valkey-cli --pass \$(cat /etc/valkey-password/valkey-password-file.conf | awk '{print \$2; exit}') role | awk '{print \$1; exit}');     if [ \"\$role\" = \"slave\" ]; then valkey-cli --pass \$(cat /etc/valkey-password/valkey-password-file.conf | awk '{print \$2; exit}') cluster failover; fi; sleep 30; done"
              ]
            volumeMounts:
            - name: valkey-password
              mountPath: /etc/valkey-password
              readOnly: true
          - name: valkey
            image: "${MY_ACR_REGISTRY}.azurecr.cn/valkey:latest"
            env:
            - name: VALKEY_PASSWORD_FILE
              value: "/etc/valkey-password/valkey-password-file.conf"
            - name: MY_POD_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
            command:
              - "valkey-server"
            args:
              - "/conf/valkey.conf"
              - "--cluster-announce-ip"
              - "\$(MY_POD_IP)"
            resources:
              requests:
                cpu: "100m"
                memory: "100Mi"
            ports:
                - name: valkey
                  containerPort: 6379
                  protocol: "TCP"
                - name: cluster
                  containerPort: 16379
                  protocol: "TCP"
            volumeMounts:
            - name: conf
              mountPath: /conf
              readOnly: false
            - name: data
              mountPath: /data
              readOnly: false
            - name: valkey-password
              mountPath: /etc/valkey-password
              readOnly: true
          volumes:
          - name: valkey-password
            csi:
              driver: secrets-store.csi.k8s.io
              readOnly: true
              volumeAttributes:
                secretProviderClass: valkey-password
          - name: conf
            configMap:
              name: valkey-cluster
              defaultMode: 0755
      volumeClaimTemplates:
      - metadata:
          name: data
        spec:
          accessModes: [ "ReadWriteOnce" ]
          storageClassName: managed-csi-premium
          resources:
            requests:
              storage: 20Gi
    EOF
    

    示例输出:

    statefulset.apps/valkey-masters created
    

创建 Valkey 集群容器组

  1. 使用 StatefulSet 命令为 Valkey 辅助数据库创建第二个 spec.affinity 资源,其 kubectl apply 目标是将所有副本保留在区域 3 中(最好在不同的节点中)。

    kubectl apply -f - <<EOF
    ---
    apiVersion: apps/v1
    kind: StatefulSet
    metadata:
      name: valkey-replicas
      namespace: valkey
    spec:
      serviceName: "valkey-replicas"
      replicas: 3
      selector:
        matchLabels:
          app: valkey
      template:
        metadata:
          labels:
            app: valkey
            appCluster: valkey-replicas
        spec:
          terminationGracePeriodSeconds: 20
          affinity:
            nodeAffinity:
              requiredDuringSchedulingIgnoredDuringExecution:
                nodeSelectorTerms:
                - matchExpressions:
                  - key: agentpool
                    operator: In
                    values:
                    - valkey
                  - key: topology.kubernetes.io/zone
                    operator: In
                    values:
                    - ${MY_LOCATION}-3
            podAntiAffinity:
              preferredDuringSchedulingIgnoredDuringExecution:
              - weight: 90
                podAffinityTerm:
                  labelSelector:
                    matchExpressions:
                    - key: app
                      operator: In
                      values:
                      - valkey
                  topologyKey: kubernetes.io/hostname
          containers:
          - name: valkey
            image: "${MY_ACR_REGISTRY}.azurecr.cn/valkey:latest"
            env:
            - name: VALKEY_PASSWORD_FILE
              value: "/etc/valkey-password/valkey-password-file.conf"
            - name: MY_POD_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
            command:
              - "valkey-server"
            args:
              - "/conf/valkey.conf"
              - "--cluster-announce-ip"
              - "\$(MY_POD_IP)"
            resources:
              requests:
                cpu: "100m"
                memory: "100Mi"
            ports:
                - name: valkey
                  containerPort: 6379
                  protocol: "TCP"
                - name: cluster
                  containerPort: 16379
                  protocol: "TCP"
            volumeMounts:
            - name: conf
              mountPath: /conf
              readOnly: false
            - name: data
              mountPath: /data
              readOnly: false
            - name: valkey-password
              mountPath: /etc/valkey-password
              readOnly: true
          volumes:
          - name: valkey-password
            csi:
              driver: secrets-store.csi.k8s.io
              readOnly: true
              volumeAttributes:
                secretProviderClass: valkey-password
          - name: conf
            configMap:
              name: valkey-cluster
              defaultMode: 0755
      volumeClaimTemplates:
      - metadata:
          name: data
        spec:
          accessModes: [ "ReadWriteOnce" ]
          storageClassName: managed-csi-premium
          resources:
            requests:
              storage: 20Gi
    EOF
    

    示例输出:

    statefulset.apps/valkey-replicas created
    

验证 Pod 和节点分布

  1. 使用 master-Nreplica-N 命令验证 kubectl get nodeskubectl get pods 在不同的节点和区域中运行。

    kubectl get pods -n valkey -o wide
    kubectl get node -o custom-columns=Name:.metadata.name,Zone:".metadata.labels.topology\.kubernetes\.io/zone"
    

    示例输出:

    NAME                READY   STATUS    RESTARTS   AGE     IP             NODE                             NOMINATED NODE   READINESS GATES
    valkey-masters-0    1/1     Running   0          2m55s   10.224.0.4     aks-valkey-18693609-vmss000004   <none>           <none>
    valkey-masters-1    1/1     Running   0          2m31s   10.224.0.137   aks-valkey-18693609-vmss000000   <none>           <none>
    valkey-masters-2    1/1     Running   0          2m7s    10.224.0.222   aks-valkey-18693609-vmss000001   <none>           <none>
    valkey-replicas-0   1/1     Running   0          88s     10.224.0.237   aks-valkey-18693609-vmss000005   <none>           <none>
    valkey-replicas-1   1/1     Running   0          70s     10.224.0.18    aks-valkey-18693609-vmss000002   <none>           <none>
    valkey-replicas-2   1/1     Running   0          48s     10.224.0.242   aks-valkey-18693609-vmss000005   <none>           <none>
    Name                                Zone
    aks-nodepool1-17621399-vmss000000   chinanorth3-1
    aks-nodepool1-17621399-vmss000001   chinanorth3-2
    aks-nodepool1-17621399-vmss000003   chinanorth3-3
    aks-valkey-18693609-vmss000000      chinanorth3-1
    aks-valkey-18693609-vmss000001      chinanorth3-2
    aks-valkey-18693609-vmss000002      chinanorth3-3
    aks-valkey-18693609-vmss000003      chinanorth3-1
    aks-valkey-18693609-vmss000004      chinanorth3-2
    aks-valkey-18693609-vmss000005      chinanorth3-3
    

    在继续下一步之前,等待所有 Pod 都运行。

创建无头服务

  1. 使用 Service 命令创建三个无外设 kubectl apply 资源(第一个用于整个群集,第二个用于主项,第三个用于辅助项),用于获取 Valkey Pod 的 IP 地址。

    kubectl apply -f - <<EOF
    apiVersion: v1
    kind: Service
    metadata:
      name: valkey-cluster
      namespace: valkey
    spec:
      clusterIP: None
      ports:
      - name: valkey-port
        port: 6379
        protocol: TCP
        targetPort: 6379
      selector:
        app: valkey
      sessionAffinity: None
      type: ClusterIP
    EOF
    
    kubectl apply -f - <<EOF
    apiVersion: v1
    kind: Service
    metadata:
      name: valkey-masters
      namespace: valkey
    spec:
      clusterIP: None
      ports:
      - name: valkey-port
        port: 6379
        protocol: TCP
        targetPort: 6379
      selector:
        app: valkey
        appCluster: valkey-masters
      sessionAffinity: None
      type: ClusterIP
    EOF
    
    kubectl apply -f - <<EOF
    apiVersion: v1
    kind: Service
    metadata:
      name: valkey-replicas
      namespace: valkey
    spec:
      clusterIP: None
      ports:
      - name: valkey-port
        port: 6379
        protocol: TCP
        targetPort: 6379
      selector:
        app: valkey
        appCluster: valkey-replicas
      sessionAffinity: None
      type: ClusterIP
    EOF
    

    示例输出:

    service/valkey-cluster created
    service/valkey-masters created
    service/valkey-replicas created
    

创建 Pod 中断预算 (PDB)

  1. 创建 Pod 中断预算(PDB),以确保在自愿中断(例如升级或维护)期间最多只有一个 Pod 不可用。 这有助于维护 Kubernetes 群集中 Valkey 应用程序的稳定性和可用性。

    kubectl apply -f - <<EOF
    apiVersion: policy/v1
    kind: PodDisruptionBudget
    metadata:
      name: valkey
      namespace: valkey
    spec:
      maxUnavailable: 1
      selector:
        matchLabels:
          app: valkey
    EOF
    

    示例输出:

    poddisruptionbudget.policy/valkey created
    

运行 Valkey 群集

  1. 使用 kubectl exec 命令将区域 1 和 2 中的 Valkey 主数据库添加到群集。

    kubectl exec -it -n valkey valkey-masters-0 -- valkey-cli --cluster create --cluster-yes --cluster-replicas 0 \
                        valkey-masters-0.valkey-masters.valkey.svc.cluster.local:6379 \
                        valkey-masters-1.valkey-masters.valkey.svc.cluster.local:6379 \
                        valkey-masters-2.valkey-masters.valkey.svc.cluster.local:6379 \
                        --pass ${SECRET}
    

    示例输出:

    >>> Performing hash slots allocation on 3 nodes...
    Master[0] -> Slots 0 - 5460
    Master[1] -> Slots 5461 - 10922
    Master[2] -> Slots 10923 - 16383
    M: ee6ac1d00d3f016b6f46c7ce11199bc1a7809a35 valkey-masters-0.valkey-masters.valkey.svc.cluster.local:6379
       slots:[0-5460] (5461 slots) master
    M: fd1fb98db83976478e05edd3d2a02f9a13badd80 valkey-masters-1.valkey-masters.valkey.svc.cluster.local:6379
       slots:[5461-10922] (5462 slots) master
    M: ea47bf57ae7080ef03164a4d48b662c7b4c8770e valkey-masters-2.valkey-masters.valkey.svc.cluster.local:6379
       slots:[10923-16383] (5461 slots) master
    >>> Nodes configuration updated
    >>> Assign a different config epoch to each node
    >>> Sending CLUSTER MEET messages to join the cluster
    Waiting for the cluster to join
    ...
    >>> Performing Cluster Check (using node valkey-masters-0.valkey-masters.valkey.svc.cluster.local:6379)
    M: ee6ac1d00d3f016b6f46c7ce11199bc1a7809a35 valkey-masters-0.valkey-masters.valkey.svc.cluster.local:6379
       slots:[0-5460] (5461 slots) master
    M: ea47bf57ae7080ef03164a4d48b662c7b4c8770e 10.224.0.176:6379
       slots:[10923-16383] (5461 slots) master
    M: fd1fb98db83976478e05edd3d2a02f9a13badd80 10.224.0.247:6379
       slots:[5461-10922] (5462 slots) master
    [OK] All nodes agree about slots configuration.
    >>> Check for open slots...
    >>> Check slots coverage...
    [OK] All 16384 slots covered.
    
  2. 使用 kubectl exec 命令将区域 3 中的 Valkey 副本添加到群集。

    kubectl exec -ti -n valkey valkey-masters-0 -- valkey-cli --cluster add-node \
                        valkey-replicas-0.valkey-replicas.valkey.svc.cluster.local:6379 \
                        valkey-masters-0.valkey-masters.valkey.svc.cluster.local:6379  --cluster-slave \
                        --pass ${SECRET}
    
    kubectl exec -ti -n valkey valkey-masters-0 -- valkey-cli --cluster add-node \
                        valkey-replicas-1.valkey-replicas.valkey.svc.cluster.local:6379 \
                        valkey-masters-1.valkey-masters.valkey.svc.cluster.local:6379  --cluster-slave \
                        --pass ${SECRET}
    
    kubectl exec -ti -n valkey valkey-masters-0 -- valkey-cli --cluster add-node \
                        valkey-replicas-2.valkey-replicas.valkey.svc.cluster.local:6379 \
                        valkey-masters-2.valkey-masters.valkey.svc.cluster.local:6379  --cluster-slave \
                        --pass ${SECRET}
    

    示例输出:

    >>> Adding node valkey-replicas-0.valkey-replicas.valkey.svc.cluster.local:6379 to cluster valkey-masters-0.valkey-masters.valkey.svc.cluster.local:6379
    >>> Performing Cluster Check (using node valkey-masters-0.valkey-masters.valkey.svc.cluster.local:6379)
    M: ee6ac1d00d3f016b6f46c7ce11199bc1a7809a35 valkey-masters-0.valkey-masters.valkey.svc.cluster.local:6379
       slots:[0-5460] (5461 slots) master
    M: ea47bf57ae7080ef03164a4d48b662c7b4c8770e 10.224.0.176:6379
       slots:[10923-16383] (5461 slots) master
    M: fd1fb98db83976478e05edd3d2a02f9a13badd80 10.224.0.247:6379
       slots:[5461-10922] (5462 slots) master
    [OK] All nodes agree about slots configuration.
    >>> Check for open slots...
    >>> Check slots coverage...
    [OK] All 16384 slots covered.
    Automatically selected master valkey-masters-0.valkey-masters.valkey.svc.cluster.local:6379
    >>> Send CLUSTER MEET to node valkey-replicas-0.valkey-replicas.valkey.svc.cluster.local:6379 to make it join the cluster.
    Waiting for the cluster to join
    
    >>> Configure node as replica of valkey-masters-0.valkey-masters.valkey.svc.cluster.local:6379.
    [OK] New node added correctly.
    >>> Adding node valkey-replicas-1.valkey-replicas.valkey.svc.cluster.local:6379 to cluster valkey-masters-1.valkey-masters.valkey.svc.cluster.local:6379
    >>> Performing Cluster Check (using node valkey-masters-1.valkey-masters.valkey.svc.cluster.local:6379)
    M: fd1fb98db83976478e05edd3d2a02f9a13badd80 valkey-masters-1.valkey-masters.valkey.svc.cluster.local:6379
       slots:[5461-10922] (5462 slots) master
    S: 0ebceb60cbcc31da9040159440a1f4856b992907 10.224.0.224:6379
       slots: (0 slots) slave
       replicates ee6ac1d00d3f016b6f46c7ce11199bc1a7809a35
    M: ea47bf57ae7080ef03164a4d48b662c7b4c8770e 10.224.0.176:6379
       slots:[10923-16383] (5461 slots) master
    M: ee6ac1d00d3f016b6f46c7ce11199bc1a7809a35 10.224.0.14:6379
       slots:[0-5460] (5461 slots) master
       1 additional replica(s)
    [OK] All nodes agree about slots configuration.
    >>> Check for open slots...
    >>> Check slots coverage...
    [OK] All 16384 slots covered.
    Automatically selected master valkey-masters-1.valkey-masters.valkey.svc.cluster.local:6379
    >>> Send CLUSTER MEET to node valkey-replicas-1.valkey-replicas.valkey.svc.cluster.local:6379 to make it join the cluster.
    Waiting for the cluster to join
    
    >>> Configure node as replica of valkey-masters-1.valkey-masters.valkey.svc.cluster.local:6379.
    [OK] New node added correctly.
    >>> Adding node valkey-replicas-2.valkey-replicas.valkey.svc.cluster.local:6379 to cluster valkey-masters-2.valkey-masters.valkey.svc.cluster.local:6379
    >>> Performing Cluster Check (using node valkey-masters-2.valkey-masters.valkey.svc.cluster.local:6379)
    M: ea47bf57ae7080ef03164a4d48b662c7b4c8770e valkey-masters-2.valkey-masters.valkey.svc.cluster.local:6379
       slots:[10923-16383] (5461 slots) master
    S: 0ebceb60cbcc31da9040159440a1f4856b992907 10.224.0.224:6379
       slots: (0 slots) slave
       replicates ee6ac1d00d3f016b6f46c7ce11199bc1a7809a35
    S: fa44edff683e2e01ee5c87233f9f3bc35c205dce 10.224.0.103:6379
       slots: (0 slots) slave
       replicates fd1fb98db83976478e05edd3d2a02f9a13badd80
    M: ee6ac1d00d3f016b6f46c7ce11199bc1a7809a35 10.224.0.14:6379
       slots:[0-5460] (5461 slots) master
       1 additional replica(s)
    M: fd1fb98db83976478e05edd3d2a02f9a13badd80 10.224.0.247:6379
       slots:[5461-10922] (5462 slots) master
       1 additional replica(s)
    [OK] All nodes agree about slots configuration.
    >>> Check for open slots...
    >>> Check slots coverage...
    [OK] All 16384 slots covered.
    Automatically selected master valkey-masters-2.valkey-masters.valkey.svc.cluster.local:6379
    >>> Send CLUSTER MEET to node valkey-replicas-2.valkey-replicas.valkey.svc.cluster.local:6379 to make it join the cluster.
    Waiting for the cluster to join
    
    >>> Configure node as replica of valkey-masters-2.valkey-masters.valkey.svc.cluster.local:6379.
    [OK] New node added correctly.
    
  3. 使用以下命令验证 Pod 的角色:

    for x in $(seq 0 2); do echo "valkey-masters-$x"; kubectl exec -n valkey valkey-masters-$x  -- valkey-cli --pass ${SECRET} role; echo; done
    for x in $(seq 0 2); do echo "valkey-replicas-$x"; kubectl exec -n valkey valkey-replicas-$x -- valkey-cli --pass ${SECRET} role; echo; done
    

    示例输出:

    valkey-masters-0
    master
    84
    10.224.0.224
    6379
    84
    
    valkey-masters-1
    master
    84
    10.224.0.103
    6379
    84
    
    valkey-masters-2
    master
    70
    10.224.0.200
    6379
    70
    
    valkey-replicas-0
    slave
    10.224.0.14
    6379
    connected
    98
    
    valkey-replicas-1
    slave
    10.224.0.247
    6379
    connected
    98
    
    valkey-replicas-2
    slave
    10.224.0.176
    6379
    connected
    84
    

后续步骤

若要详细了解如何在 Azure Kubernetes 服务 (AKS) 上部署开源软件,请参阅以下文章:

供稿人

Microsoft 会维护本文。 本系列文章为以下参与者的原创作品:

  • Nelly Kiboi | 服务工程师
  • Saverio Proto | 首席客户体验工程师
  • Don High | 首席客户工程师
  • LaBrina Loving | 首席服务工程师
  • Ken Kilty | 首席 TPM
  • Russell de Pina | 首席 TPM
  • Colin Mixon | 产品经理
  • Ketan Chawda | 高级客户工程师
  • Naveed Kharadi | 客户体验工程师
  • Erin Schaffer | 内容开发人员 2