为 Azure Kubernetes 服务(AKS)部署开放服务网格(OSM)加载项时,可能会遇到与服务网格配置相关的问题。 本文探讨常见的故障排除错误以及如何解决这些问题。
使用
kubectl get deployment,pod,service
命令检查 OSM 控制器部署、Pod 和服务运行状况。kubectl get deployment,pod,service -n kube-system --selector app=osm-controller
正常的 OSM 控制器提供类似于以下示例输出的输出:
NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/osm-controller 2/2 2 2 3m4s NAME READY STATUS RESTARTS AGE pod/osm-controller-65bd8c445c-zszp4 1/1 Running 0 2m pod/osm-controller-65bd8c445c-xqhmk 1/1 Running 0 16s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/osm-controller ClusterIP 10.96.185.178 <none> 15128/TCP,9092/TCP,9091/TCP 3m4s service/osm-validator ClusterIP 10.96.11.78 <none> 9093/TCP 3m4s
备注
对于
osm-controller
服务,CLUSTER-IP 是不同的。 服务名称和 PORT(S)必须与示例输出相同。
使用
kubectl get deployment,pod,service
命令检查 OSM Injector 部署、Pod 和服务运行状况。kubectl get deployment,pod,service -n kube-system --selector app=osm-injector
正常的 OSM 注入器提供类似于以下示例输出的输出:
NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/osm-injector 2/2 2 2 4m37s NAME READY STATUS RESTARTS AGE pod/osm-injector-5c49bd8d7c-b6cx6 1/1 Running 0 4m21s pod/osm-injector-5c49bd8d7c-dx587 1/1 Running 0 4m37s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/osm-injector ClusterIP 10.96.236.108 <none> 9090/TCP 4m37s
使用
kubectl get deployment,pod,service
命令检查 OSM Bootstrap 部署、Pod 和服务运行状况。kubectl get deployment,pod,service -n kube-system --selector app=osm-bootstrap
正常的 OSM Bootstrap 提供类似于以下示例输出的输出:
NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/osm-bootstrap 1/1 1 1 5m25s NAME READY STATUS RESTARTS AGE pod/osm-bootstrap-594ffc6cb7-jc7bs 1/1 Running 0 5m25s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/osm-bootstrap ClusterIP 10.96.250.208 <none> 9443/TCP,9095/TCP 5m25s
使用
kubectl get ValidatingWebhookConfiguration
命令检查 OSM 验证 Webhook。kubectl get ValidatingWebhookConfiguration --selector app=osm-controller
正常的 OSM 验证 Webhook 会输出类似下面示例的结果:
NAME WEBHOOKS AGE aks-osm-validator-mesh-osm 1 81m
使用
kubectl get MutatingWebhookConfiguration
命令检查 OSM 变更 Webhook。kubectl get MutatingWebhookConfiguration --selector app=osm-injector
正常运行的 OSM 变更 Webhook 产生与以下示例输出类似的结果:
NAME WEBHOOKS AGE aks-osm-webhook-osm 1 102m
使用
kubectl get ValidatingWebhookConfiguration
、aks-osm-validator-mesh-osm
和jq '.webhooks[0].clientConfig.service'
命令检查 OSM 验证 Webhook 的服务和 CA 捆绑包。kubectl get ValidatingWebhookConfiguration aks-osm-validator-mesh-osm -o json | jq '.webhooks[0].clientConfig.service'
正确配置的 Webhook 验证器配置类似于以下示例 JSON 输出:
{ "name": "osm-config-validator", "namespace": "kube-system", "path": "/validate-webhook", "port": 9093 }
使用
和 命令检查 OSM 变更 Webhook 的服务和 CA 捆绑包< c2 />。 kubectl get MutatingWebhookConfiguration aks-osm-webhook-osm -o json | jq '.webhooks[0].clientConfig.service'
配置良好的 Mutating Webhook 配置类似于以下示例 JSON 输出:
{ "name": "osm-injector", "namespace": "kube-system", "path": "/mutate-pod-creation", "port": 9090 }
使用
kubectl get meshconfig
命令检查 OSM MeshConfig 资源是否存在。kubectl get meshconfig osm-mesh-config -n kube-system
使用
kubectl get meshconfig
命令结合-o yaml
检查 OSM MeshConfig 资源的内容。kubectl get meshconfig osm-mesh-config -n kube-system -o yaml
apiVersion: config.openservicemesh.io/v1alpha1 kind: MeshConfig metadata: creationTimestamp: "0000-00-00A00:00:00A" generation: 1 name: osm-mesh-config namespace: kube-system resourceVersion: "2494" uid: 6c4d67f3-c241-4aeb-bf4f-b029b08faa31 spec: certificate: serviceCertValidityDuration: 24h featureFlags: enableEgressPolicy: true enableMulticlusterMode: false enableWASMStats: true observability: enableDebugServer: true osmLogLevel: info tracing: address: jaeger.kube-system.svc.cluster.local enable: false endpoint: /api/v2/spans port: 9411 sidecar: configResyncInterval: 0s enablePrivilegedInitContainer: false envoyImage: mcr.azk8s.cn/oss/envoyproxy/envoy:v1.18.3 initContainerImage: mcr.azk8s.cn/oss/openservicemesh/init:v0.9.1 logLevel: error maxDataPlaneConnections: 0 resources: {} traffic: enableEgress: true enablePermissiveTrafficPolicyMode: true inboundExternalAuthorization: enable: false failureModeAllow: false statPrefix: inboundExtAuthz timeout: 1s useHTTPSIngress: false
密钥 | 类型 | 默认值 | Kubectl Patch 命令示例 |
---|---|---|---|
spec.traffic.enableEgress | 布尔 | true |
kubectl patch meshconfig osm-mesh-config -n kube-system -p '{"spec":{"traffic":{"enableEgress":true}}}' --type=merge |
spec.traffic.enablePermissiveTrafficPolicyMode | 布尔 | true |
kubectl patch meshconfig osm-mesh-config -n kube-system -p '{"spec":{"traffic":{"enablePermissiveTrafficPolicyMode":true}}}' --type=merge |
spec.traffic.useHTTPSIngress | 布尔 | false |
kubectl patch meshconfig osm-mesh-config -n kube-system -p '{"spec":{"traffic":{"useHTTPSIngress":true}}}' --type=merge |
spec.traffic.出站端口排除列表 | 数组 | [] |
kubectl patch meshconfig osm-mesh-config -n kube-system -p '{"spec":{"traffic":{"outboundPortExclusionList":[6379,8080]}}}' --type=merge |
spec.traffic.outboundIPRangeExclusionList | 数组 | [] |
kubectl patch meshconfig osm-mesh-config -n kube-system -p '{"spec":{"traffic":{"outboundIPRangeExclusionList":["10.0.0.0/32","1.1.1.1/24"]}}}' --type=merge |
spec.traffic.inboundPortExclusionList | 数组 | [] |
kubectl patch meshconfig osm-mesh-config -n kube-system -p '{"spec":{"traffic":{"inboundPortExclusionList":[6379,8080]}}}' --type=merge |
spec.certificate.serviceCert有效期时长 | 字符串 | "24h" |
kubectl patch meshconfig osm-mesh-config -n kube-system -p '{"spec":{"certificate":{"serviceCertValidityDuration":"24h"}}}' --type=merge |
spec.observability.enableDebugServer | 布尔 | true |
kubectl patch meshconfig osm-mesh-config -n kube-system -p '{"spec":{"observability":{"enableDebugServer":true}}}' --type=merge |
spec.observability.tracing.enable | 布尔 | false |
kubectl patch meshconfig osm-mesh-config -n kube-system -p '{"spec":{"observability":{"tracing":{"enable":true}}}}' --type=merge |
spec.observability.tracing.address | 字符串 | "jaeger.kube-system.svc.cluster.local" |
kubectl patch meshconfig osm-mesh-config -n kube-system -p '{"spec":{"observability":{"tracing":{"address": "jaeger.kube-system.svc.cluster.local"}}}}' --type=merge |
spec.observability.tracing.endpoint | 字符串 | "/api/v2/spans" |
kubectl patch meshconfig osm-mesh-config -n kube-system -p '{"spec":{"observability":{"tracing":{"endpoint":"/api/v2/spans"}}}}' --type=merge' --type=merge |
spec.observability.tracing.port | 整数 (int) | 9411 |
kubectl patch meshconfig osm-mesh-config -n kube-system -p '{"spec":{"observability":{"tracing":{"port":9411}}}}' --type=merge |
spec.observability.tracing.osmLogLevel | 字符串 | "info" |
kubectl patch meshconfig osm-mesh-config -n kube-system -p '{"spec":{"observability":{"tracing":{"osmLogLevel": "info"}}}}' --type=merge |
spec.sidecar.enablePrivilegedInitContainer | 布尔 | false |
kubectl patch meshconfig osm-mesh-config -n kube-system -p '{"spec":{"sidecar":{"enablePrivilegedInitContainer":true}}}' --type=merge |
spec.sidecar.logLevel | 字符串 | "error" |
kubectl patch meshconfig osm-mesh-config -n kube-system -p '{"spec":{"sidecar":{"logLevel":"error"}}}' --type=merge |
spec.sidecar.maxDataPlaneConnections | 整数 (int) | 0 |
kubectl patch meshconfig osm-mesh-config -n kube-system -p '{"spec":{"sidecar":{"maxDataPlaneConnections":"error"}}}' --type=merge |
spec.sidecar.envoyImage | 字符串 | "mcr.azk8s.cn/oss/envoyproxy/envoy:v1.19.1" |
kubectl patch meshconfig osm-mesh-config -n kube-system -p '{"spec":{"sidecar":{"envoyImage":"mcr.azk8s.cn/oss/envoyproxy/envoy:v1.19.1"}}}' --type=merge |
spec.sidecar.initContainerImage | 字符串 | "mcr.azk8s.cn/oss/openservicemesh/init:v0.11.1" |
kubectl patch meshconfig osm-mesh-config -n kube-system -p '{"spec":{"sidecar":{"initContainerImage":"mcr.azk8s.cn/oss/openservicemesh/init:v0.11.1"}}}' --type=merge |
spec.sidecar.configResyncInterval | 字符串 | "0s" |
kubectl patch meshconfig osm-mesh-config -n kube-system -p '{"spec":{"sidecar":{"configResyncInterval":"30s"}}}' --type=merge |
spec.featureFlags.enableWASMStats | 布尔 | "true" |
kubectl patch meshconfig osm-mesh-config -n kube-system -p '{"spec":{"featureFlags":{"enableWASMStats":"true"}}}' --type=merge |
spec.featureFlags.enableEgressPolicy | 布尔 | "true" |
kubectl patch meshconfig osm-mesh-config -n kube-system -p '{"spec":{"featureFlags":{"enableEgressPolicy":"true"}}}' --type=merge |
spec.featureFlags.enableMulticlusterMode | 布尔 | "false" |
kubectl patch meshconfig osm-mesh-config -n kube-system -p '{"spec":{"featureFlags":{"enableMulticlusterMode":"false"}}}' --type=merge |
spec.featureFlags.enableSnapshotCacheMode | 布尔 | "false" |
kubectl patch meshconfig osm-mesh-config -n kube-system -p '{"spec":{"featureFlags":{"enableSnapshotCacheMode":"false"}}}' --type=merge |
spec.featureFlags.enableAsyncProxyServiceMapping | 布尔 | "false" |
kubectl patch meshconfig osm-mesh-config -n kube-system -p '{"spec":{"featureFlags":{"enableAsyncProxyServiceMapping":"false"}}}' --type=merge |
spec.featureFlags.enableIngressBackendPolicy | 布尔 | "true" |
kubectl patch meshconfig osm-mesh-config -n kube-system -p '{"spec":{"featureFlags":{"enableIngressBackendPolicy":"true"}}}' --type=merge |
spec.featureFlags.enableEnvoyActiveHealthChecks | 布尔 | "false" |
kubectl patch meshconfig osm-mesh-config -n kube-system -p '{"spec":{"featureFlags":{"enableEnvoyActiveHealthChecks":"false"}}}' --type=merge |
备注
命名空间 kube-system
永远不会参与服务网格,并且永远不会用以下键/值标记和/或批注。
该 osm namespace add
命令允许将命名空间联接到给定的服务网格。 如果希望 K8s 命名空间是网格的一部分,它必须具有以下注释和标签。
使用
kubectl get namespace
和jq '.metadata.annotations'
命令查看批注。kubectl get namespace bookbuyer -o json | jq '.metadata.annotations'
必须在输出中看到以下注释:
{ "openservicemesh.io/sidecar-injection": "enabled" }
使用
kubectl get namespaces
和jq '.metadata.labels'
命令查看标签。kubectl get namespace bookbuyer -o json | jq '.metadata.labels'
必须在输出中看到以下标签:
{ "openservicemesh.io/monitored-by": "osm" }
如果命名空间没有 "openservicemesh.io/sidecar-injection": "enabled"
批注或 "openservicemesh.io/monitored-by": "osm"
标签,则 OSM Injector 不会添加 Envoy sidecars。
备注
在调用 osm namespace add
之后,只有新的 Pod 才会与 Envoy 挎斗一起注入。 必须使用 kubectl rollout restart deployment ...
重新启动现有 Pod
使用
kubectl get crds
命令检查群集是否具有所需的 CRD。kubectl get crds
必须在集群上安装以下 CRD:
- egresses.policy.openservicemesh.io
- httproutegroups.specs.smi-spec.io
- ingressbackends.policy.openservicemesh.io
- meshconfigs.config.openservicemesh.io
- multiclusterservices.config.openservicemesh.io
- tcproutes.specs.smi-spec.io
- trafficsplits.split.smi-spec.io
- traffictargets.access.smi-spec.io
使用
osm mesh list
命令获取安装的 SMI CRD 版本。osm mesh list
输出应类似于以下示例输出:
MESH NAME MESH NAMESPACE VERSION ADDED NAMESPACES osm kube-system v0.11.1 MESH NAME MESH NAMESPACE SMI SUPPORTED osm kube-system HTTPRouteGroup:v1alpha4,TCPRoute:v1alpha4,TrafficSplit:v1alpha2,TrafficTarget:v1alpha3 To list the OSM controller pods for a mesh, please run the following command passing in the mesh's namespace kubectl get pods -n <osm-mesh-namespace> -l app=osm-controller
OSM 控制器 v0.11.1 需要以下版本:
若要详细了解 OSM 如何颁发和管理在应用程序 Pod 上运行的 Envoy 代理的证书,请参阅 OSM 证书指南。
在由加载项监视的命名空间中创建新 Pod 时,OSM 会在该 Pod 中注入一个 Envoy 代理边车。 有关如何更新 Envoy 版本的详细信息,请参阅 OSM 升级指南。