Azure Kubernetes 服务 (AKS) 中的可持续软件工程原则Sustainable software engineering principles in Azure Kubernetes Service (AKS)

可持续软件工程原则是一组可帮助你定义、构建和运行可持续应用程序的能力。The sustainable software engineering principles are a set of competencies to help you define, build, and run sustainable applications. 总体目标是降低应用程序每个方面的碳足迹。The overall goal is to reduce your carbon footprint of every aspect of your application. Principles.Green 项目概述了可持续软件工程的原则。The Principles.Green project has an overview of the principles of sustainable software engineering.

关于可持续软件工程,需要了解的一个重要理念是:它是优先事项和重点的转移。An important idea to understand about sustainable software engineering is that it's a shift in priorities and focus. 在许多情况下,软件的设计和运行方式侧重于快速性能和低延迟。In many cases, software is designed and ran in a way that focuses on fast performance and low latency. 可持续软件工程侧重于尽可能多地降低碳排放量。Sustainable software engineering focuses on reducing as much carbon emissions as possible. 在某些情况下,应用可持续软件工程主体可以提高性能或降低延迟,例如,降低总体网络行程就可以实现该目标。In some cases, applying sustainable software engineering principals can give you faster performance or lower latency, such as by lowering total network travel. 考虑将可持续软件工程主体应用到应用程序之前,请查看应用程序的优先级、需求和利弊。Before considering applying sustainable software engineering principals to your application, review the priorities, needs, and trade-offs of your application.

度量和优化Measure and optimize

若要降低 AKS 群集的碳足迹,你需要了解群集资源的使用方式。To lower the carbon footprint of your AKS clusters, you need understand how your cluster's resources are being used. Azure Monitor 提供有关群集资源使用情况的详细信息,例如内存和 CPU 使用率。Azure Monitor provides details on your cluster's resource usage, such as memory and CPU usage. 这些数据可以帮助你做出决定,以减少群集的碳足迹并观察所做更改的效果。This data can help inform your decisions to reduce the carbon footprint of your cluster and observe the effect of your changes. 你还可以安装 Microsoft 可持续性计算器,以查看所有 Azure 资源的碳足迹。You can also install the Microsoft Sustainability Calculator to see the carbon footprint of all your Azure resources.

提高资源利用率Increase resource utilization

减少碳足迹的一种方法是缩短计算资源的空闲时间。One approach to lowering your carbon footprint is to reduce the amount of idle time for your compute resources. 缩短空闲时间涉及提高计算资源的利用率。Reducing your idle time involves increasing the utilization of your compute resources. 例如,如果群集中有四个节点,每个节点以 50% 的容量运行,则所有四个节点都有 50% 的未使用容量保持空闲状态。For example, if you had four nodes in your cluster, each running at 50% capacity, all four of your nodes have 50% unused capacity remaining idle. 如果将群集减为三个节点,则相同的工作负荷会导致三个节点以 67% 的容量运行,将每个节点上的未使用容量减少到 33%,从而提高利用率。If you reduced your cluster to three nodes, then the same workload would cause your three nodes to run at 67% capacity, reducing your unused capacity to 33% on each node and increasing your utilization.

重要

考虑更改群集中的资源时,请验证你的系统池是否有足够的资源来维持群集的核心系统组件的稳定性。When considering making changes to the resources in your cluster, verify your system pools have enough resources to maintain the stability of the core system components of your cluster. 永远不要将群集的资源减少到群集可能会变得不稳定的程度。Never reduce your cluster's resources to the point where your cluster may become unstable.

查看群集利用率之后,请考虑使用多节点池提供的功能。After reviewing your cluster's utilization, consider using the features offered by multiple node pools. 可以使用节点大小调整通过特定的 CPU 和内存配置文件定义节点池,以便根据工作负荷需求定制节点。You can use node sizing to define node pools with specific CPU and memory profiles, allowing you to tailor your nodes to your workload needs. 根据工作负荷需求调整节点大小可以让你在运行较少节点的情况下提高利用率。Sizing your nodes to your workload needs can help you run few nodes at higher utilization. 你还可以配置群集的缩放方式,并使用横向 pod 自动缩放程序群集自动缩放程序基于配置自动缩放群集。You can also configure how your cluster scales and use the horizontal pod autoscaler and the cluster autoscaler to scale your cluster automatically based on your configuration. 控制群集的缩放方式可以让你的所有节点保持以高利用率运行,同时跟上对群集工作负荷的更改。Controlling how your cluster scales can help you keep all your nodes running at a high utilization while keeping up with changes to your cluster's workload.

提高利用率还可以减少过多的节点,这可以减少每个节点上的资源预留所消耗的能量。Increasing utilization can also reduce excess nodes, which reduces the energy consumed by resource reservations on each node.

另外,请在你的应用程序的 Kubernetes 清单中查看 CPU 和内存请求与限制。Also review the CPU and memory requests and limits in the Kubernetes manifests of your applications. 由于你降低了内存和 CPU 的这些值,因此有更多的内存和 CPU 可供群集用来运行其他工作负荷。As you lower those values for memory and CPU, more memory and CPU are available to the cluster to run other workloads. 由于你以较少的 CPU 和内存运行较多的工作负荷,因此可以更加密集地分配你的群集,从而提高利用率。As you run more workloads with lower CPU and memory, your cluster becomes more densely allocated which increases your utilization. 降低应用程序的 CPU 和内存时,如果将这些值设置得过低,应用程序的行为可能会降级或变得不稳定。When reducing the CPU and memory for your applications, the behavior of your applications may become degraded or unstable if you set these values too low. 在更改 CPU 和内存请求与限制之前,请考虑运行一些基准测试,以了解这些值的设置是否合适。Before changing the CPU and memory requests and limits, consider running some benchmarking tests to understand if these values are set appropriately. 此外,永远不要将这些值减少到你的应用程序会变得不稳定的程度。Moreover, never reduce these values to the point when your application becomes unstable.

减少网络行程Reduce network travel

减少往返群集的请求和响应的距离通常可以减少联网设备的电力消耗并减少碳排放。Reducing the distance requests and responses to and from your cluster have to travel usually reduces electricity consumption by networking devices and reduces carbon emissions. 查看网络流量后,请考虑在更靠近你的网络流量来源的区域中创建群集。After reviewing your network traffic, consider creating clusters in regions closer to the source of your network traffic. 你还可以使用 Azure 流量管理器来帮助你将流量路由到最近的群集,以便减小 Azure 资源之间的距离。You can also use Azure Traffic Manager to help with routing traffic to the closest cluster to help reduce the distance between Azure resources.

重要

考虑对群集的网络进行更改时,切勿以必须满足的工作负荷要求为代价来减少网络行程。When considering making changes to your cluster's networking, never reduce network travel at the cost of meeting workload requirements. 例如,使用[可用性区域][availability-zones]会导致群集上的网络行程增多,但处理工作负荷要求可能需要使用该功能。For example, using [availability zones][availability-zones] causes more network travel on your cluster but using that feature may be necessary to handle workload requirements.

需求调整Demand shaping

如果可能,请考虑将对群集资源的需求转移到可以使用过量容量的时间或区域。Where possible, consider shifting demand for your cluster's resources to times or regions where you can use excess capacity. 例如,考虑更改要运行的批处理作业的时间或区域。For example, consider changing the time or region for a batch job to run. 还应考虑重构你的应用程序,以使用队列来延迟运行不需要立即处理的工作负荷。Also consider refactoring your application to use a queue to defer running workloads that don't need immediate processing.

后续步骤Next steps

详细了解本文中提到的 AKS 的功能:Learn more about the features of AKS mentioned in this article: