Azure Cosmos DB Java SDK v4 性能提示Performance tips for Azure Cosmos DB Java SDK v4

重要

本文中的性能提示仅适用于 Azure Cosmos DB Java SDK v4。The performance tips in this article are for Azure Cosmos DB Java SDK v4 only. 请查看 Azure Cosmos DB Java SDK v4 发行说明Maven 存储库、Azure Cosmos DB Java SDK v4 故障排除指南了解详细信息。Please view the Azure Cosmos DB Java SDK v4 Release notes, Maven repository, and Azure Cosmos DB Java SDK v4 troubleshooting guide for more information. 如果你当前使用的是早于 v4 的版本,请参阅迁移到 Azure Cosmos DB Java SDK v4 指南,获取升级到 v4 的相关帮助。If you are currently using an older version than v4, see the Migrate to Azure Cosmos DB Java SDK v4 guide for help upgrading to v4.

Azure Cosmos DB 是一个快速、弹性的分布式数据库,可以在提供延迟与吞吐量保证的情况下无缝缩放。Azure Cosmos DB is a fast and flexible distributed database that scales seamlessly with guaranteed latency and throughput. 凭借 Azure Cosmos DB,无需对体系结构进行重大更改或编写复杂的代码即可缩放数据库。You do not have to make major architecture changes or write complex code to scale your database with Azure Cosmos DB. 扩展和缩减操作就像执行单个 API 调用或 SDK 方法调用一样简单。Scaling up and down is as easy as making a single API call or SDK method call. 但是,由于 Azure Cosmos DB 是通过网络调用访问的,因此,使用 Azure Cosmos DB Java SDK v4 时,可以通过客户端优化获得最高性能。However, because Azure Cosmos DB is accessed via network calls there are client-side optimizations you can make to achieve peak performance when using Azure Cosmos DB Java SDK v4.

如果有“如何改善数据库性能?”的疑问,So if you're asking "How can I improve my database performance?" 请考虑以下选项:consider the following options:

网络Networking

  • 连接模式:使用直接模式 Connection mode: Use Direct mode

    客户端连接到 Azure Cosmos DB 的方式对性能有重大影响(尤其在客户端延迟方面)。How a client connects to Azure Cosmos DB has important implications on performance, especially in terms of client-side latency. 连接模式是可用于配置客户端的关键配置设置。The connection mode is a key configuration setting available for configuring the client. 对于 Azure Cosmos DB Java SDK v4,有两种可用的连接模式:For Azure Cosmos DB Java SDK v4, the two available connection modes are:

    • 直接模式(默认)Direct mode (default)
    • 网关模式Gateway mode

    这些连接模式实质上限制了数据平面请求(文档读取和写入)从客户端计算机到 Azure Cosmos DB 后端中分区的路由方式。These connection modes essentially condition the route that data plane requests - document reads and writes - take from your client machine to partitions in the Azure Cosmos DB back-end. 通常,直接模式是最佳性能的首选选项,它允许客户端直接与 Azure Cosmos DB 后端分区建立 TCP 连接,并直接发送请求,而不通过中介。Generally Direct mode is the preferred option for best performance - it allows your client to open TCP connections directly to partitions in the Azure Cosmos DB back-end and send requests directly with no intermediary. 与之相反,在“网关”模式下,客户端发出的请求会路由到 Azure Cosmos DB 前端中所谓的“网关”服务器,该服务器接下来会将你的请求扇出到 Azure Cosmos DB 后端的相应分区。By contrast, in Gateway mode, requests made by your client are routed to a so-called "Gateway" server in the Azure Cosmos DB front-end, which in turn fans out your requests to the appropriate partition(s) in the Azure Cosmos DB back-end. 如果应用程序在有严格防火墙限制的企业网络中运行,则“网关”模式是最佳选择,因为它使用标准 HTTPS 端口与单个终结点。If your application runs within a corporate network with strict firewall restrictions, Gateway mode is the best choice since it uses the standard HTTPS port and a single endpoint. 但是,对于性能的影响是每次从/向 Azure Cosmos DB 读取/写入数据时,“网关”模式都涉及到额外的网络跃点(从客户端到网关,以及从网关到分区)。The performance tradeoff, however, is that Gateway mode involves an additional network hop (client to Gateway plus Gateway to partition) every time data is read or written to Azure Cosmos DB. 因此,直接模式因为网络跃点较少,可以提供更好的性能。Because of this, Direct mode offers better performance due to fewer network hops.

    如下所示,使用 directMode() 或 gatewayMode() 方法在 Azure Cosmos DB 客户端生成器中配置数据平面请求的连接模式。The connection mode for data plane requests is configured in the Azure Cosmos DB client builder using the directMode() or gatewayMode() methods, as shown below. 若要使用默认设置配置任一模式,请调用任一方法而不使用参数。To configure either mode with default settings, call either method without arguments. 否则,以参数(directMode() 的是 DirectConnectionConfig,gatewayMode() 的是 GatewayConnectionConfig)的形式传递配置设置类实例。Otherwise, pass a configuration settings class instance as the argument (DirectConnectionConfig for directMode(), GatewayConnectionConfig for gatewayMode().)

    Java V4 SDKJava V4 SDK

    Java SDK V4 (Maven com.azure::azure-cosmos) 异步 APIJava SDK V4 (Maven com.azure::azure-cosmos) Async API

    
    /* Direct mode, default settings */
    CosmosAsyncClient clientDirectDefault = new CosmosClientBuilder()
            .endpoint(HOSTNAME)
            .key(MASTERKEY)
            .consistencyLevel(CONSISTENCY)
            .directMode()
            .buildAsyncClient();
    
    /* Direct mode, custom settings */
    DirectConnectionConfig directConnectionConfig = DirectConnectionConfig.getDefaultConfig();
    
    // Example config, do not use these settings as defaults
    directConnectionConfig.setMaxConnectionsPerEndpoint(120);
    directConnectionConfig.setIdleConnectionTimeout(Duration.ofMillis(100));
    
    CosmosAsyncClient clientDirectCustom = new CosmosClientBuilder()
            .endpoint(HOSTNAME)
            .key(MASTERKEY)
            .consistencyLevel(CONSISTENCY)
            .directMode(directConnectionConfig)
            .buildAsyncClient();
    
    /* Gateway mode, default settings */
    CosmosAsyncClient clientGatewayDefault = new CosmosClientBuilder()
            .endpoint(HOSTNAME)
            .key(MASTERKEY)
            .consistencyLevel(CONSISTENCY)
            .gatewayMode()
            .buildAsyncClient();
    
    /* Gateway mode, custom settings */
    GatewayConnectionConfig gatewayConnectionConfig = GatewayConnectionConfig.getDefaultConfig();
    
    // Example config, do not use these settings as defaults
    gatewayConnectionConfig.setProxy(new ProxyOptions(ProxyOptions.Type.HTTP, InetSocketAddress.createUnresolved("your.proxy.addr",80)));
    gatewayConnectionConfig.setMaxConnectionPoolSize(150);
    
    CosmosAsyncClient clientGatewayCustom = new CosmosClientBuilder()
            .endpoint(HOSTNAME)
            .key(MASTERKEY)
            .consistencyLevel(CONSISTENCY)
            .gatewayMode(gatewayConnectionConfig)
            .buildAsyncClient();
    
    /* No connection mode, defaults to Direct mode with default settings */
    CosmosAsyncClient clientDefault = new CosmosClientBuilder()
            .endpoint(HOSTNAME)
            .key(MASTERKEY)
            .consistencyLevel(CONSISTENCY)
            .buildAsyncClient();
    
    

    由于以下原因,directMode() 方法额外被替代。The directMode() method has an additional override, for the following reason. 控制平面操作(如数据库和容器 CRUD)始终使用网关模式;如果用户已为数据平面操作配置了直接模式,控制平面操作将使用默认的网关模式设置。Control plane operations such as database and container CRUD always utilize Gateway mode; when the user has configured Direct mode for data plane operations, control plane operations use default Gateway mode settings. 大多数用户是这种情况。This suits most users. 但是,如果用户想将直接模式用于数据平面操作,同时获得控制平面网关模式参数的可调性,则可以使用以下 directMode() 的重写:However, users who want Direct mode for data plane operations as well as tunability of control plane Gateway mode parameters can use the following directMode() override:

    Java V4 SDKJava V4 SDK

    Java SDK V4 (Maven com.azure::azure-cosmos) 异步 APIJava SDK V4 (Maven com.azure::azure-cosmos) Async API

    
    /* Independent customization of Direct mode data plane and Gateway mode control plane */
    DirectConnectionConfig directConnectionConfig = DirectConnectionConfig.getDefaultConfig();
    
    // Example config, do not use these settings as defaults
    directConnectionConfig.setMaxConnectionsPerEndpoint(120);
    directConnectionConfig.setIdleConnectionTimeout(Duration.ofMillis(100));
    
    GatewayConnectionConfig gatewayConnectionConfig = GatewayConnectionConfig.getDefaultConfig();
    
    // Example config, do not use these settings as defaults
    gatewayConnectionConfig.setProxy(new ProxyOptions(ProxyOptions.Type.HTTP, InetSocketAddress.createUnresolved("your.proxy.addr",80)));
    gatewayConnectionConfig.setMaxConnectionPoolSize(150);
    
    CosmosAsyncClient clientDirectCustom = new CosmosClientBuilder()
            .endpoint(HOSTNAME)
            .key(MASTERKEY)
            .consistencyLevel(CONSISTENCY)
            .directMode(directConnectionConfig,gatewayConnectionConfig)
            .buildAsyncClient();
    
    

  • 将客户端并置在同一 Azure 区域内以提高性能 Collocate clients in same Azure region for performance

    如果可能,请将任何调用 Azure Cosmos DB 的应用程序放在与 Azure Cosmos 数据库所在的相同区域中。When possible, place any applications calling Azure Cosmos DB in the same region as the Azure Cosmos database. 根据请求采用的路由,各项请求从客户端传递到 Azure 数据中心边界时的此类延迟可能有所不同。This latency can likely vary from request to request depending on the route taken by the request as it passes from the client to the Azure datacenter boundary. 通过确保在与预配 Azure Cosmos DB 终结点所在的同一 Azure 区域中调用应用程序,可能会实现最低的延迟。The lowest possible latency is achieved by ensuring the calling application is located within the same Azure region as the provisioned Azure Cosmos DB endpoint. 有关可用区域的列表,请参阅 Azure Regions(Azure 区域)。For a list of available regions, see Azure Regions.

    Azure Cosmos DB 连接策略演示

    与多区域 Azure Cosmos DB 帐户交互的应用需要配置首选位置,以确保请求进入并置区域。An app that interacts with a multi-region Azure Cosmos DB account needs to configure preferred locations to ensure that requests are going to a collocated region.

  • 在 Azure VM 上启用“加速网络”以降低延迟。Enable Accelerated Networking on your Azure VM for lower latency.

建议你按说明在 Windows(单击获取说明)Linux(单击获取说明)Azure VM 中启用“加速网络”,以便最大程度地提高性能。It is recommended that you follow the instructions to enable Accelerated Networking in your Windows (click for instructions) or Linux (click for instructions) Azure VM, in order to maximize performance.

没有加速网络,在 Azure VM 与其他 Azure 资源之间传输的 IO 可能会不必要地通过主机和虚拟交换机(位于 VM 与其网卡之间)进行路由。Without accelerated networking, IO that transits between your Azure VM and other Azure resources may be unnecessarily routed through a host and virtual switch situated between the VM and its network card. 在数据路径中以内联方式放置主机和虚拟交换机不仅会增加信道中的延迟和抖动,还会占用 VM 的 CPU 周期。Having the host and virtual switch inline in the datapath not only increases latency and jitter in the communication channel, it also steals CPU cycles from the VM. 使用加速网络时,VM 直接与 NIC 连接,没有中介;以前由主机和虚拟交换机处理的任何网络策略细节现在都在 NIC 的硬件中处理;主机和虚拟交换机将被绕过。With accelerated networking, the VM interfaces directly with the NIC without intermediaries; any network policy details which were being handled by the host and virtual switch are now handled in hardware at the NIC; the host and virtual switch are bypassed. 通常情况下,当启用加速网络后,应会降低延迟并提高吞吐量,同时会提高延迟一致性并降低 CPU 利用率。Generally you can expect lower latency and higher throughput, as well as more consistent latency and decreased CPU utilization when you enable accelerated networking.

限制:加速网络必须受 VM OS 支持,并且只能在已停止并解除分配 VM 的情况下启用。Limitations: accelerated networking must be supported on the VM OS, and can only be enabled when the VM is stopped and deallocated. 不能通过 Azure 资源管理器部署此 VM。The VM cannot be deployed with Azure Resource Manager.

有关更多详细信息,请参阅 WindowsLinux 说明。Please see the Windows and Linux instructions for more details.

SDK 用法SDK usage

  • 安装最新的 SDKInstall the most recent SDK

    Azure Cosmos DB SDK 正在不断改进以提供最佳性能。The Azure Cosmos DB SDKs are constantly being improved to provide the best performance. 请参阅 Azure Cosmos DB SDK 页以了解最新的 SDK 并查看改进内容。See the Azure Cosmos DB SDK pages to determine the most recent SDK and review improvements.

  • 在应用程序生存期内使用单一实例 Azure Cosmos DB 客户端Use a singleton Azure Cosmos DB client for the lifetime of your application

    每个 Azure Cosmos DB 客户端实例都是线程安全的,可执行高效的连接管理和地址缓存。Each Azure Cosmos DB client instance is thread-safe and performs efficient connection management and address caching. 若要通过 Azure Cosmos DB 客户端实现高效的连接管理和更好的性能,建议在应用程序生存期内对每个 AppDomain 使用单个 Azure Cosmos DB 客户端实例。To allow efficient connection management and better performance by the Azure Cosmos DB client, it is recommended to use a single instance of the Azure Cosmos DB client per AppDomain for the lifetime of the application.

  • 使用应用程序所需的最低一致性级别Use the lowest consistency level required for your application

    创建 CosmosClient 时,在未显式设置的情况下,所使用的默认一致性是“会话”。When you create a CosmosClient, the default consistency used if not explicitly set is Session. 如果应用程序逻辑不要求“会话”一致性,请将“一致性”设置为“最终”。If Session consistency is not required by your application logic set the Consistency to Eventual. 注意:建议在采用 Azure Cosmos DB 更改源处理器的应用程序中至少使用“会话”一致性。Note: it is recommended to use at least Session consistency in applications employing the Azure Cosmos DB Change Feed processor.

  • 使用异步 API 最大化预配的吞吐量Use Async API to max out provisioned throughput

    Azure Cosmos DB Java SDK v4 捆绑了两个 API:同步 API 和异步 API。Azure Cosmos DB Java SDK v4 bundles two APIs, Sync and Async. 大致说来,异步 API 用于实现 SDK 功能,而同步 API 则是一种精简的包装器,用于向异步 API 发出阻止调用。Roughly speaking, the Async API implements SDK functionality, whereas the Sync API is a thin wrapper that makes blocking calls to the Async API. 这不同于较旧的 Azure Cosmos DB Async Java SDK v2(仅限异步),也不同于较旧的 Azure Cosmos DB Sync Java SDK v2(仅限同步,且具有完全不同的实现)。This stands in contrast to the older Azure Cosmos DB Async Java SDK v2, which was Async-only, and to the older Azure Cosmos DB Sync Java SDK v2, which was Sync-only and had a completely separate implementation.

    API 的选择在客户端初始化期间确定;CosmosAsyncClient 支持异步 API,而 CosmosClient 支持同步 API。The choice of API is determined during client initialization; a CosmosAsyncClient supports Async API while a CosmosClient supports Sync API.

    异步 API 可实现非阻止 IO。如果你的目标是将请求发送到 Azure Cosmos DB 时最大化吞吐量,则它是最佳选择。The Async API implements non-blocking IO and is the optimal choice if your goal is to max out throughput when issuing requests to Azure Cosmos DB.

    如果你想要或需要一个可以阻止对每个请求做出响应的 API,或者如果同步操作是应用程序中的主导模式,则使用同步 API 可能是正确的选择。Using Sync API can be the right choice if you want or need an API which blocks on the response to each request, or if synchronous operation is the dominant paradigm in your application. 例如,在吞吐量并不重要的情况下,若要在微服务应用程序中将数据持久保存到 Azure Cosmos DB,则可以使用同步 API。For example, you might want the Sync API when you are persisting data to Azure Cosmos DB in a microservices application, provided throughput is not critical.

    请注意,同步 API 吞吐量会随请求响应时间的增加而降低,而异步 API 可充分利用硬件的全部带宽能力。Just be aware that Sync API throughput degrades with increasing request response-time, whereas the Async API can saturate the full bandwidth capabilities of your hardware.

    使用同步 API 时,进行地理并置可以获得更高且更一致的吞吐量(请参阅将客户端并置在同一 Azure 区域内以提高性能),但应不会超过异步 API 可获得的吞吐量。Geographic collocation can give you higher and more consistent throughput when using Sync API (see Collocate clients in same Azure region for performance) but still is not expected to exceed Async API attainable throughput.

    另外,某些用户可能不熟悉 Project Reactor,这是用于实现 Azure Cosmos DB Java SDK v4 异步 API 的反应流框架。Some users may also be unfamiliar with Project Reactor, the Reactive Streams framework used to implement Azure Cosmos DB Java SDK v4 Async API. 如果存在此问题,建议你阅读我们的简介性文章:Reactor Pattern Guide(Reactor 模式指南),然后查看此响应式编程简介,自行熟悉相关内容。If this is a concern, we recommend you read our introductory Reactor Pattern Guide and then take a look at this Introduction to Reactive Programming in order to familiarize yourself. 如果你已将 Azure Cosmos DB 与异步接口配合使用,并且所使用的 SDK 是 Azure Cosmos DB Async Java SDK v2,那么你可能已熟悉 ReactiveX/RxJava,但不确定 Project Reactor 中所做的变更。If you have already used Azure Cosmos DB with an Async interface, and the SDK you used was Azure Cosmos DB Async Java SDK v2, then you may be familiar with ReactiveX/RxJava but be unsure what has changed in Project Reactor. 这种情况下,请查看我们的 Reactor vs.RxJava Guide(Reactor 与 RxJava 指南),熟悉相关内容。In that case, please take a look at our Reactor vs. RxJava Guide to become familiarized.

    以下代码片段演示了如何分别针对异步 API 或同步 API 操作初始化 Azure Cosmos DB 客户端:The following code snippets show how to initialize your Azure Cosmos DB client for Async API or Sync API operation, respectively:

    Java V4 SDKJava V4 SDK

    Java SDK V4 (Maven com.azure::azure-cosmos) 异步 APIJava SDK V4 (Maven com.azure::azure-cosmos) Async API

    
    CosmosAsyncClient client = new CosmosClientBuilder()
            .endpoint(HOSTNAME)
            .key(MASTERKEY)
            .consistencyLevel(CONSISTENCY)
            .buildAsyncClient();
    
    
  • 优化 ConnectionPolicyTuning ConnectionPolicy

    默认情况下,在使用 Azure Cosmos DB Java SDK v4 时,直接模式 Cosmos DB 请求是通过 TCP 发出的。By default, Direct mode Cosmos DB requests are made over TCP when using Azure Cosmos DB Java SDK v4. 在内部,直接模式使用特殊的体系结构来动态管理网络资源并获得最佳性能。Internally Direct mode uses a special architecture to dynamically manage network resources and get the best performance.

    在 Azure Cosmos DB Java SDK v4 中,直接模式是为大多数工作负荷改善数据库性能的最佳选择。In Azure Cosmos DB Java SDK v4, Direct mode is the best choice to improve database performance with most workloads.

    • 直接模式概述Overview of Direct mode

      直接模式体系结构插图

      在直接模式下采用的客户端体系结构使得网络利用率可预测,并实现对 Azure Cosmos DB 副本的多路访问。The client-side architecture employed in Direct mode enables predictable network utilization and multiplexed access to Azure Cosmos DB replicas. 上图显示了直接模式如何将客户端请求路由到 Cosmos DB 后端中的副本。The diagram above shows how Direct mode routes client requests to replicas in the Cosmos DB backend. 直接模式体系结构在客户端上为每个数据库副本最多分配 10 个通道。The Direct mode architecture allocates up to 10 Channels on the client side per DB replica. 一个通道是前面带有请求缓冲区(深度为 30 个请求)的 TCP 连接。A Channel is a TCP connection preceded by a request buffer, which is 30 requests deep. 属于某个副本的通道由该副本的服务终结点按需动态分配。The Channels belonging to a replica are dynamically allocated as needed by the replica's Service Endpoint. 当用户在直接模式下发出请求时,TransportClient 会根据分区键将请求路由到适当的服务终结点。When the user issues a request in Direct mode, the TransportClient routes the request to the proper service endpoint based on the partition key. 请求队列在服务终结点之前缓冲请求。The Request Queue buffers requests before the Service Endpoint.

    • 直接模式的配置选项Configuration options for Direct mode

      如果需要非默认的直接模式行为,则在 Azure Cosmos DB 客户端生成器中创建 DirectConnectionConfig 实例并自定义其属性,然后将自定义的属性实例传递到 directMode() 方法。If non-default Direct mode behavior is desired, create a DirectConnectionConfig instance and customize its properties, then pass the customized property instance to the directMode() method in the Azure Cosmos DB client builder.

      这些配置设置控制以上讨论的基础直接模式体系结构的行为。These configuration settings control the behavior of the underlying Direct mode architecture discussed above.

      第一步是使用下面推荐的配置设置。As a first step, use the following recommended configuration settings below. 这些 DirectConnectionConfig 选项是高级配置设置,可能会以意想不到的方式影响 SDK 性能;我们建议用户不要对其进行修改,除非他们深刻了解其中的得失,并且进行修改是绝对必要的。These DirectConnectionConfig options are advanced configuration settings which can affect SDK performance in unexpected ways; we recommend users avoid modifying them unless they feel very comfortable in understanding the tradeoffs and it is absolutely necessary. 如果遇到有关此特定主题方面的问题,请与 Azure Cosmos DB 团队联系。Please contact the Azure Cosmos DB team if you run into issues on this particular topic.

      配置选项Configuration option 默认Default
      idleConnectionTimeoutidleConnectionTimeout “PT1M”"PT1M"
      maxConnectionsPerEndpointmaxConnectionsPerEndpoint "PT0S""PT0S"
      connectTimeoutconnectTimeout "PT1M10S""PT1M10S"
      idleEndpointTimeoutidleEndpointTimeout 83886088388608
      maxRequestsPerConnectionmaxRequestsPerConnection 1010
  • 优化分区集合的并行查询。Tuning parallel queries for partitioned collections

    Azure Cosmos DB Java SDK v4 支持并行查询,允许以并行方式查询分区的集合。Azure Cosmos DB Java SDK v4 supports parallel queries, which enable you to query a partitioned collection in parallel. 有关详细信息,请参阅与使用 Azure Cosmos DB Java SDK v4 相关的代码示例For more information, see code samples related to working with Azure Cosmos DB Java SDK v4. 并行查询旨改善查询延迟和串行配对物上的吞吐量。Parallel queries are designed to improve query latency and throughput over their serial counterpart.

    • 优化 setMaxDegreeOfParallelism:Tuning setMaxDegreeOfParallelism:

      并行查询的方式是并行查询多个分区。Parallel queries work by querying multiple partitions in parallel. 但就查询本身而言,会按顺序提取单个已分区集合中的数据。However, data from an individual partitioned collection is fetched serially with respect to the query. 因此,通过使用 setMaxDegreeOfParallelism 设置分区数,最有可能实现查询的最高性能,但前提是所有其他系统条件仍保持不变。So, use setMaxDegreeOfParallelism to set the number of partitions that has the maximum chance of achieving the most performant query, provided all other system conditions remain the same. 如果不知道分区数,可使用 setMaxDegreeOfParallelism 设置一个较高的数值,系统会选择最小值(分区数、用户输入)作为最大并行度。If you don't know the number of partitions, you can use setMaxDegreeOfParallelism to set a high number, and the system chooses the minimum (number of partitions, user provided input) as the maximum degree of parallelism.

      必须注意,如果查询时数据均衡分布在所有分区之间,则并行查询可提供最大的优势。It is important to note that parallel queries produce the best benefits if the data is evenly distributed across all partitions with respect to the query. 如果对分区集合进行分区,其中全部或大部分查询所返回的数据集中于几个分区(最坏的情况下为一个分区),则这些分区会遇到查询的性能瓶颈。If the partitioned collection is partitioned such a way that all or a majority of the data returned by a query is concentrated in a few partitions (one partition in worst case), then the performance of the query would be bottlenecked by those partitions.

    • 优化 setMaxBufferedItemCount:Tuning setMaxBufferedItemCount:

      并行查询设计为当客户端正在处理当前结果批时预提取结果。Parallel query is designed to pre-fetch results while the current batch of results is being processed by the client. 预提取帮助改进查询中的的总体延迟。The pre-fetching helps in overall latency improvement of a query. setMaxBufferedItemCount 会限制预提取结果的数目。setMaxBufferedItemCount limits the number of pre-fetched results. 通过将 setMaxBufferedItemCount 设置为预期返回的结果数(或较高的数值),可使查询从预提取获得最大的好处。Setting setMaxBufferedItemCount to the expected number of results returned (or a higher number) enables the query to receive maximum benefit from pre-fetching.

      预提取的工作方式不因 MaxDegreeOfParallelism 而异,并且有一个单独的缓冲区用来存储所有分区的数据。Pre-fetching works the same way irrespective of the MaxDegreeOfParallelism, and there is a single buffer for the data from all partitions.

  • 增大客户端工作负荷Scale out your client-workload

    如果在高吞吐量级别进行测试,客户端应用程序可能会由于计算机的 CPU 或网络利用率达到上限而成为瓶颈。If you are testing at high throughput levels, the client application may become the bottleneck due to the machine capping out on CPU or network utilization. 如果达到此上限,可以跨多个服务器横向扩展客户端应用程序以继续进一步推送 Azure Cosmos DB 帐户。If you reach this point, you can continue to push the Azure Cosmos DB account further by scaling out your client applications across multiple servers.

    建议不要让任何给定服务器上的 CPU 利用率超出 50%,使延迟保持在较低水平。A good rule of thumb is not to exceed >50% CPU utilization on any given server, to keep latency low.

  • 调整查询/读取源的页面大小以获得更好的性能Tune the page size for queries/read feeds for better performance

    使用读取源功能(例如 readItems)执行批量文档读取时,或发出 SQL 查询 (queryItems) 时,如果结果集太大,则会以分段方式返回结果。When performing a bulk read of documents by using read feed functionality (for example, readItems) or when issuing a SQL query (queryItems), the results are returned in a segmented fashion if the result set is too large. 默认情况下,以包括 100 个项的块或 1 MB 大小的块返回结果(以先达到的限制为准)。By default, results are returned in chunks of 100 items or 1 MB, whichever limit is hit first.

    假设应用程序向 Azure Cosmos DB 发出一个查询,同时假设应用程序需要有完整的查询结果集才能完成其任务。Suppose that your application issues a query to Azure Cosmos DB, and suppose that your application requires the full set of query results in order to complete its task. 若要减少检索所有适用结果所需的网络往返次数,可以通过调整 x-ms-max-item-count 请求标头字段来增大页面大小。To reduce the number of network round trips required to retrieve all applicable results, you can increase the page size by adjusting the x-ms-max-item-count request header field.

    • 对于单分区查询,将 x-ms-max-item-count 字段值调整为 -1(对页面大小没有限制)可以最大程度地减少查询响应页的数目,从而最大程度地增加延迟,这样就会出现两种情况:一种是在单个页面中返回完整结果集;另一种是查询所用时间超出了超时时间间隔,因此将会以尽可能少的页数返回完整结果集。For single-partition queries, adjusting the x-ms-max-item-count field value to -1 (no limit on page size) maximizes latency by minimizing the number of query response pages: either the full result set will return in a single page, or if the query takes longer than the timeout interval, then the full result set will be returned in the minimum number of pages possible. 这样可以成倍节省请求往返时间。This saves on multiples of the request round-trip time.

    • 对于跨分区查询,如果将 x-ms-max-item-count 字段设置为 -1 并去除页面大小限制,则会存在因无法管理的页面大小而使 SDK 无法正常工作的风险。For cross-partition queries, setting the x-ms-max-item-count field to -1 and removing the page size limit risks overwhelming the SDK with unmanageable page sizes. 在跨分区的情况下,建议你将页面大小限制提高到某个够大但又有限的值(例如 1000),以降低延迟。In the cross-partition case we recommend raising the page size limit up to some large but finite value, perhaps 1000, to reduce latency.

    在某些应用程序中,可能不需要完整的查询结果集。In some applications, you may not require the full set of query results. 在只需要显示几个结果的情况下(例如,用户界面或应用程序 API 一次只返回 10 个结果),也可以将页面大小缩小到 10,以降低读取和查询所耗用的吞吐量。In cases where you need to display only a few results, for example, if your user interface or application API returns only 10 results at a time, you can also decrease the page size to 10 to reduce the throughput consumed for reads and queries.

    也可以设置 byPage 方法的首选页面大小参数,而不是直接修改 REST 标头字段。You may also set the preferred page size argument of the byPage method, rather than modifying the REST header field directly. 请记住,x-ms-max-item-countbyPage 的首选页面大小参数仅设置页面大小的上限,而不是绝对要求。因此,由于各种原因,你可能会看到 Azure Cosmos DB 返回的页面小于首选页面大小。Keep in mind that x-ms-max-item-count or the preferred page size argument of byPage are only setting an upper limit on page size, not an absolute requirement; so for a variety of reason you may see Azure Cosmos DB return pages which are smaller than your preferred page size.

  • 使用相应的计划程序(避免窃取事件循环 IO Netty 线程)Use Appropriate Scheduler (Avoid stealing Event loop IO Netty threads)

    Azure Cosmos DB Java SDK 的异步功能基于 netty 非阻止 IO。The asynchronous functionality of Azure Cosmos DB Java SDK is based on netty non-blocking IO. SDK 使用固定数量的 IO netty 事件循环线程(数量与计算机提供的 CPU 核心数相同)来执行 IO 操作。The SDK uses a fixed number of IO netty event loop threads (as many CPU cores your machine has) for executing IO operations. API 返回的 Flux 会将结果发送到某个共享 IO 事件循环 netty 线程上。The Flux returned by API emits the result on one of the shared IO event loop netty threads. 因此,切勿阻塞共享的 IO 事件循环 netty 线程。So it is important to not block the shared IO event loop netty threads. 针对 IO 事件循环 netty 线程执行 CPU 密集型工作或者阻塞操作可能导致死锁,或大大减少 SDK 吞吐量。Doing CPU intensive work or blocking operation on the IO event loop netty thread may cause deadlock or significantly reduce SDK throughput.

    例如,以下代码针对事件循环 IO netty 线程执行 CPU 密集型工作:For example the following code executes a cpu intensive work on the event loop IO netty thread:

    Java SDK V4 (Maven com.azure::azure-cosmos) Async APIJava SDK V4 (Maven com.azure::azure-cosmos) Async API

    
    Mono<CosmosItemResponse<CustomPOJO>> createItemPub = asyncContainer.createItem(item);
    createItemPub.subscribe(
            itemResponse -> {
                //this is executed on eventloop IO netty thread.
                //the eventloop thread is shared and is meant to return back quickly.
                //
                // DON'T do this on eventloop IO netty thread.
                veryCpuIntensiveWork();
            });
    
    

    收到结果后,如果想要针对结果执行 CPU 密集型工作,应避免针对事件循环 IO netty 线程执行。After result is received if you want to do CPU intensive work on the result you should avoid doing so on event loop IO netty thread. 你可以改为提供自己的计划程序,以便提供自己的线程来运行工作,如下所示(需要 import reactor.core.scheduler.Schedulers)。You can instead provide your own Scheduler to provide your own thread for running your work, as shown below (requires import reactor.core.scheduler.Schedulers).

    Java SDK V4 (Maven com.azure::azure-cosmos) Async APIJava SDK V4 (Maven com.azure::azure-cosmos) Async API

    
    Mono<CosmosItemResponse<CustomPOJO>> createItemPub = asyncContainer.createItem(item);
    createItemPub
        .subscribeOn(Schedulers.elastic())
        .subscribe(
        itemResponse -> {
            //this is executed on eventloop IO netty thread.
            //the eventloop thread is shared and is meant to return back quickly.
            //
            // DON'T do this on eventloop IO netty thread.
            veryCpuIntensiveWork();                
        });
    
    

    应该根据工作的类型使用相应的现有 Reactor 计划程序来执行工作。Based on the type of your work you should use the appropriate existing Reactor Scheduler for your work. 请阅读 SchedulersRead here Schedulers.

    有关 Azure Cosmos DB Java SDK v4 的详细信息,请参阅 GitHub 上 Azure SDK for Java 单存储库的 Cosmos DB 目录For more information on Azure Cosmos DB Java SDK v4, please look at the Cosmos DB directory of the Azure SDK for Java monorepo on GitHub.

  • 优化应用程序中的日志记录设置Optimize logging settings in your application

    由于各种原因,你可能希望或需要在某个产生较高请求吞吐量的线程中添加日志记录。For a variety of reasons, you may want or need to add logging in a thread which is generating high request throughput. 如果你的目标是使用此线程生成的请求使容器的预配吞吐量完全饱和,则日志记录优化可以极大地提升性能。If your goal is to fully saturate a container's provisioned throughput with requests generated by this thread, logging optimizations can greatly improve performance.

    • 配置异步记录器Configure an async logger

      生成请求的线程的总体延迟计算必然会考虑到同步记录器延迟的因素。The latency of a synchronous logger necessarily factors into the overall latency calculation of your request-generating thread. 建议使用异步记录器(例如 log4j2),以便将日志记录开销与高性能应用程序线程分开。An async logger such as log4j2 is recommended to decouple logging overhead from your high-performance application threads.

    • 禁用 netty 的日志记录Disable netty's logging

      Netty 库日志记录非常琐碎,因此需要将其关闭(在配置中禁止登录可能并不足够),以避免产生额外的 CPU 开销。Netty library logging is chatty and needs to be turned off (suppressing sign in the configuration may not be enough) to avoid additional CPU costs. 如果不处于调试模式,请一起禁用 netty 日志记录。If you are not in debugging mode, disable netty's logging altogether. 因此,如果要使用 log4j 来消除 netty 中 org.apache.log4j.Category.callAppenders() 产生的额外 CPU 开销,请将以下行添加到基代码:So if you are using log4j to remove the additional CPU costs incurred by org.apache.log4j.Category.callAppenders() from netty add the following line to your codebase:

      org.apache.log4j.Logger.getLogger("io.netty").setLevel(org.apache.log4j.Level.OFF);
      
  • OS 打开文件资源限制OS Open files Resource Limit

    某些 Linux 系统(例如 CentOS)对打开的文件数和连接总数施加了上限。Some Linux systems (like CentOS) have an upper limit on the number of open files and so the total number of connections. 运行以下命令以查看当前限制:Run the following to view the current limits:

    ulimit -a
    

    打开的文件数 (nofile) 需要足够大,以便为配置的连接池大小和 OS 打开的其他文件留出足够的空间。The number of open files (nofile) needs to be large enough to have enough room for your configured connection pool size and other open files by the OS. 可以修改此参数,以增大连接池大小。It can be modified to allow for a larger connection pool size.

    打开 limits.conf 文件:Open the limits.conf file:

    vim /etc/security/limits.conf
    

    添加/修改以下行:Add/modify the following lines:

    * - nofile 100000
    
  • 在点写入中指定分区键Specify partition key in point writes

    若要提高点写入的性能,请在点写入 API 调用中指定项分区键,如下所示:To improve the performance of point writes, specify item partition key in the point write API call, as shown below:

    Java SDK V4 (Maven com.azure::azure-cosmos) 异步 APIJava SDK V4 (Maven com.azure::azure-cosmos) Async API

    asyncContainer.createItem(item,new PartitionKey(pk),new CosmosItemRequestOptions()).block();
    
    

    而不是仅提供项实例,如下所示:rather than providing only the item instance, as shown below:

    Java SDK V4 (Maven com.azure::azure-cosmos) 异步 APIJava SDK V4 (Maven com.azure::azure-cosmos) Async API

    asyncContainer.createItem(item).block();
    
    

    后者是受支持的,但会增加应用程序的延迟;SDK 必须分析项并提取分区键。The latter is supported but will add latency to your application; the SDK must parse the item and extract the partition key.

索引编制策略Indexing policy

  • 从索引中排除未使用的路径以加快写入速度Exclude unused paths from indexing for faster writes

    Azure Cosmos DB 的索引策略允许使用索引路径(setIncludedPaths 和 setExcludedPaths)指定要在索引中包括或排除的文档路径。Azure Cosmos DB's indexing policy allows you to specify which document paths to include or exclude from indexing by leveraging Indexing Paths (setIncludedPaths and setExcludedPaths). 在事先知道查询模式的方案中,使用索引路径可改善写入性能并降低索引存储空间,因为索引成本与索引的唯一路径数目直接相关。The use of indexing paths can offer improved write performance and lower index storage for scenarios in which the query patterns are known beforehand, as indexing costs are directly correlated to the number of unique paths indexed. 例如,以下代码演示如何使用“*”通配符从索引编制中纳入和排除文档的整个部分(也称为子树)。For example, the following code shows how to include and exclude entire sections of the documents (also known as a subtree) from indexing using the "*" wildcard.

    Java SDK V4 (Maven com.azure::azure-cosmos)Java SDK V4 (Maven com.azure::azure-cosmos)

    
    CosmosContainerProperties containerProperties = new CosmosContainerProperties(containerName, "/lastName");
    
    // Custom indexing policy
    IndexingPolicy indexingPolicy = new IndexingPolicy();
    indexingPolicy.setIndexingMode(IndexingMode.CONSISTENT);
    
    // Included paths
    List<IncludedPath> includedPaths = new ArrayList<>();
    includedPaths.add(new IncludedPath("/*"));
    indexingPolicy.setIncludedPaths(includedPaths);
    
    // Excluded paths
    List<ExcludedPath> excludedPaths = new ArrayList<>();
    excludedPaths.add(new ExcludedPath("/name/*"));
    indexingPolicy.setExcludedPaths(excludedPaths);
    
    containerProperties.setIndexingPolicy(indexingPolicy);
    
    ThroughputProperties throughputProperties = ThroughputProperties.createManualThroughput(400);
    
    database.createContainerIfNotExists(containerProperties, throughputProperties);
    CosmosAsyncContainer containerIfNotExists = database.getContainer(containerName);
    
    

    有关详细信息,请参阅 Azure Cosmos DB 索引策略For more information, see Azure Cosmos DB indexing policies.

吞吐量Throughput

  • 测量和优化较低的每秒请求单位使用量Measure and tune for lower request units/second usage

    Azure Cosmos DB 提供一组丰富的数据库操作,包括 UDF 的关系和层次查询,存储过程和触发器 - 所有这些都是对数据库集合内的文档进行的操作。Azure Cosmos DB offers a rich set of database operations including relational and hierarchical queries with UDFs, stored procedures, and triggers - all operating on the documents within a database collection. 与这些操作关联的成本取决于完成操作所需的 CPU、IO 和内存。The cost associated with each of these operations varies based on the CPU, IO, and memory required to complete the operation. 与考虑和管理硬件资源不同的是,可以考虑将请求单位 (RU) 作为所需资源的单个措施,以执行各种数据库操作和服务应用程序请求。Instead of thinking about and managing hardware resources, you can think of a request unit (RU) as a single measure for the resources required to perform various database operations and service an application request.

    吞吐量是基于为每个容器设置的请求单位数量预配的。Throughput is provisioned based on the number of request units set for each container. 请求单位消耗以每秒速率评估。Request unit consumption is evaluated as a rate per second. 如果应用程序的速率超过了为其容器预配的请求单位速率,则会受到限制,直到该速率降到容器的预配级别以下。Applications that exceed the provisioned request unit rate for their container are limited until the rate drops below the provisioned level for the container. 如果应用程序需要较高级别的吞吐量,可以通过预配更多请求单位来增加吞吐量。If your application requires a higher level of throughput, you can increase your throughput by provisioning additional request units.

    查询的复杂性会影响操作使用的请求单位数量。The complexity of a query impacts how many request units are consumed for an operation. 谓词数、谓词性质、UDF 数目和源数据集的大小都会影响查询操作的成本。The number of predicates, nature of the predicates, number of UDFs, and the size of the source data set all influence the cost of query operations.

    若要测量任何操作(创建、更新或删除)的开销,请检查 x-ms-request-charge 标头来测量这些操作占用的请求单位数。To measure the overhead of any operation (create, update, or delete), inspect the x-ms-request-charge header to measure the number of request units consumed by these operations. 也可以在 ResourceResponse<T> 或 FeedResponse<T> 中找到等效的 RequestCharge 属性。You can also look at the equivalent RequestCharge property in ResourceResponse<T> or FeedResponse<T>.

    Java SDK V4 (Maven com.azure::azure-cosmos) 异步 APIJava SDK V4 (Maven com.azure::azure-cosmos) Async API

    CosmosItemResponse<CustomPOJO> response = asyncContainer.createItem(item).block();
    
    response.getRequestCharge();
    
    

    在此标头中返回的请求费用是预配吞吐量的一小部分。The request charge returned in this header is a fraction of your provisioned throughput. 例如,如果预配了 2000 RU/s,上述查询返回 1000 个 1KB 文档,则操作成本为 1000。For example, if you have 2000 RU/s provisioned, and if the preceding query returns 1000 1KB-documents, the cost of the operation is 1000. 因此在一秒内,服务器在对后续请求进行速率限制之前,只接受两个此类请求。As such, within one second, the server honors only two such requests before rate limiting subsequent requests. 有关详细信息,请参阅请求单位请求单位计算器For more information, see Request units and the request unit calculator.

  • 处理速率限制/请求速率太大Handle rate limiting/request rate too large

    客户端尝试超过帐户保留的吞吐量时,服务器的性能不会降低,并且不会使用超过保留级别的吞吐量容量。When a client attempts to exceed the reserved throughput for an account, there is no performance degradation at the server and no use of throughput capacity beyond the reserved level. 服务器将抢先结束 RequestRateTooLarge(HTTP 状态代码 429)的请求并返回 x-ms-retry-after-ms 标头,该标头指示重新尝试请求前用户必须等待的时间量(以毫秒为单位)。The server will preemptively end the request with RequestRateTooLarge (HTTP status code 429) and return the x-ms-retry-after-ms header indicating the amount of time, in milliseconds, that the user must wait before reattempting the request.

        HTTP Status 429,
        Status Line: RequestRateTooLarge
        x-ms-retry-after-ms :100
    

    SDK 全部都会隐式捕获此响应,并遵循服务器指定的 retry-after 标头,并重试请求。The SDKs all implicitly catch this response, respect the server-specified retry-after header, and retry the request. 除非多个客户端同时访问帐户,否则下次重试就会成功。Unless your account is being accessed concurrently by multiple clients, the next retry will succeed.

    如果累计有多个客户端持续在超过请求速率的情况下运行,则当前由客户端在内部设置为 9 的默认重试计数可能并不足够;在此情况下,客户端就会向应用程序引发 CosmosClientException,其状态代码为 429。If you have more than one client cumulatively operating consistently above the request rate, the default retry count currently set to 9 internally by the client may not suffice; in this case, the client throws a CosmosClientException with status code 429 to the application. 可以通过在 ConnectionPolicy 实例上使用 setRetryOptions 来更改默认重试计数。The default retry count can be changed by using setRetryOptions on the ConnectionPolicy instance. 默认情况下,如果请求继续以高于请求速率的方式运行,则会在 30 秒的累积等待时间后返回 CosmosClientException 和状态代码 429。By default, the CosmosClientException with status code 429 is returned after a cumulative wait time of 30 seconds if the request continues to operate above the request rate. 即使当前的重试计数小于最大重试计数(默认值 9 或用户定义的值),也会发生这种情况。This occurs even when the current retry count is less than the max retry count, be it the default of 9 or a user-defined value.

    尽管自动重试行为有助于改善大多数应用程序的复原能力和可用性,但是在执行性能基准测试时可能会造成冲突(尤其是在测量延迟时)。While the automated retry behavior helps to improve resiliency and usability for the most applications, it might come at odds when doing performance benchmarks, especially when measuring latency. 如果实验达到服务器限制并导致客户端 SDK 静默重试,则客户端观测到的延迟会剧增。The client-observed latency will spike if the experiment hits the server throttle and causes the client SDK to silently retry. 若要避免性能实验期间出现延迟高峰,可以测量每个操作返回的费用,并确保请求以低于保留请求速率的方式运行。To avoid latency spikes during performance experiments, measure the charge returned by each operation and ensure that requests are operating below the reserved request rate. 有关详细信息,请参阅请求单位For more information, see Request units.

  • 针对小型文档进行设计以提高吞吐量Design for smaller documents for higher throughput

    给定操作的请求费用(请求处理成本)与文档大小直接相关。The request charge (the request processing cost) of a given operation is directly correlated to the size of the document. 大型文档的操作成本高于小型文档的操作成本。Operations on large documents cost more than operations for small documents. 最好在设计应用程序和工作流的架构时,将项大小设为大约 1KB 或类似的数量级。Ideally, architect your application and workflows to have your item size be ~1KB, or similar order or magnitude. 对于延迟敏感型应用程序,应避免出现大项 - 大型文档会降低应用程序的速度。For latency-sensitive applications large items should be avoided - multi-MB documents will slow down your application.

后续步骤Next steps

若要深入了解如何设计应用程序以实现缩放和高性能,请参阅 Azure Cosmos DB 中的分区和缩放To learn more about designing your application for scale and high performance, see Partitioning and scaling in Azure Cosmos DB.