估计和管理 Azure 认知搜索服务的容量Estimate and manage capacity of an Azure Cognitive Search service

预配搜索服务和锁定特定的定价层之前,请花几分钟时间来了解容量如何工作以及如何调整副本和分区来适应工作负荷波动。Before provisioning a search service and locking in a specific pricing tier, take a few minutes to understand how capacity works and how you might adjust replicas and partitions to accommodate workload fluctuation.

容量是服务层的一种功能,它为每个服务和每个分区创建最大的存储空间,以及为你可创建的对象数创建最大的限制。Capacity is a function of the service tier, establishing maximum storage per service, per partition, and the maximum limits on the number of objects you can create. 基本层适用于具有适中存储要求(仅一个分区)的应用,但可以在高可用性配置(3 个副本)中运行。The Basic tier is designed for apps having modest storage requirements (one partition only) but with the ability to run in a high availability configuration (3 replicas). 其他层适用于特定的工作负载或模式,例如多租户。Other tiers are designed for specific workloads or patterns, such as multitenancy. 在内部,在这些层上创建的服务受益于帮助这些情况的硬件。Internally, services created on those tiers benefit from hardware that helps those scenarios.

Azure 认知搜索中的可伸缩性体系结构基于副本和分区的灵活组合,以便你可根据是否需要更多的查询或索引功能来改变容量。The scalability architecture in Azure Cognitive Search is based on flexible combinations of replicas and partitions so that you can vary capacity depending on whether you need more query or indexing power. 创建服务后,可以单独增加或减少副本或分区数。Once a service is created, you can increase or decrease the number of replicas or partitions independently. 每增加一个物理资源,成本就会增加,但是一旦完成大的工作负载,就可以减小规模以降低费用。Costs will go up with each additional physical resource, but once large workloads are finished, you can reduce scale to lower your bill. 增加或减少容量所需的时间为 15 分钟到几个小时,具体取决于调整的层和大小。Depending on the tier and the size of the adjustment, adding or reducing capacity can take anywhere from 15 minutes to several hours.

修改副本和分区的分配时,建议使用 Azure 门户。When modifying the allocation of replicas and partitions, we recommend using the Azure portal. 该门户针对允许的组合强制实施限制,使其低于层的上限。The portal enforces limits on allowable combinations that stay below maximum limits of a tier. 但是,如果需要使用基于脚本或基于代码的预配方法,Azure PowerShell管理 REST API 是替代的解决方案。However, if you require a script-based or code-based provisioning approach, the Azure PowerShell or the Management REST API are alternative solutions.

概念:搜索单位、副本、分区、分片Concepts: search units, replicas, partitions, shards

容量以搜索单位表示,可以通过分区和副本的组合进行分配,并使用基础分片机制来支持灵活的配置:Capacity is expressed in search units that can be allocated in combinations of partitions and replicas, using an underlying sharding mechanism to support flexible configurations:

概念Concept 定义Definition
搜索单位Search unit 总可用容量(36 个单位)的单一增量。A single increment of total available capacity (36 units). 它还是 Azure 认知搜索服务的计费单位。It is also the billing unit for an Azure Cognitive Search service. 至少需要一个单位才能运行服务。A minimum of one unit is required to run the service.
副本Replica 是搜索服务的实例,主要用于对查询操作进行负载均衡。Instances of the search service, used primarily to load balance query operations. 每个副本承载着索引的一个副本。Each replica hosts one copy of an index. 如果分配三个副本,则可以使用索引的三个副本来为查询请求提供服务。If you allocate three replicas, you'll have three copies of an index available for servicing query requests.
分区Partition 为读/写操作(例如,在重建或刷新索引时进行的此类操作)提供物理存储和 I/O。Physical storage and I/O for read/write operations (for example, when rebuilding or refreshing an index). 每个分区都有总体索引的一个切片。Each partition has a slice of the total index. 如果分配三个分区,则索引将划分为三个部分。If you allocate three partitions, your index is divided into thirds.
分片Shard 索引的一个区块。A chunk of an index. Azure 认知搜索将每个索引划分为分片,以便更快地添加分区(通过将分片移动到新的搜索单位)。Azure Cognitive Search divides each index into shards to make the process of adding partitions faster (by moving shards to new search units).

下图显示了副本、分区、分片与搜索单位之间的关系。The following diagram shows the relationship between replicas, partitions, shards, and search units. 它显示了一个示例,该示例说明了在具有两个副本和两个分区的服务中,单个索引如何跨越四个搜索单位。It shows an example of how a single index is spanned across four search units in a service with two replicas and two partitions. 这四个搜索单位每个都只存储索引的一半分片。Each of the four search units stores only half of the shards of the index. 左列中的搜索单位存储分片的第一半,构成第一个分区,而右列中的搜索单位存储分片的第二半,构成第二个分区。The search units in the left column store the first half of the shards, comprising the first partition, while those in the right column store the second half of the shards, comprising the second partition. 由于有两个副本,因此每个索引分片有两个副本。Since there are two replicas, there are two copies of each index shard. 顶部行中的搜索单位存储着一个副本,构成第一个副本,而底部行中的搜索单位存储着另一个副本,构成第二个副本。The search units in the top row store one copy, comprising the first replica, while those in the bottom row store another copy, comprising the second replica.


上图只是一个示例。The diagram above is only one example. 分区和副本有许多可能的组合,最多可包含 36 个搜索单位(总计)。Many combinations of partitions and replicas are possible, up to a maximum of 36 total search units.

在认知搜索中,分片管理是实现细节且不可配置,但知道索引是分片的有助于你了解排名和自动完成行为中的偶然异常:In Cognitive Search, shard management is an implementation detail and non-configurable, but knowing that an index is sharded helps to understand the occasional anomalies in ranking and autocomplete behaviors:

  • 排名异常:搜索评分首先在分片级别计算,然后聚合成单个结果集。Ranking anomalies: Search scores are computed at the shard level first, and then aggregated up into a single result set. 根据分片内容的特征,一个分片中的匹配项的排名可能高于另一个分片中的匹配项。Depending on the characteristics of shard content, matches from one shard might be ranked higher than matches in another one. 如果你在搜索结果中发现与预料相反的排名,则很可能是由于分片的影响,尤其是在索引较小的情况下。If you notice counter intuitive rankings in search results, it is most likely due to the effects of sharding, especially if indexes are small. 你可以通过选择在整个索引中全局计算评分来避免这些排名异常,但这样做会导致性能下降。You can avoid these ranking anomalies by choosing to compute scores globally across the entire index, but doing so will incur a performance penalty.

  • 自动完成异常:自动完成查询(根据仅输入了一部分内容的单词的前几个字符进行匹配)接受一个模糊参数,该参数允许有微小的拼写差异。Autocomplete anomalies: Autocomplete queries, where matches are made on the first several characters of a partially entered term, accept a fuzzy parameter that forgives small deviations in spelling. 对于自动完成,模糊匹配被限制为当前分片中的字词。For autocomplete, fuzzy matching is constrained to terms within the current shard. 例如,如果某个分片包含“Microsoft”,并且输入了部分字词“micor”,则搜索引擎将针对该分片中的“Microsoft”进行匹配,但不会在包含索引剩余部分的其他分片中进行匹配。For example, if a shard contains "Microsoft" and a partial term of "micor" is entered, the search engine will match on "Microsoft" in that shard, but not in other shards that hold the remaining parts of the index.

如何评估容量要求How to evaluate capacity requirements

容量与运行服务的成本密切相关。Capacity and the costs of running the service go hand in hand. 层对两个级别施加限制:存储和内容(例如服务上的索引计数)。Tiers impose limits on two levels: storage and content (a count of indexes on a service, for example). 请务必考虑到这两者,因为首先达到的限制就是有效的限制。It's important to consider both because whichever limit you reach first is the effective limit.

索引和其他对象的数量通常由业务和工程要求决定。Quantities of indexes and other objects are typically dictated by business and engineering requirements. 例如,你可能有用于积极开发、测试和生成的同一索引的多个版本。For example, you might have multiple versions of the same index for active development, testing, and production.

存储需求取决于预计要生成的索引的大小。Storage needs are determined by the size of the indexes you expect to build. 没有可帮助估计的纯启发和概述。There are no solid heuristics or generalities that help with estimates. 确定索引大小的唯一方法是生成一个索引The only way to determine the size of an index is build one. 索引的大小将基于导入的数据、文本分析和索引配置,例如是否启用建议器、筛选和排序。Its size will be based on imported data, text analysis, and index configuration such as whether you enable suggesters, filtering, and sorting.

对于全文搜索,主要数据结构是倒排索引结构,该结构具有与源数据不同的特征。For full text search, the primary data structure is an inverted index structure, which has different characteristics than source data. 对于倒排索引,大小和复杂度由内容决定,不一定是输入的数据量。For an inverted index, size and complexity are determined by content, not necessarily by the amount of data that you feed into it. 具有高度冗余的大型数据源可能会导致比包含高度可变内容的较小数据集更小的索引。A large data source with high redundancy could result in a smaller index than a smaller dataset that contains highly variable content. 因此,很难根据原始数据集的大小来推断索引大小。So it's rarely possible to infer index size based on the size of the original dataset.


即使估算将来的索引和存储需求类似于猜测,但也值得一试。Even though estimating future needs for indexes and storage can feel like guesswork, it's worth doing. 如果层级容量经证实过低,将需要在更高的层级上预配新服务,然后重新加载索引If a tier's capacity turns out to be too low, you'll need to provision a new service at a higher tier and then reload your indexes. 服务无法从一个层就地升级到另一个层。There's no in-place upgrade of a service from one tier to another.

使用“免费”层进行评估Estimate with the Free tier

估算容量的一种方法是从“免费”层开始。One approach for estimating capacity is to start with the Free tier. 回想一下,“免费”服务最多提供 3 个索引、50 MB 存储和 2 分钟索引时间。Remember that the Free service offers up to three indexes, 50 MB of storage, and 2 minutes of indexing time. 根据这些约束估算预计索引大小可能很有难度,具体步骤如下:It can be challenging to estimate a projected index size with these constraints, but these are the steps:

  • 创建免费服务Create a free service.

  • 准备一个小型的有代表性的数据集。Prepare a small, representative dataset.

  • 创建索引并加载数据。Create an index and load your data. 如果可在索引器支持的 Azure 数据源中承载数据集,则你可以使用门户中的导入数据向导创建和加载索引。If the dataset can be hosted in an Azure data source supported by indexers, you can use the Import data wizard in the portal to both create and load the index. 否则,应使用 REST 和 PostmanVisual Studio Code 来创建索引和推送数据。Otherwise, you should use REST and Postman or Visual Studio Code to create the index and push the data. 推送模型要求数据采用 JSON 文档的形式,其中文档中的字段与索引中的字段相对应。The push model requires data to be in the form of JSON documents, where fields in the document correspond to fields in the index.

  • 收集有关索引的信息,如大小。Collect information about the index, such as size. 功能和属性会影响存储。Features and attributes have an impact on storage. 例如,添加建议器(“边键入边搜索”查询)会提高存储要求。For example, adding suggesters (search-as-you-type queries) will increase storage requirements.

    可以使用同一个数据集尝试创建索引的多个版本,并在每个字段中使用不同的属性,以了解存储要求的变化。Using the same data set, you might try creating multiple versions of an index, with different attributes on each field, to see how storage requirements vary. 有关详细信息,请参阅“创建基本索引”中的“存储影响”For more information, see "Storage implications" in Create a basic index.

估算出粗略的数字后,可将此数量增大一倍来得出两个索引(开发和生产)的预算,然后相应地选择层。With a rough estimate in hand, you might double that amount to budget for two indexes (development and production) and then choose your tier accordingly.

使用计费层进行评估Estimate with a billable tier

专用资源可以适应更大的采样和处理时间,并可以在开发期间对索引数量、大小和查询量进行更贴近实际的估算。Dedicated resources can accommodate larger sampling and processing times for more realistic estimates of index quantity, size, and query volumes during development. 某些客户会直接选择计费层,然后在开发项目成熟后重新进行评估。Some customers jump right in with a billable tier and then re-evaluate as the development project matures.

  1. 检查每个层级的服务限制以确定较低层级是否可以支持需要的索引数量。Review service limits at each tier to determine whether lower tiers can support the number of indexes you need. 在“基本”、“S1”和“S2”层中,索引数限制分别为 15、50 和 200。Across the Basic, S1, and S2 tiers, index limits are 15, 50, and 200, respectively.

  2. 在可计费层中创建服务Create a service at a billable tier:

    • 如果你不确定预计负载有多大,请从较低的“基本”或“S1”层着手。Start low, at Basic or S1, if you're not sure about the projected load.
    • 起点高,如果测试包括大规模索引和查询负载,请从 S2 甚至 S3 开始。Start high, at S2 or even S3, if testing includes large-scale indexing and query loads.
  3. 生成初始索引以确定将源数据转换为索引的方式。Build an initial index to determine how source data translates to an index. 这是估计索引大小的唯一方法。This is the only way to estimate index size.

  4. 在门户中监视存储、服务限制、查询量和延迟Monitor storage, service limits, query volume, and latency in the portal. 门户会显示每秒查询数、限制的查询数和搜索延迟。The portal shows you queries per second, throttled queries, and search latency. 所有这些值可帮助你确定是否选择了合适的层。All of these values can help you decide if you selected the right tier.

  5. 如果需要高可用性或遇到查询性能缓慢问题,请添加副本。Add replicas if you need high availability or if you experience slow query performance.

    没有有关需要多少个副本来适应查询负载的指导。There are no guidelines on how many replicas are needed to accommodate query loads. 查询性能取决于查询复杂性和争用资源的工作负荷。Query performance depends on the complexity of the query and competing workloads. 尽管添加副本会明显提高性能,但结果不一定有线性改善:添加三个副本并不保证带来三倍的吞吐量。Although adding replicas clearly results in better performance, the result is not strictly linear: adding three replicas does not guarantee triple throughput. 有关评估解决方案的 QPS 的指导,请参阅缩放缩放以提高性能监视查询For guidance in estimating QPS for your solution, see Scale for performanceand Monitor queries.


如果包含从不进行搜索的数据,则存储要求可能会过高。Storage requirements can be inflated if you include data that will never be searched. 理想情况下,文档仅包含搜索体验所需的数据。Ideally, documents contain only the data that you need for the search experience. 二进制数据不可搜索,因此应单独存储(也许可以存储在 Azure 表或 Blob 存储中)。Binary data isn't searchable and should be stored separately (maybe in an Azure table or blob storage). 然后在索引中添加一个字段,用于保存对外部数据的 URL 引用。A field should then be added in the index to hold a URL reference to the external data. 单个搜索文档的最大大小是 16 MB(如果在一次请求中批量上传了多个文档,则小于 16 MB)。The maximum size of an individual search document is 16 MB (or less if you're bulk uploading multiple documents in one request). 有关详细信息,请参阅 Azure 认知搜索中的服务限制For more information, see Service limits in Azure Cognitive Search.

查询量注意事项Query volume considerations

每秒查询数 (QPS) 在性能优化期间是一个重要的指标,但如果你预期查询量一开始就很高,则通常只需考虑到层。Queries per second (QPS) is an important metric during performance tuning, but it's generally only a tier consideration if you expect high query volume at the outset.

“标准”层可以提供平衡的副本数和分区数。The Standard tiers can provide a balance of replicas and partitions. 可以添加副本来实现负载均衡,或添加分区进行并行处理,以此增大查询周转时间。You can increase query turnaround by adding replicas for load balancing or add partitions for parallel processing. 然后,可以在预配服务后优化性能。You can then tune for performance after the service is provisioned.

如果你预期持续查询量一开始就很高,应考虑更高的由更强大硬件支持的“标准”层。If you expect high sustained query volumes from the outset, you should consider higher Standard tiers, backed by more powerful hardware. 如果不会这些这种查询量,可以使分区和副本脱机,甚至切换到更低层的服务。You can then take partitions and replicas offline, or even switch to a lower-tier service, if those query volumes don't occur. 有关如何计算查询吞吐量的详细信息,请参阅 Azure 认知搜索性能和优化For more information on how to calculate query throughput, see Azure Cognitive Search performance and optimization.

服务级别协议Service-level agreements

“免费”层和预览版功能不提供服务级别协议 (SLA)The Free tier and preview features don't provide service-level agreements (SLAs). 对于所有可计费的层,SLA 将在用户为服务提供足够冗余时生效。For all billable tiers, SLAs take effect when you provision sufficient redundancy for your service. 需要为查询(读取)SLA 提供两个或更多个副本。You need to have two or more replicas for query (read) SLAs. 需要为查询和索引(读写)SLA 提供三个或更多个副本。You need to have three or more replicas for query and indexing (read-write) SLAs. 分区数不影响 SLA。The number of partitions doesn't affect SLAs.

容量计划提示Tips for capacity planning

  • 允许围绕查询生成指标,并围绕使用模式收集数据(在营业期间执行查询,在非高峰期执行索引编制)。Allow metrics to build around queries, and collect data around usage patterns (queries during business hours, indexing during off-peak hours). 使用此数据做出明智的服务预配决策。Use this data to inform service provisioning decisions. 尽管这种做法不是在每小时或每日都可行,但可以动态调整分区和资源,以应对查询量的计划内变化。Though it's not practical at an hourly or daily cadence, you can dynamically adjust partitions and resources to accommodate planned changes in query volumes. 此外,还可以应对计划外的但持续性的变化,前提是变化程度持续足够长的时间,以致有必要采取措施。You can also accommodate unplanned but sustained changes if levels hold long enough to warrant taking action.

  • 请记住,预配不足的唯一缺点是,如果实际需求超出预测,则可能必须关闭某项服务。Remember that the only downside of under provisioning is that you might have to tear down a service if actual requirements are greater than your predictions. 为避免服务中断,可以在更高层级创建新服务,并让其并排运行,直到所有应用和请求都指向新的终结点。To avoid service disruption, you would create a new service at a higher tier and run it side by side until all apps and requests target the new endpoint.

何时添加分区和副本When to add partitions and replicas

最初为服务分配了由一个分区和一个副本组成的最低级别的资源。Initially, a service is allocated a minimal level of resources consisting of one partition and one replica.

单个服务必须具有足够的资源才能处理所有工作负荷(索引和查询)。A single service must have sufficient resources to handle all workloads (indexing and queries). 没有任何工作负荷在后台运行。Neither workload runs in the background. 如果查询请求在性质上不频繁,则可以计划索引编制,但如果不这样做,服务也不会排定任务的优先级。You can schedule indexing for times when query requests are naturally less frequent, but the service will not otherwise prioritize one task over another. 此外,在内部更新服务或节点时,一定程度的冗余也会销蚀查询性能。Additionally, a certain amount of redundancy smooths out query performance when services or nodes are updated internally.

根据惯例,搜索应用程序所需的副本数往往多过分区数,尤其是在服务操作偏向于查询工作负荷的情况下。As a general rule, search applications tend to need more replicas than partitions, particularly when the service operations are biased toward query workloads. 高可用性部分将解释原因。The section on high availability explains why.

所选的层确定了分区大小和速度,每个层已根据一组适合不同方案的特征进行优化。The tier you choose determines partition size and speed, and each tier is optimized around a set of characteristics that fit various scenarios. 如果选择更高端的层,所需的分区数可能比使用 S1 时更少。If you choose a higher-end tier, you might need fewer partitions than if you go with S1. 需要通过自我引导式测试解答的一个问题是,对于性能而言,使用更大且更昂贵的分区,是否比在较低层上预配的服务中使用两个更廉价的分区更好。One of the questions you'll need to answer through self-directed testing is whether a larger and more expensive partition yields better performance than two cheaper partitions on a service provisioned at a lower tier.

需要以近乎实时的速度刷新数据的搜索应用程序,需要的分区数在比例上要多于副本。Search applications that require near real-time data refresh will need proportionally more partitions than replicas. 添加分区可将读/写操作分配到更多的计算资源。Adding partitions spreads read/write operations across a larger number of compute resources. 此外,还能提供更多磁盘空间来存储更多的索引和文档。It also gives you more disk space for storing additional indexes and documents.

索引越大,查询所需的时间就越长。Larger indexes take longer to query. 因此,可能发现,每次增加分区都需要按比例少量增加副本。As such, you might find that every incremental increase in partitions requires a smaller but proportional increase in replicas. 查询和查询卷的复杂性影响查询执行的速度。The complexity of your queries and query volume will factor into how quickly query execution is turned around.


添加更多的副本或分区会增加运行服务的成本,并可能在结果的排序方式上引入细微变化。Adding more replicas or partitions increases the cost of running the service, and can introduce slight variations in how results are ordered. 请务必查看定价计算器来了解添加更多节点对计费造成的影响。Be sure to check the pricing calculator to understand the billing implications of adding more nodes. 以下图表可帮助你交叉参考特定配置所需的搜索单位数。The chart below can help you cross-reference the number of search units required for a specific configuration. 有关其他副本如何影响查询处理的详细信息,请参阅排序结果For more information on how additional replicas impact query processing, see Ordering results.

如何分配副本和分区How to allocate replicas and partitions

  1. 登录到 Azure 门户,并选择搜索服务。Sign in to the Azure portal and select the search service.

  2. 在“设置”中,打开“规模”页以修改副本和分区 。Under Settings, open the Scale page to modify replicas and partitions.

    以下屏幕截图显示了预配有一个副本和分区的基本标准。The following screenshot shows a Basic Standard provisioned with one replica and partition. 底部的公式指示正在使用多少个搜索单位 (1)。The formula at the bottom indicates how many search units are being used (1). 如果单位价格为 $100(非实际价格),则运行此服务的每月成本平均为 $100。If the unit price was $100 (not a real price), the monthly cost of running this service would be $100 on average.


  3. 使用滑块增加或减少分区数。Use the slider to increase or decrease the number of partitions. 底部的公式指示正在使用多少个搜索单位。The formula at the bottom indicates how many search units are being used. 选择“保存”。Select Save.

    此示例添加第二个副本和分区。This example adds a second replica and partition. 请注意搜索单位计数;现在有 4 个搜索单位,因为计费公式是副本数乘以分区数 (2 x 2)。Notice the search unit count; it is now four because the billing formula is replicas multiplied by partitions (2 x 2). 将容量翻倍不仅仅会使运行服务的成本翻倍。Doubling capacity more than doubles the cost of running the service. 如果搜索单位的成本是 $100,则新的每月费用将是 $400。If the search unit cost was $100, the new monthly bill would now be $400.

    有关每个层的当前单位成本,请访问定价页For the current per unit costs of each tier, visit the Pricing page.


  4. 保存后,你可以查看通知以确认操作是否成功。After saving, you can check notifications to confirm the action succeeded.


    容量的更改可能需要长达几个小时才能完成。Changes in capacity can take up to several hours to complete. 一旦启动更改过程,就无法将其取消;系统不会实时监视副本和分区的调整。You cannot cancel once the process has started and there is no real-time monitoring for replica and partition adjustments. 但是,在更改过程中,会一直显示以下消息。However, the following message remains visible while changes are underway.



预配服务后,无法升级到更高的层。After a service is provisioned, it cannot be upgraded to a higher tier. 必须在新层中创建搜索服务,并重新加载索引。You must create a search service at the new tier and reload your indexes. 有关服务预配的帮助,请参阅在门户中创建 Azure 认知搜索服务See Create an Azure Cognitive Search service in the portal for help with service provisioning.

此外,分区和副本是由服务在内部专门管理的。Additionally, partitions and replicas are managed exclusively and internally by the service. 不存在处理器关联或者将工作负荷分配到特定节点的概念。There is no concept of processor affinity, or assigning a workload to a specific node.

分区和副本组合Partition and replica combinations

“基本”服务可以包含一个分区以及最多三个副本,上限为三个 SU。A Basic service can have exactly one partition and up to three replicas, for a maximum limit of three SUs. 唯一可调整的资源是副本。The only adjustable resource is replicas. 至少需要两个副本才能实现查询的高可用性。You need a minimum of two replicas for high availability on queries.

1 个分区1 partition 2 个分区2 partitions 3 个分区3 partitions 4 个分区4 partitions 6 个分区6 partitions 12 个分区12 partitions
1 个副本1 replica 1 个 SU1 SU 2 SU2 SU 3 SU3 SU 4 SU4 SU 6 SU6 SU 12 SU12 SU
2 个副本2 replicas 2 SU2 SU 4 SU4 SU 6 SU6 SU 8 SU8 SU 12 SU12 SU 24 SU24 SU
3 个副本3 replicas 3 SU3 SU 6 SU6 SU 9 SU9 SU 12 SU12 SU 18 SU18 SU 36 个 SU36 SU
4 个副本4 replicas 4 SU4 SU 8 SU8 SU 12 SU12 SU 16 SU16 SU 24 SU24 SU 不适用N/A
5 副本5 replicas 5 SU5 SU 10 SU10 SU 15 SU15 SU 20 SU20 SU 30 SU30 SU 不适用N/A
6 个副本6 replicas 6 SU6 SU 12 SU12 SU 18 SU18 SU 24 SU24 SU 36 个 SU36 SU 不适用N/A
12 副本12 replicas 12 SU12 SU 24 SU24 SU 36 个 SU36 SU 空值N/A 空值N/A 不适用N/A

Azure 网站上详细说明了 SU、定价和容量。SUs, pricing, and capacity are explained in detail on the Azure website. 有关详细信息,请参阅 Pricing Details(定价详细信息)。For more information, see Pricing Details.


副本数和分区数必须能被 12 整除(具体而言,为 1、2、3、4、6、12)。The number of replicas and partitions divides evenly into 12 (specifically, 1, 2, 3, 4, 6, 12). 这是因为,Azure 认知搜索将每个索引预先分割为 12 个分片,以便将其平均分散到所有分区。This is because Azure Cognitive Search pre-divides each index into 12 shards so that it can be spread in equal portions across all partitions. 例如,如果服务有三个分区,而你创建了新索引,则每个分区将包含该索引的四个分片。For example, if your service has three partitions and you create an index, each partition will contain four shards of the index. Azure 认知搜索为索引分片的方法属于实现细节,在将来的版本中可能发生变化。How Azure Cognitive Search shards an index is an implementation detail, subject to change in future releases. 尽管目前的分区数为 12,但请不要料想将来该数字永远都是 12。Although the number is 12 today, you shouldn't expect that number to always be 12 in the future.

高可用性High availability

由于扩展的过程比较简单而且相对较快,因此我们通常建议从一个分区以及一个或两个副本开始,并随着不断构建查询卷而进行扩展。Because it's easy and relatively fast to scale up, we generally recommend that you start with one partition and one or two replicas, and then scale up as query volumes build. 查询工作负荷主要是在副本上运行。Query workloads run primarily on replicas. 如果需要更高的吞吐量或高可用性,也许需要增加副本。If you need more throughput or high availability, you will probably require additional replicas.

针对高可用性的一般建议是:General recommendations for high availability are:

  • 对于只读工作负荷(查询),需要有两个副本才能实现高可用性Two replicas for high availability of read-only workloads (queries)

  • 对于读/写工作负荷(查询以及添加、更新或删除单个文档时的索引编制),需有三个或更多个副本才能实现高可用性Three or more replicas for high availability of read/write workloads (queries plus indexing as individual documents are added, updated, or deleted)

Azure 认知搜索的服务级别协议 (SLA) 针对查询操作,以及由文档添加、更新或删除操作构成的索引更新。Service level agreements (SLA) for Azure Cognitive Search are targeted at query operations and at index updates that consist of adding, updating, or deleting documents.

基本层最多能有一个分区和三个副本。Basic tier tops out at one partition and three replicas. 如果希望灵活地立即响应对索引编制和查询吞吐量的需求波动,请考虑使用标准层中的一个。If you want the flexibility to immediately respond to fluctuations in demand for both indexing and query throughput, consider one of the Standard tiers.

后续步骤Next steps