Azure Cosmos DB 中的索引 - 概述Indexing in Azure Cosmos DB - Overview

适用于: SQL API

Azure Cosmos DB 是一种架构不可知的数据库,你可用它来迭代应用程序,而无需处理架构或索引管理。Azure Cosmos DB is a schema-agnostic database that allows you to iterate on your application without having to deal with schema or index management. 默认情况下,Azure Cosmos DB 自动对容器中所有项的每个属性编制索引,不用定义任何架构或配置辅助索引。By default, Azure Cosmos DB automatically indexes every property for all items in your container without having to define any schema or configure secondary indexes.

本文的目的是说明 Azure Cosmos DB 如何为数据编制索引以及如何使用索引来提高查询性能。The goal of this article is to explain how Azure Cosmos DB indexes data and how it uses indexes to improve query performance. 建议先阅读本部分,然后再探索如何自定义索引策略It is recommended to go through this section before exploring how to customize indexing policies.

从项到树From items to trees

每次在容器中存储项时,项的内容都投影为 JSON 文档,然后转换为树表示形式。Every time an item is stored in a container, its content is projected as a JSON document, then converted into a tree representation. 这意味着,该项的每个属性都在树中以节点的形式表示。What that means is that every property of that item gets represented as a node in a tree. 伪根节点被创建为项的所有第一级属性的父级。A pseudo root node is created as a parent to all the first-level properties of the item. 叶节点包含项带有的实际标量值。The leaf nodes contain the actual scalar values carried by an item.

例如,请看以下项:As an example, consider this item:

{
    "locations": [
        { "country": "Germany", "city": "Berlin" },
        { "country": "France", "city": "Paris" }
    ],
    "headquarters": { "country": "Belgium", "employees": 250 },
    "exports": [
        { "city": "Moscow" },
        { "city": "Athens" }
    ]
}

它由以下树表示:It would be represented by the following tree:

上一项以树的形式表示

请注意数组是如何在树中进行编码的:数组中的每个条目都获得一个中间节点,该节点标记了该数组中该条目的索引(0、1 等等)。Note how arrays are encoded in the tree: every entry in an array gets an intermediate node labeled with the index of that entry within the array (0, 1 etc.).

从树到属性路径From trees to property paths

Azure Cosmos DB 将项转换为树的原因是,它允许通过这些树中属性的路径来引用属性。The reason why Azure Cosmos DB transforms items into trees is because it allows properties to be referenced by their paths within those trees. 若要获取属性的路径,可从根节点到该属性来遍历树,并将每个遍历的节点的标签连接起来。To get the path for a property, we can traverse the tree from the root node to that property, and concatenate the labels of each traversed node.

下面是上述示例项中每个属性的路径:Here are the paths for each property from the example item described above:

  • /locations/0/country:"Germany"/locations/0/country: "Germany"
  • /locations/0/city:"Berlin"/locations/0/city: "Berlin"
  • /locations/1/country:"France"/locations/1/country: "France"
  • /locations/1/city:"Paris"/locations/1/city: "Paris"
  • /headquarters/country:"Belgium"/headquarters/country: "Belgium"
  • /headquarters/employees:250/headquarters/employees: 250
  • /exports/0/city:"Moscow"/exports/0/city: "Moscow"
  • /exports/1/city:"Athens"/exports/1/city: "Athens"

写入项时,Azure Cosmos DB 会有效地对每个属性的路径及其相应的值编制索引。When an item is written, Azure Cosmos DB effectively indexes each property's path and its corresponding value.

索引类型Types of indexes

Azure Cosmos DB 目前支持三种类型的索引。Azure Cosmos DB currently supports three types of indexes. 定义索引策略时,可以配置这些索引类型。You can configure these index types when defining the indexing policy.

范围索引Range Index

范围索引基于已排序的树形结构。Range index is based on an ordered tree-like structure. 范围索引类型用于:The range index type is used for:

  • 相等查询:Equality queries:

    SELECT * FROM container c WHERE c.property = 'value'
    
    SELECT * FROM c WHERE c.property IN ("value1", "value2", "value3")
    

    数组元素上的相等匹配Equality match on an array element

    SELECT * FROM c WHERE ARRAY_CONTAINS(c.tags, "tag1")
    
  • 范围查询:Range queries:

    SELECT * FROM container c WHERE c.property > 'value'
    

    (适用于 ><>=<=!=(works for >, <, >=, <=, !=)

  • 检查属性是否存在:Checking for the presence of a property:

    SELECT * FROM c WHERE IS_DEFINED(c.property)
    
  • 字符串系统函数:String system functions:

    SELECT * FROM c WHERE CONTAINS(c.property, "value")
    
    SELECT * FROM c WHERE STRINGEQUALS(c.property, "value")
    
  • ORDER BY 查询:ORDER BY queries:

    SELECT * FROM container c ORDER BY c.property
    
  • JOIN 查询:JOIN queries:

    SELECT child FROM container c JOIN child IN c.properties WHERE child = 'value'
    

范围索引可用于标量值(字符串或数字)。Range indexes can be used on scalar values (string or number). 新建容器的默认索引策略会对任何字符串或数字强制使用范围索引。The default indexing policy for newly created containers enforces range indexes for any string or number. 若要了解如何配置范围索引,请参阅范围索引策略示例To learn how to configure range indexes, see Range indexing policy examples

备注

按单个属性排序的 ORDER BY 子句总是需要一个范围索引,如果它引用的路径没有范围索引,则会失败。An ORDER BY clause that orders by a single property always needs a range index and will fail if the path it references doesn't have one. 同样地,按多个属性排序的 ORDER BY 查询总是需要一个组合索引。Similarly, an ORDER BY query which orders by multiple properties always needs a composite index.

空间索引Spatial index

空间索引可对地理空间对象(例如点、线、多边形和多面)进行有效查询。Spatial indices enable efficient queries on geospatial objects such as - points, lines, polygons, and multipolygon. 这些查询使用ST_DISTANCE、ST_WITHIN 和 ST_INTERSECTS 关键字。These queries use ST_DISTANCE, ST_WITHIN, ST_INTERSECTS keywords. 下面是使用空间索引类型的一些示例:The following are some examples that use spatial index type:

  • 地理空间距离查询:Geospatial distance queries:

    SELECT * FROM container c WHERE ST_DISTANCE(c.property, { "type": "Point", "coordinates": [0.0, 10.0] }) < 40
    
  • 在查询的地理空间:Geospatial within queries:

    SELECT * FROM container c WHERE ST_WITHIN(c.property, {"type": "Point", "coordinates": [0.0, 10.0] })
    
  • 地理空间相交查询:Geospatial intersect queries:

    SELECT * FROM c WHERE ST_INTERSECTS(c.property, { 'type':'Polygon', 'coordinates': [[ [31.8, -5], [32, -5], [31.8, -5] ]]  })  
    

空间索引可在格式正确的 GeoJSON 对象上使用。Spatial indexes can be used on correctly formatted GeoJSON objects. 目前支持点、线串、多边形和多面。Points, LineStrings, Polygons, and MultiPolygons are currently supported. 若要使用此索引类型,请在配置索引策略时使用 "kind": "Range" 属性进行设置。To use this index type, set by using the "kind": "Range" property when configuring the indexing policy. 若要了解如何配置空间索引,请参阅空间索引策略示例To learn how to configure spatial indexes, see Spatial indexing policy examples

组合索引Composite indexes

对多个字段执行操作时,组合索引可提高效率。Composite indexes increase the efficiency when you are performing operations on multiple fields. 复合索引类型用于:The composite index type is used for:

  • 对多个属性的 ORDER BY 查询:ORDER BY queries on multiple properties:

     SELECT * FROM container c ORDER BY c.property1, c.property2
    
  • 使用筛选器和 ORDER BY 的查询。Queries with a filter and ORDER BY. 如果在 ORDER BY 子句中添加筛选器属性,则这些查询可使用组合索引。These queries can utilize a composite index if the filter property is added to the ORDER BY clause.

     SELECT * FROM container c WHERE c.property1 = 'value' ORDER BY c.property1, c.property2
    
  • 对两个或更多属性进行的带筛选器的查询,其中至少一个属性是等式筛选器Queries with a filter on two or more properties where at least one property is an equality filter

     SELECT * FROM container c WHERE c.property1 = 'value' AND c.property2 > 'value'
    

只要一个筛选器谓词使用其中一种索引类型,查询引擎就会在扫描其余内容之前先对其进行评估。As long as one filter predicate uses one of the index type, the query engine will evaluate that first before scanning the rest. 例如,如果你有一个 SQL 查询,如 SELECT * FROM c WHERE c.firstName = "Andrew" and CONTAINS(c.lastName, "Liu")For example, if you have a SQL query such as SELECT * FROM c WHERE c.firstName = "Andrew" and CONTAINS(c.lastName, "Liu")

  • 上面的查询将先使用索引筛选 firstName = "Andrew" 的条目,The above query will first filter for entries where firstName = "Andrew" by using the index. 然后通过后续管道传递所有 firstName = "Andrew" 的条目来评估 CONTAINS 筛选器谓词。It then pass all of the firstName = "Andrew" entries through a subsequent pipeline to evaluate the CONTAINS filter predicate.

  • 如果使用不采用索引(如 CONTAINS)的函数,可额外添加使用该索引的筛选器谓词来加快查询速度和避免完整容器扫描。You can speed up queries and avoid full container scans when using functions that don't use the index (e.g. CONTAINS) by adding additional filter predicates that do use the index. 筛选子句的顺序并不重要。The order of filter clauses isn't important. 查询引擎将确定哪些谓词更具选择性,并相应地运行查询。The query engine is will figure out which predicates are more selective and run the query accordingly.

若要了解如何配置组合索引,请参阅组合索引策略示例To learn how to configure composite indexes, see Composite indexing policy examples

索引使用情况Index usage

查询引擎可采用 5 种方式来评估查询筛选器,按效率从高到低排序:There are five ways that the query engine can evaluate query filters, sorted by most-efficient to least-efficient:

  • 索引查找Index seek
  • 精确索引扫描Precise index scan
  • 扩展索引扫描Expanded index scan
  • 完全索引扫描Full index scan
  • 完全扫描Full scan

索引属性路径时,查询引擎将尽可能高效地自动使用索引。When you index property paths, the query engine will automatically use the index as efficiently as possible. 除了索引新的属性路径外,无需配置任何内容即可优化查询使用索引的方式。Aside from indexing new property paths, you don't need to configure anything to optimize how queries use the index. 查询的请求单位 (RU) 费用是索引使用量的 RU 费用与加载项的 RU 费用之和。A query's RU charge is a combination of both the RU charge from index usage and the RU charge from loading items.

下表汇总了在 Azure Cosmos DB 中使用索引的不同方式:Here is a table that summarizes the different ways indexes are used in Azure Cosmos DB:

索引查找类型Index lookup type 描述Description 常见示例Common Examples 索引使用量的 RU 费用RU charge from index usage 从事务数据存储中加载项的 RU 费用RU charge from loading items from transactional data store
索引查找Index seek 只读取所需的索引值,并且只从事务数据存储中加载匹配项Read only required indexed values and load only matching items from the transactional data store 相等筛选器,INEquality filters, IN 每个相等筛选器的费用相同Constant per equality filter 根据查询结果中的项数增加Increases based on number of items in query results
精确索引扫描Precise index scan 索引值的二进制搜索,并且只从事务数据存储中加载匹配项Binary search of indexed values and load only matching items from the transactional data store 范围比较(>、<、<= 或 >=),StartsWithRange comparisons (>, <, <=, or >=), StartsWith 与索引查找相比,根据索引属性的基数略有增加Comparable to index seek, increases slightly based on the cardinality of indexed properties 根据查询结果中的项数增加Increases based on number of items in query results
扩展索引扫描Expanded index scan 索引值的优化搜索(但比二进制文搜索的效率低),并且只从事物数据存储中加载匹配项Optimized search (but less efficient than a binary search) of indexed values and load only matching items from the transactional data store StartsWith(不区分大小写的),StringEquals(不区分大小写)StartsWith (case-insensitive), StringEquals (case-insensitive) 根据索引属性的基数略有增加Increases slightly based on the cardinality of indexed properties 根据查询结果中的项数增加Increases based on number of items in query results
完全索引扫描Full index scan 读取一组非重复的索引值,并且只从事务数据存储中加载匹配项Read distinct set of indexed values and load only matching items from the transactional data store Contains、EndsWith、RegexMatch、LIKEContains, EndsWith, RegexMatch, LIKE 根据索引属性的基数呈线性增加Increases linearly based on the cardinality of indexed properties 根据查询结果中的项数增加Increases based on number of items in query results
完全扫描Full scan 从事务数据存储加载所有项Load all items from the transactional data store Upper、LowerUpper, Lower 空值N/A 根据容器中的项数增加Increases based on number of items in container

编写查询时,应采用尽可能有效使用索引的筛选谓词。When writing queries, you should use filter predicate that use the index as efficiently as possible. 例如,如果 StartsWithContains 都适合你的用例,应选择 StartsWith,因为它将执行精确索引扫描,而不是完全索引扫描。For example, if either StartsWith or Contains would work for your use case, you should opt for StartsWith since it will do a precise index scan instead of a full index scan.

索引使用情况详细信息Index usage details

在此部分,我们将更详细地介绍查询是如何使用索引的。In this section, we'll cover more details about how queries use indexes. 如果是 Azure Cosmos DB 入门,那么这不是必学内容,但详细记录供感兴趣的用户查看。This isn't necessary to learn to get started with Azure Cosmos DB but is documented in detail for curious users. 我们将参考在本文档前面分享的示例项:We'll reference the example item shared earlier in this document:

示例项:Example items:

    {
        "id": 1,
        "locations": [
            { "country": "Germany", "city": "Berlin" },
            { "country": "France", "city": "Paris" }
        ],
        "headquarters": { "country": "Belgium", "employees": 250 },
        "exports": [
            { "city": "Moscow" },
            { "city": "Athens" }
        ]
    }
    {
        "id": 2,
        "locations": [
            { "country": "Ireland", "city": "Dublin" }
        ],
        "headquarters": { "country": "Belgium", "employees": 200 },
        "exports": [
            { "city": "Moscow" },
            { "city": "Athens" },
            { "city": "London" }
        ]
    }

Azure Cosmos DB 使用倒排索引。Azure Cosmos DB uses an inverted index. 索引的工作原理是将每个 JSON 路径映射到包含该值的一组项中。The index works by mapping each JSON path to the set of items that contain that value. 对于容器,项 ID 映射跨多个不同的索引页表示。The item ID mapping is represented across many different index pages for the container. 以下是包含两个示例项的容器的倒排索引示例关系图:Here is a sample diagram of an inverted index for a container that includes the two example items:

路径Path Value 项 ID 列表List of item IDs
/locations/0/country/locations/0/country 德国Germany 11
/locations/0/country/locations/0/country 爱尔兰Ireland 22
/locations/0/city/locations/0/city 柏林Berlin 11
/locations/0/city/locations/0/city 都柏林Dublin 22
/locations/1/country/locations/1/country 法国France 11
/locations/1/city/locations/1/city ParisParis 11
/headquarters/country/headquarters/country 比利时Belgium 1,21,2
/headquarters/employees/headquarters/employees 200200 22
/headquarters/employees/headquarters/employees 250250 11

倒排索引具有 2 个重要属性:The inverted index has two important attributes:

  • 对于给定路径,值按升序排序。For a given path, values are sorted in ascending order. 因此,查询引擎可轻松地从索引中提供 ORDER BYTherefore, the query engine can easily serve ORDER BY from the index.
  • 对于给定路径,查询引擎可扫描一组非重复的可能值,以确定存在结果的索引页。For a given path, the query engine can scan through the distinct set of possible values to identify the index pages where there are results.

查询引擎使用倒排索引的方式有以下 4 种:The query engine can utilize the inverted index in four different ways:

索引查找Index seek

请考虑下列查询:Consider the following query:

SELECT location
FROM location IN company.locations
WHERE location.country = 'France'`

查询谓词(对项进行筛选,其中任何位置都采用“法国”作为其国家/地区)与下面用红色突出显示的路径相匹配:The query predicate (filtering on items where any location has "France" as its country/region) would match the path highlighted in red below:

匹配树中的特定路径

由于此查询具有相等筛选器,因此在遍历此树后,我们可以快速识别包含查询结果的索引页。Since this query has an equality filter, after traversing this tree, we can quickly identify the index pages that contain the query results. 在这种情况下,查询引擎将读取包含项 1 的索引页。In this case, the query engine would read index pages that contain Item 1. 索引查找是使用索引最有效的方式。An index seek is the most efficient way to use the index. 通过索引查找,我们只读取必要的索引页,并且只加载查询结果中的项。With an index seek we only read the necessary index pages and load only the items in the query results. 因此,无论总数据量是多少,索引查找的查找时间都很短,并且 RU 费用低。Therefore, the index lookup time and RU charge from index lookup are incredibly low, regardless of the total data volume.

精确索引扫描Precise index scan

请考虑下列查询:Consider the following query:

SELECT *
FROM company
WHERE company.headquarters.employees > 200

查询谓词(对具有超过 200 名员工的项进行筛选)可通过 headquarters/employees 路径的精确索引扫描进行评估。The query predicate (filtering on items where there are more than 200 employees) can be evaluated with a precise index scan of the headquarters/employees path. 执行精确索引扫描时,查询引擎首先对一组非重复的可能值进行二进制搜索,来查找 headquarters/employees 路径的值 200 的位置。When doing a precise index scan, the query engine starts by doing a binary search of the distinct set of possible values to find the location of the value 200 for the headquarters/employees path. 由于每个路径的值都是按升序排序的,因此查询引擎可轻松执行二进制搜索。Since the values for each path are sorted in ascending order, it's easy for the query engine to do a binary search. 查询引擎找到值 200 后,将开始读取所有剩余的索引页(按升序反向)。After the query engine finds the value 200, it starts reading all remaining index pages (going in the ascending direction).

查询引擎可执行二进制搜索来避免扫描不必要的索引页,因此精确索引扫描的延迟和 RU 费用与索引查找操作相当。Because the query engine can do a binary search to avoid scanning unnecessary index pages, precise index scans tend to have comparable latency and RU charges to index seek operations.

扩展索引扫描Expanded index scan

请考虑下列查询:Consider the following query:

SELECT *
FROM company
WHERE STARTSWITH(company.headquarters.country, "United", true)

查询谓词(对总部位于以不区分大小写的“United”开头的国家/地区的项进行筛选)可通过 headquarters/country 路径的扩展索引扫描进行评估。The query predicate (filtering on items that have headquarters in a country that start with case-insensitive "United") can be evaluated with an expanded index scan of the headquarters/country path. 执行扩展索引扫描的操作具有一些优化,可帮助避免扫描每个索引页,但比精确索引扫描的二进制搜索略贵。Operations that do an expanded index scan have optimizations that can help avoid needs to scan every index page but are slightly more expensive than a precise index scan's binary search.

例如,在评估不区分大小写的 StartsWith 时,查询引擎将检查索引中是否有大写值和小写值混用的情况。For example, when evaluating case-insensitive StartsWith, the query engine will check the index for different possible combinations of uppercase and lowercase values. 此优化使查询引擎能够避免读取大部分索引页。This optimization allows the query engine to avoid reading the majority of index pages. 不同的系统函数具有不同的优化,它们可用于避免读取每个索引页,因此我们可将其大致分类为扩展索引扫描。Different system functions have different optimizations that they can use to avoid reading every index page, so we'll broadly categorize these as expanded index scan.

完全索引扫描Full index scan

请考虑下列查询:Consider the following query:

SELECT *
FROM company
WHERE CONTAINS(company.headquarters.country, "United")

查询谓词(对总部位于包含“United”的国家/地区的项目进行筛选)可通过 headquarters/country 路径的索引扫描进行评估。The query predicate (filtering on items that have headquarters in a country that contains "United") can be evaluated with an index scan of the headquarters/country path. 与精确索引扫描不同,完全索引扫描将始终扫描一组非重复的可能值,以确定存在结果的索引页。Unlike a precise index scan, a full index scan will always scan through the distinct set of possible values to identify the index pages where there are results. 在这种情况下,索引上会运行 ContainsIn this case, Contains is run on the index. 索引扫描的索引查找时间和 RU 费用会随着路径基数的增加而增加。The index lookup time and RU charge for index scans increases as the cardinality of the path increases. 换句话说,查询引擎需要扫描的可能的非重复值越多,执行完全索引扫描的延迟和 RU 费用就越高。In other words, the more possible distinct values that the query engine needs to scan, the higher the latency and RU charge involved in doing a full index scan.

例如,请考虑两个属性:town 和 country。For example, consider two properties: town and country. town 的基数是 5,000,country 的基数是 200。The cardinality of town is 5,000 and the cardinality of country is 200. 下面是两个示例查询,每个查询都有 Contains 系统函数来对 town 属性执行完全索引扫描。Here are two example queries that each have a Contains system function that does a full index scan on the town property. 第一个查询比第二个查询使用更多的 RU,因为 town 的基数高于 country 的基数。The first query will use more RUs than the second query because the cardinality of town is higher than country.

SELECT *
FROM c
WHERE CONTAINS(c.town, "Red", false)
SELECT *
FROM c
WHERE CONTAINS(c.country, "States", false)

完全扫描Full scan

在某些情况下,查询引擎可能无法使用索引评估查询筛选器。In some cases, the query engine may not be able to evaluate a query filter using the index. 在这种情况下,为了评估查询筛选器,查询引擎需要从事务存储中加载所有的项。In this case, the query engine will need to load all items from the transactional store in order to evaluate the query filter. 完全扫描不使用索引,而且其 RU 费用根据总数据大小呈线性增加。Full scans do not use the index and have an RU charge that increases linearly with the total data size. 幸运的是,很少有需要完全扫描的操作。Luckily, operations that require full scans are rare.

具有复杂筛选表达式的查询Queries with complex filter expressions

在前面的示例中,我们只考虑到具有简单筛选表达式的查询(例如,只具有单个相等或范围筛选器的查询)。In the earlier examples, we only considered queries that had simple filter expressions (for example, queries with just a single equality or range filter). 实际上,大多数查询都具有更复杂的筛选表达式。In reality, most queries have much more complex filter expressions.

请考虑下列查询:Consider the following query:

SELECT *
FROM company
WHERE company.headquarters.employees = 200 AND CONTAINS(company.headquarters.country, "United")

要执行此查询,查询引擎必须对 headquarters/employeesheadquarters/country 分别执行索引查找和完全索引扫描。To execute this query, the query engine must do an index seek on headquarters/employees and full index scan on headquarters/country. 查询引擎具有内部启发法,用于尽可能高效地评估查询筛选表达式。The query engine has internal heuristics that it uses to evaluate the query filter expression as efficiently as possible. 在这种情况下,查询引擎通过首先执行索引查找来避免读取不必要的索引页。In this case, the query engine would avoid needing to read unnecessary index pages by doing the index seek first. 例如,如果只有 50 个项与相等筛选器匹配,则查询引擎只需要在包含这 50 个项的索引页上评估 ContainsIf, for example, only 50 items matched the equality filter, the query engine would only need to evaluate Contains on the index pages that contained those 50 items. 无需对整个容器执行完全索引扫描。A full index scan of the entire container wouldn't be necessary.

标量聚合函数的索引使用率Index utilization for scalar aggregate functions

具有聚合函数的查询必须以独占方式依赖索引才能使用它。Queries with aggregate functions must rely exclusively on the index in order to use it.

在某些情况下,索引会返回假正。In some cases, the index can return false positives. 例如,在索引上评估 Contains 时,索引中的匹配项数可能超过查询结果数。For example, when evaluating Contains on the index, the number of matches in the index may exceed the number of query results. 查询引擎将加载所有索引匹配项,评估已加载的项上的筛选器,并且只返回正确的结果。The query engine will load all index matches, evaluate the filter on the loaded items, and return only the correct results.

对于大多数查询,加载假正索引匹配项不会对索引利用率产生任何显著影响。For the majority of queries, loading false positive index matches will not have any noticeable impact on index utilization.

例如,考虑以下查询:For example, consider the following query:

SELECT *
FROM company
WHERE CONTAINS(company.headquarters.country, "United")

Contains 系统函数可能返回一些假正匹配项,因此查询引擎将需要验证每个已加载的项是否与筛选表达式匹配。The Contains system function may return some false positive matches, so the query engine will need to verify whether each loaded item matches the filter expression. 在此示例中,查询引擎可能只需要加载额外几项,因此对索引使用率和 RU 费用的影响微乎其微。In this example, the query engine may only need to load an extra few items, so the impact on index utilization and RU charge is minimal.

但是,具有聚合函数的查询必须以独占方式依赖索引才能使用它。However, queries with aggregate functions must rely exclusively on the index in order to use it. 例如,考虑使用具有 Count 聚合的以下查询:For example, consider the following query with a Count aggregate:

SELECT COUNT(1)
FROM company
WHERE CONTAINS(company.headquarters.country, "United")

与第一个示例类似,Contains 系统函数可能返回一些假正匹配项。Like in the first example, the Contains system function may return some false positive matches. 但与 SELECT * 查询不同,Count 查询无法通过评估已加载项上的筛选表达式来验证所有索引匹配项。Unlike the SELECT * query, however, the Count query can't evaluate the filter expression on the loaded items to verify all index matches. Count 查询必须以独占方式依赖索引,因此如果筛选表达式可能返回假正匹配项,查询引擎将采用完全扫描。The Count query must rely exclusively on the index, so if there's a chance a filter expression will return false positive matches, the query engine will resort to a full scan.

具有以下聚合函数的查询必须以独占方式依赖索引,因此评估某些系统函数需要采用完全扫描。Queries with the following aggregate functions must rely exclusively on the index, so evaluating some system functions requires a full scan.

后续步骤Next steps

阅读以下文章中有关索引的详细信息:Read more about indexing in the following articles: