Azure 认知搜索中 $filter$orderby$select 的 OData 语言概述OData language overview for $filter, $orderby, and $select in Azure Cognitive Search

Azure 认知搜索支持一组适用于 $filter$orderby$select 表达式的 OData 表达式语法。Azure Cognitive Search supports a subset of the OData expression syntax for $filter, $orderby, and $select expressions. 筛选表达式在查询分析期间进行求值,将搜索范围限制为特定字段或添加索引扫描期间使用的匹配条件。Filter expressions are evaluated during query parsing, constraining search to specific fields or adding match criteria used during index scans. Order-by 表达式作为后处理步骤应用于结果集,对返回的文档进行排序。Order-by expressions are applied as a post-processing step over a result set to sort the documents that are returned. Select 表达式确定要包含在结果集中的文档字段。Select expressions determine which document fields are included in the result set. 这些表达式的语法不同于搜索参数中使用的简单完整查询语法,但引用字段的语法中存在一定的重叠。The syntax of these expressions is distinct from the simple or full query syntax that is used in the search parameter, although there's some overlap in the syntax for referencing fields.

本文概述筛选器、order-by 和 select 表达式中使用的 OData 表达式语言。This article provides an overview of the OData expression language used in filters, order-by, and select expressions. 该语言按“自下而上”的顺序演示,从最基本的元素及其上的构建基块开始。The language is presented "bottom-up", starting with the most basic elements and building on them. 有单独的文章会介绍每个参数的顶级语法:The top-level syntax for each parameter is described in a separate article:

有些 OData 表达式非常简单,而有些 OData 表达式则非常复杂,但它们全部共享通用的元素。OData expressions range from simple to highly complex, but they all share common elements. Azure 认知搜索中的 OData 表达式的最基本组成部分包括:The most basic parts of an OData expression in Azure Cognitive Search are:

  • 字段路径:引用索引的特定字段。Field paths, which refer to specific fields of your index.
  • 常量:特定数据类型的文本值。Constants, which are literal values of a certain data type.

备注

Azure 认知搜索中的术语在某些方面不同于 OData 标准Terminology in Azure Cognitive Search differs from the OData standard in a few ways. Azure 认知搜索中所谓的字段在 OData 中称为属性,类似地,字段路径在 OData 中称为属性路径What we call a field in Azure Cognitive Search is called a property in OData, and similarly for field path versus property path. Azure 认知搜索中包含文档索引在 OData 中更普遍地称为包含实体实体集An index containing documents in Azure Cognitive Search is referred to more generally in OData as an entity set containing entities. 本参考文档使用 Azure 认知搜索的术语。The Azure Cognitive Search terminology is used throughout this reference.

字段路径Field paths

以下 EBNF(扩展巴科斯-瑙尔范式)定义字段路径的语法。The following EBNF (Extended Backus-Naur Form) defines the grammar of field paths.

field_path ::= identifier('/'identifier)*

identifier ::= [a-zA-Z_][a-zA-Z_0-9]*

下面还提供了交互式语法图:An interactive syntax diagram is also available:

字段路径由斜杠分隔的一个或多个标识符组成。A field path is composed of one or more identifiers separated by slashes. 每个标识符是必须以 ASCII 字母或下划线开头的一系列字符,只能包含 ASCII 字母、数字或下划线。Each identifier is a sequence of characters that must start with an ASCII letter or underscore, and contain only ASCII letters, digits, or underscores. 字母可以采用大写或小写。The letters can be upper- or lower-case.

标识符可以引用字段的名称,或者引用筛选器的集合表达式上下文中的某个范围变量anyall)。An identifier can refer either to the name of a field, or to a range variable in the context of a collection expression (any or all) in a filter. 范围变量类似于表示集合的当前元素的循环变量。A range variable is like a loop variable that represents the current element of the collection. 对于复杂集合,该变量表示某个对象,正因如此,你可以使用字段路径来引用变量的子字段。For complex collections, that variable represents an object, which is why you can use field paths to refer to sub-fields of the variable. 这类似于许多编程语言中的点表示法。This is analogous to dot notation in many programming languages.

下表显示了字段路径的示例:Examples of field paths are shown in the following table:

字段路径Field path 说明Description
HotelName 引用索引的顶级字段Refers to a top-level field of the index
Address/City 引用索引中复杂字段的 City 子字段;在此示例中,Address 的类型为 Edm.ComplexTypeRefers to the City sub-field of a complex field in the index; Address is of type Edm.ComplexType in this example
Rooms/Type 引用索引中复杂集合字段的 Type 子字段;在此示例中,Rooms 的类型为 Collection(Edm.ComplexType)Refers to the Type sub-field of a complex collection field in the index; Rooms is of type Collection(Edm.ComplexType) in this example
Stores/Address/Country 引用索引中复杂集合字段的 Address 子字段的 Country 子字段;在此示例中,Stores 的类型为 Collection(Edm.ComplexType)Address 的类型为 Edm.ComplexTypeRefers to the Country sub-field of the Address sub-field of a complex collection field in the index; Stores is of type Collection(Edm.ComplexType) and Address is of type Edm.ComplexType in this example
room/Type 引用 room 范围变量的 Type 子字段(例如,在筛选表达式 Rooms/any(room: room/Type eq 'deluxe') 中)Refers to the Type sub-field of the room range variable, for example in the filter expression Rooms/any(room: room/Type eq 'deluxe')
store/Address/Country 引用 store 范围变量的 Address 子字段的 Country 子字段(例如,在筛选表达式 Stores/any(store: store/Address/Country eq 'Canada') 中)Refers to the Country sub-field of the Address sub-field of the store range variable, for example in the filter expression Stores/any(store: store/Address/Country eq 'Canada')

字段路径的含义因上下文而异。The meaning of a field path differs depending on the context. 在筛选器中,字段路径引用当前文档中某个字段的单个实例的值。In filters, a field path refers to the value of a single instance of a field in the current document. 在其他上下文中(例如 $orderby$select,或完整 Lucene 语法中的字段搜索),字段路径引用该字段本身。In other contexts, such as $orderby, $select, or in fielded search in the full Lucene syntax, a field path refers to the field itself. 这种差异会根据你在筛选器中使用字段路径的方式而产生一些后果。This difference has some consequences for how you use field paths in filters.

以字段路径 Address/City 为例。Consider the field path Address/City. 在筛选器中,此字段路径引用当前文档的单个城市,例如“旧金山”。In a filter, this refers to a single city for the current document, like "San Francisco". 相比之下,Rooms/Type 引用许多客房的 Type 子字段(例如,“标准”表示第一间客房,“豪华”表示第二间客房,等等)。In contrast, Rooms/Type refers to the Type sub-field for many rooms (like "standard" for the first room, "deluxe" for the second room, and so on). 由于 Rooms/Type 不引用子字段 Type单个实例,因此不能直接在筛选器中使用。Since Rooms/Type doesn't refer to a single instance of the sub-field Type, it can't be used directly in a filter. 若要根据客房类型进行筛选,请使用包含范围变量的 Lambda 表达式,如下所示:Instead, to filter on room type, you would use a lambda expression with a range variable, like this:

Rooms/any(room: room/Type eq 'deluxe')

在此示例中,范围变量 room 显示在 room/Type 字段路径中。In this example, the range variable room appears in the room/Type field path. 于是,room/Type 引用当前文档中当前客房的类型。That way, room/Type refers to the type of the current room in the current document. 这是 Type 子字段的单个实例,因此可以直接在筛选器中使用。This is a single instance of the Type sub-field, so it can be used directly in the filter.

使用字段路径Using field paths

Azure 认知搜索 REST API 的许多参数中使用字段路径。Field paths are used in many parameters of the Azure Cognitive Search REST APIs. 下表列出了可以使用字段路径的所有位置,以及字段路径用法的任何限制:The following table lists all the places where they can be used, plus any restrictions on their usage:

APIAPI 参数名称Parameter name 限制Restrictions
创建更新索引Create or Update Index suggesters/sourceFields None
创建更新索引Create or Update Index scoringProfiles/text/weights 只能引用可搜索字段Can only refer to searchable fields
创建更新索引Create or Update Index scoringProfiles/functions/fieldName 只能引用可筛选字段Can only refer to filterable fields
搜索Search queryTypefull 时,该参数为 searchsearch when queryType is full 只能引用可搜索字段Can only refer to searchable fields
搜索Search facet 只能引用可分面字段Can only refer to facetable fields
搜索Search highlight 只能引用可搜索字段Can only refer to searchable fields
搜索Search searchFields 只能引用可搜索字段Can only refer to searchable fields
建议自动完成Suggest and Autocomplete searchFields 只能引用属于建议器的字段Can only refer to fields that are part of a suggester
搜索建议自动完成Search, Suggest, and Autocomplete $filter 只能引用可筛选字段Can only refer to filterable fields
搜索建议Search and Suggest $orderby 只能引用可排序字段Can only refer to sortable fields
搜索建议查找Search, Suggest, and Lookup $select 只能引用可检索字段Can only refer to retrievable fields

常量Constants

OData 中的常量是给定实体数据模型 (EDM) 类型的文本值。Constants in OData are literal values of a given Entity Data Model (EDM) type. 有关 Azure 认知搜索中受支持类型的列表,请参阅支持的数据类型See Supported data types for a list of supported types in Azure Cognitive Search. 不支持集合类型的常量。Constants of collection types aren't supported.

下表显示了 Azure 认知搜索支持的每个数据类型的常量示例:The following table shows examples of constants for each of the data types supported by Azure Cognitive Search:

数据类型Data type 示例常量Example constants
Edm.Boolean true, falsetrue, false
Edm.DateTimeOffset 2019-05-06T12:30:05.451Z
Edm.Double 3.14159-1.2e7NaNINF-INF3.14159, -1.2e7, NaN, INF, -INF
Edm.GeographyPoint geography'POINT(-122.131577 47.678581)'
Edm.GeographyPolygon geography'POLYGON((-122.031577 47.578581, -122.031577 47.678581, -122.131577 47.678581, -122.031577 47.578581))'
Edm.Int32 123, -456123, -456
Edm.Int64 283032927235
Edm.String 'hello'

转义字符串常量中的特殊字符Escaping special characters in string constants

OData 中的字符串常量由单引号分隔。String constants in OData are delimited by single quotes. 如果需要使用本身可能包含单引号的字符串常量构造查询,则可以通过将嵌入的引号加倍来对其进行转义。If you need to construct a query with a string constant that might itself contain single quotes, you can escape the embedded quotes by doubling them.

例如,带有无格式撇号的短语(如“Alice's car”)将在 OData 中表示为字符串常量 'Alice''s car'For example, a phrase with an unformatted apostrophe like "Alice's car" would be represented in OData as the string constant 'Alice''s car'.

重要

以编程方式构建筛选器时,请记住转义来自用户输入的字符串常量,这一点很重要。When constructing filters programmatically, it's important to remember to escape string constants that come from user input. 这是为了减少注入攻击的可能性,特别是在使用筛选器实现安全修整时。This is to mitigate the possibility of injection attacks, especially when using filters to implement security trimming.

常量语法Constants syntax

以下 EBNF(扩展巴科斯-瑙尔范式)定义上表中所示的大多数常量的语法。The following EBNF (Extended Backus-Naur Form) defines the grammar for most of the constants shown in the above table. 可在 Azure 认知搜索中的 OData 地理空间函数中找到地理空间类型的语法。The grammar for geo-spatial types can be found in OData geo-spatial functions in Azure Cognitive Search.

constant ::=
    string_literal
    | date_time_offset_literal
    | integer_literal
    | float_literal
    | boolean_literal
    | 'null'

string_literal ::= "'"([^'] | "''")*"'"

date_time_offset_literal ::= date_part'T'time_part time_zone

date_part ::= year'-'month'-'day

time_part ::= hour':'minute(':'second('.'fractional_seconds)?)?

zero_to_fifty_nine ::= [0-5]digit

digit ::= [0-9]

year ::= digit digit digit digit

month ::= '0'[1-9] | '1'[0-2]

day ::= '0'[1-9] | [1-2]digit | '3'[0-1]

hour ::= [0-1]digit | '2'[0-3]

minute ::= zero_to_fifty_nine

second ::= zero_to_fifty_nine

fractional_seconds ::= integer_literal

time_zone ::= 'Z' | sign hour':'minute

sign ::= '+' | '-'

/* In practice integer literals are limited in length to the precision of
the corresponding EDM data type. */
integer_literal ::= digit+

float_literal ::=
    sign? whole_part fractional_part? exponent?
    | 'NaN'
    | '-INF'
    | 'INF'

whole_part ::= integer_literal

fractional_part ::= '.'integer_literal

exponent ::= 'e' sign? integer_literal

boolean_literal ::= 'true' | 'false'

下面还提供了交互式语法图:An interactive syntax diagram is also available:

基于字段路径和常量生成表达式Building expressions from field paths and constants

字段路径和常量是 OData 表达式的最基本组成部分,但它们已经是完整的表达式。Field paths and constants are the most basic part of an OData expression, but they're already full expressions themselves. 事实上,Azure 认知搜索中的 $select 参数无非就是逗号分隔的字段路径列表,而 $orderby 也不是比 $select 要复杂得多。In fact, the $select parameter in Azure Cognitive Search is nothing but a comma-separated list of field paths, and $orderby isn't much more complicated than $select. 如果你正好在索引中使用了 Edm.Boolean 类型的字段,则你甚至可以编写一个只包含该字段的路径的筛选器。If you happen to have a field of type Edm.Boolean in your index, you can even write a filter that is nothing but the path of that field. 常量 truefalse 同样是有效的筛选器。The constants true and false are likewise valid filters.

但是,大多数情况下,需要使用更复杂的表达式来引用多个字段和常量。However, most of the time you'll need more complex expressions that refer to more than one field and constant. 这些表达式的生成方式根据参数而异。These expressions are built in different ways depending on the parameter.

以下 EBNF(扩展巴科斯-瑙尔范式)定义 $filter 、$orderby 和 $select 参数的语法。The following EBNF (Extended Backus-Naur Form) defines the grammar for the $filter, $orderby, and $select parameters. 这些表达式是基于引用字段路径和常量的更简单表达式生成的:These are built up from simpler expressions that refer to field paths and constants:

filter_expression ::= boolean_expression

order_by_expression ::= order_by_clause(',' order_by_clause)*

select_expression ::= '*' | field_path(',' field_path)*

下面还提供了交互式语法图:An interactive syntax diagram is also available:

$orderby$select 参数都是较简单表达式的逗号分隔列表。The $orderby and $select parameters are both comma-separated lists of simpler expressions. $filter 参数是由较简单的子表达式构成的布尔表达式。The $filter parameter is a Boolean expression that is composed of simpler sub-expressions. 这些子表达式是使用逻辑运算符(例如 andornot)、比较运算符(例如 eqltgt)和集合运算符(例如 anyall)合并的。These sub-expressions are combined using logical operators such as and, or, and not, comparison operators such as eq, lt, gt, and so on, and collection operators such as any and all.

以下文章更详细地探讨了 $filter$orderby$select 参数:The $filter, $orderby, and $select parameters are explored in more detail in the following articles:

另请参阅See also