Azure 认知搜索中的 OData $filter 语法OData $filter syntax in Azure Cognitive Search

除了全文搜索字词以外,Azure 认知搜索还可使用 OData 筛选表达式将其他条件应用到搜索查询。Azure Cognitive Search uses OData filter expressions to apply additional criteria to a search query besides full-text search terms. 本文详细介绍筛选器的语法。This article describes the syntax of filters in detail. 有关筛选器的定义以及如何使用筛选器来实现特定查询方案的其他一般信息,请参阅 Azure 认知搜索中的筛选器For more general information about what filters are and how to use them to realize specific query scenarios, see Filters in Azure Cognitive Search.

语法Syntax

OData 语言中的筛选器是一个布尔表达式,而该表达式又可以是多种类型的表达式之一,如以下 EBNF(扩展巴科斯-瑙尔范式)所示:A filter in the OData language is a Boolean expression, which in turn can be one of several types of expression, as shown by the following EBNF (Extended Backus-Naur Form):

boolean_expression ::=
    collection_filter_expression
    | logical_expression
    | comparison_expression
    | boolean_literal
    | boolean_function_call
    | '(' boolean_expression ')'
    | variable

/* This can be a range variable in the case of a lambda, or a field path. */
variable ::= identifier | field_path

交互式语法图也可用:An interactive syntax diagram is also available:

布尔表达式的类型包括:The types of Boolean expressions include:

  • 使用 anyall 的集合筛选表达式。Collection filter expressions using any or all. 这些表达式将筛选条件应用到集合字段。These apply filter criteria to collection fields. 有关详细信息,请参阅 Azure 认知搜索中的 OData 集合运算符For more information, see OData collection operators in Azure Cognitive Search.
  • 使用 andornot 运算符合并其他布尔表达式的逻辑表达式。Logical expressions that combine other Boolean expressions using the operators and, or, and not. 有关详细信息,请参阅 Azure 认知搜索中的 OData 逻辑运算符For more information, see OData logical operators in Azure Cognitive Search.
  • 使用 eqnegtltgele 运算符将字段或范围变量与常量值进行比较的比较表达式。Comparison expressions, which compare fields or range variables to constant values using the operators eq, ne, gt, lt, ge, and le. 有关详细信息,请参阅 Azure 认知搜索中的 OData 比较运算符For more information, see OData comparison operators in Azure Cognitive Search. 比较表达式还可用于通过 geo.distance 函数比较地理空间坐标之间的距离。Comparison expressions are also used to compare distances between geo-spatial coordinates using the geo.distance function. 有关详细信息,请参阅 Azure 认知搜索中的 OData 地理空间函数For more information, see OData geo-spatial functions in Azure Cognitive Search.
  • 布尔文本值 truefalseThe Boolean literals true and false. 在以编程方式生成筛选器时,这些常量可能很有用,否则在实践中很少用到它们。These constants can be useful sometimes when programmatically generating filters, but otherwise don't tend to be used in practice.
  • 布尔函数的调用,包括:Calls to Boolean functions, including:
  • 类型为 Edm.Boolean 的字段路径或范围变量。Field paths or range variables of type Edm.Boolean. 例如,如果索引中包含名为 IsEnabled 的布尔字段,而你希望返回此字段为 true 的所有文档,则筛选表达式只需使用名称 IsEnabled 即可。For example, if your index has a Boolean field called IsEnabled and you want to return all documents where this field is true, your filter expression can just be the name IsEnabled.
  • 括号中的布尔表达式。Boolean expressions in parentheses. 使用括号可以帮助显式确定筛选器中的操作顺序。Using parentheses can help to explicitly determine the order of operations in a filter. 有关 OData 运算符的默认优先顺序的详细信息,请参阅下一部分。For more information on the default precedence of the OData operators, see the next section.

筛选器中的运算符优先顺序Operator precedence in filters

如果编写的筛选表达式未将其子表达式括在括号中,Azure 认知搜索将会根据一组运算符优先顺序规则评估该表达式。If you write a filter expression with no parentheses around its sub-expressions, Azure Cognitive Search will evaluate it according to a set of operator precedence rules. 这些规则基于用于合并子表达式的运算符。These rules are based on which operators are used to combine sub-expressions. 下表按从高到低的优先顺序列出了运算符组:The following table lists groups of operators in order from highest to lowest precedence:

Group 运算符Operator(s)
逻辑运算符Logical operators not
比较运算符Comparison operators eqnegtltgeleeq, ne, gt, lt, ge, le
逻辑运算符Logical operators and
逻辑运算符Logical operators or

上表中优先顺序较高的运算符与其操作数之间的绑定密切性高于其他运算符。An operator that is higher in the above table will "bind more tightly" to its operands than other operators. 例如,and 的优先顺序高于 or,而比较运算符的优先顺序又高于两者中的任何一个,因此,以下两个表达式是等效的:For example, and is of higher precedence than or, and comparison operators are of higher precedence than either of them, so the following two expressions are equivalent:

    Rating gt 0 and Rating lt 3 or Rating gt 7 and Rating lt 10
    ((Rating gt 0) and (Rating lt 3)) or ((Rating gt 7) and (Rating lt 10))

not 运算符的优先顺序最高 -- 甚至高于比较运算符。The not operator has the highest precedence of all -- even higher than the comparison operators. 正因如此,你会尝试编写如下所示的筛选器:That's why if you try to write a filter like this:

    not Rating gt 5

将出现以下错误消息:You'll get this error message:

    Invalid expression: A unary operator with an incompatible type was detected. Found operand type 'Edm.Int32' for operator kind 'Not'.

发生此错误的原因是,运算符仅与 Rating 类型的 Edm.Int32 字段相关联,而不与整个比较表达式相关联。This error happens because the operator is associated with just the Rating field, which is of type Edm.Int32, and not with the entire comparison expression. 解决方法是将 not 的操作数括在括号中:The fix is to put the operand of not in parentheses:

    not (Rating gt 5)

筛选器大小限制Filter size limitations

可以发送到 Azure 认知搜索的筛选表达式的大小和复杂性存在限制。There are limits to the size and complexity of filter expressions that you can send to Azure Cognitive Search. 限制大致基于筛选器表达式中的子句数。The limits are based roughly on the number of clauses in your filter expression. 一条合理的指导原则是,如果存在数百个子句,则存在超限的风险。A good guideline is that if you have hundreds of clauses, you are at risk of exceeding the limit. 我们建议正确设计应用程序,使之不会生成大小不受限制的筛选器。We recommend designing your application in such a way that it doesn't generate filters of unbounded size.

提示

使用 search.in 函数而不是相等性比较的较长析取可帮助避免超出筛选子句限制,因为一个函数调用算作一个子句。Using the search.in function instead of long disjunctions of equality comparisons can help avoid the filter clause limit, since a function call counts as a single clause.

示例Examples

查找至少有一间客房的基本价格低于 200 美元且评分为 4 分或以上的所有酒店:Find all hotels with at least one room with a base rate less than $200 that are rated at or above 4:

    $filter=Rooms/any(room: room/BaseRate lt 200.0) and Rating ge 4

查找除“Sea View Motel”以外的自 2010 年以来经过翻修的所有酒店:Find all hotels other than "Sea View Motel" that have been renovated since 2010:

    $filter=HotelName ne 'Sea View Motel' and LastRenovationDate ge 2010-01-01T00:00:00Z

查找在 2010 年或以后经过翻修的所有酒店。Find all hotels that were renovated in 2010 or later. 日期时间文本包括太平洋标准时间的时区信息:The datetime literal includes time zone information for Pacific Standard Time:

    $filter=LastRenovationDate ge 2010-01-01T00:00:00-08:00

查找提供停车位并且所有客房禁止吸烟的所有酒店:Find all hotels that have parking included and where all rooms are non-smoking:

    $filter=ParkingIncluded and Rooms/all(room: not room/SmokingAllowed)

- 或 -- OR -

    $filter=ParkingIncluded eq true and Rooms/all(room: room/SmokingAllowed eq false)

查找等级为豪华或包含停车场且评分为 5 分的所有酒店:Find all hotels that are Luxury or include parking and have a rating of 5:

    $filter=(Category eq 'Luxury' or ParkingIncluded eq true) and Rating eq 5

查找至少有一间客房提供“wifi”标记的所有酒店(每间客房的标记存储在 Collection(Edm.String) 字段中):Find all hotels with the tag "wifi" in at least one room (where each room has tags stored in a Collection(Edm.String) field):

    $filter=Rooms/any(room: room/Tags/any(tag: tag eq 'wifi'))

查找提供任何客房的所有酒店:Find all hotels with any rooms:

    $filter=Rooms/any()

查找不提供客房的所有酒店:Find all hotels that don't have rooms:

    $filter=not Rooms/any()

查找与给定参考点的距离在 10 公里范围内的所有酒店(其中 LocationEdm.GeographyPoint 类型的字段):Find all hotels within 10 kilometers of a given reference point (where Location is a field of type Edm.GeographyPoint):

    $filter=geo.distance(Location, geography'POINT(-122.131577 47.678581)') le 10

查找描述为多边形的给定视区内的所有酒店(其中 Location 是 Edm.GeographyPoint 类型的字段)。Find all hotels within a given viewport described as a polygon (where Location is a field of type Edm.GeographyPoint). 多边形必须处于闭合状态,这意味着第一个点集和最后一个点集必须相同。The polygon must be closed, meaning the first and last point sets must be the same. 此外,点必须以逆时针顺序列出Also, the points must be listed in counterclockwise order.

    $filter=geo.intersects(Location, geography'POLYGON((-122.031577 47.578581, -122.031577 47.678581, -122.131577 47.678581, -122.031577 47.578581))')

查找“描述”字段为 null 的所有酒店。Find all hotels where the "Description" field is null. 如果未曾设置该字段,或者显式将其设置为 null,则该字段为 null:The field will be null if it was never set, or if it was explicitly set to null:

    $filter=Description eq null

查找名称为“Sea View motel”或“Budget hotel”的所有酒店。Find all hotels with name equal to either 'Sea View motel' or 'Budget hotel'). 这些短语包含空格,而空格是默认的分隔符。These phrases contain spaces, and space is a default delimiter. 可将单引号中的备用分隔符指定为第三个字符串参数:You can specify an alternative delimiter in single quotes as the third string parameter:

    $filter=search.in(HotelName, 'Sea View motel,Budget hotel', ',')

查找名称为“Sea View motel”或“Budget hotel”并以“|”分隔的所有酒店:Find all hotels with name equal to either 'Sea View motel' or 'Budget hotel' separated by '|'):

    $filter=search.in(HotelName, 'Sea View motel|Budget hotel', '|')

查找所有客房具有“wifi”或“浴缸”标记的所有酒店:Find all hotels where all rooms have the tag 'wifi' or 'tub':

    $filter=Rooms/any(room: room/Tags/any(tag: search.in(tag, 'wifi, tub'))

在集合中查找短语匹配项,例如标记中的“heated towel racks”或“hairdryer included”。Find a match on phrases within a collection, such as 'heated towel racks' or 'hairdryer included' in tags.

    $filter=Rooms/any(room: room/Tags/any(tag: search.in(tag, 'heated towel racks,hairdryer included', ','))

查找包含“waterfront”一词的文档。Find documents with the word "waterfront". 此筛选器查询与包含 搜索请求search=waterfront相同。This filter query is identical to a search request with search=waterfront.

    $filter=search.ismatchscoring('waterfront')

查找带有“hostel”一词且评分大于或等于 4 分的文档,或带有“motel”一词且评分等于 5 分的文档。Find documents with the word "hostel" and rating greater or equal to 4, or documents with the word "motel" and rating equal to 5. 在不使用 search.ismatchscoring 函数的情况下无法表达此请求,因为它使用 or 将全文搜索与筛选操作合并在一起。This request couldn't be expressed without the search.ismatchscoring function since it combines full-text search with filter operations using or.

    $filter=search.ismatchscoring('hostel') and rating ge 4 or search.ismatchscoring('motel') and rating eq 5

查找没有“luxury”一词的文档。Find documents without the word "luxury".

    $filter=not search.ismatch('luxury')

查找包含短语“ocean view”或评分等于 5 分的文档。Find documents with the phrase "ocean view" or rating equal to 5. search.ismatchscoring 查询仅针对 HotelNameDescription 字段执行。The search.ismatchscoring query will be executed only against fields HotelName and Description. 仅与析取的第二个子句匹配的文档也将被返回,即 Rating 等于 5 的酒店。Documents that matched only the second clause of the disjunction will be returned too -- hotels with Rating equal to 5. 为了清楚地表明这些文档与表达式的任何评分部分都不匹配,它们返回的分数等于零。Those documents will be returned with score equal to zero to make it clear that they didn't match any of the scored parts of the expression.

    $filter=search.ismatchscoring('"ocean view"', 'Description,HotelName') or Rating eq 5

查找除描述以外,其他位置的“酒店”和“机场”描述内容不超过五个单词,且所有客房都禁止吸烟的酒店。Find hotels where the terms "hotel" and "airport" are no more than five words apart in the description, and where all rooms are non-smoking. 此查询使用完整 Lucene 查询语言This query uses the full Lucene query language.

    $filter=search.ismatch('"hotel airport"~5', 'Description', 'full', 'any') and not Rooms/any(room: room/SmokingAllowed)

后续步骤Next steps