Azure 认知搜索中的 OData search.in 函数OData search.in function in Azure Cognitive Search

OData 筛选器表达式中,一种常见情景是检查每个文档中的单个字段是否等于许多可能值中的一个。A common scenario in OData filter expressions is to check whether a single field in each document is equal to one of many possible values. 例如,某些应用程序会这样进行安全修整:根据一个包含主 ID(代表发出查询的用户)的列表,检查字段是否包含一个或多个主 ID。For example, this is how some applications implement security trimming -- by checking a field containing one or more principal IDs against a list of principal IDs representing the user issuing the query. 若要编写这样的查询,一种方法是使用 eqor 运算符:One way to write a query like this is to use the eq and or operators:

    group_ids/any(g: g eq '123' or g eq '456' or g eq '789')

不过,有一种更简便的编写方法,这就是使用 search.in 函数:However, there is a shorter way to write this, using the search.in function:

    group_ids/any(g: search.in(g, '123, 456, 789'))

重要

除了更简便且更易读,使用 search.in 还具有性能优势,并且在需要在筛选器中包含数百甚至数千个值的情况下,可以避免某些筛选器大小限制Besides being shorter and easier to read, using search.in also provides performance benefits and avoids certain size limitations of filters when there are hundreds or even thousands of values to include in the filter. 因此,我们强烈建议使用 search.in,而不要使用更复杂的相等表达式析取。For this reason, we strongly recommend using search.in instead of a more complex disjunction of equality expressions.

备注

4.01 版 OData Standard 最近引入了 in 运算符,该运算符的行为类似于 Azure 认知搜索中的 search.in 函数。Version 4.01 of the OData standard has recently introduced the in operator, which has similar behavior as the search.in function in Azure Cognitive Search. 但是,Azure 认知搜索不支持该运算符,因此你必须改用 search.in 函数。However, Azure Cognitive Search does not support this operator, so you must use the search.in function instead.

语法Syntax

以下 EBNF(扩展巴科斯-瑙尔范式)定义了 search.in 函数的语法:The following EBNF (Extended Backus-Naur Form) defines the grammar of the search.in function:

search_in_call ::=
    'search.in(' variable ',' string_literal(',' string_literal)? ')'

下面还提供了交互式语法图:An interactive syntax diagram is also available:

search.in 函数测试给定字符串字段或范围变量是否等于给定的值列表之一。The search.in function tests whether a given string field or range variable is equal to one of a given list of values. 变量与列表中每个值之间的相等性以区分大小写的方式进行确定,这与 eq 运算符的方式相同。Equality between the variable and each value in the list is determined in a case-sensitive fashion, the same way as for the eq operator. 因此,search.in(myfield, 'a, b, c') 等表达式相当于 myfield eq 'a' or myfield eq 'b' or myfield eq 'c',但 search.in 的表现会好得多。Therefore an expression like search.in(myfield, 'a, b, c') is equivalent to myfield eq 'a' or myfield eq 'b' or myfield eq 'c', except that search.in will yield much better performance.

search.in 函数有两个重载:There are two overloads of the search.in function:

  • search.in(variable, valueList)
  • search.in(variable, valueList, delimiters)

下表定义了这些参数:The parameters are defined in the following table:

参数名称Parameter name 类型Type 说明Description
variable Edm.String 字符串字段引用(在 anyall 表达式中使用 search.in 的情况下,则为基于字符串集合字段的范围变量)。A string field reference (or a range variable over a string collection field in the case where search.in is used inside an any or all expression).
valueList Edm.String 一个字符串,其中包含的分隔列表中的值需要与 variable 参数匹配。A string containing a delimited list of values to match against the variable parameter. 如果未指定 delimiters 参数,则默认的分隔符为空格和逗号。If the delimiters parameter is not specified, the default delimiters are space and comma.
delimiters Edm.String 一个字符串,其中的每个字符在分析 valueList 参数时会被视为分隔符。A string where each character is treated as a separator when parsing the valueList parameter. 此参数的默认值为 ' ,',这意味着,系统会将其中包含空格和/或逗号的任何值分开。The default value of this parameter is ' ,' which means that any values with spaces and/or commas between them will be separated. 如果因为值包含空格和逗号而需要使用这些字符以外的分隔符,可以在此参数中指定替代分隔符,例如 '|'If you need to use separators other than spaces and commas because your values include those characters, you can specify alternate delimiters such as '|' in this parameter.

search.in 的性能Performance of search.in

如果使用 search.in,则当第二个参数包含数百个或数千个值的列表时,可以获得次秒级响应时间。If you use search.in, you can expect sub-second response time when the second parameter contains a list of hundreds or thousands of values. 尽管仍然受到最大请求大小的限制,但可以传递给 search.in 的项数没有明确限制。There is no explicit limit on the number of items you can pass to search.in, although you are still limited by the maximum request size. 但是,延迟会随着值数量的增长而增加。However, the latency will grow as the number of values grows.

示例Examples

查找名称为“Sea View motel”或“Budget hotel”的所有酒店。Find all hotels with name equal to either 'Sea View motel' or 'Budget hotel'. 包含空格(默认分隔符)的短语。Phrases contain spaces, which is a default delimiter. 可以将单引号中的备用分隔符指定为第三个字符串参数:You can specify an alternative delimiter in single quotes as the third string parameter:

    search.in(HotelName, 'Sea View motel,Budget hotel', ',')

查找名称为“Sea View motel”或“Budget hotel”并以“|”分隔的所有酒店:Find all hotels with name equal to either 'Sea View motel' or 'Budget hotel' separated by '|'):

    search.in(HotelName, 'Sea View motel|Budget hotel', '|')

查找其房间带有“wifi”或“tub”标签的所有酒店:Find all hotels with rooms that have the tag 'wifi' or 'tub':

    Rooms/any(room: room/Tags/any(tag: search.in(tag, 'wifi, tub')))

在集合中查找短语匹配项,例如标记中的“heated towel racks”或“hairdryer included”。Find a match on phrases within a collection, such as 'heated towel racks' or 'hairdryer included' in tags.

    Rooms/any(room: room/Tags/any(tag: search.in(tag, 'heated towel racks,hairdryer included', ','))

查找没有“motel”或“cabin”标签的所有酒店:Find all hotels without the tag 'motel' or 'cabin':

    Tags/all(tag: not search.in(tag, 'motel, cabin'))

后续步骤Next steps