top-hitters 运算符top-hitters operator

返回前 N 个结果的近似值(假定输入分布呈扭曲状态)。Returns an approximation of the first N results (assuming skewed distribution of the input).

T | top-hitters 25 of Page by Views 


top-hitters 是近似算法,应在使用大型数据进行运行时使用。top-hitters is an approximation algorithm and should be used when running with large data. top-hitters 的近似值基于 Count-Min-Sketch 算法。The approximation of the the top-hitters is based on the Count-Min-Sketch algorithm.


T | top-hitters NumberOfRows of sort_key [ by expression ]T | top-hitters NumberOfRows of sort_key [ by expression ]


  • NumberOfRows:要返回的 T 的行数。NumberOfRows : The number of rows of T to return. 可以指定任何数值表达式。You can specify any numeric expression.
  • sort_key:对行进行排序所依据的列的名称。sort_key : The name of the column by which to sort the rows.
  • expression:(可选)一个将用于 top-hitters 估算的表达式。expression : (optional) An expression which will be used for the top-hitters estimation.
    • expression:top-hitters 将返回 NumberOfRows 行,这些行包含 sum(expression) 的近似最大值。expression : top-hitters will return NumberOfRows rows which have an approximated maximum of sum( expression ). Expression 可以是计算结果为数字的列或任何其他表达式。Expression can be a column, or any other expression that evaluates to a number.
    • 如果未提及 expression,top-hitters 算法将计算 sort-key 出现的次数。If expression is not mentioned, top-hitters algorithm will count the occurrences of the sort-key .


获取最频繁的项Get most frequent items

下一个示例演示如何在维基百科中查找(在 2016 年 4 月之后访问的)页面最多的前 5 种语言。The next example shows how to find top-5 languages with most pages in Wikipedia (accessed after during April 2016).

| where Timestamp > datetime(2016-04-01) and Timestamp < datetime(2016-05-01) 
| top-hitters 5 of Language 
语言Language approximate_count_Languageapproximate_count_Language
enen 15399541271539954127
zhzh 339827659339827659
dede 262197491262197491
ruru 227003107227003107
frfr 207943448207943448

获取排名最靠前的项(基于列值)Get top hitters based on column value

下一个示例演示如何找到 2016 年维基百科浏览量最多的英文页面。The next example shows how to find most viewed English pages of Wikipedia of the year 2016. 该查询使用“Views”(整数)来计算页面受欢迎程度(查看次数)。The query uses 'Views' (integer number) to calculate page popularity (number of views).

| where Timestamp > datetime(2016-01-01)
| where Language == "en"
| where Page !has 'Special'
| top-hitters 10 of Page by Views
页面Page approximate_sum_Viewsapproximate_sum_Views
Main_PageMain_Page 13258567541325856754
Web_scrapingWeb_scraping 4397915343979153
Java_(programming_language)Java_(programming_language) 1648949116489491
United_StatesUnited_States 1392884113928841
WikipediaWikipedia 1358491513584915
Donald_TrumpDonald_Trump 1237644812376448
YouTubeYouTube 1191725211917252
The_Revenant_(2015_film)The_Revenant_(2015_film) 1071426310714263
Star_Wars:_The_Force_AwakensStar_Wars:_The_Force_Awakens 97706539770653
Portal:Current_eventsPortal:Current_events 95780009578000