active_users_count 插件active_users_count plugin

计算值的非重复计数,其中每个值都至少已在回看期内最低数量的时间段中出现。Calculates distinct count of values, where each value has appeared in at least a minimum number of periods in a lookback period.

仅适用于计算“粉丝”的非重复计数,而不包括“非粉丝”的出现。Useful for calculating distinct counts of "fans" only, while not including appearances of "non-fans". 仅当用户在回看期中处于活跃状态时,该用户才会被计为“粉丝”。A user is counted as a "fan" only if it was active during the lookback period. 回看期仅用于确定用户是否被视为 active(“粉丝”)。The lookback period is only used to determine whether a user is considered active ("fan") or not. 聚合本身不包括来自回看窗口的用户。The aggregation itself doesn't include users from the lookback window. 相比之下,sliding_window_counts 聚合针对回看期的滑动窗口执行。In comparison, the sliding_window_counts aggregation is performed over a sliding window of the lookback period.

T | evaluate active_users_count(id, datetime_column, startofday(ago(30d)), startofday(now()), 7d, 1d, 2, 7d, dim1, dim2, dim3)

语法Syntax

T | evaluate active_users_count(IdColumn, TimelineColumn, Start, End, LookbackWindow, Period, ActivePeriodsCount, Bin , [ dim1, dim2, ...])T | evaluate active_users_count(IdColumn, TimelineColumn, Start, End, LookbackWindow, Period, ActivePeriodsCount, Bin , [ dim1, dim2, ...])

参数Arguments

  • T :输入表格表达式。T : The input tabular expression.
  • IdColumn :列的名称,其 ID 值表示用户活跃度。IdColumn : The name of the column with ID values that represent user activity.
  • TimelineColumn :表示时间线的列的名称。TimelineColumn : The name of the column that represents timeline.
  • Start :(可选)带有分析开始时间段值的标量。Start : (optional) Scalar with value of the analysis start period.
  • End :(可选)带有分析结束时间段值的标量。End : (optional) Scalar with value of the analysis end period.
  • LookbackWindow :滑动时间窗口,限定检查用户出现情况的时间段。LookbackWindow : A sliding time window defining a period where user appearance is checked. 回看期开始于([当前出现] - [回看窗口]),结束于([当前出现])。Lookback period starts at ([current appearance] - [lookback window]) and ends on ([current appearance]).
  • Period :要计为单次出现的标量常数时间跨度(如果用户在此时间跨度的至少非重复的 ActivePeriodsCount 个时间段中出现,该用户将会被计为活跃)。Period : Scalar constant timespan to count as single appearance (a user will be counted as active if it appears in at least distinct ActivePeriodsCount of this timespan.
  • ActivePeriodsCount :非重复的活跃时间段的最小数目,用于确定用户是否处于活跃状态。ActivePeriodsCount : Minimal number of distinct active periods to decide if user is active. 活跃用户是指那些至少在(等于或大于)最低数量的活跃时间段中出现的用户。Active users are those users who appeared in at least (equal or greater than) active periods count.
  • Bin :分析步骤时间段的标量常数值。Bin : Scalar constant value of the analysis step period. 可以是数字/日期/时间/时间戳值,也可以是 week/month/year 字符串。Can be a numeric/datetime/timestamp value, or a string that is week/month/year. 所有时间段都将会是相应的 startofweek/startofmonth/startofyear 函数。All periods will be the corresponding startofweek/startofmonth/startofyear functions.
  • dim1 , dim2 , ...:(可选)维度列的列表,用于切分活跃度指标计算。dim1 , dim2 , ...: (optional) list of the dimensions columns that slice the activity metrics calculation.

返回Returns

返回一个表,该表包含 ID 的非重复计数值,这些 ID 已在以下时间段中的 ActivePeriodCounts 个时间段中出现:回看期、每个时间线时间段,以及每个现有的维度组合。Returns a table that has distinct count values for IDs that have appeared in ActivePeriodCounts in the following periods: the lookback period, each timeline period, and each existing dimensions combination.

输出表架构如下:Output table schema is:

TimelineColumnTimelineColumn dim1dim1 .... dim_ndim_n dcount_valuesdcount_values
类型:自 TimelineColumn 起type: as of TimelineColumn .... .... .... longlong

示例Examples

计算每周的非重复用户数,这些用户在前八天的时间段内有至少不同三天都出现过。Calculate weekly number of distinct users that appeared in at least three different days over a period of prior eight days. 分析的时间段:2018 年 7 月。Period of analysis: July 2018.

let Start = datetime(2018-07-01);
let End = datetime(2018-07-31);
let LookbackWindow = 8d;
let Period = 1d;
let ActivePeriods = 3;
let Bin = 7d; 
let T =  datatable(User:string, Timestamp:datetime)
[
    "B",      datetime(2018-06-29),
    "B",      datetime(2018-06-30),
    "A",      datetime(2018-07-02),
    "B",      datetime(2018-07-04),
    "B",      datetime(2018-07-08),
    "A",      datetime(2018-07-10),
    "A",      datetime(2018-07-14),
    "A",      datetime(2018-07-17),
    "A",      datetime(2018-07-20),
    "B",      datetime(2018-07-24)
]; 
T | evaluate active_users_count(User, Timestamp, Start, End, LookbackWindow, Period, ActivePeriods, Bin)



TimestampTimestamp dcount
2018-07-01 00:00:00.00000002018-07-01 00:00:00.0000000 11
2018-07-15 00:00:00.00000002018-07-15 00:00:00.0000000 11

如果以下两个条件都满足,则用户会被视为处于活跃状态:A user is considered active if it fulfills both of the following criteria:

  • 在至少三个不同日期(Period = 1d,ActivePeriods = 3)中发现该用户。The user was seen in at least three distinct days (Period = 1d, ActivePeriods=3).
  • 在 8 天的回看窗口(早于并包括当前这次出现)中发现该用户。The user was seen in a lookback window of 8d before and including their current appearance.

在下图中,符合此活跃状态条件的出现只有以下实例:用户 A 在 7 月 20 日出现以及用户 B 在 7 月 4 日出现(请参阅上述插件结果)。In the illustration below, the only appearances that are active by this criteria are the following instances: User A on 7/20 and User B on 7/4 (see plugin results above). 用户 B 的出现包括在 7 月 4 日的回看窗口中,但不包括在 6 月 29 日至 30 日的 Start-End 时间范围内。The appearances of User B are included for the lookback window on 7/4, but not for the Start-End time range of 6/29-30.

活跃用户计数示例