队列Cohorts

队列分析会检查预定义组(称为队列)在一系列阶段中的进展结果。A cohort analysis examines the outcomes of predetermined groups, called cohorts, as they progress through a set of stages. 队列图表的显著特征是比较两个不同时序中变量的变化。The signature characteristic of a cohort chart is its comparison of the change in a variable across two different time series. 例如,常见的队列定义是按注册时间段划分的用户和按天划分的使用模式。For example, a common cohort definition is users by sign-up period and their usage pattern by day. 其他示例包括:Other examples include:

  • 每月硬盘驱动器故障统计信息(按月)Monthly hard drive failure statistics by month
  • 每周供应商交付业绩(按周)Weekly supplier delivery performance by week
  • 每月班级平均成绩(按月)Monthly average class GPA’s by month

尽管有多种方法可定义队列分析的各个阶段,但 SQL Analytics 支持按每日、每周或每月进行队列可视化。While there are many ways to define the stages of a Cohort analysis, SQL Analytics supports cohort visualizations with daily, weekly, or monthly stages. 此外,SQL Analytics 队列图表会将队列在给定时间段内的度量值与该组的初始总体大小进行比较。Also, SQL Analytics cohort charts compare a cohort’s measurements in a given period against that group’s initial population size.

数据格式Data format

SQL Analytics 要求输入示例包含以下字段:SQL Analytics expects your input samples to have the following fields:

  • 队列日期:唯一标识队列的日期。Cohort Date: the date that uniquely identifies a cohort. 假设你要按注册日期来可视化显示每月的用户活动,则 2018 年 1 月注册的所有用户的队列日期为 2018 年 1 月 1 日。Suppose you’re visualizing monthly user activity by sign-up date, your cohort date for all users that signed-up in January 2018 would be January 1st, 2018. 2 月注册的所有用户的队列日期为 2018 年 2 月 1 日。The cohort date for any user that signed-up in February would be February 1st, 2018.
  • 时间段:从队列日期到本示例为止经过的时间段计数。Period: a count of how many periods transpired since the cohort date as of this sample. 如果你要按注册月份对用户进行分组,则时间段将是自这些用户注册以来的月份计数。If you are grouping users by sign-up month, then your period will be the count of months since these users signed up. 在上例中,对 1 月注册的用户在 7 月的活动进行度量将得到时间段值 7,因为在 1 月与 7 月之间经过了 7 个时间段。In the above example, a measurement of activity in July for users that signed up in January would yield a period value of 7 because seven periods have transpired between January and July.
  • 满足目标的计数:此队列在给定时间段内的表现的实际度量值。Count Satisfying Target: your actual measurement of this cohort’s performance in the given period. 在上例中,如果 1 月注册的 30 位用户在 7 月均有活动,则满足目标的计数将为 30。In the above example, if thirty users who signed up in January showed activity in July then the Count Satisfying Target would be 30.
  • 总队列大小:SQL Analytics 将用于计算队列在给定时间段内目标满意度百分比的分母。Total Cohort Size: the denominator that SQL Analytics will use to calculate the percentage of a cohort’s target satisfaction for a given period. 继续上面的示例,如果有 72 位用户在 1 月注册,则总队列大小为 72。Continuing the example above, if seventy-two users signed up in January then the Total Cohort Size would be 72. 呈现可视化效果时,SQL Analytics 会将该值显示为 41.67% (32 ÷ 72)。When the visualization is rendered, SQL Analytics would display the value as 41.67% (32 ÷ 72).

队列日期注释Cohort date notes

即使你按月或按周定义队列,SQL Analytics 也要求“队列日期”列中的值是完整的日期值。Even if you define your cohorts by month or week, SQL Analytics expects the values in your Cohort Date column to be a full date value. 如果按月分组,则应将 2018-01-18 缩短为 2018-01-01 或 1 月中的任何其他完整日期,而不是 2018-01If you are grouping by month, 2018-01-18 should be shortened to 2018-01-01 or any other full date in January, not 2018-01.

在呈现之前,队列可视化工具会将所有日期和时间值转换为 GMT。The cohort visualizer converts all date and time values to GMT before rendering. 为避免呈现出现问题,应按照当地 UTC 时差调整从数据库返回的日期时间。To avoid rendering issues, you should adjust the date times returned from your database by your local UTC offset.