$fill

$fill 阶段用于在聚合管道内的文档中填充缺失值或 null 值。 它提供了各种方法来填充缺失的数据,其中包括使用静态值、线性内插或上一/下一文档中的值。

语法

{
  $fill: {
    sortBy: <sort specification>,
    partitionBy: <partition fields>,
    partitionByFields: <array of partition field names>,
    output: {
      <field1>: { value: <expression> },
      <field2>: { method: <string> }
    }
  }
}

参数

参数 DESCRIPTION
sortBy 应用依赖于文档顺序的填充方法时,指定文档的排序顺序。
partitionBy 可选。 将文档分组到分区中。 填充操作分别在每个分区中应用。
partitionByFields 可选。 使用字段名称数组的 partitionBy 的替代语法。
output 指定要填充的字段以及用于填充缺失数据的方法和值。

填充方法

方法 DESCRIPTION
value 使用指定的静态值或表达式结果进行填充。
linear 使用已知值之间的线性内插进行填充(仅限数值字段)。
locf 末次观测值结转 - 使用上次的已知值。
linear 周围值之间的线性内插。

例子

请考虑商店集合中的这个示例文档。

{
  "_id": "2cf3f885-9962-4b67-a172-aa9039e9ae2f",
  "name": "First Up Consultants | Bed and Bath Center - South Amir",
  "location": {
    "lat": 60.7954,
    "lon": -142.0012
  },
  "staff": {
    "totalStaff": {
      "fullTime": 18,
      "partTime": 17
    }
  },
  "sales": {
    "totalSales": 37701,
    "salesByCategory": [
      {
        "categoryName": "Mattress Toppers",
        "totalSales": 37701
      }
    ]
  },
  "promotionEvents": [
    {
      "eventName": "Price Drop Palooza",
      "promotionalDates": {
        "startDate": {
          "Year": 2024,
          "Month": 9,
          "Day": 21
        },
        "endDate": {
          "Year": 2024,
          "Month": 9,
          "Day": 30
        }
      },
      "discounts": [
        {
          "categoryName": "Bath Accessories",
          "discountPercentage": 18
        },
        {
          "categoryName": "Pillow Top Mattresses",
          "discountPercentage": 17
        }
      ]
    }
  ]
}

示例 1:使用静态值填充缺失值

此查询用默认值 0 填充数组中的totalSales缺失salesByCategory值。

db.stores.aggregate([{
    $match: {
        company: {
            $in: ["First Up Consultants"]
        }
    }
}, {
    $unwind: "$sales.salesByCategory"
}, {
    $fill: {
        output: {
            "sales.salesByCategory.totalSales": {
                value: 0
            }
        }
    }
}, {
    $group: {
        _id: "$_id",
        name: {
            $first: "$name"
        },
        salesByCategory: {
            $push: "$sales.salesByCategory"
        }
    }
}])

此查询返回的前两个结果为:

[
    {
        "_id": "affdc09c-7356-4fff-a857-e8301f57159c",
        "name": "First Up Consultants | Sports Gear Pantry - Wildermanhaven",
        "salesByCategory": [
            {
                "categoryName": "Baseball Gear",
                "totalSales": 33878
            },
            {
                "categoryName": "Volleyball Gear",
                "totalSales": 34031
            }
        ]
    },
    {
        "_id": "1cf667b4-d8ce-4f1a-bad1-a1f0bbce26c2",
        "name": "First Up Consultants | Picture Frame Variety - New Abrahamborough",
        "salesByCategory": [
            {
                "categoryName": "Picture Hanging Supplies",
                "totalSales": 7229
            },
            {
                "categoryName": "Collage Frames",
                "totalSales": 40014
            }
        ]
    }
]

示例 2:使用末次观测值结转填充缺失的员工数据

此查询使用每个商店组中的最后一个已知值填充缺少的兼职员工数据。

db.stores.aggregate([{
    $fill: {
        sortBy: {
            "_id": 1
        },
        output: {
            "staff.totalStaff.partTime": {
                method: "locf"
            }
        }
    }
}, {
    $project: {
        name: 1,
        "staff.totalStaff": 1
    }
}])

此查询返回的前两个结果为:

[
    {
        "_id": "00003278-4226-4ca7-871d-e80d8f414431",
        "name": "Wide World Importers | Camera Depot - Lake Luramouth",
        "staff": {
            "totalStaff": {
                "fullTime": 20,
                "partTime": 6
            }
        }
    },
    {
        "_id": "00009bd0-c44e-4cc8-ab03-347076d74a1a",
        "name": "Wide World Importers | Music Stop - Rebeccaside",
        "staff": {
            "totalStaff": {
                "fullTime": 9,
                "partTime": 0
            }
        }
    }
]

示例 3:使用平均值填充缺失的折扣百分比

此查询将填充缺少的折扣百分比,并且所有商店的平均折扣百分比。

db.stores.aggregate([
  { $unwind: "$promotionEvents" },
  { $unwind: "$promotionEvents.discounts" },
  {
    $fill: {
      partitionBy: "$promotionEvents.eventName",
      sortBy: { "promotionEvents.discounts.categoryName": 1 },
      output: {
        "promotionEvents.discounts.discountPercentage": { 
          value: { $avg: "$promotionEvents.discounts.discountPercentage" } 
        }
      }
    }
  },
  {
    $group: {
      _id: { storeId: "$_id", eventName: "$promotionEvents.eventName" },
      storeName: { $first: "$name" },
      eventName: { $first: "$promotionEvents.eventName" },
      discounts: { $push: "$promotionEvents.discounts" }
    }
  }
])

此查询返回的前两个结果为:

[
    {
        "_id": {
            "storeId": "70d4cc90-23b1-46e3-8f59-630648e311a4",
            "eventName": "Price Slash Spectacular"
        },
        "storeName": "Wide World Importers | Music Bazaar - West Johnpaulhaven",
        "eventName": "Price Slash Spectacular",
        "discounts": [
            {
                "categoryName": "CDs",
                "discountPercentage": 22
            },
            {
                "categoryName": "Vinyl Records",
                "discountPercentage": 21
            }
        ]
    },
    {
        "_id": {
            "storeId": "24873ac4-b2d1-4216-a425-3375a384b23d",
            "eventName": "Massive Deal Mania"
        },
        "storeName": "Northwind Traders | Furniture Pantry - Farrellchester",
        "eventName": "Massive Deal Mania",
        "discounts": [
            {
                "categoryName": "Bookcases",
                "discountPercentage": 22
            },
            {
                "categoryName": "Cabinets",
                "discountPercentage": 8
            }
        ]
    }
]

用例

  • 数据清理:填充导入的数据集中的缺失值
  • 时序数据:使用内插处理顺序数据中的间隙
  • 默认值:将默认值分配给可选字段
  • 数据规范化:确保跨文档的数据结构一致