Azure Cosmos DB for MongoDB vCore 中基于方案的索引编制指南
适用对象: MongoDB vCore
索引是提高数据检索速度的结构,可用于快速访问集合中的特定字段。 本文介绍如何在不同的嵌套级别执行索引编制,并复习如何有效地审查这些索引的利用率。
索引编制方案
我们将根据定义的示例 json 的上下文来研究示例方案。
{
"_id": "e79b564e-48b1-4f75-990f-e62de2449239",
"car_id":"AZ-9874532",
"car_info": {
"make": "Mustang",
"model": "GT Fastback",
"year": 2024,
"registration": {
"license_plate": "LJX386",
"state": "WV",
"registration_datetime": {
"$date": "2024-01-10T01:16:44.000Z"
},
"expiration_datetime": {
"$date": "2034-01-10T01:16:44.000Z"
}
}
},
"rental_history": [
{
"rental_id": "RT63857499825952",
"customer_id": "CX8716",
"start_date": {
"$date": "2024-02-29T01:16:44.000Z"
},
"end_date": {
"$date": "2024-03-04T16:54:44.000Z"
},
"pickup_location": { "type": "Point", "coordinates": [ -73.97, 40.77 ]
},
"drop_location": { "type": "Point", "coordinates": [ -73.96, 40.78 ]
},
"total_price": 232.56944444444443,
"daily_rent": 50,
"complains": [
{
"complain_id": "CMP638574998259520",
"issue": "Strange odor inside the car.",
"reported_datetime": {
"$date": "2024-03-03T20:11:44.000Z"
},
"reported_medium": "Website",
"resolutions": [
{
"resolution_datetime": {
"$date": "2024-03-03T20:20:44.000Z"
},
"solution": "Inspect for any leftover food, spills, or trash that might be causing the odor. Contact the rental agency.",
"resolved": true
}
]
}
],
"accidents": [
{
"accident_id": "ACC376184",
"date": {
"$date": "2024-03-03T01:47:44.000Z"
},
"description": "Collisions with Soft Barriers: Accidents involving hitting bushes, shrubs, or other soft barriers.",
"repair_cost": 147
}
]
},
{
"rental_id": "RT63857499825954",
"customer_id": "CX1412",
"start_date": {
"$date": "2033-11-18T01:16:44.000Z"
},
"end_date": {
"$date": "2033-11-25T21:11:44.000Z"
},
"pickup_location": { "type": "Point", "coordinates": [ 40, 5 ]
},
"drop_location": { "type": "Point", "coordinates": [ 41, 11 ]
},
"total_price": 305.3645833333333,
"daily_rent": 39,
"complains": [
{
"complain_id": "CMP638574998259540",
"issue": "Unresponsive infotainment system.",
"reported_datetime": {
"$date": "2033-11-19T17:55:44.000Z"
},
"reported_medium": "Agency",
"resolutions": []
}
],
"accidents": null
}
],
"junk": null
}
为根字段编制索引
Azure Cosmos DB for MongoDB vCore 允许对根属性编制索引。 该示例允许按照 car_id
搜索 sampleColl
。
CarData> db.sampleColl.createIndex({"car_id":1})
执行计划允许使用 explain
审查对 car_id
字段创建的索引的利用率。
CarData> db.sampleColl.find({"car_id":"ZA-XWB804"}).explain()
{
explainVersion: 2,
command: "db.runCommand({explain: { 'find' : 'sampleColl', 'filter' : { 'car_id' : 'ZA-XWB804' } }})",
explainCommandPlanningTimeMillis: 0.156,
explainCommandExecTimeMillis: 37.956,
dataSize: '32 kB',
queryPlanner: {
namespace: 'CarData.sampleColl',
winningPlan: {
stage: 'FETCH',
estimatedTotalKeysExamined: 8700,
inputStage: {
stage: 'IXSCAN',
indexName: 'car_id_1',
isBitmap: true,
indexFilterSet: [ { '$eq': { car_id: 'ZA-XWB804' } } ],
estimatedTotalKeysExamined: 174
}
}
},
ok: 1
}
为嵌套属性编制索引
Azure Cosmos DB for MongoDB vCore 允许为嵌入的文档属性编制索引。 该示例对嵌套文档 registration
中的字段 registration_datetime
创建索引。
CarData> db.sampleColl.createIndex({"car_info.registration.registration_datetime":1})
使用 explain
的审查执行计划可以提供索引扫描的见解。
CarData> db.sampleColl.find({"car_info.registration.registration_datetime":
{ $gte : new ISODate("2024-05-01")
,$lt: ISODate("2024-05-07")
}
}).explain()
{
explainVersion: 2,
command: "db.runCommand({explain: { 'find' : 'sampleColl', 'filter' : { 'car_info.registration.registration_datetime' : { '$gte' : ISODate('2024-05-01T00:00:00Z'), '$lt' : ISODate('2024-05-07T00:00:00Z') } } }})",
explainCommandPlanningTimeMillis: 0.095,
explainCommandExecTimeMillis: 42.703,
dataSize: '4087 kB',
queryPlanner: {
namespace: 'CarData.sampleColl',
winningPlan: {
stage: 'FETCH',
estimatedTotalKeysExamined: 4350,
inputStage: {
stage: 'IXSCAN',
indexName: 'car_info.registration.registration_datetime_1',
isBitmap: true,
indexFilterSet: [
{
'$range': {
'car_info.registration.registration_datetime': {
min: ISODate("2024-05-01T00:00:00.000Z"),
max: ISODate("2024-05-07T00:00:00.000Z"),
minInclusive: true,
maxInclusive: false
}
}
}
],
estimatedTotalKeysExamined: 2
}
}
},
ok: 1
}
为根处的数组编制索引
Azure Cosmos DB for MongoDB vCore 允许为定义为数组的根属性编制索引。 让我们考虑以下 json 示例。
{
"_id": ObjectId("58f56170ee9d4bd5e610d644"),
"id": 1,
"num": 001,
"name": "Bulbasaur",
"img": "http://www.serebii.net/pokemongo/pokemon/001.png",
"type": [ 'Grass', 'Poison' ],
"height": '0.71 m',
"weight": '6.9 kg',
"avg_spawns": 69,
"spawn_time": "20:00",
"multipliers": [ 1.58 ],
"weaknesses": [ "Fire", "Ice", "Flying", "Psychic"],
"next_evolution": [ { "num": "002", "name": "Ivysaur" }, { "num": "003", "name": "Venusaur" }]
}
在本示例中,我们对 weaknesses
数组字段创建索引,并检查数组中的所有三个值是否存在:Ground
、Water
和 Fire
。
Cosmicworks> db.Pokemon.createIndex({'weaknesses':1})
Cosmicworks> db.Pokemon.find({"weaknesses":
{$all:["Ground","Water","Fire"]}
}
).explain()
{
explainVersion: 2,
command: "db.runCommand({explain: { 'find' : 'Pokemon', 'filter' : { 'weaknesses' : { '$all' : ['Ground', 'Water', 'Fire'] } } }})",
explainCommandPlanningTimeMillis: 10.161,
explainCommandExecTimeMillis: 21.64,
dataSize: '906 bytes',
queryPlanner: {
namespace: 'Cosmicworks.Pokemon',
winningPlan: {
stage: 'FETCH',
estimatedTotalKeysExamined: 50,
inputStage: {
stage: 'IXSCAN',
indexName: 'weaknesses_1',
isBitmap: true,
indexFilterSet: [
{ '$all': { weaknesses: [ 'Ground', 'Water', 'Fire' ] } }
],
estimatedTotalKeysExamined: 2
}
}
},
ok: 1
}
注意
对于 MongoServerError:索引键太大。
请创建支持请求以启用后台索引编制,然后设置 enableLargeIndexKeys
db.runCommand({ createIndexes: "collectionName", indexes: [{ {"index_spec"}], enableLargeIndexKeys: true });
为嵌套数组编制索引
Azure Cosmos DB for MongoDB vCore 允许为嵌套数组编制索引。 该示例对 complains
数组中的 resolutions
字段创建索引。
CarData> db.sampleColl.createIndex({"rental_history.complains.resolutions":1})
我们使用 explain
审查计划,以识别所有租赁情况,但不向客户提供解决方案。
CarData> db.sampleColl.find({"rental_history.complains.resolutions":{ $exists: false, $ne: []}}).explain()
{
explainVersion: 2,
command: "db.runCommand({explain: { 'find' : 'sampleColl', 'filter' : { 'rental_history.complains.resolutions' : { '$exists' : false, '$ne' : [] } } }})",
explainCommandPlanningTimeMillis: 0.12,
explainCommandExecTimeMillis: 48.721000000000004,
dataSize: '1747 kB',
queryPlanner: {
namespace: 'CarData.sampleColl',
winningPlan: {
stage: 'FETCH',
estimatedTotalKeysExamined: 1933,
inputStage: {
stage: 'IXSCAN',
indexName: 'rental_history.complains.resolutions_1',
isBitmap: true,
indexFilterSet: [
{
'$exists': { 'rental_history.complains.resolutions': false }
},
{ '$ne': { 'rental_history.complains.resolutions': [] } }
],
estimatedTotalKeysExamined: 2
}
}
},
ok: 1
}
为数组中的特定字段编制索引
Azure Cosmos DB for MongoDB vCore 允许为数组中的字段编制索引。 该示例对 accidents
数组中的 date
字段创建索引。
CarData> db.sampleColl.createIndex({"rental_history.accidents.date":1})
示例查询评估某个时间范围内的事件,并显示正在使用的、对 date
属性创建的索引。
CarData> db.sampleColl.find({"rental_history.accidents.date":
{ $gte : ISODate("2024-05-01")
, $lt : ISODate("2024-05-07")
}
}).explain()
{
explainVersion: 2,
command: "db.runCommand({explain: { 'find' : 'sampleColl', 'filter' : { 'rental_history.accidents.date' : { '$gte' : ISODate('2024-05-01T00:00:00Z'), '$lt' : ISODate('2024-05-07T00:00:00Z') } } }})",
explainCommandPlanningTimeMillis: 19.816,
explainCommandExecTimeMillis: 48.359,
dataSize: '12 MB',
queryPlanner: {
namespace: 'CarData.sampleColl',
winningPlan: {
stage: 'FETCH',
estimatedTotalKeysExamined: 4350,
inputStage: {
stage: 'IXSCAN',
indexName: 'rental_history.accidents.date_1',
isBitmap: true,
indexFilterSet: [
{
'$range': {
'rental_history.accidents.date': {
min: ISODate("2024-05-01T00:00:00.000Z"),
max: ISODate("2024-05-07T00:00:00.000Z"),
minInclusive: true,
maxInclusive: false
}
}
}
],
estimatedTotalKeysExamined: 2
}
}
},
ok: 1
}
注意
我们目前正在增强对嵌套数组的支持。 在某些极端情况下,特定的索引编制操作可能会导致错误。
在排除嵌套字段的同时进行通配符索引编制
Azure Cosmos DB for MongoDB vCore 支持通配符索引。 该示例允许我们排除对文档 car_info
中所有嵌套字段的索引编制。
// Excludes all the nested sub-document property
CarData> db.sampleColl.createIndex( {"$**":1}
,{"wildcardProjection":
{ "car_info.make":0
,"car_info.model":0
,"car_info.registration":0
,"car_info.year":0
,"rental_history":0
}
}
)
执行计划会显示不支持对 model
字段执行的查询,该字段在创建通配符索引时已被排除。
CarData> db.sampleColl.find({"car_info.model":"GT Fastback"}).explain()
{
explainVersion: 2,
command: "db.runCommand({explain: { 'find' : 'sampleColl', 'filter' : { 'car_info.model' : 'GT Fastback' } }})",
explainCommandPlanningTimeMillis: 10.879,
explainCommandExecTimeMillis: 374.25100000000003,
dataSize: '0 bytes',
queryPlanner: {
namespace: 'CarData.sampleColl',
winningPlan: {
stage: 'COLLSCAN',
runtimeFilterSet: [ { '$eq': { 'car_info.model': 'GT Fastback' } } ],
estimatedTotalKeysExamined: 8700
}
},
ok: 1
}
在排除嵌套对象的同时进行通配符索引编制
Azure Cosmos DB for MongoDB vCore 支持通配符索引。 该示例允许我们从文档中排除嵌套对象。
// Wildcard index excluding nested object
[mongos] CarData> db.sampleColl.createIndex( {"$**":1},
{"wildcardProjection":
{ "car_info":0
,"rental_history":0
}
}
)
执行计划显示不支持对 car_info
文档中嵌套字段 make
执行的查询。
CarData> db.sampleColl.find({"car_info.make":"Mustang"}).explain()
{
explainVersion: 2,
command: "db.runCommand({explain: { 'find' : 'sampleColl', 'filter' : { 'car_info.make' : 'Mustang' } }})",
explainCommandPlanningTimeMillis: 21.271,
explainCommandExecTimeMillis: 337.475,
dataSize: '0 bytes',
queryPlanner: {
namespace: 'CarData.sampleColl',
winningPlan: {
stage: 'COLLSCAN',
runtimeFilterSet: [ { '$eq': { 'car_info.make': 'Mustang' } } ],
estimatedTotalKeysExamined: 8700
}
},
ok: 1
}
在排除包含嵌套数组的字段的同时进行通配符索引编制
该通配符索引示例允许从嵌套数组中排除字段。 我们使用 pokemon
集合,其中突出显示了 json 格式。
{
{
"_id": ObjectId("58f56170ee9d4bd5e610d644"),
"id": 1,
"num": 001,
"name": "Bulbasaur",
"img": "http://www.serebii.net/pokemongo/pokemon/001.png",
"type": [ 'Grass', 'Poison' ],
"height": '0.71 m',
"weight": '6.9 kg',
"avg_spawns": 69,
"spawn_time": "20:00",
"multipliers": [ 1.58 ],
"weaknesses": [ "Fire", "Ice", "Flying", "Psychic"],
"next_evolution": [ { "num": "002", "name": "Ivysaur" }, { "num": "003", "name": "Venusaur" }]
}
}
我们将针对 json 中的所有字段创建索引,同时从数组中排除 num
和 name
字段。
Cosmicworks> db.Pokemon.createIndex( {"$**":1},
{"wildcardProjection":
{ "id":0
,"name":0
,"multipliers":0
,"next_evolution.num":0
,"next_evolution.name":0
}
}
)
解释计划在查询 next_evolution
数组中的 name
字段时不会显示索引利用率。
Cosmicworks> db.Pokemon.find({"next_evolution.name":"Venusaur"}).explain()
{
explainVersion: 2,
command: "db.runCommand({explain: { 'find' : 'Pokemon', 'filter' : { 'next_evolution.name' : 'Venusaur' } }})",
explainCommandPlanningTimeMillis: 0.799,
explainCommandExecTimeMillis: 0.869,
dataSize: '1090 bytes',
queryPlanner: {
namespace: 'Cosmicworks.Pokemon',
winningPlan: {
stage: 'COLLSCAN',
runtimeFilterSet: [ { '$eq': { 'next_evolution.name': 'Venusaur' } } ],
estimatedTotalKeysExamined: 76
}
},
ok: 1
}