Hi, could you help find out why the projection stage doesn’t use an index that contains all the fields to be included? And any advice on how to avoid full scan to get the projection.
Objects have the following schema:
test> db.events.find({}).limit(3)
[
{
_id: ObjectId("63a086c116b1985489f8ccaf"),
data: { x: 'X' },
meta: {
key: 'nk0',
schema: { tenant: 't1', name: 'type#0', version: '2.0.22' }
}
},
{
_id: ObjectId("63a086c116b1985489f8ccb0"),
data: { x: 'X' },
meta: {
key: 'nk1',
schema: { tenant: 't1', name: 'type#1', version: '2.0.22' }
}
},
{
_id: ObjectId("63a086c116b1985489f8ccb1"),
data: { x: 'X' },
meta: {
key: 'nk2',
schema: { tenant: 't1', name: 'type#4', version: '2.0.22' }
}
}
]
I’ve created a unique compound index that includes ‘meta.key’, ‘meta.schema.name’, ‘meta.schema.tenant’
test> db.events.getIndices()
[
{ v: 2, key: { _id: 1 }, name: '_id_' },
{
v: 2,
key: { 'meta.key': 1, 'meta.schema.name': 1, 'meta.schema.tenant': 1 },
name: 'meta.key_1_meta.schema.name_1_meta.schema.tenant_1',
unique: true
}
]
I want to extract these fields into a separate collection. I expect that aggregation pipeline will use the index to retrieve data, but it doesn’t.
test> db.events.explain().aggregate([{$project: {'_id':0, 'meta.key': 1, 'meta.schema.name': 1, 'meta.schema.tenant': 1}}, {$out: 'out_test'}])
{
explainVersion: '1',
stages: [
{
'$cursor': {
queryPlanner: {
namespace: 'test.events',
indexFilterSet: false,
parsedQuery: {},
queryHash: 'E7D9C058',
planCacheKey: 'E7D9C058',
maxIndexedOrSolutionsReached: false,
maxIndexedAndSolutionsReached: false,
maxScansToExplodeReached: false,
winningPlan: {
stage: 'PROJECTION_DEFAULT',
transformBy: {
meta: { key: true, schema: { tenant: true, name: true } },
_id: false
},
inputStage: { stage: 'COLLSCAN', direction: 'forward' }
},
rejectedPlans: []
}
}
},
{ '$out': { db: 'test', coll: 'out_test' } }
],
serverInfo: {
host: '02295ffc4e95',
port: 27017,
version: '6.0.2',
gitVersion: '94fb7dfc8b974f1f5343e7ea394d0d9deedba50e'
},
serverParameters: {
internalQueryFacetBufferSizeBytes: 104857600,
internalQueryFacetMaxOutputDocSizeBytes: 104857600,
internalLookupStageIntermediateDocumentMaxSizeBytes: 104857600,
internalDocumentSourceGroupMaxMemoryBytes: 104857600,
internalQueryMaxBlockingSortMemoryUsageBytes: 104857600,
internalQueryProhibitBlockingMergeOnMongoS: 0,
internalQueryMaxAddToSetBytes: 104857600,
internalDocumentSourceSetWindowFieldsMaxMemoryBytes: 104857600
},
command: {
aggregate: 'events',
pipeline: [
{
'$project': {
_id: 0,
'meta.key': 1,
'meta.schema.name': 1,
'meta.schema.tenant': 1
}
},
{ '$out': 'out_test' }
],
cursor: {},
'$db': 'test'
},
ok: 1
}
What is wrong?
Using MongoDB: 6.0.2
Using Mongosh: 1.5.0