Hello everyone,
I’m hitting a problem while trying to .distinct
on a time-series collection when an index is present on the key.
When using a normal collection, the planner is able to find the DISTINCT_SCAN
plan:
test> db.test_index_skipping.getIndexKeys()
[ { _id: 1 }, { attributes: 1 } ]
test> db.test_index_skipping.explain().distinct('attributes')
{
explainVersion: '1',
queryPlanner: {
namespace: 'test.test_index_skipping',
indexFilterSet: false,
parsedQuery: {},
queryHash: '2211C6E0',
planCacheKey: 'E10B2080',
maxIndexedOrSolutionsReached: false,
maxIndexedAndSolutionsReached: false,
maxScansToExplodeReached: false,
winningPlan: {
stage: 'PROJECTION_COVERED',
transformBy: {},
inputStage: {
stage: 'DISTINCT_SCAN',
keyPattern: { attributes: 1 },
indexName: 'attributes_1',
...
But this doesn’t work on a time-series collection, and the planner seems to default on COLLSCAN
:
test> db.createCollection('timeseries_index_skipping', { timeseries: { timeField: "timestamp", metaField: "metadata" } })
{ ok: 1 }
... insert some stuff ...
test> db.timeseries_index_skipping.createIndex({ 'metadata.attributes': 1 })
test> db.timeseries_index_skipping.explain().distinct('metadata.attributes')
{
explainVersion: '1',
stages: [
{
'$cursor': {
queryPlanner: {
namespace: 'test.system.buckets.timeseries_index_skipping',
indexFilterSet: false,
parsedQuery: {},
queryHash: '8B3D4AB8',
planCacheKey: 'D542626C',
maxIndexedOrSolutionsReached: false,
maxIndexedAndSolutionsReached: false,
maxScansToExplodeReached: false,
winningPlan: { stage: 'COLLSCAN', direction: 'forward' },
rejectedPlans: []
}
}
},
{
'$_internalUnpackBucket': {
include: [ 'metadata' ],
timeField: 'timestamp',
metaField: 'metadata',
bucketMaxSpanSeconds: 3600
}
},
...
I tried to search JIRA for a bug but I couldn’t find anything, am I doing something wrong?
Thanks!