How does indexes find the main document in MongoDB?

Gilcemir_Filho · December 22, 2023, 10:15am

I have a question about indexes in MongoDB in general.

I read somewhere that every MongoDB index stores the _id field inside the index. This allows it to find the current document if the query satisfies the index. Consequently, I assumed that if I have a query that satisfies the index and projects the _id field at the end, it would result in a covered query. This is because the _id field is already present in the index document.

However, during testing, I discovered that this assumption was not true. MongoDB was fetching the document to complete the query. Let me provide you with an example.

I created the following index:

db.ManualLiquidationItems.createIndex({
  "ReturnId": 1,
  "Status": 1,
  "IsDeleted": 1,
  "CreationTime": 1
});

I then used the following query against the collection:


aggregate([
    {
        "$match": {
            "IsDeleted": false,
            "Status": "Unprocessed",
            "ReturnId": UUID("3704f58c-ba38-4edd-9986-f37a0eaf619c")
        }
    },
    {
        "$sort": {
            "CreationTime": 1
        }
    },
    {
        "$project":{
            _id: 1
        }
    }
])

I observed that this query was using the index, but it was also fetching the document from the disk.

Interestingly, when I projected the Status field to test, the query did not fetch the document, resulting in a covered query.

So, my question is: How do indexes find the main document in MongoDB if they are not using the main index _id?

Spencer_Brown · December 22, 2023, 11:00pm

You are correct that the index entry does not include the _id field by default. Indexes that expressly include _id as a key (which includes, of course, the always-present _id index) do have the _id field. Perhaps you are thinking of the recordID, which is an internal field in indexes used to find the actual document(s).

So, my question is: How do indexes find the main document in MongoDB if they are not using the main index _id ?

Indexes, including the always-present _id index, use the internal recordID to find the document(s) that the index entry points to.

Gilcemir_Filho · December 24, 2023, 8:59pm

Hi Spencer, thanks for your response.
Now I understand. I didnt know about the recordId, I tought that an internal Id would not be needed because of the clustered index structure that mongodb uses to store documents.
Thank you!

Spencer_Brown · December 26, 2023, 4:34pm

Now that you mention clustered indexes, there is a new feature as of MongoDB 5.3/6.0 that allows you to create a “clustered collection”, where the documents in the collection are referenced by _id rather than the internal recordID. See here for details