Mongo Aggregation Projections not working for nested fields

Problem: Nested Projections before a sort stage do not carry to a sort stage. In fact they seem to ruin the rest of the projections. When run in Compass, it seems the projection takes out the fields as intended but I am still getting a out of memory error (did not enable allow disk use). As you can see below, just excluding a non nested field seems to work perfectly. One thing to add is that metadata does hold dynamic fields.
Sample Query that works:

 [{ "$match" : { "$and" : [{ "creator.identity_id" : 1}, { "memlevel" : "MEMBLOCK"}, { "$or" : [{ "is_session" : true}, { "is_draft" : true}]}]}}, { "$project" : { "memlabel" : 0}}, { "$sort" : { "modified_at_utc" : -1}}, { "$limit" : 25}]

Sample Query with Nested Projection that results in sort out of memory Error:

[{ "$match" : { "$and" : [{ "creator.identity_id" : 1}, { "memlevel" : "MEMBLOCK"}, { "$or" : [{ "is_session" : true}, { "is_draft" : true}]}]}}, { "$project" : { "memlabel" : 0, "metadata" : { "feed_text" : 0}}}, { "$sort" : { "modified_at_utc" : -1}}, { "$limit" : 25}] 

Sample Document:

{
  "_id": {
    "$oid": "62b73c651f9bc7269439d9d4"
  },
  "memlevel": "MEMBLOCK",
  "type": "MEMORY",
  "event": "luther.text",
  "memlabel": "This is where her is, Claire. That's why what you can found is that a government funded now and like like a Obama's comments are coming from or going, oh, and that makes sense. So you go No. One in time on your MySpace or but although I also know the like, I go by what I saw and and the those reps on this one is I said, well, that's why we found the government on this country. where her name is.",
  "created_at_utc": {
    "$date": {
      "$numberLong": "1656175717001"
    }
  },
  "modified_at_utc": {
    "$date": {
      "$numberLong": "1656175717001"
    }
  },
  "start_time_utc": {
    "$date": {
      "$numberLong": "1656175710069"
    }
  },
  "end_time_utc": {
    "$date": {
      "$numberLong": "1656175710069"
    }
  },
  "creator": {
    "identity_id": 1,
    "name": "person",
    "domain_id": 0,
    "lead_id": 0,
    "propertyBag": {
      "email": {
        "key": "email",
        "value": "email@personal.ai
      },
      "picture": {
        "key": "picture",
        "value": "asdfasfd.jpg"
      }
    }
  },
  "owner": {
    "identity_id": 1,
    "name": "person",
    "domain_id": 1,
    "lead_id": 0,
    "propertyBag": {
      "email": {
        "key": "email",
        "value": "email.personal.ai"
      },
      "picture": {
        "key": "picture",
        "value": "asdff.jpg"
      }
    }
  },
  "source": {
    "name": "WebApp",
    "type": "text",
    "device": "desktop",
    "os": ""
  },
  "feed_event": [],
  "visibility": "PRIVATE",
  "scope": "PERSONAL",
  "metadata": {
    "title": "",
    "feed_text": "<p>This is where her is, Claire. That's why what you can found is that a government funded now and like like a Obama's comments are coming from or going, oh, and that makes sense. So you go No. One in time on your MySpace or but although I also know the like, I go by what I saw and and the those reps on this one is I said, well, that's why we found the government on this country. where her name is. </p>",
    "is_draft_delete": "true"
  },
  "status": "CREATED",
  "is_draft": true,
  "_class": "com.personalai.MemoryAPI.Common.Model.FeedMemory"
}

You are sorting on 1 of 2 fields.

  1. A field that is present in the original document.

  2. A computed field, not present in the original document, that is the result of $project, $set/$addFields.

If you $sort as point 1, sort as soon as possible and before doing any alteration to the original document.

If you $sort as point 2, it will always be a memory sort since you cannot have an index on computed fields. If you cannot sort your computed field in memory for any reason, store the computed value and create an index, then it will be $sort as point 1.

In the document and pipelines you shared you are sorting on a field that is present in the original document, so simply move the $sort at the right place right after the $match. The same apply to your $limit, move it after the $sort and before the $project. Why $project, that is altering, all matching documents if your are only interested in only 25.

When altering a top-level field, the optimizer is smart enough to not alter the document right away. When done on a nested field it will be much harder to keep track of the excluded vs included sub-fields, so the optimizer probably do not optimize.

But the right thing to do is to $sort sooner than what you did.