How do I search all subdocuments?

Hello,

I have the following structure:
{
month {
day1 { name: SOMETHING },
day2{ name: SOMETHING },
day3{ name: SOMETHING}
}

the question is how do I search by name ?
e.g. ..name that matches a regex

Momchil

You sample document is not valid JSON. Please correct because it is unusable in this form.

Best way to share is with https://mongoplayground.net/

Sure: Mongo playground
thanks for responding

Your current data structure is a prime candidate for using the Attribute Pattern. That makes it easier to query and makes it possible to index. Otherwise, the fields don’t have known/predictable names.

For example, your document can be transformed from:

{
  "03": {
    "17": {
      "id": "xxx",
      "name": "myname",
      "lastSeen": "2025-03-17",
      "someotherkey": "value"
    },
    "18": {
      "id": "xxx",
      "name": "myname",
      "lastSeen": "2025-03-18",
      "someotherkey": "value"
    }
  }
}

to:

{
  "month": "03",
  "days": [
    {
      "day": "17",
      "id": "xxx",
      "name": "myname",
      "lastSeen": "2025-03-17",
      "someotherkey": "value"
    },
    {
      "day": "18",
      "id": "xxx",
      "name": "myname",
      "lastSeen": "2025-03-18",
      "someotherkey": "value"
    }
  ]
}

Then a search for ā€˜name’ could be simply:

db.collection.find({ "days.name": { $regex: "yna" } })

Mongo Playground 1A

or

db.collection.find({ "days.name": { $regex: "her" } })

Mongo Playground 1B

Some assumptions, questions, and suggestions:

  1. Does each document represent just one month? If there can be more than one month in a doc, then the example above will need to be nested under a months array field.
  2. When you search the name in subdocuments, do you want the whole document or just the matching subdocument in the array?
  3. If your document has month and day, it should probably also have year. Or directly use a datetime field with the hour-minute-second set to zero. Like ISODate("2025-03-17T00:00:00Z").
  4. And then it would be appropriate as a Time Series collection (optional).

Without using the Attribute Pattern and the changes above, you would need to use $objectToArray multiple times so that each field has a known/predictable name. And since no index would exist* (see below), it can’t be optimised and has to execute on every document until the $match stage after the transformations:

db.collection.aggregate([
  {
    $replaceWith: {
      _id: "$_id",
      doc: {
        // assumes one month per doc (one top-level field)
        $first: {
          $filter: {
            input: { $objectToArray: "$$ROOT" },
            cond: {
              // skip _id since it's separate
              $ne: ["$$this.k", "_id"]
            }
          }
        }
      }
    }
  },
  {
    $set: {
      doc: {
        v: { $objectToArray: "$doc.v" }
      }
    }
  },
  {
    $match: {
      // SEARCH TERM HERE
      "doc.v.v.name": { $regex: "yna" }
    }
  },
  {
    // undo the transformations
    $set: {
      doc: [
        {
          k: "$doc.k",
          v: { $arrayToObject: "$doc.v" }
        }
      ]
    }
  },
  {
    $replaceWith: {
      $mergeObjects: [
        { _id: "$_id" },
        { $arrayToObject: "$doc" }
      ]
    }
  }
])

Mongo Playground 2A (and Mongo Playground 2B)


* You could have a Wildcard Index but the query would need to specify the field names (month and day).

2 Likes

I see nothing to add to this wonderful reply.

The links to documentation, the playgrounds, all is good.

Thank you aneroid.

2 Likes

Not in a million year would I have figured that out. Awesome!
Changing the schema is not a option.

The code does exactly what I need it to do in the playground but my compass does not like it.
Problem is around here:
// skip _id since it’s separate
I have another ID in the base on the document.
Tryied $or after he $ne … but I m way out of my depth here.

The issue occurs because you have an "id": "SOMEID" field. Instead of "_id" - but _id will always exist in a document. So that needs to be excluded at the part where I put the comment ā€œ// skip _id since it’s separateā€.

So, change that part to an index check to ensure that we skip the _id and id fields - by checking that $indexOfArray is -1. Make sure to do this for all other non-month top-level fields if you have them. So that $filter part becomes:

{
  $filter: {
    input: { $objectToArray: "$$ROOT" },
    cond: {
      // skip _id & id since they are separate
      $eq: [
        {
          $indexOfArray: [
            ["_id", "id"],
            "$$this.k"
          ]
        },
        -1
      ]
    }
  }
}

Updated Mongo Playground

FYI, if we skip the checks for ā€œtop-level fields which are not monthsā€, then we need to add the is-array or is-object check every time we do a transform or undo a transform. This is a working playground for that transformation approach but it’s quite ugly and hard to maintain.

3 Likes

That did it!
Thank you so much

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.