How can I query documents by comparing two fields in the same document where one field is ina nested array?

My data is like this (I have taken out unnecessary fields and only left the dates in):

[
  {
    "date": "2022-09-25T12:35:51.833Z",
    "scans": [
      {
        "date": "2022-09-01T05:00:00.000Z",
      },
      {
        "date": "2022-08-04T05:00:00.000Z",
      },
      {
        "date": "2022-09-01T05:00:00.000Z",
      },
      {
        "date": "2022-09-06T05:00:00.000Z",
      }
    ],
  },
  {
    "date": "2022-09-25T12:55:12.018Z",
    "scans": [
      {
        "date": "1919-11-30T07:00:00.000Z",
      },
      {
        "date": "1919-11-30T07:00:00.000Z",
      },
      {
        "date": "1926-11-30T07:00:00.000Z",
      },
      {
        "date": "1919-11-30T07:00:00.000Z",
      },
      {
        "date": "1919-11-30T07:00:00.000Z",
      },
      {
        "date": "1919-11-30T07:00:00.000Z",
      },
      {
        "date": "1919-11-30T07:00:00.000Z",
      },
      {
        "date": "1919-11-30T07:00:00.000Z",
      }
    ],
  },
  {
    "date": "2022-09-25T13:49:20.639Z",
    "scans": [
      {
        "date": "2022-09-15T05:00:00.000Z",
      },
      {
        "date": "2022-09-12T05:00:00.000Z",
      }
    ],
  },
  {
    "date": "2022-09-25T13:58:02.755Z",
    "scans": [
      {
        "date": "2022-09-13T05:00:00.000Z",
      },
      {
        "date": "2022-08-20T05:00:00.000Z",
      },
      {}
    ],
  },
  {
    "date": "2022-09-25T14:17:04.947Z",
    "scans": [
      {
        "date": "2022-09-12T05:00:00.000Z",
      }
    ],
  },
  {
    "date": "2022-09-25T14:17:49.489Z",
    "scans": [
      {
        "date": "2022-09-13T05:00:00.000Z",
      }
    ],
  },
  {
    "date": "2022-09-25T14:19:26.068Z",
    "scans": [{}],
  },
  {
    "date": "2022-09-25T14:20:07.569Z",
    "scans": [
      {
        "date": "2022-09-12T05:00:00.000Z",
      }
    ],
  },
  {
    "date": "2022-09-25T14:33:17.783Z",
    "scans": [
      {
        "date": "2022-08-15T07:00:00.000Z",
      }
    ],
  },
  {
    "date": "2022-09-25T14:33:41.050Z",
    "scans": [
      {
        "date": "2022-08-19T07:00:00.000Z",
      }
    ],
  },
  {
    "date": "2022-09-25T14:34:03.172Z",
    "scans": [
      {
        "date": "2022-09-07T07:00:00.000Z",
      }
    ],
  },
  {
    "date": "2022-09-25T15:28:23.723Z",
    "scans": [
      {
        "date": "2022-08-19T05:00:00.000Z",
      }
    ],
  },
  {
    "date": "2022-09-25T15:28:49.211Z",
    "scans": [
      {
        "date": "2022-09-09T05:00:00.000Z",
      }
    ],
  }
]

I would like to get back the same list of documents but filtered according to these:

  1. Include documents that have at least one scan where there is no date field.
  2. Include documents that have scans where the date is before 90 days of the date field of the document.
  3. Include documents that have scans where the date is after the date field of the document.

Bottom line is that I am trying to find documents with “incorrect” scans. How can this be achieved?

Can you provide some documents for which

For

you might try with

"scans" : { "$elemMatch" : { "date" : { "$exists" : false } } }
1 Like

The sample data I provided had this one document:

{
    "date": "2022-09-25T13:58:02.755Z",
    "scans": [
      {
        "date": "2022-09-13T05:00:00.000Z",
      },
      {
        "date": "2022-08-20T05:00:00.000Z",
      },
      {}
    ],
  },

You can see an empty nested document. What that means is that other fields may be there but no date field will be inside such a nested document. Other fields are irrelevant.

And as for the solution that you propose, would this check every single nested document or just the first one as you have not specified it?

Yes it will. You should test it. The documentation is also usefull as it contains examples.

I misread your reply earlier and instead gave an example of doc with no date. As for the case where scan date is after the date of the doc, such a document is not in the sample data by chance. It may exist in my actual database. For testing purposes, you can make do by just changing one or two dates for this case.

For testing purposes of your use-case, you can supply test documents by just changing one or two dates from your documents.

1 Like

@Abdul_Muezz_Ejaz Here’s one way to return documents that match your conditions.

db.collection.aggregate([
  {
    "$match": {
      "$expr": {
        "$reduce": {
          "input": "$scans",
          "initialValue": false,
          "in": {
            "$or": [
              "$$value",
              { // is scans date missing?
                "$eq": [{"$type": "$$this.date"}, "missing"]
              },
              { // is scans date older than 90 days?
                "$gt": [
                  {
                    "$dateDiff": {
                      "startDate": "$$this.date",
                      "endDate": "$date",
                      "unit": "day"
                    }
                  },
                  90
                ]
              },
              { // is scans date after doc date?
                "$gt": ["$$this.date", "$date"]
              }
            ]
          }
        }
      }
    }
  }
])

Try it on mongoplayground.net.

For the use case

you may try this untested code:

{ $match : { "$expr" : { "$gt" : [ "$date" , "$scans.date" ] }}}
1 Like