Aggregate search near a date

Hello all, I’m trying out different Atlas search options using the aggregate api. The text search has been working great and now i’ve started querying my date fields. The range operator seems to function as I expect but the near one doesn’t seem to make a difference in my results.

I’m using the Node driver, and my collection documents are stored as the Date type.

An aggregate pipeline I’m trying:

[
  {
    $search: {
      compound: {
        must: [
          {
            text: {
              query: ['shopping'],
              path: 'tags',
            },
          },
          {
            near: {
              path: 'savedAt',
              origin: new Date(2022, 0, 1),
              pivot: 2629800000,
            },
          },
        ],
      },
    },
  },
];

My understanding is that this query will look for documents with “shopping” in their tag field, and that have a savedAt date within ~1 month of January 1st, 2022. I have 19 documents that have shopping in their “tags” field. Only one of them has a saved at date in 2022 (Jan 5th). My aggregate result however returns all 19 documents.

What am I missing? I would only expect 1 result since a month before and after Jan 1 2022 should only result in the 1 document present? Using the range operator and defining a “greater than” date of new Date(2022, 0, 1) returns just that one document.

Hi @zacharykane,

My aggregate result however returns all 19 documents.

My understanding is that the near operator utilises the pivot to calculate scores but will still return documents outside of the pivot “range” as you have stated. The below is also from the pivot section of the near documentation:

Results have a score equal to 1/2 (or 0.5 ) when their indexed field value is pivot units away from origin.

What am I missing? I would only expect 1 result since a month before and after Jan 1 2022 should only result in the 1 document present? Using the range operator and defining a “greater than” date of new Date(2022, 0, 1) returns just that one document.

It sounds like range works for you but please correct me if i’m wrong here. I’m curious to understand if there is a need for near to filter out for ±1 month from a specified date? Or, is it for scoring / sorting purposes (nearest to furthest)?

If you could explain the context further here for use of near or possibly if range doesn’t suit all requirements, that would help greatly.

In saying so, please see the following example for both near and range (~±1 month from 2022-01-01) usage:

DB> var f =
[
  {
    '$search': {
      compound: {
        must: [
          { text: { query: [ 'shopping' ], path: 'tags' } },
          {
            range: {
              path: 'savedAt',
              gte: ISODate("2021-12-01T00:00:00.000Z"),
              lte: ISODate("2022-02-01T00:00:00.000Z")
            }
          },
          {
            near: {
              path: 'savedAt',
              origin: ISODate("2022-01-01T00:00:00.000Z"),
              pivot: 2592000000
            }
          }
        ]
      }
    }
  },
  { '$project': { savedAt: 1, score: { '$meta': 'searchScore' } } }
]

Output:

DB> db.collection.aggregate(f)
[
  {
    _id: ObjectId("63030ce08ed303b32008b26f"),
    savedAt: ISODate("2022-01-05T00:00:00.000Z"),
    score: 1.9016982316970825
  },
  {
    _id: ObjectId("63030f768ed303b32008b278"),
    savedAt: ISODate("2021-12-01T00:00:00.000Z"),
    score: 1.5111485719680786
  },
  {
    _id: ObjectId("63030f768ed303b32008b275"),
    savedAt: ISODate("2022-02-01T00:00:00.000Z"),
    score: 1.5111485719680786
  }
]

Without use of near the score of the above documents in the output were all the same in my test environment.

Hope this helps in some manner.

Regards,
Jason

2 Likes

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.