Documents with same search score

Greetings,
Documents returned from $search stage are sorted by their search score descending.
In case there are multiple documents with the same score - is there a secondary sort by their object ids?

e.g. if I run a query on a static collection that returns multiple documents with the same search score over and over again - will the results be always sorted in the same manner?

Thanks a lot,
Ofer.

@Ofer_Chacham unfortunately we cannot guarantee the order of the results of a static collection. We have not tested for this use case because of the probabilistic nature of the underlying subsystem, Lucene.

You can enforce a deterministic sort order a few ways if need be. The simplest way is to use $sort after $search or other features.

Is the collection sharded?

Thanks for the reply @Marcus.
Currently we are using sort stage after search stage but the queries run very slow.

We wanted to leverage the near operator in order to sort date and numeric fields but we must to have consistency between queries, because we are also using pagination with skip and limit so we don’t want to display same document in two different pages.

Is there something else we can use to achieve the consistency without a sort stage?

Currently we don’t use sharded collection.

Thanks,
Ofer

If you have a field that you create that holds the sort order for all the fields, you can use function score and the path option. It should return consistent ordering and be fast. Since it is a static collection this should not be a problem. You can add this field with Atlas Triggers and an $addField operation. Does that make sense?

1 Like

Thanks @Marcus - if I use the function score with a field as you describe it means I always sort the documents according to this field and this is not what I want to achieve.

Maybe I’ll explain my problem with an example:

let’s assume I have 5 documents in a collection and the collection is not changing:

object_id, numeric_field, string_field
A, 1, ‘abc’
B, 2, ‘abc’
C, 2, ‘abc’
D, 3, ‘abc’
E, 3, ‘def’

I want to display the documents that contain ‘abc’ in their string_field and sort them by their numeric_value, so I’ll use this search stage:

    "compound": {
        "filter": [
            {
                "text": {
                    "query": [
                        "abc"
                    ],
                    "path": "string_field"
                }
            }
        ],
        "should": [
            {
                "near": {
                    "path": "numeric_field",
                    "origin": 0,
                    "pivot": 1
                }
            }
            
        ],
    }

I also want to display the documents in pages of 2.
So, for getting the first page a stage of limit:2 will be added to the pipe, and for getting the second page two stages of skip:2,limit:2 will be added to the pipe.

If I perform the two queries for getting the two pages over and over I want to always get A,B in first page and C,D in second page.

Because B,C have the same numeric field value they will both get the same search score and as far as I understand it is possible that I will get a first page of A,C or second page of B,D.
So is there something I can do to guarantee the consistency?

BTW I tried to check consistency of the same query with real data and it do seems like i always get the same order of results even when I have documents with same search score.
Might it work this way because the collection is not sharded so mongot always scan the documents in same order and so it outputs results in same order?

Thanks,
Ofer.

The only way to guarantee the sort order at this time is to have the unique values for all the numeric fields or two use the blocking $sort stage after the $search stage. Some customers, use storedSource to speed up results of this query pattern.

1 Like