When $in can use sort_merge

hi, recently , i do some performance about $or and $in

$or always uses sort_merge ,but $in not ;

this is my test data

generate data script:

people = ["CFF0FC9CCB","4093FC8D87","63407DCB5E","FF14E5FA11","9B30FB0595","8C71FB8B73","D39B586686","F20C4D636F","E3AB4638CA","748A6D0C29"];
for(var i=0; i<100000; i++){
     name = people[Math.floor(Math.random()*people.length)];
     user_id = i;
     boolean = [true, false][Math.floor(Math.random()*2)];

     d = new Date();
     added_at = d.getFullYear() +"-" +(d.getMonth()+1) + "-" + d.getDate() +  "T" + d.getHours() + ":" + d.getMinutes()+ ":" + d.getSeconds()+ "." + d.getMilliseconds() +"Z";
     
     number = Math.floor(Math.random()*10001);
     db.test_collection5.save({"_id":number+ObjectId().str.substring(14),"name":name, "user_id":user_id, "boolean": boolean, "added_at":added_at, "number":number });
}

db.test_collection5.createIndex({name:1,added_at:-1},{name:"index"});

last sql

db.test_collection5.find({name:{$in:[“CFF0FC9CCB”,“4093FC8D87”,“63407DCB5E”,“FF14E5FA11”,“9B30FB0595”,“8C71FB8B73”,“D39B586686”,“F20C4D636F”,“E3AB4638CA”,“748A6D0C29”]},added_at:{$gt:“2022-10-22T07:28:00.782Z”}}).sort({added_at:-1}).limit(1000).explain(“executionStats”);

the sql’s explain always can get sort_merge ,this is test data,

but in my business data
like this

{
        "_id" : "0000008626",
        "companyId" : "7B4B691836",
        "distributionId" : "9B30F343erB0595",
        "code" : "sdfdsf23",
        "roomTypeCode" : "K1",
        "deleted" : false,
        "state" : "Activated",
        "version" : NumberLong(0),
        "createdDate" : "2022-06-04T10:03:53.382Z",
        "lastModifiedDate" : "2022-06-04T10:03:53.382Z"
}

when execute $in , then not use sort_merge

so I want know what is the influence factors.

thank you for your response .

Hi @Huang_Huang ,

A $in is in fact a range operator when it gets multiple values.

If we consider the ESR rule to have sort fields before range you should consider:

added_at : -1 ,  name : 1

As a better order, this should do an index sort and not in memory.

Thanks
Pavel

Thank for your response.

if I use the index.

added_at : -1 ,  name : 1

when i execute the below sql , then return full table , and then match rang name .

db.test_collection5.find({name:{$in:[“CFF0FC9CCB”,“4093FC8D87”,“63407DCB5E”,“FF14E5FA11”,“9B30FB0595”,“8C71FB8B73”,“D39B586686”,“F20C4D636F”,“E3AB4638CA”,“748A6D0C29”]},added_at:{$gt:“1970-01-01T00:00:00.000Z”}}).sort({added_at:-1}).limit(1000).explain(“executionStats”);

Not sure I understand, can you show me the explain plan?