Indexes on sharded collection

Our collections are sharded based on tenantId.

Is it good practice to create another single index on a particular field instead of creating compound index containing tenantId and that particular field

Hi @Nithin_Kumar ,

Well it depands on the nature of the queries and what is the optimal index to support them.

However, if you don’t include the tenantId prefix in your queries and your compound index prefix you will probably end up with a scatter gather (distrebuted) query that will essentially send the query to all of the shards and wait until all of them respond :

This can be a performance problem and you should avoid it if possible…

Thanks
Pavel

1 Like

iterable = db.collection.find(’{field:value}’, limit;1000);
iterable.foreach{ doc →
doc.update( ‘{field:differentValue}’
}
I did a perf test on the query for 500K documents in a 2 sharded cluster.
time to fetch the records every 1000 documents is ~1second.

@Nithin_Kumar ,

That doesn’t sound very good…

Wat are the indexes and shard keys?

_id : { _tenantId: “blah”, _id: }
fieldOfInterest: someVal

shardkey: [ _id._tenantId ]
index: [ _id._tenantId, _id._id ]
index:[ fieldOfInterest ]

@Nithin_Kumar ,

I don’t think I fully understand. You have an _id that is not an objectId but a complex object?

{ _id : { __tenantId : ... , _id : ... } ...

Is that correct?

Now when you query on this non sharding field the query is routed and gathered through all shards on a mongos… Therefore not super perfomant…

You can’t see a logic of using a shard key prefix before this query :

db.collection.find ({_id.__tenantid : "blah", valueOfIntrest : ... })

Index : {_id.__tenantId : 1 , valueOfIntrest: 1}

Ty

yes the _id is complex object.

This type of query is not done often.
iterable = db.collection.find(’{fieldOfInterest:value}’, limit;1000);

so probably we will live with it. No, we dont have a way to get all the tenantIds.