Proper way to avoid unbounded arrays

Hi experts,

I have a case where I have an array in a collection that might grow unlimited. For performance purpose, going to apply the outlier pattern to avoid unbounded arrays. What do you think which better from performance wise, create new collection to have the overflow (extra) data? Or create new document in the same collection?

Thanks

Hi @Rami_Khal ,

Its an interesting question, I would say that if you perform a lookup of the overflow documents then it is not that important.

But if you be able to cluster those documents on the same index then it might turn into a range query.

{
_id : "doc1", 
parent : 'xxx' ,
array : [ { "id" : "embeeded1" } ...  { "id" : "embeededN" } ],
overFlowIndex: 1,
hasOverflow : true
}
...
{
_id : "doc2", 
parent : 'xxx' ,
array : [ { "id" : "embeeded1" } ...  { "id" : "embeededN" }] ,
overFlowIndex: 2,
hasOverflow : false
}

Now to get all the documents of parent xxx I need to query:

db.collection.find({"parent" : "xxx" , overFlowIndex : { $gt : 0} }

If you need to sort the documents based on insert order:

db.collection.find({"parent" : "xxx" , overFlowIndex : { $gt : 0} }.sort({ overFlowIndex : 1})

Now when indexing {"parent" : 1, "overFlowIndex" : 1} you will get an indexed query to get all overflow documents. This will have a much better performance then doing a lookup of overflow document from another collection.

Thanks
Pavel

Thanks a lot for the reply, @Pavel_Duchovny .
Going to start design and implement the same approach.