Pros and cons of map vs array

steevej · April 20, 2023, 1:52pm

I do not like to tag people directly but in this case I felt it was important.

So @Asya_Kamsky, @Tarun_Gaur, @Jan_de_Wilde, @Hasan_Kumar, @Takis, @MaBeuLux88_xxx, @Daniel_Coupal, @kevinadi, @Jason_Tran please forgive me for tagging you. I am forgetting a lot of other people I enjoy reading so do not feel less appreciated if you are not in the list.

Please read the following 2 threads and provide your always insightful input.

For the record, I have been preaching the array approach.

Let’s the battle begin.

kevinadi · April 27, 2023, 12:45am

Hi @steevej

Thanks for all the contributions you made to the forum! I see that there’s no taker for this yet, so let me start with a small thought of my own.

It’s very difficult to say which method is superior vs another when we’re talking about schema design. As I’m sure you’re aware, generally in MongoDB the schema design follows the use case, and not the other way around like in most SQL data design. Over there, you decide how your data will be stored, and later figure out how to query all those connected tables into a single entity that then you can use in your app.

In contrast, MongoDB allows you to store an entity into a single document, speeding up the query process, and the flexible schema model allow you to store differently shaped documents in a single collection. This gives you the flexibility to change if the earlier design doesn’t work, or there are changes in your requirements in the future.

What I would advocate in the most general terms is that: how will the schema help simplify your workflow, and at the same time, allow you to create indexes that make those workloads run faster?

In some cases, arrays are the obvious choice, and in other cases, embedded documents is the way to go. One may find that using arrays simplify one use case, but complicates others (and vice versa), and thus the onus is on the user to determine the right balance that will satisfy all workloads, while still being able to perform well using indexes.

One example is using the attribute pattern when you have varying field names. Since normal indexes in MongoDB requires a static field name (not counting wilcard indexes which is another discussion), it’s difficult to create a good index if your documents are varied like that. Thus, using the attribute pattern in an array can be considered. Note that I’m not saying it’s the best thing to do, because it depends

There are other patterns as well, listed in the Building with patterns series of articles that may be interesting for some use cases. Ultimately I feel that MongoDB gives you this flexibility in choosing how to model your data as you see fit, so your database design can be super specialized & customized for the workload.

Best regards
Kevin