Please bear with me,
I forgot to mention the 5th layer that has a name easily confused: server itself. I mean the host pc
so there is a real/virtual host server pc and there are database server programs. we can have only 1, or as many as we need, mongod
(the program) instance on a single host pc. the same holds true if zoom out our sight: a data center can have a single powerful data hosting pc, or as many as needed.
The reason I raise this ordered architecture is about making decisions about where exactly our single document should belong. If your design to keep client data in a single collection proves to be hard to implement, you need to consider having a collection for each user which will eliminate the indexing problem. If a single client’s data exceeds a certain amount, you need to consider having a database (not the server, naming can also be confusing here). And If you want to serve a bigger degree of data, you get the idea, go one layer up.
In all of these possibilities, if you design carefully, your clients would not notice any difference if you choose one or another. they would not even notice if you change to some other database other SQL/NoSQL server other than MongoDB. in fact, you can leverage cooperation between them to cover their weak sides. None of these would be noticable by clients if your design is good.
Now back to your “document” focused design. If you try an all-free approach, the indexing will become bloated and thus performance will degrade. I think you need to pour some thinking into field names and types to guide your clients. for example, try preventing them to name the field “my_age” and enter their pet’s name (a string) as the value.
The second approach from your first post has an advantage over the other: you can have a “search index” over the key field. but you still have to deal with the array structure.
“Third option” as you asked, along with the first and second, need a longer time to discuss than we have here because this will be the heart of your design. Considering “100s of thousands” documents per client, I would go up on the layers and settle on a collection or database per client. think of it as folder tree structure; a folder per client. you are required to implement functions to switch between “folders” for each client, but the logic you end up with for the “document” does not need to change, and it will be much easier to administer each client plus better performance.
By the way, I am sorry for the long lines to read. Model designing sounds like an easy thing, but might be the hardest because there are too many things to consider. But again, as long as you keep backup data, you can craft a whole new model and apply it without clients ever noticing. So, decide on one model and start developing so you can actually test things on the way if you prefer hands-on experience.