New to Mongo and had a question about structuring data. Our organization is trying to decide how to partition data into collections. Each piece of data we insert has a property called the “symbol”, and our original approach was to put each “symbol” into its own collection. However after reading a previous thread on the topic (Maximum number of Collections) and an article (https://www.mongodb.com/article/schema-design-anti-pattern-massive-number-collections/) the sense I got was that this isn’t a great idea, even though Mongo no longer has a hard limit on the number of collections, because there’s no limit to the number of “symbols” we may have, and thus the number of collections may grow unbounded which seems likely to cause performance problems eventually (if my understanding of those links is accurate).
Given that, I have two questions:
- Do the performance issues mentioned in the above links affect real time queries as well? My understanding from the aforementioned article is that having too many collections slows down startup, but not individual queries (once Mongo instance has finished its boot process).
- If query performance can be affected by the number of collections, is there any real downside to using only a single collection to store all of our data? We have sharding configured already (based on a different key), so we would essentially just add this “symbol” to each document and make it part of the compound index, and move to using only a single collection. Specifically, is using a single collection likely to cause performance issues due to synchronization/locking if there are multiple writers and readers accessing the collection at once (vs if the writers/readers were accessing different collections)?