Will many (10,000+) databases in a mongo replica impact the performance in long term?

LEK_CodeMobiles · January 1, 2024, 9:27am

I am developing a POS (Point of Sale) system, which currently utilizes numerous databases in a replica set, with each database representing a unique customer (1 customer per database). My question concerns any potential limitations of MongoDB Community Edition regarding the creation of a large number of databases. Specifically, I am looking at scenarios where there could be over 10,000 databases in a single replica set. While I am considering distributing customers across different replica sets using zoning or package separation solutions to prevent having too many databases in one replica set, it is crucial for me to understand the maximum number of databases that MongoDB Community Edition can support. Any feedback or suggestions would be greatly appreciated.

steevej · January 1, 2024, 4:38pm

Check

LEK_CodeMobiles · January 5, 2024, 2:20pm

Thank you for sharing the post. It highlights the importance of avoiding the creation of unnecessary collections and indexes, which can lead to excessive RAM consumption. In my scenario, I have essential columns and indexes and am prepared to upgrade my hardware to manage the load. However, I’m interested in understanding the practical limitations, particularly if managing over 10,000 databases in a single replica set is feasible. This information is crucial for planning when to expand my database to an additional replica set. Thanks.

steevej · January 5, 2024, 2:24pm

Verbatim from the link I posted:

Additionally, the WiredTiger storage engine (MongoDB’s default storage engine) stores a file for each collection and a file for each index. WiredTiger will open all files upon startup, so performance will decrease when an excessive number of collections and indexes exist.

In general, we recommend limiting collections to 10,000 per replica set. When users begin exceeding 10,000 collections, they typically see decreases in performance.

I cannot say more.

LEK_CodeMobiles · January 6, 2024, 2:32am

Thank you very much for the insights. I’m now considering the distribution of my replica sets, although it presents a challenge due to the large number of customers (over 5,000) in my system, each mapped to an individual database. This approach results in a high number of collections within a single replica set. I’m contemplating dividing this single replica set into multiple ones, potentially more than 100. However, I’m uncertain about the effectiveness of this design strategy. Do you have any recommendations or advice on this matter?

Kobe_W · January 6, 2024, 6:06am

what you mean by this? i don’t know a replica set can be “divided”. Are you talking about sharding?

LEK_CodeMobiles · January 6, 2024, 6:59am

I intend to continue using the replica set model, but with a distribution strategy where my databases are split across multiple replica sets. For instance, replica set A (PSS) will contain only 1,000 databases, and similarly, set B will also hold 1,000 databases, and so on. Alongside these, there will be a main replica set dedicated exclusively to storing core data, such as customer information.

The issue at hand is that, regardless of the current arrangement, the number of databases per replica set will continue to increase over time. But I don’t know which distrubution strategy is the best practice

steevej · January 6, 2024, 12:55pm

You worry too much about having the best strategy at this point. Continuous improvement is better that delayed perfection - Mark Twain.

LEK_CodeMobiles · January 6, 2024, 1:46pm

Thank you very much. I will keep improve.