The Best Structure to Have in Your Application


We’re building a software tracking app that users can use on their landers to monitor traffic sources and performance for their funnels.

We discussed the possibility of having a collection for each user of the app or making a 1 data collection that contains the data of all users. Our main concerns are the following:

  1. If we make a collection for each user, organizing the data will be better since this collection will have the statistics for each day of the month. The user can query the statistics of a range, specific day, or year, but that way, we need to create many collections if have a large number of users.

  2. If we make 1 collection that contains the user statistics, then we will have all the users’ data in 1 collection, and in order to query something specifically for the user, we have to do heavier queries each time, and the data will not be organized per user.

Accordingly, what is the best model to have this type of functionality, especially since we need data filters according to a time range for a specific user and domain, and we need to store user information, tokens, domains, etc.

I was thinking that having a collection for each user is the best option we have, so we can have a user info document, and a document for the page statistics on each day of the month.

Pls, let us know what do think. and what recommendations or special ideas you have for us as starters.

Thanks a lot!

Hi @Hazem_Alkhatib ,

The problem with the collection per user is that you will probably hit a known anti-pattern called “Too many collections”:

The problem is by having so many collections the MongoDB server will use more resources to manage those multiple on drive files creating more indexes and more storage consumption.

If you store the data in a single collection you can use indexes to better filter your queries and eventually reducing the effort of accessing a specific user data. For example indexing:

{ domain : 1, userName : 1, date : 1}

Will allow you to perform time based queries efficiently if the query is :

db.userData.find({domain : "xxx" , userName : "yyy" , date : { $gt :  ISODate(...) , $lt : isDate(...)})

This query will use an index and will scan only the entries as if the data was in “its own collection”

If that still won’t work for you in terms of data management you can always:

  1. Use sharding to scale the collection with a shardkey like : {domain : 1, userName : 1}
  2. Partition the collection by date creating for example monthly/yearly collections :
  1. Consider using a timeseries collections if the data is clusterd by time and mostly analysed based on the time window.

Let me know if that answer the question.


1 Like