MongoDB performance with large documents

Baris_Karamanli · May 6, 2020, 8:27am

I’ve been making game replay system that saves player actions every 100ms as JSON file. Currently I’m using Java for it so I can convert json to map or text without delay. Average size of json is 8MB and maximum is 15MB. I used to upload json file to internet and whenever player wants to watch it again I downloaded from internet but it’s bad for me. If you consider traffic and usage of internet is bad for me. So I’ve decided to use database and tried Redis but it’s not one for this process. In the end of the day I’ve only one database remaining which is MongoDB.

Questions

Is 15MB per document bad for mongodb?
I often push and pull the 15MB data from mongodb. May it cause slowdown on average system? As an example 10 data push/pull every 10 minute.
After a while there’ll be a lot of datas which are over 10MB each. May it be problem in future for me?
I’d like to use MongoDB timer for documents. I’ll kill each documents after 48 hours. Is it bad idea to use MongoDB timer for each document?

NOTE: Decidated server and Mongo server is in same location and machine. You can think of decidated server features like an average system.

I’d like to get answers to my questions.

Natac13 · May 6, 2020, 10:59am

Hey @Baris_Karamanli

So for your questions:

From the docs

Document Size Limit

The maximum BSON document size is 16 megabytes.

The maximum document size helps ensure that a single document cannot use excessive amount of RAM or, during transmission, excessive amount of bandwidth. To store documents larger than the maximum size, MongoDB provides the GridFS API. See mongofiles and the documentation for your driver for more information about GridFS.

So no 15mb is fine, but you are close to the limit.

This depends on the memory amount for your cluster / system. You want to make sure that your working set and indexes “fit into” the amount of RAM you have. But since you said it is the same machine then you would need to deduct the amount of memory your application is using I think.
Do you mean by having a large db? This can be solved by sharding with Mongodb, which will allow you to split a collection(s) across multiple replica sets in the cluster.
You can use a TTL index on any collection where you would like mongo to expire the document. So Mongo has you covered there!

Stennie_X · May 6, 2020, 10:26pm

Welcome to the MongoDB community @Baris_Karamanli!

As @Natac13 noted, you can store up to 16MB per document. However, I’d advise against doing so without consideration of the practical impact. Large documents will occupy more memory in your WiredTiger cache and doing any update on a large document still involves loading the full document into cache.

If you are frequently only accessing or updating a small portion of a large document, I would reconsider your data model as you may be able to make more efficient use of RAM and I/O.

The Building with Patterns blog series includes some helpful patterns which are also explored in the free online course M320: Data Modelling at MongoDB University.

You’ll have to test this with your own deployment, but you can provision appropriately for your workload. 150MB of data every 10 minutes isn’t much, but if that is 150MB for each client of your game the usage will scale up quickly.

I think this is similar to the previous two questions. Large documents may require provisioning more server resources, but if this is the best fit for your use case you can scale up server resources and eventually grow into a sharded deployment if appropriate.

As @Natac13 mentioned, you can use a Time-To-Live (TTL) index to Expire Documents at a Specific Clock Time.

Regards,
Stennie

Baris_Karamanli · May 7, 2020, 8:34am

Thanks for the information. That’s really help me a lot. I’ll follow @Stennie and @Natac13 suggestions.