Hello devs, we have a use case where we insert thousands (>20000) of documents (each one small, <1kb) into the same collection within one transaction. Setup is a one-node replica set.
We made the following observations:
- If the cache size is quite small, in mongotop it can be seen that insert activities stop once default cache eviction trigger (20% of Wiredtiger cache size) is reached and MongoDB consumes much CPU. Sometimes this process finishes after some minutes, sometimes it runs for a long time. During that time almost no other DB queries are handled (thus system is almost blocked).
- If the cache size is large enough, the transaction runs smoothly in few seconds.
- As written in https://www.mongodb.com/blog/post/performance-best-practices-transactions-and-read--write-concerns transactions with more than 1000 documents should be avoided but we want to ensure DB consistency. Are we overlooking a possibility to keep the insert consistent with multiple transactions (and not needing to implement a manual rollback)?
- The required portion of the cache size seems to be significantly larger than the data to be inserted, especially if the collection already contains some data. Is it related to the index size? Do you have any idea how to calculate the required cache size beforehand? Then we could adjust the cache size or forbid the insert.
- Do you have other ideas how to handle this use case?