Poor driver performance, how to optimize?

Hi,

We are comparing MongoDB vs our current SQL database solution.
One thing we noticed is that in the C# driver when we want to retrieve around 1 million documents it’s super slow versus de SQL version.

Some background information: we have a collection containing 70 million documents. One query we are currently testing is to retrieve some documents based on a filter. When that result is retrieved, it takes over 1 minute before the list is filled. In SQL that same query and table takes 15 seconds to fill the same list.

Is there a way to optimize this?

Have you performed your comparison on the same hardware?

Have you performed your comparison on cold or hot server? Filling the cache with data from disk might be slow so if SQL was hot and Mongo is cold then you are comparing apples and oranges.

Do you have indexes that support your query?

I think that a use-case that needs to download 1 million documents/rows should be redesigned. For example, if you download 1 million documents to do some calculations, then you should redesign so that the calculations are done on the server with the aggregation framework.

Note that being schema less mongo needs to send field names over the wire. With 1M documents you might have hit a bandwidth bottleneck. One way to optimize is to $project only the fields needed for the computation.

Also blindly copy one SQL table to one MongoDB collection fails to tap into the flexible nature of JSON documents. For example, if rows/docs with dates, you could use the bucketing pattern and group all rows/docs with same date with a single document. This will reduce the total number of documents and data duplication. Your 1_000_000 docs might end up being only 100_000.

That is it for now. I hope others will join the conversation.

Hi Steeve,

Thanks for your inisghts. Let me indeed provide some more info:
Yes, comparison is done on the same hardware. Also on the same exact server so I don’t think hot or cold servers is the issue. We have indeed an index that supports the query, both on SQL as in Mongodb.

I also think you’re right about the use case, that it’s bad design. In this case it was merely a benchmark.

Hopefully others will chime in as well :slight_smile:

1 Like