Problem: query targeting scanned objects / returned above 1000

Hello,
I have been working with a few datasets each containing around 3-6 million documents. In production, I am generating charts using aggregation over the data that are stored in the database where I am using MongoDB aggregation framework to do all the aggregation and generate necessary numbers and ChartJS to display the charts on my web app. I have tried to use as many indices as possible on each collection, but the problem is there are at least 11 filtering options for each of those datasets and many of them are numeric range fields while around half of them are categorical and can be used in any combination. Moreover, there are options to make queries by dragging the map using the Lat-Lon pairs available on the documents. Basically, I tried to use $facet, followed by a $match to filter the data according to frontend inputs, to perform all the necessary aggregations for generating the chart data in the beginning. But since then, I am getting this warning almost on a daily basis. Before using $facet, the process was to fetch all the filtered data after the $match stage which performed very slowly as for many of those queries, I had to fetch documents worth 300-400 MB to my python server. What better way can I optimize my pipeline or update my configurations so that I can get all the aggregation done at the database level?

Hi @Md_Abdur_Rakib ,

First this alert sounds like an atlas alert, if so why don’t you use Atlas charts to generate the charts?

Second We will need to get the aggregation pipeline used, sample document , the execution plan and indexes available to answer the question better…

The warning by itself does not indicate on a problem but it might if you see performance impact, do you? Have you confirmed its the same query in the logs that have those large scans vs returned?

Best
Pavel

@Pavel_Duchovny, I am not the owner of the project; I am just a responsible backend developer. And there are GDPR restrictions for which I don’t have access to the profiler. Besides, I am not having any performance impact, rather I am having responses within 200-500 milliseconds. So, I guess this should not be a problem.

One thing I want to ask is, can I fetch the Atlas charts and show them on 3rd party applications? If so, can you please guide me to the documentation? And is the JS code for it customizable to fit the application theme? Thanks in advance.

Hi @Md_Abdur_Rakib ,

Yes atlas charts can be embedded via a js sdk or an iframe :

Look here and configure as needed, there are multiple ways to embed and its easy and very powerful…

Regards
Pavel

@Pavel_Duchovny thanks a lot, great stuff. Didn’t know it was this easy

1 Like