I need to help for Query Targeting: Scanned Objects / Returned has gone above 1000 alert

gyus · May 26, 2022, 9:39am

Hello.
I’m using Atlas’ paid tier Replicaset (Primary-Secondary-Secondary)

Every hour I create a new collection and insert about 1.5 to 2 million documents.

When I check Atlas’ cluster metrics every time I insert it, the primary is unchanged, and the query targeting of secondary is rapidly increasing.

As a result, Interferes with alerts of actual dangerous operations according to collscan and it is very noisy because the alarm of atlas occurs every hour

The alarm is using readPreference=secondary in my application, so it is difficult to disable.

I need an opinion on how this can happen.

Below is the metric information that I checked.

MaBeuLux88_xxx · May 27, 2022, 8:50pm

Hi @gyus and welcome in the MongoDB Community !

This alert is set quite low by default and I often increase the number in my alerts so it doesn’t trigger that often.

That being said, why are you writing 1.5+M docs in MongoDB and then instantly reading them all from the cluster? Why not just write them to MongoDB and keep them in memory for processing?

Also why are you reading with readPreference=Secondary? Your secondaries are now doing more work than your primary (because they are also doing the write operations (=replication) just like the primary + the reads now. Why not read directly the data using the primary?

If you still want the data as soon as you write it. Maybe you could use a Change Streams filtering on insert operations maybe instead of a coll scan?

Cheers,
Maxime.

gyus · May 28, 2022, 5:11am

Hello. @MaBeuLux88_xxx
The reason why I read in secondary is because of load balancing.

Apart from this, I’ve done some tests and found the cause.

In my case, it’s because of the Atlas Search Index that exists in different databases or collections that exist in the same cluster instance

When you create a new cluster and insert 1.5 million data with nothing in it, the secondary QueryTargeting did not rise
If there is an Atlas Search Index in a completely different database, collection (not a collection that inserts data), the queryTargeting of the secondary rises sharply when 1.5 million documents are inserted.

I think it is unreasonable for this to happen because of the search index that exists in the collection where no data changes have occurred, and I think unnecessary collscan work of unnecessary alerts and secondary is a waste of cluster resources.

I wrote a post on MongoDB Jira about this part, and I will attach the link below.

https://jira.mongodb.org/plugins/servlet/mobile#issue/SERVER-66824

Progressive improvement is required for that part

MaBeuLux88_xxx · May 28, 2022, 5:36pm

Replica Set are not meant to be used to scale read operations. Scaling == Sharding.
If one of your Secondary goes down now, the other one will have to take 100% of the workload and it will blow up instantly as well (overloaded => Domino effect).

Using RS to scale => No more High Availability. RS are here for one thing only: HA.