Secondary reads and orphaned documents guarantees

Hi, we come from an upgrade path 4.0 => 4.2. => 4.4.

We have a large sharded collection with { _id: hashed } as the shard key that has had balancing enabled in the past. In my current use case, I want to read from secondary (with balancing disabled) because I can tolerate any stale documents as long as they are not orphan documents.

I found out recently that reading from secondary returns duplicate documents when querying using a non-shard key index. This is true even when balancing is disabled. I investigated further and found that the duplicate documents do come from two different shards, so they are likely due to failed migration. I found out that this is a known issue from these articles and tickets:

(1) https://www.mongodb.com/community/forums/t/when-a-chunk-is-migrating-do-its-documents-exist-on-both-shards/89332/2

Till 3.4, if you were reading from secondary shard members, there is a possibility of getting duplicate documents (orphans) which seems like your scenario.
When you read from a primary (or a secondary with read concern local starting from mongodb 3.6) the node will apply the shard filter before returning the document. So, we won’t return twice the document. If the shard doesn’t official own the document it will just not returning it even if it has it locally.

^ A MongoDB Employee quotes this but I cannot find this in the referenced docs link or anywhere on google. But I was able to verify that { readConcern: local } with secondary reads does remove duplicate documents. Is this information still accurate without caveats or corner cases?

(2) https://www.mongodb.com/blog/post/background-indexing-on-secondaries-and-orphaned

The scenario where users typically encounter issues related to orphaned documents is when issuing secondary reads. In a sharded cluster, primary replicas for each shard are aware of the chunk placements, while secondaries are not. If you query the primary (which is the default read preference), you will not see any issues as the primary will not return orphaned documents even if it has them. But if you are using secondary reads, the presence of orphaned documents can produce unexpected results, because secondaries are not aware of the chunk ownerships and they can’t filter out orphaned documents.

^ This official article explains why secondaries show up orphaned documents, but doesn’t talk about readConcern: local

(3) database - In MongoDB, why is read concern "available" default option for secondaries in non causally consistent sessions? - Stack Overflow

  • local : returns data on the local node, but with orphaned documents filtered out. This requires the node to communicate with the shard’s primary (if this read is on a secondary), or the config server to service the read. In a degraded sharded cluster, this read may stall indefinitely. This is not an issue for an unsharded collection, though. Possible to return data that could be rolled back.

^ A MongoDB Engineer mentioned this about local concern but I want to clarify on the second statement. Does it communicate with the shard’s primary or config server or both? I am concerned about the performance degradation if it communicates with shard’s primary. Since we want to disable balancing, in theory, we should be able to perform the shard/chunk ownership filter in mongos itself.

My primary questions are:

  1. Does readConcern: local in secondary reads guarantee that we don’t get orphaned documents if balancing is disabled?
  2. Follow up question: How does it guarantee it OR if it doesn’t, how does mongo filter out duplicate documents with readConcern: local?
  • Does it communicate with primary OR does it communicate with mongos only? Could there be some performance implication here?

FYI for anyone looking for answers… After doing a live test on a 1TB production database, readConcern: local does NOT guarantee you don’t get orphaned documents. I resolved it by running cleanupOrphaned which took about 2 days and it didn’t impact performance while it was running.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.