Starting in MongoDB 8.2, secondary reads in sharded clusters might automatically terminate if there is a risk of missing documents due to chunk migrations.
To support this new behavior, MongoDB 8.2 introduces the following changes:
Adds
terminateSecondaryReadsOnOrphanCleanup
parameter (default:true
)Note
If
terminateSecondaryReadsOnOrphanCleanup
is set tofalse
, the server does not terminate reads and might miss documents in sharded collections due to chunk migrations. This is the default behavior in MongoDB 8.1 or earlier. To learn more, see Disable Secondary Read Termination.Increases
orphanCleanupDelaySecs
default value from900
seconds to3600
seconds (1 hour)
Behavior
By default, a sharded cluster performs the following operations when a chunk migration commits:
The source shard initiates an orphan cleanup process to delete documents that migrated to a different shard.
The shard waits for any pre-existing reads on the primary to complete.
The shard waits an additional
orphanCleanupDelaySecs
seconds (default: 1 hour).The shard deletes orphaned documents.
Secondaries terminate reads that started before the migration completed.
Secondaries replicate orphaned document deletions.
Terminating secondary reads before deleting orphaned documents ensures that long-running secondary reads do not miss any documents deleted by the cleanup process.
Monitoring
You can monitor terminated secondary reads due to orphan cleanup in the following ways:
Check the server status of your secondary node with the following
mongosh
command:db.serverStatus().metrics.operation.killedDueToRangeDeletion
Review your
mongod
logs. Each termination results in a log entry like the following example:
{ "t": { "$date": "2025-06-11T12:11:43.361+02:00" }, "s": "I", "c": "SHARDING", "id": 10016300, "svc": "S", "ctx": "conn93", "msg": "Read has been terminated due to orphan range cleanup", "attr": { "type": "command", ... "workingMillis": 0, "durationMillis": 0, "orphanCleanupDelaySecs": 3600 } }
Managing Long-Running Secondary Reads
If your application performs secondary reads that exceed 1 hour
on sharded clusters that perform chunk migrations, you might encounter QueryPlanKilled
errors (error code 175
) due to terminated reads.
The recommended method to manage long-running secondary reads is to implement a resume mechanism in your application.
You can also manage long-running secondary reads with the following alternative strategies:
Implement Resume Mechanism
A resume mechanism allows your application to create a new read operation that starts where your previous read operation terminates.
To implement an effective resume mechanism, your application must use a consistent sort order for your query results. Consider the following factors when selecting a sort order for your resume mechanism:
The sort operation should utilize an indexed field for efficient query execution.
The sort field should contain unique values.
If the sort field values are not unique, your application must implement additional logic to handle documents that share the same sort value.
Example
Consider a cities
database containing a zipcodes
collection with the following
structure:
{ "state": "NY", "city": "NEW YORK", "zipcode": "00501" }
For this example, assume the zipcode
field values are unique.
The following JavaScript code performs a secondary read operation to retrieve all documents
where the state
is NY
and implements a resume mechanism to handle QueryPlanKilled
errors:
let readDoc; let latestZip; let cursor = db.getSiblingDB("cities").zipcodes.find({ state: "NY" }) .sort({zipcode: 1}) .readPref("secondary"); while(cursor.hasNext()) { try { readDoc = cursor.next(); // process `readDoc` here latestZip = readDoc.zipcode; } catch (err) { if (err.code === 175 && err.errmsg.includes("Read has been terminated due to orphan range cleanup")) { console.log("Query terminated, resuming from zipcode:", latestZip); cursor = db.getSiblingDB("cities").zipcodes.find({ state: "NY", zipcode: {$gt: latestZip} }) .sort({zipcode: 1}) .readPref("secondary"); } else { throw err; // Rethrow non-termination errors } } }
When reviewing the example database and application logic, consider the following:
The example code handles
QueryPlanKilled
errors with a resume mechanism that sorts byzipcode
. Sorting on thezipcode
field ensures a consistent order and a unique sort value for each document. This allows the application to resume the read operation precisely where it was terminated.The
cities.zipcodes
collection implements a{state: 1, zipcode: 1}
compound index to ensure the efficiency of the resume mechanism queries. Implementing this compound index prevents both collection scans and in-memory sorts, and supports filter and sort operations. To learn more about creating effective indexes, see The ESR (Equality, Sort, Range) Guideline.The
QueryPlanKilled
error (error code175
) can occur for reasons other than terminated secondary reads. To accurately handleQueryPlanKilled
errors, you must parse theerrmsg
field. MongoDB returns the following error message when it terminates a secondary read:
{ code: 175, name: QueryPlanKilled, categories: [CursorInvalidatedError], errmsg: "Read has been terminated due to orphan range cleanup" }
When the application encounters a
QueryPlanKilled
error due to orphan range cleanup, it uses the last successfully processed zipcode as a starting point for the resumed query. The$gt
operator ensures the application does not process duplicate documents.
Test your resume mechanisms in a test environment and monitor your production cluster to understand how often secondary reads are terminated. If terminations occur frequently, you might need to adjust your query patterns, or consider alternative data access approaches. To learn how to monitor your cluster for these errors, see Monitoring.
Increase orphanCleanupDelaySecs
The orphanCleanupDelaySecs
server parameter controls the time
MongoDB waits before deleting a migrated chunk from the source shard.
Increasing orphanCleanupDelaySecs
allows secondary read operations to
run for a longer period of time. You can set the orphanCleanupDelaySecs
at both startup and runtime.
The following command sets orphanCleanupDelaySecs
to 2 hours:
db.adminCommand({ setParameter: 1, orphanCleanupDelaySecs: 7200 })
Important
Increasing orphanCleanupDelaySecs
means that orphaned documents
remain on nodes for a longer period of time. If you increase this value,
executing a query that uses an index but does not include the shard key
might result in degraded performance as the query must filter more
orphaned documents before returning results.
Disable Secondary Read Termination
Note
In MongoDB 8.1 or earlier, sharded clusters do not automatically terminate long-running secondary reads. To match this behavior in MongoDB 8.2 or later, disable secondary read termination.
The terminateSecondaryReadsOnOrphanCleanup
server parameter
controls whether long-running secondary reads automatically terminate
before orphaned document deletion.
You can disable secondary read termination by setting terminateSecondaryReadsOnOrphanCleanup
to false
. You can set this parameter at startup or runtime.
The following command sets terminateSecondaryReadsOnOrphanCleanup
to false
:
db.adminCommand({ setParameter: 1, terminateSecondaryReadsOnOrphanCleanup: false })
Warning
If this feature is disabled and chunk migrations affect the targeted collection, your secondary reads might fail to return all documents.
Disable the Balancer
You can avoid automatically terminating long-running secondary reads by disabling the balancer and not performing any manual migrations.
To disable the balancer for specific collections, use the
configureCollectionBalancing
command's enableBalancing
field.
To restrict balancer operations to specific times, see Schedule the Balancing Window.
Warning
Disabling the balancer for extended periods of time can lead to unbalanced shards which degrade cluster performance. Only disable the balancer if it is necessary for your use case.