Definition
Changed in version 6.2.
validateThe
validatecommand checks a collection's data and indexes for correctness and returns the results. The command also repairs any inconsistencies in the count and data size of a collection.Tip
In
mongosh, this command can also be run through thevalidate()helper method.Helper methods are convenient for
mongoshusers, but they may not return the same level of information as database commands. In cases where the convenience is not needed or the additional return fields are required, use the database command.Changed in version 5.0.
Starting in version 5.0, the
validatecommand can also find inconsistencies in the collection and fix them if possible.Index inconsistencies include:
An index is multikey but there are no multikey fields.
An index has multikeyPaths covering fields that are not multikey.
An index does not have multikeyPaths but there are multikey documents (for indexes built before 3.4).
If any inconsistencies are detected by the
db.collection.validate()command, a warning is returned and the repair flag on the index is set totrue.db.collection.validate()also validates any documents that violate the collection's schema validation rules.The
db.collection.validate()method inmongoshprovides a wrapper aroundvalidate.
Compatibility
This command is available in deployments hosted in the following environments:
MongoDB Atlas: The fully managed service for MongoDB deployments in the cloud
Important
This command is not supported in M0 and Flex clusters. For more information, see Unsupported Commands.
MongoDB Enterprise: The subscription-based, self-managed version of MongoDB
MongoDB Community: The source-available, free-to-use, and self-managed version of MongoDB
Syntax
The command has the following syntax:
db.runCommand( { validate: <string>, // Collection name full: <boolean>, // Optional repair: <boolean>, // Optional, added in MongoDB 5.0 metadata: <boolean>, // Optional, added in MongoDB 5.0.4 checkBSONConformance: <boolean> // Optional, added in MongoDB 6.2 background: <boolean> // Optional } )
Command Fields
The command takes the following fields:
Field | Type | Description | |
|---|---|---|---|
| string | The name of the collection to validate. | |
| boolean | Optional. A flag that determines whether the command performs a slower but more thorough check or a faster but less thorough check.
The default is For the WiredTiger storage engine, only the | |
| boolean | Optional. A flag that determines whether the command performs a repair.
The default is A repair can only be run on a standalone node. The repair fixes the following issues:
IMPORTANT: To set For more information, see the New in version 5.0. | |
| boolean | Optional. A flag which allows users to perform a quick validation to detect invalid index options without scanning all of the documents and indexes.
The default is Running the validate command with The
The If there is an invalid index detected, the validate command will prompt
you to use the New in version 5.0.4. | |
| boolean | Optional. If
New in version 6.2. | |
| boolean | Optional. If
The default is New in version 8.1. |
Behavior
Performance
The validate command can be slow, particularly on
larger data sets.
The validate command obtains an exclusive lock W on
the collection. This will block all reads and writes on the collection
until the operation finishes. When run on a secondary, the
validate operation can block all other operations on that
secondary until it finishes.
Warning
Due to the performance impact of validation, consider running
validate only on secondary replica set nodes.
You can use rs.stepDown() to instruct the current
primary node to become a secondary to avoid impacting a live
primary node.
Data Throughput Metrics
The $currentOp and the currentOp command
include dataThroughputAverage and
dataThroughputLastSecond information for
validate operations in progress.
The log messages for validate operations include
dataThroughputAverage and dataThroughputLastSecond
information.
Collection Validation Improvements
Starting in MongoDB 6.2, the validate command and
db.collection.validate() method:
Check collections to ensure the BSON documents conform to the BSON specifications.
Check time series collections for internal data inconsistencies.
Have a new option
checkBSONConformancethat enables comprehensive BSON checks.
Restrictions
The validate command no longer supports afterClusterTime. As such, validate cannot be
associated with causally consistent sessions.
Time series collections were introduced in MongoDB 5.0. Starting in v5.2, the default internal format for storing time series measurements changed. Due to this change:
Time series collections created before v5.2 might contain documents in both the old and new format. Internally, such collections are flagged as
timeseriesBucketsMayHaveMixedSchemaData: true.Time series collections created in v5.2 or later will always contain documents in the new format. Internally, such collections are flagged as
timeseriesBucketsMayHaveMixedSchemaData: falseor not flagged at all.
When the flag is true, time series queries take both the new and the old format
into account. When the flag is false or missing, time series queries take
only the new format into account.
Due to a bug described in SERVER-91194, under some conditions the flag might be lost. When this happens for time series collections created before v5.2, read query results may be incomplete. That is, some documents may be missed, even though they are still stored on the disk.
To determine if you are impacted by this, run validate on your time
series collection. The command returns an error if the collection is affected
by the bug. Your read query results may be incorrect if this is the case.
If affected, upgrade to a fixed version and set timeseriesBucketsMayHaveMixedSchemaData
to true for each affected collection to ensure that future queries on the
collection return correct results. The full steps for this process are located
here.
Index Key Format
Starting in MongoDB 6.0, the validate command returns a message if a
unique index has a key format that is
incompatible. The message indicates an old format is used.
Count and Data Size Statistics
The validate command updates the collection's count and
data size statistics in the collStats output with their correct values.
Note
In the event of an unclean shutdown, the count and data size statistics might be inaccurate.
Examples
To validate a collection
myCollectionusing the default validation setting (specifically, full: false):db.runCommand( { validate: "myCollection" } ) To perform a full validation of collection
myCollection, specify full: true:db.runCommand( { validate: "myCollection", full: true } ) To repair collection
myCollection, specify repair: true:db.runCommand( { validate: "myCollection", repair: true } ) To validate the metadata in the
myCollectioncollection, specify metadata: true:db.runCommand( { validate: "myCollection", metadata: true } ) To perform additional BSON conformance checks in
myCollection, specify checkBSONConformance: true:db.runCommand( { validate: "myCollection", checkBSONConformance: true } )
Validate Output
Note
The output may vary depending on the version and specific configuration of your MongoDB instance.
Specify full: true for more detailed output.
validate.nInvalidDocumentsThe number of invalid documents in the collection. Invalid documents are those that are not readable, which means the BSON document is corrupt and has an error or a size mismatch.
validate.nNonCompliantDocumentsThe number of documents not conforming to the collection's schema. Non-compliant documents are not counted as invalid in
nInvalidDocuments.Starting in MongoDB 6.2,
nNonCompliantDocumentsalso includes the number of documents that do not conform to the BSON or time series collection requirements.
validate.nrecordsThe number of documents in the collection.
validate.keysPerIndexA document that contains the name and index entry count for each index on the collection.
"keysPerIndex" : { "_id_" : <num>, "<index2_name>" : <num>, ... } keysPerIndexidentifies the index by its name only.
validate.indexDetailsChanged in version 8.1.
A document that contains the status of the index validation for each index and the index specification.
"indexDetails" : { "_id_" : { "valid" : <boolean>, "spec" : <document> }, "<index2_name>" : { "valid" : <boolean>, "spec" : <document> }, ... } indexDetailsidentifies the specific index (or indexes) that is invalid. Earlier versions of MongoDB would mark all indexes as invalid, if any of the indexes were invalid.indexDetailsidentifies the index by its name only. Earlier versions of MongoDB displayed the full namespace of the index; i.e.<db>.<collection>.$<index_name>.The
specdocument is the index specification, which varies depending on how the index is defined. Some examplespecdocument fields include:spec.v. The index version.spec.unique. A Boolean value that indicates if the index is unique.spec.key. The index key identifier.spec.name. The index name.
New in version 8.1.
validate.nsThe full namespace name of the collection. Namespaces include the database name and the collection name in the form
database.collection.
validate.validA boolean that is
trueifvalidatedetermines that all aspects of the collection are valid. Whenfalse, see theerrorsfield for more information.
validate.repairedA boolean that is
trueifvalidaterepaired the collection.
validate.repairModeNew in version 8.2.
A string that indicates what types of data inconsistencies the
validatecommand attempted to repair, if detected. PossiblerepairModevalues include:None: No repair actions are taken.FixErrors: Attempts to fix any validation errors.AdjustMultikey: Attempts to fix multikey inconsistencies by adjusting multikey metadata.
validate.warningsAn array that contains warning messages, if any, regarding the validate operation itself. The warning messages do not indicate that the collection is itself invalid. For example:
"warnings" : [ "Could not complete validation of table:collection-28-6471619540207520785. This is a transient issue as the collection was actively in use by other operations." ],
validate.errorsIf the collection is not valid (i.e
validis false), this field will contain a message describing the validation error.
validate.extraIndexEntriesAn array that contains information for each index entry that points to a document that does not exist in the collection.
"extraIndexEntries" : [ { "indexName" : <string>, "recordId" : <NumberLong>, // for the non-existent document "indexKey" : { "<key1>" : <value>, ... } } ... ] Note
For the
extraIndexEntriesarray, the sum of all theindexKeyfield sizes has a limit of 1MB where the sizes include both the keys and values for theindexKey. If the sum exceeds this size, the warning field displays a message.
validate.missingIndexEntriesAn array that contains information for each document that is missing the corresponding index entry.
"missingIndexEntries" : [ { "indexName" : <string>, "recordId" : <NumberLong>, "idKey" : <_id key value>, // The _id value of the document. Only present if an ``_id`` index exists. "indexKey" : { // The missing index entry "<key1>" : <value>, ... } } ... ] Note
For the
missingIndexEntriesarray, the sum of theidKeyfield size and all itsindexKeyfield sizes has a limit of 1MB where the field sizes include both the keys and values for theidKeyandindexKey. If the sum exceeds this size, the warning field displays a message.
validate.corruptRecordsAn array of
RecordIdvalues for documents that are unreadable, possibly because the data is damaged. These documents are reported as corrupt during validation. ARecordIdis a 64-bit integer internal key that uniquely identifies a document in a collection."corruptRecords" : [ Long(1), // RecordId 1 Long(2) // RecordId 2 ] New in version 5.0.
validate.okAn integer with the value
1when the command succeeds. If the command fails theokfield has a value of0.