How to Index Fields in Arrays of Objects and Documents
On this page
Note
The Atlas Search embeddedDocuments
type, embeddedDocument operator, and embedded
scoring option are in preview. When an Atlas Search index on a replica set or
single MongoDB shard reaches 2,147,483,647 index objects,
Atlas Search transitions the index to a failed, non-queryable state. A solution
to accommodate this limitation will be in place when this feature is
generally available. To troubleshoot any issues related to using this
feature, contact Support. To request support
for more than 2,147,483,647 index objects, upvote
this request
in the MongoDB Feedback Engine.
You can use the Atlas Search embeddedDocuments
type to index fields in
documents and objects that are elements of an array. Atlas Search indexes
embedded documents independent of their parent document. Each indexed
document contains only fields that are part of the embedded document
array element. You can use only the embeddedDocument
operator to query fields indexed as embeddedDocuments
type.
Note
Use the embeddedDocuments
type to index fields inside array of
documents so that you can query each nested document individually. If
you only need to query nested documents in relation to the parent
document, use the How to Index Fields in Objects and Documents type.
Atlas Search doesn't dynamically index
fields of type embeddedDocument
. You must use static mappings to index embeddedDocument
fields. You can
use the Visual Editor or the JSON Editor in the Atlas UI
to index fields of type embeddedDocument
.
Review embeddedDocuments
Type Limitations
Before you create an index that uses the embeddedDocuments
type,
review the Lucene's 2,147,483,647 index objects limit and other Atlas Search
limitations.
Lucene's 2,147,483,647 Index Objects Limit
Atlas Search doesn't support indexing more than 2,147,483,647 index objects on a
replica set or single shard, where each indexed embedded document counts
as a single object. Using the embeddedDocuments
field type can
result in indexing objects over this limit, which causes an index to
transition to a failed, non-queryable state.
The exact number of index objects can vary based on the rate of document changes and deletions. The Search Max Number of Lucene Docs metric provides the upper bound of the current number of index objects across all indexes per replica set or shard. You can approximate the expected number of index objects in a single index by doing the following:
Calculate the number of index objects per document. For every level of nesting, each embedded document counts as a separate index object.
total number of index objects = 1 + number of nested embedded documents Multiply the number of index objects per document by the total number of documents in the collection
total number of index objects x total number of documents in collection
Note that this approximation is a lower bound.
Example
Consider the collection named
schools
, described in this tutorial, and suppose the
collection contains 1000 documents similar to the following:
{ "_id": 0, "name": "Springfield High", "mascot": "Pumas", "teachers": [ { "first": "Jane", "last": "Smith", "classes": [ { "subject": "art of science", "grade": "12th" }, ... // 2 more embedded documents ] }, ... // 1 more embedded document ], "clubs": { "stem": [ { "club_name": "chess", "description": "provides students opportunity to play the board game of chess informally and competitively in tournaments." }, ... // 1 more embedded document ], ... // 1 more embedded document } }
Now consider the index definition for the following fields in the
schools
collection:
The array of documents named teachers
is indexed as the
embeddedDocuments
type with dynamic mappings enabled.
However, the classes
field isn't indexed. Use the
following to calculate the index objects:
Calculate the number of index objects per document.
Number of ``teachers`` embedded documents = up to 2 Total number of index objects per document = 1 + 2 = 3 Multiply by the total number of documents in the collection.
Number of documents in the collection = 1000 Number of index objects per document = 3 Total number of index objects for collection = 1000 x 3 = 3000
The arrays of documents named teachers
and
teachers.classes
are indexed as the embeddedDocuments
type with dynamic mappings enabled. Use the following to
calculate the index objects:
Calculate the number of index objects per document:
Number of documents = 1 Number of ``teachers`` embedded documents = up to 2 Number of ``classes`` embedded documents = up to 3 Number of index objects per document = 1 + ( 2 x 3 ) = 7 Multiply by the total number of documents in the collection.
Number of documents in the collection = 1000 Number of index objects per document = 7 Total number of index objects: 1000 x 7 = 7000
If your collection has large arrays that might generate 2,147,483,647
index objects, you must shard any
clusters that contain indexes with the embeddedDocuments
type.
Atlas Search Limitations
The following limitations apply:
You can use
embeddedDocuments
only on fields with up to5
levels of nesting. AnembeddedDocuments
field can't have more than4
parentembeddedDocuments
fields.You can't use
embeddedDocuments
for date or numeric faceting.You can't define a field inside the
embeddedDocuments
type as the knnVector type.You can't index children of fields indexed as the
embeddedDocuments
type as the token type.For highlighting fields within embedded documents, you must also index the parent of the field that you want to highlight as the document type.
You can do the following only if you index the parents of the embedded document child field as the document type:
Faceted search on string fields within embedded documents. You must also index the field that you want to facet on as the stringFacet type.
Note
When you facet on a string field inside embedded documents, Atlas Search returns facet count for only the number of matching parent documents.
You can't facet on numeric and date fields in embedded documents.
Highlight fields within embedded documents. For an example, see the How to Run Atlas Search Queries Against Objects in Arrays tutorial.
Sort by the parent of the embedded document field. You must also index the embedded document field with string values as the token type. For child fields with number and date values, enable dynamic mapping to index those fields automatically. For an example, see Sort Example.
Define the Index for the embeddedDocument
Type
To define the index for the embeddedDocument
type, choose your preferred
configuration method in the Atlas UI and then select the
database and collection.
Click Refine Your Index to configure your index.
In the Field Mappings section, click Add Field to open the Add Field Mapping window.
Click Customized Configuration.
Select the field to index from the Field Name dropdown.
Note
You can't index fields that contain the dollar (
$
) sign at the start of the field name.Click the Data Type dropdown and select EmbeddedDocument.
Toggle the Enable Dynamic Mapping setting to enable or disable dynamic indexing of all dynamically indexable fields in the document. To learn more, see Configure
document
Field Properties.Click Add.
If you disabled dynamic mapping, click Add Embedded Field for the EmbeddedDocument type field to define field mappings for the fields in the document.
The following is the JSON syntax for the embeddedDocument
type.
Replace the default index definition with the following. To learn more
about the fields, see Field Properties.
1 { 2 "mappings": { 3 "dynamic": true|false, 4 "fields": { 5 "<field-name>": { 6 "type": "embeddedDocuments", 7 "dynamic": true|false, 8 "fields": { 9 "<field-name>": { 10 <field-mapping-definition> 11 } 12 } 13 } 14 } 15 } 16 }
Configure embeddedDocument
Field Properties
The Atlas Search embeddedDocuments
type takes the following parameters:
Field | Type | Necessity | Description | Default |
---|---|---|---|---|
type | string | Required | Human-readable label that identifies the field type.
Value must be embeddedDocuments . | |
dynamic | boolean | Optional | Flag that specifies whether to index every dynamically indexable field in the document. Value can be one of the following:
| false |
fields | document | Optional | Fields to index. If If Atlas Search doesn't support indexing facet fields as part of an
| {} |
Try an Example for the embeddedDocument
Type
The following index definition example uses the sample_supplies.sales collection. If you have the sample data already loaded on your cluster, you can use the Visual Editor or JSON Editor in the Atlas UI to configure the index. After you select your preferred configuration method, select the database and collection, and refine your index to add field mappings.
The following index definition indexes the array of objects in
the items
field. It also configures Atlas Search to automatically
index all dynamically indexable fields inside the objects in
the items
array.
In the Add Field Mapping window, select items from the Field Name dropdown.
Click the Data Type dropdown and select EmbeddedDocuments.
Toggle Enable Dynamic Mapping to enable dynamic mapping, if needed.
Click Add.
Replace the default index definition with the following index definition.
1 { 2 "mappings": { 3 "fields": { 4 "items": { 5 "type": "embeddedDocuments", 6 "dynamic": true 7 } 8 } 9 } 10 }
Note
To index all fields in an embedded document including fields that Atlas Search doesn't dynamically index, define the fields in the index definition. For string faceting, Atlas Search counts string facets once for each document in the result set.
For example, the following index definition configures Atlas Search to
automatically index all dynamically indexable fields inside the
objects in the items
array. It also configures the
purchaseMethod
field inside the array of objects to be
indexed as stringFacet,
which Atlas Search doesn't dynamically index, to support Atlas Search
facet queries against that field:
Click Add Field in the Field Mappings section and add the following fields by clicking Add after configuring the settings for each field in the Add Field Mapping window.
Field Name | Data Type |
---|---|
items | Click the dropdown and select EmbeddedDocuments . |
purchaseMethod | Click the dropdown and select StringFacet . |
1 { 2 "mappings": { 3 "dynamic": true, 4 "fields": { 5 "items": { 6 "dynamic": true, 7 "type": "embeddedDocuments" 8 }, 9 "purchaseMethod": { 10 "type": "stringFacet" 11 } 12 } 13 } 14 }
The following index definition configures Atlas Search to index only the
name
and tags
fields as the Atlas Search string
type in the
items
array of objects.
In the Add Field Mapping window, select items from the Field Name dropdown.
Click the Data Type dropdown and select EmbeddedDocuments.
Disable Enable Dynamic Mapping.
Click Add.
Click Add Embedded Field for the items field in the Field Mappings table and add the following fields by clicking Add after configuring the settings for each field in the Add Embedded Field Mapping window.
Field NameData Typeitems.nameClick the Data Type dropdown and select String.items.tagsClick the Data Type dropdown and select String.
Replace the default index definition with the following index definition.
1 { 2 "mappings": { 3 "fields": { 4 "items": { 5 "type": "embeddedDocuments", 6 "dynamic": false, 7 "fields": { 8 "name": { 9 "type": "string" 10 }, 11 "tags": { 12 "type": "string" 13 } 14 } 15 } 16 } 17 } 18 }