Docs Menu

Docs HomeLaunch & Manage MongoDBMongoDB Atlas

How to Index String Fields for Efficient Filtering and Sorting

On this page

  • Review the Behavior of the token Type
  • Review token Type Limitations
  • Define the Index for the token Type
  • Configure token Field Properties
  • Try an Example for the token Type

You can use the Atlas Search token type to index string fields for sorting the Atlas Search results. You can then use the $search sort option in your query to sort the results by the indexed field. To learn more, see Sort Atlas Search Results. You can also use the Atlas Search token type to index string fields for pre-filtering the data that the $vectorSearch queries analyze. To learn more, see Perform Semantic Search with Atlas Vector Search.

To run queries against string fields using the following operators, you must index the field as the Atlas Search token type:

To learn more, see the documentation for each respective operator.

When you index a field as token type, Atlas Search indexes the terms in the string as a single token (searchable term) and stores them in a columnar storage for efficient filtering or sort operations. You can use a normalizer to transform the token. By default, the normalizer is set to none and so Atlas Search indexes strings in their original form.

The major difference between the Atlas Search string and token types is that Atlas Search creates one or more tokens for fields indexed as string type whereas Atlas Search creates only a single token for fields indexed as the token type.

If a string being indexed as a token field type exceeds 8181 characters, Atlas Search truncates it to 8181 characters before indexing.

When you index a field as the token type, you must index that field as string type also to query the text value using operators such as text, phrase, etc. For the following operators, you don't need to index the field as string type also to query the text value in the field:

You can't index children of fields indexed as the embeddedDocuments type as the token type.

To define the index for the token type, choose your preferred configuration method in the Atlas UI and then select the database and collection.

The Atlas Search token type takes the following parameters:

Option
Type
Necessity
Description
Default
type
string
Required
Human-readable label that identifies this field type. Value must be token.
normalizer
string
Optional

Type of transformation to perform on the field value. Value can be one of the following:

  • lowercase - to transform text values in string fields to lowercase.

  • none - to not perform any transformation.

If you don't set this option explicitly, it defaults to none.

none

The following index definition example uses the sample_mflix.movies collection. If you have the sample data already loaded on your cluster, you can use the Visual Editor or JSON Editor in the Atlas UI to configure the index. After you select your preferred configuration method, select the database and collection, and refine your index to add field mappings.

← How to Index String Fields For Faceted Search