How to Define a Custom Analyzer and Run an Atlas Search Diacritic-Insensitive Query
This tutorial describes how to create an index that uses a
custom analyzer and run a
diacritic-insensitive query against the sample_mflix.movies
collection. It takes you through the following steps:
Set up an Atlas Search index on the
title
andgenres
fields in thesample_mflix.movies
collection.Run an Atlas Search compound query against the
title
andgenres
fields in thesample_mflix.movies
collection using the wildcard and text operators.
Before you begin, ensure that your Atlas cluster meets the requirements described in the Prerequisites.
Required Access
To create an Atlas Search index, you must have Project Data Access Admin
or higher access to the project.
Create the Atlas Search Index
In this section, you will create an Atlas Search index on the title
and
genres
fields in the sample_mflix.movies
collection.
Navigate to the Atlas Search page for your project.
If it is not already displayed, select the organization that contains your desired project from the Organizations menu in the navigation bar.
If it is not already displayed, select your desired project from the Projects menu in the navigation bar.
Click your cluster's name.
Click the Search tab.
Enter the Index Name, and set the Database and Collection.
In the Index Name field, enter
diacritic-insensitive-tutorial
.Note
If you name your index
default
, you don't need to specify anindex
parameter when using the $search pipeline stage. Otherwise, you must specify the index name using theindex
parameter.In the Database and Collection section, find the
sample_mflix
database, and select themovies
collection.
Specify an index definition.
This index definition for the genres
and title
fields
specifies a custom analyzer, diacriticFolder
, using the following:
keyword tokenizer that tokenizes the entire input as a single token.
icuFolding token filter that applies character foldings such as accent removal and case folding.
The index definition specifies a string type for the genres
and title
fields. It also applies the custom analyzer named
diacriticFolder
on the title
field.
Use the Visual Editor or JSON Editor in the Atlas user interface to create the index.
Search the Collection
➤ Use the Select your language drop-down menu to set the language of the example in this section.
You can use the compound operator to combine two or more
operators into a single query. The sample query in this section uses the
compound operator to query the title
and genres
fields in the movies
collection using multiple operators.
In this section, connect to your Atlas
cluster and run the sample query against the
sample_mflix.movies
collection using the compound
operator.