Docs Menu
Docs Home
/
MongoDB Manual
/ / /

Specify the Default Language for a Text Index on Self-Managed Deployments

On this page

  • Before You Begin
  • Procedure
  • Results
  • Learn More

By default, the default_language for text indexes is english. To improve the performance of non-English text search queries, you can specify a different default language associated with your text index.

The default language associated with the indexed data determines the suffix stemming rules. The default language also determines which language-specific stop words (for example, the, an, a, and and in English) are not indexed.

To specify a different language, use the default_language option when creating the text index. To see the languages available for text indexing, see Text Search Languages on Self-Managed Deployments. Your operation should resemble this prototype:

db.<collection>.createIndex(
{ <field>: "text" },
{ default_language: <language> }
)

If you specify a default_language value of none, the text index parses through each word in the field, including stop words, and ignores suffix stemming.

Create a quotes collection that contains the following documents with a Spanish text field:

db.quotes.insertMany( [
{
_id: 1,
quote : "La suerte protege a los audaces."
},
{
_id: 2,
quote: "Nada hay más surrealista que la realidad."
},
{
_id: 3,
quote: "Es este un puñal que veo delante de mí?"
},
{
_id: 4,
quote: "Nunca dejes que la realidad te estropee una buena historia."
}
] )

The following operation creates a text index on the quote field and sets the default_language to spanish:

db.quotes.createIndex(
{ quote: "text" },
{ default_language: "spanish" }
)

The resulting index supports text search queries on the quote field with Spanish-language suffix stemming rules. For example, the following query searches for the keyword punal in the quote field:

db.quotes.find(
{
$text: { $search: "punal" }
}
)

Output:

[
{
_id: 3,
quote: "Es este un puñal que veo delante de mí?"
}
]

Although the $search value is set to punal, the query will return the document containing the word puñal because text indexes are diacritic insensitive.

The index also ignores language-specific stop words. For example, although the document with _id: 2 contains the word hay, the following query does not return any documents. hay is classified as a Spanish stop word, meaning it is not included in the text index.

db.quotes.find(
{
$text: { $search: "hay" }
}
)

Back

Create a Wildcard

Next

Multiple Languages