Welcome to the community @Katsuhiro_Mihara!
Per the Text Search documentation, whitespace characters are not part of the search terms:
$text
will tokenize the search string using whitespace and most punctuation as delimiters, and perform a logical OR
of all such tokens in the search string.
This means the behaviour will be similar to SQL-92 by default, however you can achieve a stricter result using phrase matching if appropriate.
You can confirm the behaviour by setting up some test data:
db.cuppa.insert(
[
{ _id: 1, name: "coffee" },
{ _id: 2, name: " coffee" },
{ _id: 3, name: "coffee " },
{ _id: 4, name: " coffee " },
{ _id: 5, name: "tea" },
]
)
db.cuppa.createIndex( { name: "text" } )
Default text search behaviour
A search on coffee
will match all of these example documents:
db.cuppa.find( { $text: { $search: "coffee" } } )
{ "_id" : 4, "name" : " coffee " }
{ "_id" : 3, "name" : "coffee " }
{ "_id" : 2, "name" : " coffee" }
{ "_id" : 1, "name" : "coffee" }
A search with trailing spaces will perform identically:
> db.cuppa.find( { $text: { $search: "coffee " } } )
{ "_id" : 4, "name" : " coffee " }
{ "_id" : 3, "name" : "coffee " }
{ "_id" : 2, "name" : " coffee" }
{ "_id" : 1, "name" : "coffee" }
Phrase matching
You can match a trailing space by wrapping the search term in double quotes to perform a phrase match:
> db.cuppa.find( { $text: { $search: "\"coffee \"" } } )
{ "_id" : 4, "name" : " coffee " }
{ "_id" : 3, "name" : "coffee " }
Explaining the results
If you want to understand differences in how these text search queries are processed, you can explain()
the queries and look at the parsedTextQuery
outcome for the winning plan:
> db.cuppa.find( { $text: { $search: "coffee " } } ).explain().queryPlanner.winningPlan.parsedTextQuery
{
"terms" : [
"coffe"
],
"negatedTerms" : [ ],
"phrases" : [ ],
"negatedPhrases" : [ ]
}
In this example, coffe
is the stemmed version of coffee
in English (according to the Snowball stemming algorithm), and whitespace characters have been removed by default.
A query with phrase matching has the same stemming outcome but adds an additional phrase match filter:
> db.cuppa.find( { $text: { $search: "\"coffee \"" } } ).explain().queryPlanner.winningPlan.parsedTextQuery
{
"terms" : [
"coffe"
],
"negatedTerms" : [ ],
"phrases" : [
"coffee "
],
"negatedPhrases" : [ ]
}
If you are performing a different type of text search (for example a regular expression match or Atlas Search), please provide a sample document, example of your search query, and the version of your MongoDB server.
Regards,
Stennie