Text search in aggregation

I have a collection named users, it has following attributes

{
  "_id": "937a04d3f516443e87abe8308a1fe83e",
  "username": "andy",
  "full_name": "andy white",
  "image" : "https://abc.com/xy.jpg",
... etc
}

i want to make a text search on full_name and username using aggregation pipeline, so that if a user search for any 3 letters, then the most relevant full_name or username returned sorted by relevancy,
i have already created text index on username and full_name and then i tried query from below link:

match_stage = [
    {"$match": {"$text": {"$search": "abc"}}},
    {"$sort": {"score": {"$meta": "textScore"}}},
    {"$project": {"username": 1,"full_name": 1,"image":1}}
]

stages = [
    *match_stage
]
users = users_db.aggregate(stages)

but i am getting below error:

pymongo.errors.OperationFailure: FieldPath field names may not start with ‘$’. Consider using $getField or $setField., full error: {‘ok’: 0.0, ‘errmsg’: “FieldPath field names may not start with ‘$’. Consider using $getField or $setField.”, ‘code’: 16410, ‘codeName’: ‘Location16410’, ‘$clusterTime’: {‘clusterTime’: Timestamp(1657811022, 14), ‘signature’: {‘hash’: b’a\xb4rem\x02\xc3\xa2P\x93E\nS\x1e\xa6\xaa\xb0\xb1\x85\xb5’, ‘keyId’: 7062773414158663703}}, ‘operationTime’: Timestamp(1657811022, 14)}

Note: i am using pymongo

@Zeeshan_Anis, your PyMongo code works fine, without any errors. Just verify if the Text Index is created by running this command from the mongosh or mongo shell - db.collection.getIndexes(), or just lookup in the Compass under the Indexes tab.

What are the versions of MongoDB database, Python and PyMongo you are working with?


Hi @Prasad_Saya i am using MongoDB 5.0.9 Enterprise, Python 3.9.7 and pymongo 4.1.1
screenshot attached after running getindexes on users collection, right now i have removed index from username and index added on just full_name attribute just to get it start working but this issue is still unresolved

@Zeeshan_Anis, I tried the following code and works fine using MongoDB v4.2, Pyhton 3.8 and PyMongo 4.x. The code creates a new collection test in the database test, inserts a document in it, queries the document, creates a Text Index on the field full_name, and queries on the full_name using an aggregation $match stage.

I don’t see any errors in the output. The aggregation prints the document.

import pymongo
client = pymongo.MongoClient()
collection = client.test.test

doc = { "username" : "andy", "full_name" : "andy white", "image" : "https://abc.com/xy.jpg" }
result = collection.insert_one(doc)
print(result, '\n')

for doc in collection.find():
	print(doc)

result = collection.create_index([( "full_name", pymongo.TEXT )])
print('\n', result, '\n')

result = list(collection.list_indexes())
print(result, '\n')

pipeline = [
    { "$match": { "$text": { "$search": "white" } } },
    { "$sort": { "score": { "$meta": "textScore" } } },
    { "$project": { "username": 1, "full_name": 1, "image": 1 } }
]

print(list(collection.aggregate(pipeline)))
1 Like

Thanks @Prasad_Saya for your help, i also able to run this but how can i search with half name, like i search for “and” and i get all the results closer to “and” and sorted by relevancy

@Zeeshan_Anis, for searching with a substring of text field you can use the regular expressions. There is a $regex query operator and also $regexMatch and $regexFind aggregation operators you can try using.

1 Like

@Prasad_Saya i am trying below pipeline:

match_stage = [
    {"$match": {"$text": {"$search": "whi"}, "full_name": {"$regex": "whi"}}},
    {"$sort": {"score": {"$meta": "textScore"}}}
]

but i again started getting below error:

    raise OperationFailure(errmsg, code, response, max_wire_version)
pymongo.errors.OperationFailure: FieldPath field names may not start with '$'. Consider using $getField or $setField., full error: {'ok': 0.0, 'errmsg': "FieldPath field names may not start with '$'. Consider using $getField or $setField.", 'code': 16410, 'codeName': 'Location16410', '$clusterTime': {'clusterTime': Timestamp(1657869096, 4), 'signature': {'hash': b'\xb6\x03Nh\x8cX\xab\xb9)\xa2\x8c_^\xa8\x0f\xf25\xbd\x89_', 'keyId': 7062773414158663703}}, 'operationTime': Timestamp(1657869096, 4)}

one more thing when i was trying without regex, i was getting error due to $project, like the query you are running above successfully, i am able to run if i remove $project from there, but with project my query breaks with the error i posted in my first question, its strange

Are you sure its the correct syntax? I don’t know if you can use the $regex operator within a Text Search.

its the syntax i used before but without text search, but i found one link to use regex with search just now and trying:

About the link in your previous comment:

@Zeeshan_Anis I am not familiar with the Atlas Search. You can lookup in the MongoDB Server documentation.

@Prasad_Saya thanks, i tried below pipeline:

match_stage = [{
    "$search": {
        "index": "full_name_txt",
        "regex": {
            "query": search_key,
            "path": "full_name"

        }
    }
}
]

but its not searching partial strings, i am trying more, but meanwhile can u give the idea, why query is failing in aggregate with $project which i posted at first i.e:

match_stage = [
    {"$match": {"$text": {"$search": "whit"}, "full_name": {"regex": "whit"}}},
    {"$sort": {"score": {"$meta": "textScore"}}},
    {"$project": {"username": 1, "full_name": 1}}
]

Hi @Zeeshan_Anis

I think you’re mixing Atlas Search & Legacy Text Search. Those are the two types of text search supported in MongoDB (see Text Search). However, to use Atlas Search your data would need to be in MongoDB Atlas.

i am using MongoDB 5.0.9 Enterprise

This is an on-prem installation, thus it only supports the Legacy Text Search.

if a user search for any 3 letters, then the most relevant full_name or username returned sorted by relevancy

The legacy text search does not support partial matches. If this is your requirement, then you would have to use Atlas search. There is a tutorial to do exactly this: How to Run Partial Match Atlas Search Queries. There are also examples using Python in the page (note that you can select the language for the examples in that page).

However should you decide that your data needs to stay on-prem, then @Prasad_Saya’s working example is a great starting point. You just don’t have the ability to do partial matches.

Best regards
Kevin

1 Like

Thanks for your answer @kevinadi. its new for me , i will try this partial search in Atlas as i am using Mongo DB Atlas