How to get access to ONLY the object in sub-document where there was a hit (match) in Atlas Search

Hello All:

I am stuck with something that should be very simple to do, but am not able to find any straightforward way of doing using Mongo DB Atlas Search.

Firstly, the MongoDb videos said that the best way to structure no-sql databases like mongo db is not separating data into separate collections and linking via foreign keys, but by embedding and creating sub-documents, so sub-documents (or nested documents) are the recommended data structure of choice for Mongo db.

Now I have a USERS collection and this contains sub collections of related user data. I use the Atlas Search to do a text search on field in a nested or sub -document in the USER collection and using PROJECT I am able to get the fields for the USER where there was a match in the sub-document. However I cannot for the life of me get the _id of the sub document where the actual match took place!

In the Highlights I get the matched text. But i need the _id of the sub-document where the match happened. This should be very straightforward to get in any database system.

I did some research and in other threads there was a recommendation to do some kind of nasty workaround like $unwind the search results and do another $match on them. However I have multiple hits with the same or similar keyword (given the fuzzy aspect of the text search of Atlas search) and the subsequent running of $match as a workaround will result in serious bugs as other records could be matched rather than the one that was initially identified by the $Search.

I am pasting the aggregation code below:

[
       {
         '$search': {
          'index': "indexDeos",
           'text': {
             'query': 'trumpet', 
             'path': 'deos.text',
             'fuzzy': {}
           }, 
           'highlight': {
             'path': 'deos.text'
           }
         }
       }, {
         '$limit': 10
       }, {
         '$project': {
           '_id': 1, 
           'firstname': 1,
           'lastname': 1, 
           'thumbnail': 1,
           'deos': 1,
           'highlights': {
             '$meta': 'searchHighlights'
           }, 
           'score': {
             '$meta': 'searchScore'
           }
         }
       }
     ]

So the query is for the keyword “trumpet” this happens in a subdocument called deos.text. In my projection I have deos listed, and so I get all the deos back, including multiple ‘trumpets’, ‘trampets’ and so on, but I cannot find a way to get the _id of the record where there was the initial hit in the $search.

Please help. To get the _id of the document where the $search found a hit should be elementary.

First to dispel the myth that best way is to embed sub-documents - this is entirely dependent on the use case / workload. Data modeling and schema design in MongoDB are based on the details of how it works internally and what’s needed from the aggregation pipeline. With search, the main collection documents are what are ultimately matched and returned - not sub-documents independently. Consider making deos sub-documents their own collection and searching those instead - what would that look like scale-wise, aggregation pipeline-wise, performance-wise?

From your query, I’m deducing you’re using the document type for deos sub-documents. Could you share your index configuration too?

What’s the scale you’re working with? How many main collection documents? What’s the average number of deos sub-documents per main collection document?

@Arjun_kochhar_leodeo it sounds like you’d like to return only the matching sub-documents of the array, is that right? As Erik mentioned, Atlas Search matches and returns parents documents only. +1 on being interested about the scale of your use case.

To request this functionality and get alerted on updates, I recommend voting on this item in our feedback portal. You can see in the most recent comment that it’s possible to use a similar or equivalent $filter expression to return matching sub-documents – I recreated an example that might work for you here: Atlas Search Playground