Find with Multiple Conditions: Text Search OR In Tag Array

Greetings,

I’m looking to find the latest XX documents in a collection by 1) KEYWORD text search OR 2) tags (array) contains KEYWORD. Something like this (which doesn’t seem to work):

.find(
   { $text: { $search: KEYWORD } },
   { tags: { $in: [KEYWORD] } }
)

So: Is it possible to find by text search and tags in the same query? If so, could someone point me toward the proper syntax?

Related: Would it be more performant to just include the tags field in my text index and simply perform a single text search?

Thanks!

Hi @Jet_Hin,

Since you’re using find(), I’m assuming you’re using Text Search Operators on a Self-Managed Deployments. However, I see that the question is tagged with atlas search. Note that Atlas Search requires the query to use the aggregation pipeline.

Something like this (which doesn’t seem to work):

What doesn’t work specifically? Are you getting any particular error?

Based off your example, it seems you have placed one of the query operators as a projection. More information in the db.collection.find() documentation.

In saying so, please see the similar example below where KEYWORD is "test":

DB> KEYWORD
test
DB> db.testcoll.find({$text:{$search:KEYWORD},tags:{$in:[KEYWORD]}})
[
  {
    _id: ObjectId("63edb5ad8f476e1a65817e7a"),
    tags: 'test',
    sometextfield: 'test'
  }
]

However, I do note you have specified “OR” in your conditions. In this case, perhaps the below example would suit your use case:

db.testcoll.find({$or:[{$text:{$search:KEYWORD}},{tags:{$in:['test']}}]})
[
  {
    _id: ObjectId("63edb5ad8f476e1a65817e7a"),
    tags: 'test',
    sometextfield: 'test'
  }
]

Please note the following:

  • This was only tested on a sample document noted above in my test environment. If you believe it suits your use case please test thoroughly to verify it suits all your requirements.
  • As per the $or operator documentation:

If $or includes a $text query, all clauses in the $or array must be supported by an index. This is because a $text query must use an index, and $or can only use indexes if all its clauses are supported by indexes. If the $text query cannot use an index, the query will return an error.

Might be worth testing this in your own environment since I have only tried this on my my test environment which consists of only a few documents but the results are as follows from my testing:

  1. Single text index consisted of 1 IXSCAN and 1 FETCH
  2. using $or with a text index and standard index on the tags field consisted of 2 IXSCAN’s and 2 FETCH stages.

For more information regarding the above, I would refer to the db.collection.explain("executionStats").find() documentation.

Some operator documentation references for your information:

I would be curious to also know if you’re using an on-prem environment or Atlas deployment? If you’re using Atlas, you may wish to consider looking into using Atlas Search to see if it suits your use case and requirements.

Regards,
Jason

1 Like

Hi Jason, thanks for getting back to me.

Yes, I’m seeking to use find() in an Atlas Search collection.

I have title and description fields which are indexed, but the tags field (array) is not included in the index.

My basic example doesn’t throw an error, but it appears that only the text search is returning results – the tags query appears to be ignored.

The $or query you suggested throws this error: “Failed to produce a solution for TEXT under OR - other non-TEXT clauses under OR have to be indexed as well.” I assume this is as per the documentation you referenced.

Though it might be nice to combine the results from a text search of my indexed fields along with separate “in array” query, it seems it’s not possible. So I suppose the easiest solution would be for me to recreate the index and include the tags array for find purposes.

Atlas search looks very cool, but is frankly overkill for my very basic application.

Thanks again for your help, Jet

Hi @Jet_Hin,

As noted before, Atlas Search’s $search stage requires the use of the aggregation pipeline (not .find()). I believe you have a "text" index on a collection which just so happens to be hosted in an Atlas Deployment as opposed to an “Atlas Search collection” . The "text" index is different to an Atlas Search Index.

Based of the error, you’ll probably need another index on the $or field as well (that is not the text index field).

Are you able to provide some sample document(s) along with the expected output? Just trying to get an idea of what the document(s) look like and what the output you’re trying to achieve is.

I’m curious to know why it would be overkill for your use case. If you have any these details I could possibly communicate these to the product team.

Regards,
Jason

Hi Jason,

Here’s a sample document:

{
  "_id": {
    "$oid": "xxx"
  },
  "title": "Sandy knocked them down. Nothing will make them leave.",
  "description": "The risk of hurricanes hitting New York and southern New England is definitely going up.",
  "tags": [
    "hurricanes","hurricane sandy","New York","New England"
  ]
}

The title and description fields are in a text index; the tags array is not included in the index. I’d like to return the X most recent documents where 1) $text: { $search: KEYWORD } OR 2) tags: { $in: [KEYWORD] }.

I think the $or error is indicating that I need to include the tags field in my index (i.e. a text search doesn’t support a separate $in query.)

My reasoning re: Mongo text vs search is that I just need to find records by simple term matching and return a list sorted by date. Relevance isn’t required.

I’m not entirely sure the error indicates that the tags field needs to be included in the text index (unless you are advising that it just needs to be indexed which I believe is the case). However, see my testing below.

I reproduced the error below by creating a text index only on the 2 fields and then trying to perform the $or operation:

DB> db.collection.createIndex({title:"text",description:"text"})
title_text_description_text
DB> db.collection.find({$text:{$search:'hurricanes'}})
[
  {
    title: 'Sandy knocked them down. Nothing will make them leave.',
    description: 'The risk of hurricanes hitting New York and southern New England is definitely going up.',
    tags: [ 'hurricanes', 'hurricane sandy', 'New York', 'New England' ]
  }
]
/// Performing a query with `$or` now with a text search:
DB> db.collection.find({$or:[{$text:{$search:'hurricanes'}},{tags:{$in:['hurricanes']}}]})
Uncaught:
MongoServerError: error processing query: ns=textdb.collectionTree: $or
    tags $eq "hurricanes"
    TEXT : query=hurricanes, language=english, caseSensitive=0, diacriticSensitive=0, tag=NULL

After creating an index on tags and executing the same $or query above:

DB> db.collection.createIndex({tags:1})
tags_1
DB> db.collection.find({$or:[{$text:{$search:'hurricanes'}},{tags:{$in:['hurricanes']}}]})
[
  {
    title: 'Sandy knocked them down. Nothing will make them leave.',
    description: 'The risk of hurricanes hitting New York and southern New England is definitely going up.',
    tags: [ 'hurricanes', 'hurricane sandy', 'New York', 'New England' ]
  }
]

As you have advised before, you could create the text index so that it also includes the tags field (I added the array element "Dragon" only to the tags field for demonstration):

/// Note : I dropped all the previous indexes mentioned above
DB> db.collection.createIndex({title:"text",description:"text",tags:"text"})
title_text_description_text_tags_text
db.collection.find({$text:{$search:'Dragon'}})
[
  {
    title: 'Sandy knocked them down. Nothing will make them leave.',
    description: 'The risk of hurricanes hitting New York and southern New England is definitely going up.',
    tags: [
      'hurricanes',
      'hurricane sandy',
      'New York',
      'New England',
      'Dragon'
    ]
  }
]

Thanks for the clarification Jet :slight_smile: