I need help regarding the use of the $or operator

I am using Python + pymongo to develop a website, but the process encountered some query efficiency problems.

There are a lot of $or operators in the query process, and the query is too slow.

I checked with “explain” and $or uses a different index. I have also considered using multi-process/multi-threading to turn the $or condition into a single index query. But this cannot guarantee the “non-repetition” of the data.

For example, the following data:

{
 "name": "Alex",
  "Alias": "Ax"
  "other": "A..."
},
{
  "name":"b",
  "Alias": "bbb",
  "other": "..."
}

I need to perform an or search on “name”, “Alias” and “other”. If I use the $or operator, three queries are actually created, and they are synchronized. Correspond to their respective indexes.
I need to increase the query speed. I can create 3 threads to run each conditional query separately.

But doing so will cause duplicate data.

thread1: find({“name”:“$regex”:“A”})
thread2: find({“Alias”:$regex":“A”})
thread3: find({“other”:$regex":“A”})

They will all query this data:

{
 "name": "Alex",
  "Alias": "Ax"
  "other": "A..."
}

If so, I have to process the returned results of the three threads, which greatly reduces the query efficiency.
I want to know, is there any more efficient way for mongodb to deal with this kind of problem?

Hello @binn_zed, you can use the $or operator as follows using PyMongo.

collection.find( { '$or': [
    { 'name': { '$regex': 'A' } },
    { 'other': { '$regex': 'A' } }
] } )

See the following notes on optimizing your query:

  • Regex and Index Use

    For case sensitive regular expression queries, if an index exists for the field, then MongoDB matches the regular expression against the values in the index, which can be faster than a collection scan. Further optimization can occur if the regular expression is a “prefix expression”, which means that all potential matches start with the same string. This allows MongoDB to construct a “range” from that prefix and only match against those values from the index that fall within that range.

  • $or Clauses and Indexes

    When evaluating the clauses in the $or expression, MongoDB either performs a collection scan or, if all the clauses are supported by indexes, MongoDB performs index scans.


So, your query will benefit from having index on each of the three fields. Further optimization can occur if your regular expression is a “prefix expression” (as noted above in the in the Regex and Index Use), for example:

collection.find( { '$or': [
    { 'name': { '$regex': '^A' } },
    { 'other': { '$regex': '^A' } }
] } )
1 Like