Return list of values from mongo array that satisfy regex

field_name = taste
phrase = pep
example:
taste:[‘pear’, ‘pepper’, ‘toffi’]

my code:

result = list(
    collection.find({field_name: {'$regex': phrase, '$options': 'i'}}, {'_id': 0, field_name: 1}).skip(
        offset).limit(limit).distinct(field_name))

I would like it to go through collection and return all tastes that start with pep so expected result would be: [‘pepper’]
but instead it returns me the whole list [‘pear’, ‘pepper’, ‘toffi’]

Second approach with aggregation:

        result = list(collection.aggregate([
            {'$match': {field_name: {'$regex': phrase, '$options': 'i'}}},
            {

                '$project':
                    {
                        '_id': 0,
                        field_name: {
                            '$filter': {
                                'input': f'${field_name}',
                                'cond': {
                                    "$regexMatch": {
                                        'input': "$this",
                                        'regex': phrase,
                                        'options': 'i'
                                    }
                                }
                            }
                        }
                    }
            },
            {'$skip': offset},
            {'$limit': limit}
        ]))

result is [{‘taste’:[‘pepper’]},{‘taste’:[‘pepperoni’, ‘pepper’]}]

How to make it a list of distinct values?

1 Like

Hi @Thyme1 - Welcome to the community.

I had a similar approach with the following pipeline using $filter and $regexMatch:

{
  '$addFields': {
    filteredArray: {
      '$filter': {
        input: '$taste',
        cond: {
          '$regexMatch': { input: '$$taste', regex: /pep/, options: 'i' }
        },
        as: 'taste'
      }
    }
  }
}

In my test environment, I had the following documents:

DB>db.coll.find()
[
  {
    _id: ObjectId("6383fb1931f9fbfd0869f470"),
    taste: [ 'pear', 'pepper', 'toffi' ]
  },
  {
    _id: ObjectId("6383fd8f31f9fbfd0869f471"),
    taste: [ 'test', 'per', 'toffi' ]
  },
  {
    _id: ObjectId("6383fd9f31f9fbfd0869f472"),
    taste: [ 'testpepper', 'toffi' ]
  },
  {
    _id: ObjectId("6383fda631f9fbfd0869f473"),
    taste: [ 'testpepper', 'toffi', 'pepper' ]
  }
]

The output using the pipeline above:

[
  {
    _id: ObjectId("6383fb1931f9fbfd0869f470"),
    taste: [ 'pear', 'pepper', 'toffi' ],
    filteredArray: [ 'pepper' ]
  },
  {
    _id: ObjectId("6383fd8f31f9fbfd0869f471"),
    taste: [ 'test', 'per', 'toffi' ],
    filteredArray: []
  },
  {
    _id: ObjectId("6383fd9f31f9fbfd0869f472"),
    taste: [ 'testpepper', 'toffi' ],
    filteredArray: [ 'testpepper' ]
  },
  {
    _id: ObjectId("6383fda631f9fbfd0869f473"),
    taste: [ 'testpepper', 'toffi', 'pepper' ],
    filteredArray: [ 'testpepper', 'pepper' ]
  }
]

result is [{‘taste’:[‘pepper’]},{‘taste’:[‘pepperoni’, ‘pepper’]}]
How to make it a list of distinct values?

Based off the above 4 output documents, could you describe your desired output in regards distinct values?

Regards,
Jason

In the meantime, you might possibly be able to achieve what you are after regarding the distinct values using $unwind and $group.

I would like it to return just a list of distinct tastes matching regex so: [‘pepper’].

I think you are right, I will try to do it with $unwind and $group.

Oh, I found information that the .aggregate() method always returns Objects no matter what you do and that cannot change.

My goal is to create function that will hint values that are already in database to someone typing in form.

I think I found satisfying result based on Mongodb distinct on a array field with regex query? - Stack Overflow :

        result = list(collection.aggregate([
            # Match the possible documents. Always the best approach
            {'$match': {field_name: {'$regex': phrase, '$options': 'i'}}},
            # De-normalize the array content to separate documents
            {'$unwind': f'${field_name}'},
            # Now "filter" the content to actual matches
            {'$match': {field_name: {'$regex': phrase, '$options': 'i'}}},
            # Group the "like" terms as the "key"
            {
                '$group': {
                    '_id': f'${field_name}'
                }},
            {'$skip': offset},
            {'$limit': limit}
        ]))

So lets say i have two documents containing taste field
taste: [‘pepper’, ‘pepperoni’, ‘tofffi’]
taste: [‘pepper’, ‘pepsomething’, ‘salt’]
It will give me [{‘_id’: ‘pepper’}, {‘_id’: ‘pepperoni’}, {‘_id’: ‘pepsomething’}] which is fine, because I will just extract the values

1 Like

Sounds like you’ve gotten something that works for you. If so, please feel free to mark your comment as the solution :slight_smile:

Regards,
Jason

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.