Query nested documents mongodb with python pymongo

My data look like this:

{'_id': ObjectId('6068da8878fa2e568c42c7f1'),
 'first': datetime.datetime(2018, 1, 24, 14, 5),
 'last': datetime.datetime(2018, 1, 24, 15, 5),
 'maxid13': 12.5,
 'minid13': 7.5,
 'nsamples': 13,
 'samples': [{'c14': 'C',
              'id1': 3758.0,
              'id10': 0.0,
              'id11': 274.0,
              'id12': 0.0,
              'id13': 7.5,
              'id15': 0.0,
              'id16': 73.0,
              'id17': 0.0,
              'id18': 0.342,
              'id19': 6.3,
              'id20': 1206.0,
              'id21': 0.0,
              'id22': 0.87,
              'id23': 0.0,
              'id6': 2.0,
              'id7': -79.09,
              'id8': 35.97,
              'id9': 5.8,
              'timestamp1': datetime.datetime(2018, 1, 24, 14, 5),
              'timestamp2': datetime.datetime(2018, 1, 24, 9, 5)},
             {'c14': 'C',
              'id1': 3758.0,
              'id10': 0.0,
              'id11': 288.0,
              'id12': 0.0,
              'id13': 8.4,
              'id15': 0.0,
              'id16': 71.0,
              'id17': 0.0,
              'id18': 0.342,
              'id19': 6.3,
              'id20': 1207.0,
              'id21': 0.0,
              'id22': 0.69,
              'id23': 0.0,
              'id6': 2.0,
              'id7': -79.09,
              'id8': 35.97,
              'id9': 6.2,
              'timestamp1': datetime.datetime(2018, 1, 24, 14, 10),
              'timestamp2': datetime.datetime(2018, 1, 24, 9, 10)},
               .
               .
               .
               .

Can someone help on how to find for example the id13 when timestamp1 is equals to
datetime.datetime(2018, 1, 24, 14, 5)
Samples is an array.
This is what i have wrote.

cursor = mydb1.mongodbbucket.aggregate(
          [
              {
                "$match": {
                   "samples.timestamp1": {"$eq": datetime.strptime("2018-01-24 14:10:00", "%Y-%m-%d %H:%M:%S")}
                }
              },

              {
                  "$project": {

                      "samples.id13": 1
                  }
              },
            ]
        )

The ideal output would be id13:7.5

Hi @harris,

You need a postional projection using a find operation:

{"samples.$.id13": 1 }

Thanks
Pavel

1 Like

Thank you @Pavel_Duchovny.This is what i did to fix the problem cursor = mydb1.mongodbbucket.aggregate([ { "$unwind": "$samples" }, { "$match": { "samples.id13": { "$exists": true }, "samples.timestamp1": { "$eq": datetime.strptime("2018-01-24 14:10:00", "%Y-%m-%d %H:%M:%S") } } }, { "$project": { "samples.id13": 1 } } ])

Hi @harris,

I recommend moving match stages to first place to utilize filtering and indexes otherwise each document will need to be unwinded which is not necessary…

I recommend testing a positional projection for an optimal solution.

Thanks
Pavel

@Pavel_Duchovny If i do something like this it prints all the id13 from the samples which is not the ideal…do you mean something else?

cursor = mydb1.mongodbbucket.aggregate([

    {
        "$match": {

            "samples.timestamp1": { "$eq": datetime.strptime("2018-01-24 14:10:00", "%Y-%m-%d %H:%M:%S") }
        }
    },
    { "$unwind": "$samples" },
    {
        "$project": {
            "samples.id13": 1
        }
    }
])

Hi @harris,

You can do match => unwind => match.

But best is:

mydb1.mongodbbucket.find({       "samples.timestamp1": { "$eq": datetime.strptime("2018-01-24 14:10:00", "%Y-%m-%d %H:%M:%S") }}, {"samples.$.id13": 1 });

Thanks
Pavel

1 Like

Hello @Pavel_Duchovny .This is what i get when i do

mydb1.mongodbbucket.find({       "samples.timestamp1": { "$eq": datetime.strptime("2018-01-24 14:10:00", "%Y-%m-%d %H:%M:%S") }}, {"samples.$.id13": 1 });
pymongo.errors.OperationFailure: As of 4.4, it's illegal to specify positional operator in the middle of a path.Positional projection may only be used at the end, for example: a.b.$. If the query previously used a form like a.b.$.d, remove the parts following the '$' and the results will be equivalent., full error: {'ok': 0.0, 'errmsg': "As of 4.4, it's illegal to specify positional operator in the middle of a path.Positional projection may only be used at the end, for example: a.b.$. If the query previously used a form like a.b.$.d, remove the parts following the '$' and the results will be equivalent.", 'code': 31394, 'codeName': 'Location31394'}

I did
{"samples.id13.$": 1 } and now prints the right result!Thank you!

1 Like

@Pavel_Duchovny Hello.I have one question more.when you say do match->unwind->match do you mean the exact same match?..something like this :

cursor = mydb1.mongodbbucket.aggregate([

    {
        "$match": {

            "samples.timestamp1": { "$eq": datetime.strptime("2018-01-24 14:10:00", "%Y-%m-%d %H:%M:%S") }
        }
    },
    { "$unwind": "$samples" },

        "$match": {

            "samples.timestamp1": { "$eq": datetime.strptime("2018-01-24 14:10:00", "%Y-%m-%d %H:%M:%S") }
        }
    }
    {
        "$project": {
            "samples.id13": 1
        }
    }
])

Right?

Yep.

But it will unwind only filtered docs. Unwind is an expensive operations to run on all docs.

Thank you!I appreciate your help a lot!

1 Like

Hello @Pavel_Duchovny .Sorry for interapting you.I have one last question.Is it possible to use position projection on aggregate?cause i tried and did not have any result.Thanks in advance!

I don’t think so, i think you need to do a $filter operator during projection fetching one element.

In that case the unwind might be a cleaner solution…

1 Like

@Pavel_Duchovny Hello again.I have one question more if i may.
Is it possible to use position projection in a query like this:

mydb1.mongodbbucketright.find(
    {"samples.timestamp1": {"$gte": datetime.strptime("2010-01-01 00:05:00", "%Y-%m-%d %H:%M:%S"),
                               "$lte": datetime.strptime("2015-01-01 00:05:00", "%Y-%m-%d %H:%M:%S")},
     "samples.id13":{"$gt":5}},

    {"samples.$": 1 })

It seems that i get less results than expected…

Should i do it with an aggregate like this? :

mydb1.mongodbbucketright.aggregate([

    {
        "$match": {
            "samples.timestamp1": {"$gte": datetime.strptime("2010-01-01 00:05:00", "%Y-%m-%d %H:%M:%S"),
                          "$lte" :datetime.strptime("2015-01-01 00:05:00", "%Y-%m-%d %H:%M:%S")},
            "samples.id13": {"$gt": 5}
        }
    },
    { "$unwind": "$samples" },
    {
        "$match": {
            "samples.timestamp1": {"$gte": datetime.strptime("2010-01-01 00:05:00", "%Y-%m-%d %H:%M:%S"),
                      "$lte": datetime.strptime("2015-01-01 00:05:00", "%Y-%m-%d %H:%M:%S")},
            "samples.id13": {"$gt": 5}
        }
    },


])

Hi @harris,

Not sure maybe there is a cursor you need to iterate until exhausted…

Thanks
Pavel