Same query returning two different count

Hi,

I have a job that triggers the lambda and get the count from mongodb everyday. When i try to retrieve the data from mongodb using the below query {’$and’: [{’_id.a’: 402}, {‘t.u’: {’$gte’: datetime.datetime(2022, 6, 29, 22, 0, tzinfo=datetime.timezone.utc), ‘$lte’: datetime.datetime(2022, 8, 9, 22, 0, tzinfo=datetime.timezone.utc)}}, {’_id.s’: {’$gt’: 99999999}}]} . I’m getting two different counts. when the job ran , i got the count as 661 whereas the actual count was 340. But when i ran it separately in the morning i got the count as 340. No changes were made in the query. Even when the job ran the same query fetched the different count.

Can someone help me on this issue

Hi @Navaneethan_Sukumaran and welcome to the community!!

Could you help me with some informations for the above issue mentioned:

  1. By Lambda do you mean AWS Lambda?
  2. Can you post a code snippet that involves the problematic count? Currently in the post you only have the query but not the actual code. Would also be helpful to know the driver language you’re using.
  3. Is there a pattern for the count appearing to different in different executions or server responding with out of date responses?
  4. What is the MongoDB version you’re using?

You might want to check out:

  1. Write A Serverless Function with AWS Lambda and MongoDB | MongoDB
  2. Manage Connections with AWS Lambda

Please help me with the above details to assist further.

Thanks
Aasawari

HI Aasawari,

Please find the details below,

  1. Yes it is AWS Lambda.Please find the code snippet below,
    query = {’$and’: [{’_id.a’: int(asset_id)}, {
    ‘t.u’: {’$gte’: datetime.strptime(utc_from_date, ‘%Y-%m-%dT%H:%M:%S%z’),
    ‘$lte’: datetime.strptime(utc_to_date, ‘%Y-%m-%dT%H:%M:%S%z’)}}, {’_id.s’: {’$gt’: 99999999}}]}
    print(query)
    data = spots_col.aggregate([
    {’$match’: query},
    {’$count’: ‘count’}
    ])
    nielsen_mongo_count_list = [d for d in data]
    print(nielsen_mongo_count_list[0][‘count’])
  2. I dont get any server out of date response…The Mongodb count was different when aws lambda ran when it was triggered by the batch job. but when i trigger the lambda manually it i got the correct count. The query that ran in mongodb was exactly same with all the correct paramters.
    4.i’m using pymongo version pymongo==3.12.1

Hi @Navaneethan_Sukumaran

Thank you for sharing the above information, but I think we do not have enough information to determine what’s going on.

Could you help with a few more details which may be helpful:

  1. What is the MongoDB server version you are using?
  2. What is the deployment that you are working on? Is it an on-prem deployment? Or, are you using Atlas? Can you please share the connection string URI for the database? Please redact any passwords.
  3. Just a sanity check, can you please confirm if you are triggering the same function with and without the automatic trigger?
  4. While automatically triggering the function, do you see a wrong output in every execution consistently? If not, do you see any pattern with different responses received?

I think a better analysis could be made when we have more logging from the Lambda function’s execution. I would suggest adding more logging statements to the function, that way we can see whether the function was executed differently when it was manually triggered.

Aside from logging, perhaps you can also check out:

  1. Try to create a simple function to count the documents, and trigger it manually and automatically to ensure that the counting discrepancy was not due to an issue in the original function.
  2. Try to use MongoDB group accumulator count as an alternative to the $count aggregation stage.
  3. Follow the AWS Lambda database connection management recommendations in https://www.mongodb.com/docs/atlas/manage-connections-aws-lambda/ and the blog post in Write A Serverless Function with AWS Lambda and MongoDB | MongoDB might be a useful resource as well

Best Regards
Aasawari

1 Like