Difficulty Querying Nested Data in MongoDB Data Lake

Hello MongoDB Community,

I’m currently facing an issue while querying nested data stored in MongoDB Data Lake, and I’m seeking some guidance on how to effectively handle it.

Here’s a brief overview of my setup:

  • I’ve set up a MongoDB Data Lake to store and manage various datasets.
  • One of the datasets contains nested documents with complex structures.
  • I’m using MongoDB Atlas to query the data lake.

The problem arises when I try to query the nested data using aggregation pipelines or simple find queries. It seems that I’m having trouble navigating through the nested structure to extract the desired information.

For example, let’s say I have a dataset structured like this:

{
  "_id": ObjectId("60c2f396537a7e4326797d4c"),
  "name": "John Doe",
  "age": 30,
  "addresses": [
    {
      "type": "home",
      "city": "New York",
      "street": "123 Main St"
    },
    {
      "type": "work",
      "city": "San Francisco",
      "street": "456 Elm St"
    }
  ]
}

I want to retrieve all documents where the city in the “addresses” array is “New York”.

Could someone please provide insights into how I can effectively query this nested data? Any examples or suggestions on using aggregation pipelines or other methods would be greatly appreciated. :slightly_smiling_face: :innocent:

Thank you for your help!

db.collection.find({'addresses.city':'New York'})
or
db.collection.aggregate({$match:{'addresses.city':'New York'}})
You can also use index on that field
db.collection.createIndex({'addresses.city':1})

What is exactly the problem that arises?