Querying deeply nested documents for multiple fields with the same name

I am working with documents whose schemas contain multiple fields with a certain name (let’s call it “fieldOfInterest”), and I would like to make a list of the values of all “fieldOfInterest” fields within a given document (and eventually, query the collection of documents itself to get the list of all “fieldOfInterest” values for that collection). However, these fields are located at varying depths within the deeply nested structure of the mongoose schema, and some documents in the collection may have these fields filled out while others might not; it’s not as simple as just looking for “field1.field2.fieldOfInterest” and “field1.fieldOfInterest”, for example. Is there a simple way (e.g., maybe involving flattening) to query a document to get the values of all fields called “fieldOfInterest”?

Please provide simple documents. To experiment for a solution we have to create document into our test database that matches your description. This is time consuming. It is much easier if we can just cut-n-paste documents rather than typing them.

1 Like

Here is an example schema I just cooked up as an illustration:

const schema = new mongoose.Schema({
  name: String,
  description: { type: String, id: String},
  inventory: [ { description: String, id: String}],
  relations: [{type: String, name: String, id: String}],
  publications: [ {journal: {name: String, id: String}, coauthors: [ {name: String, id: String}]}]
});

In this example, I would want to extract the values of all fields called “id”.
Also, in the collection of documents built from this schema, some documents might have an empty array for the publications field, while others do have entries for publications.

Example document:

{ 
  name: "John Doe",
  description: { type: "person", id: 'abc156'},
  inventory: [
    { description: 'socks', id: 'cl003'},
    { description: 'shoes', id: 'cl301'},
    { description: 'watch', id: 'cl002'},
    { description: 'suit', id: 'cl001'}
  ],
  relations: [
    { type: 'child', name: 'Jane Doe', id: 'abc111222'},
    { type: 'child', name: 'Bob Doe', id: 'abc444555'},
    { type: 'parent', name: 'Martha Smith', id: 'def333666'}
  ]  ,
  publications: [
    { journal: { name: 'Physics Review D', id: 'pub0202'},
      coauthors: [
      { name: 'C. S. Wu', id: '2589320'},
      { name: 'B. Foster', id: '2352380'},
      { name: 'S. Cooper', id: '3252381'}
    ]}
  ]
}

Let me know if this helps.

1 Like

The quotes are all wrong and I cannot insert your document.

Please format the document inside 2 lines of triple back ticks so that we can cut-n-paste it in compass or mongo shell.

const schema = new mongoose.Schema({
  name: String,
  description: { type: String, id: String },
  inventory: [{ description: String, id: String }],
  relations: [{ type: String, name: String, id: String }],
  publications: [
    {
      journal: { name: String, id: String },
      coauthors: [{ name: String, id: String }],
    },
  ],
});

const Person = mongoose.model('Person', schema);

const example = new Person({
  name: 'John Doe',
  description: { type: 'person', id: 'abc156' },
  inventory: [
    { description: 'socks', id: 'cl003' },
    { description: 'shoes', id: 'cl301' },
    { description: 'watch', id: 'cl002' },
    { description: 'suit', id: 'cl001' },
  ],
  relations: [
    { type: 'child', name: 'Jane Doe', id: 'abc111222' },
    { type: 'child', name: 'Bob Doe', id: 'abc444555' },
    { type: 'parent', name: 'Martha Smith', id: 'def333666' },
  ],
  publications: [
    {
      journal: { name: 'Physics Review D', id: 'pub0202' },
      coauthors: [
        { name: 'C. S. Wu', id: '2589320' },
        { name: 'B. Foster', id: '2352380' },
        { name: 'S. Cooper', id: '3252381' },
      ],
    },
  ],
});

Try the following:

[
  {
    '$project': {
      'description_ids': '$description.id', 
      'inventory_ids': '$inventory.id', 
      'relations_ids': '$relations.id', 
      'journal_ids': '$publications.journal.id', 
      'coauthors_ids': '$publications.coauthors.id'
    }
  }, {
    // This unwind is need because coauthors is an array of array
    '$unwind': {
      'path': '$coauthors_ids'
    }
  }, {
    '$addFields': {
      'all_ids': {
        '$setUnion': [
          [
            // Make an array from a single value for $setUnion
            // that require arrays as arguments. 
            '$description_ids'
          ],
          '$inventory_ids',
          '$relations_ids',
          '$journal_ids',
          '$coauthors_ids'
        ]
      }
    }
  }, {
    '$project': {
      '_id': 0, 
      'all_ids': 1
    }
  }, {
    '$unwind': {
      'path': '$all_ids'
    }
  }
  // You might have duplicates that you could eliminate with $sort and $group
]