Query result on one collection informing 2nd query on different collection

Apologies, lots of similar questions have been answered, but I still can’t resolve my issue from those. I have two collections shapes and geometry. I have successfully queried shapes to remove all red shapes

myquery = {"Colour": "Red"}
x = collection.delete_many(myquery)    

But I want to use the result of this query on the geometry collection that shares the common field “Name”, so that I delete any document from geometry that has a Name that was found in the original query (red shapes)

example from shapes

_id: "123"
Colour: "red"
Shape: "square"
Name: "xyz123"
Line: "bold"    

example from geometry

 _id: "456"
Height: "6"
Width: "6"
Name: "xyz123"
Area: "36"    

Many thanks

Hi @Tim_Shurlock, this does not have an easy solution as I don’t think that delete_many supports aggregations. There are 2 approaches coming to my mind, which you can use -

  1. First, query on Shapes collection and retrieve the list of names, you “are going to delete”.

    to_delete_shapes = coll.find({"Colour": "red"}, {"Name":1})
    

    Then you can run 1 query (per collection) to delete the documents from both the collections using the names list and $in operator. -

    .delete_many({"Name": {"$in": array_of_names}})
    

    Do remember that you are fetching the list of names in memory, and if they are too many documents matching the Color query, you might wanna paginate or bucket. (~1000 documents)

  2. The other, more performant way is to add extra information in your Geometry collection’s document. You can easily add another key “Colour” to the geometry document also, so that you can run the delete_many query directly on both the collections.

Also, just want to ask, if you have a 1-1 relationship with Shape and Geometry documents, why not embedded one into the other? That way, your queries will become much less complicated.

Thanks…!

1 Like