I have a question about query

Sungwon_Yoon · January 23, 2022, 10:05am

i have document.
{
‘participants’ : [ { ‘name’ : ‘rooney’, ‘team’ : ‘red’ , ‘win’ : true },
{ ‘name’ : ‘beckham’, ‘team’ : ‘red’ , ‘win’ : true },
{ ‘name’ : ‘giggs’, ‘team’ : ‘red’ , ‘win’ : true},
{ ‘name’ : ‘lampard’, ‘team’ : ‘blue’ , ‘win’ : false},
{ ‘name’ : ‘drogba’, ‘team’ : ‘blue’ , ‘win’ : false},
{ ‘name’ : ‘bosingwa’, ‘team’ : ‘blue’ , ‘win’ : false}
]

}
like this document, there are many document.

i want to find,

two player is in participants,
they are in different team,
and a player win.

for example.
i want to find, there are giggs, Lampard,
and they are in different team,
and lampard win.

(a document which i show, dosent match to this query,
because lampard lose.)

i wonder is it possible? or effiecient with query find.
or i should use aggregation.

elemMatch, expr, cond…maybe they can be used. but i think, it is very difficult.

in MongoDB, other people how can solve this difficult problem.???
in python, pandas,
i can do. because they are program language.

make variable, use for or while sentence, i can do it.
but using MongoDB query, it is very difficult to solve this problem.

santimir · January 23, 2022, 2:15pm

@Sungwon_Yoon There are many ways to do this.

Do a Mongo query, but this isn’t always that simple.
Another way to is process the result of a simple find() using javascript standard tooling, or some processing library.
Also, for a procedure that you just need once in a while, and not in a hurry, you could probably do it with Pandas as long as you feed it with json data, or csv.

I’d say normally 1. is fastest because you don’t send smaller chunks of data.

MongoDB can query the database using MQL or Aggregation (most common ones).

For complex queries, where we may also alter the structure of the documents (like add or remove fields), summarize results etc., we use aggregation, as the MongoDB query language is less powerful.

In this case, I don’t find an easy way to just run a query, so I used aggregation.

Hopefully someone will get a better take, but this should work.

See query

db.collection.aggregate([
  {
    $match: {
      "participants.name": {
        $all: [
          "schopenhauer",
          "leopardi"
        ]
      },
      
    }
  },
  {
    "$addFields": {
      key: {
        "$filter": {
          "input": "$participants",
          "as": "p",
          "cond": {
            $and: [
              {
                $eq: [
                  "$$p.name",
                  "leopardi"
                ]
              },
              {
                $eq: [
                  "$$p.win",
                  true
                ]
              }
            ]
          }
        }
      }
    }
  },
  {
    $match: {
      key: {
        $ne: []
      }
    }
  }
])

The query checks 2 players are in the team, and then creates a field with the winner, that if it’s empty will finally filter that document out.

And a live example here.

steevej · January 24, 2022, 2:49pm

In my opinion, when a use-case query becomes too complicated, this usually indicates a schema issue.

In your case I see a lot of redundancy. All participants of team:red have win:true, all of team:blue have win:false. You could have something like:

{
  win :  {  team : red ,  players : [ rooney, beckham , giggs ] }  ,
  lose : {  team : blue ,  players : [ lampard , drogba , bosingwa ] }
}

Just by a quick look it is easy to see that there is 2 teams of 3 players and red is winning. In the original schema we have to look at each element in details to figure out that information.

query = { win.players : lampard , lose.players : giggs }

steevej · January 24, 2022, 3:07pm

About redundancy.

mongosh> db.participants.find()
{ _id: ObjectId("61eebecf82a670f784fe6181"),
  participants: 
   [ { name: 'rooney', team: 'red', win: true },
     { name: 'beckham', team: 'red', win: true },
     { name: 'giggs', team: 'red', win: true },
     { name: 'lampard', team: 'blue', win: false },
     { name: 'drogba', team: 'blue', win: false },
     { name: 'bosingwa', team: 'blue', win: false } ] }

mongosh> db.win_lose.find()
{ _id: ObjectId("61eebf3282a670f784fe6182"),
  win: { team: 'red', players: [ 'rooney', 'beckham', 'giggs' ] },
  lose: { team: 'blue', players: [ 'lampard', 'drogba', 'bosingwa' ] } }

mongosh> db.participants.stats().size
317

mongosh> db.win_lose.stats().size
187

Sungwon_Yoon · January 25, 2022, 5:59am

i think you suggest me change document schema to query easily . right??
but my document is not i make, but other site api.
and i will use other query too.
so just for this query easily, i cant change.
but i can think about, not change, but can add other schema. thanks.

steevej · January 25, 2022, 2:58pm

Yes, the goal is to have the best model for your use-case, for your most used queries.

Your current model looks like a blind copy of an SQL normalized table.

I do not understand. If your document comes from another site API, why do you want to query it with MongoDB? MongoDB cannot help you on a single documented not stored in MongoDB. Does the API give you access to the stored MongoDB documents? Do you store the documents in MongoDB once you received them from the API? If you store them yourself after you received them from the API, nothing stops you from storing them with you own schema.

Sungwon_Yoon · January 25, 2022, 4:00pm

Thank you, i was newbee in IT, DB, mongodb. i think you are right. i have to start change schema. Thanks!!

Stennie_X · January 27, 2022, 8:58pm

2 posts were split to a new topic: A pipeline stage specification object must contain exactly one field