Query to find duplicate users (ip)

Hi @amazing,

Something like the following might be suitable for your needs:

db.foo.insertMany([
  { "ip" : "127.0.0.1", "user": 1 },
  { "ip" : "127.1.1.1", "user": 2 },
  { "ip" : "127.2.1.1", "user": 3 },
  { "ip" : "127.0.0.1", "user": 4 },
  { "ip" : "127.1.1.1", "user": 5 },
]);
db.foo.createIndex({ ip: 1, user: 1 });
db.foo.aggregate([
  { $sort: { ip: 1 } },
  { $group: {
    _id: "$ip",
    used: { $sum: 1 },
    users: { $push: { user: "$user" } }
  }},
  { $match: { used: { $gt: 1 } } },  
])

Note that if you’re only using the ip and user fields, having an index on these (as shown above) should greatly improve the performance of this operation.

I’ve written up a longer form of this response at Efficiently Identifying Duplicates using MongoDB | ALEX BEVILACQUA as this question comes up pretty frequently :wink:

5 Likes