Hi there,
I’m trying to delete elements in an array if they appear multiple times.
In the example, I have underlined the 2 elements I want to delete.
I tried with UpdateMany and $pull, but I can’t identify the duplicates.
Thanks
Hi there,
I’m trying to delete elements in an array if they appear multiple times.
In the example, I have underlined the 2 elements I want to delete.
I tried with UpdateMany and $pull, but I can’t identify the duplicates.
Thanks
So your problem is doing searches for documents with duplicates in markets.salePeriods.panel.storeSegmentations
… focus on that and then afterwards think about the $pull
Morning Jack,
With $unwind and $group stages, I can identify the duplicates.
Is this the best way to do it?
[
{
$unwind: "$markets"
},
{
$unwind: "$markets.salePeriods"
},
{
$unwind:
"$markets.salePeriods.panel.storeSegmentations"
},
{
$group: {
_id: {
docId: "$_id",
marketId: "$markets.marketId",
salePeriodId:
"$markets.salePeriods.salePeriodId",
storeSegmentationId:
"$markets.salePeriods.panel.storeSegmentations.segmentationType"
},
count: {
$sum: 1
}
}
},
{
$match: {
count: {
$gt: 1
}
}
},
{
$group: {
_id: {
docId: "$_id.docId",
marketId: "$_id.marketId",
salePeriodId: "$_id.salePeriodId"
},
duplicateStoreSegmentations: {
$push: "$_id.storeSegmentationId"
}
}
},
{
$group: {
_id: {
docId: "$_id.docId",
marketId: "$_id.marketId"
},
duplicateSalePeriods: {
$push: {
salePeriodId: "$_id.salePeriodId",
duplicateStoreSegmentations:
"$duplicateStoreSegmentations"
}
}
}
},
{
$group: {
_id: "$_id.docId",
duplicateMarkets: {
$push: {
marketId: "$_id.marketId",
duplicateSalePeriods:
"$duplicateSalePeriods"
}
}
}
}
]
I would think so. Looks neat and tidy!
I want to avoid the $unwind stages which are resource-intensive. I have tried using the updateMany statement with $arrayFilters.
But don’t succeed
Here is a document example
{
_id: ObjectId('66ba22ed128efe0e16f91fab'),
markets: [
{
marketId: '1',
salePeriods: [
{
salePeriodId: 'a',
panel: {
storeSegmentations: [
{
segmentationType: 'A',
delete: false
},
{
segmentationType: 'B',
delete: false
}
]
}
},
{
salePeriodId: 'b',
panel: {
storeSegmentations: [
{
segmentationType: 'A',
delete: false
},
{
segmentationType: 'A',
delete: false
},
{
segmentationType: 'B',
delete: false
}
]
}
}
]
},
{
marketId: '2',
salePeriods: [
{
salePeriodId: 'a',
panel: {
storeSegmentations: [
{
segmentationType: 'A',
delete: false
},
{
segmentationType: 'B',
delete: false
},
{
segmentationType: 'B',
delete: false
}
]
}
}
]
}
]
}
db.test.updateMany(
{},
{
$set: {
"markets.$[].salePeriods.$[p].panel.storeSegmentations.$[s].delete": true
}
},
{
arrayFilters: [
{
"p.panel.storeSegmentations": {
$ne: [
{
$size: {
$filter: {
input: "$$p.panel.storeSegmentations",
as: "storeSegmentation",
cond: {
$eq: ["$$storeSegmentation.segmentationType", "$$s.segmentationType"]
}
}
}
},
1
]
}
},
{
"s.segmentationType": { $exists: true }
}
]
}
);
Hi Emmanuel,
You can’t do this with something as simple as a push nor can you efficient query for duplicates so it’s not worth trying that. You need to use an Expressive update to define what markets looks like relative to it’s current value.
Also - as an aisde you shoudl never $unwind then $group by _id, this is a pathological worst case for performance and a common huge anti-pattern.
I’ve made two assumptions here
(1) You want this to work for not just the first market or first sale period but all of them
(2) This may be wrong but I assum we can just convert storeSegmentations array to a set.
If (2) isn’t true then you would need to figure out logic for which object to keep if you have a duplicate segmentationType in an array and apply that using $reduce.
Anyway - here is the example code, I’ve broken it down to define each clause seperately - if you really want ot look at the whole thing then print out the value of expressiveUpdate.
db.a.drop()
var record = {
_id : "205734",
productId: "205734",
markets : [
{
salePeriods : [
{
panel: {
storeSegmentations :[
{ segmentationType: "C.WEB"},
{ segmentationType: "SUPPLY_SIZES"},
{ segmentationType: "SUPPLY_SIZES"},
]
}
},
{
panel: {
storeSegmentations :[
{ segmentationType: "C.WEB"},
{ segmentationType: "SUPPLY_SIZES"},
{ segmentationType: "SUPPLY_SIZES"},
]
}
}
]
},
{
salePeriods : [
{
panel: {
storeSegmentations :[
{ segmentationType: "C.WEB"},
{ segmentationType: "SUPPLY_SIZES"},
{ segmentationType: "SUPPLY_SIZES"},
]
}
}
]
}
]
}
db.a.insertOne(record);
var setOfSegmentationTypes = { $setUnion : [ "$$salePeriod.panel.storeSegmentations" ]}
var newSalesPeriod = {panel : { storeSegmentations : setOfSegmentationTypes }}
var allSalePeriods = { $map : { input : "$$market.salePeriods", as: "salePeriod", in : newSalesPeriod }}
var allMarkets = { $map : { input :"$markets", as: "market" , in: allSalePeriods }}
var expressiveUpdate = [ { $set : { markets : allMarkets }} ]
db.a.updateMany({},expressiveUpdate)
db.a.find();
Hi John,
My use case is more complex.
The duplicates are not always the same and I want to remove all items that appear more than once.
With this document :
{
_id : "205734",
productId: "205734",
markets : [
{
salePeriods : [
{
panel: {
storeSegmentations :[
{ segmentationType: "C.WEB"},
{ segmentationType: "SUPPLY_SIZES"},
{ segmentationType: "SUPPLY_SIZES"},
]
}
},
{
panel: {
storeSegmentations :[
{ segmentationType: "C.WEB"},
{ segmentationType: "SUPPLY_SIZES"},
]
}
}
]
},
{
salePeriods : [
{
panel: {
storeSegmentations :[
{ segmentationType: "C.WEB"},
{ segmentationType: "C.WEB"},
{ segmentationType: "SUPPLY_SIZES"},
]
}
}
]
}
]
}
Result should be :
{
_id : "205734",
productId: "205734",
markets : [
{
salePeriods : [
{
panel: {
storeSegmentations :[
{ segmentationType: "C.WEB"}
]
}
},
{
panel: {
storeSegmentations :[
{ segmentationType: "C.WEB"},
{ segmentationType: "SUPPLY_SIZES"}
]
}
}
]
},
{
salePeriods : [
{
panel: {
storeSegmentations :[
{ segmentationType: "SUPPLY_SIZES"}
]
}
}
]
}
]
}```
And are there other fields in there as well as Segmentation type - so you want ot remove any Object in the array if the segmentationType field is not unique , is that correct?
The code to iterate of the arrays of arrays stays the same, you jus t need to write an expression that instead of a simple setUnion returns an array with any duplicate elements removed. That shouldn’t be too difficult - Not sure if I’ll have an opportunity to try it though as traveling for a while.
Hi Jack,
I want to remove any Object in the array if the segmentationType field is not unique.
Yes, that is corret.
I want to remove any Object in the array if the segmentationType field is not unique.
I use $map aggregation operator to iterate arrays of arrays of arrays (3 levels).
It’s a bit complicated query but it’s fine