I have two collections (Rec22, Rec23) having 120K documents in each.
Rec22
{ "_id" : ObjectId("6284a6b84b171c659ec86561"), "22" : 423 }
{ "_id" : ObjectId("6284a6b84b171c659ec86562"), "22" : 506 }
...
Rec23
{ "_id" : ObjectId("6284a6b84b171c659ec86561"), "23" : "AS" }
{ "_id" : ObjectId("6284a6b84b171c659ec86562"), "23" : "DF" }
...
I am making a aggregate to concatenate “22” and “23” as store the result in new collection. The aggregation pipeline is the following:
pipeline = [
{'$lookup': {
'from': 'Rec22',
'localField': '_id',
'foreignField': '_id',
'as': 'fromExtra1'}
},
{'$lookup': {
'from': 'Rec23',
'localField': '_id',
'foreignField': '_id',
'as': 'fromExtra2'}
},
{'$replaceRoot': {'newRoot': {'$mergeObjects': [{'$arrayElemAt': ['$fromExtra2', 0]}, '$$ROOT',
{'$arrayElemAt': ['$fromExtra1', 0]}, '$$ROOT']}}},
{'$project': {'fromExtra2': 0, 'fromExtra1': 0}},
{'$addFields': {
'22_str': {'$toString': '$22'},
'23_str': {'$toString': '$23'}}
},
{'$project': {
'24': {
'$concat': [{'$ifNull': ['$22_str', '']}, '-', {'$ifNull': ['$23_str', '']}]}}
},
{'$out': 'Rec24'}
]
The query execution time is about 10s. However in case if “22” and “23” fields are in the same collection, the execution time of concatenating these fields and storing the results in new collection is less than 0.5s.
Is there any way to optimize the aggregation pipeline to get better performance?