Performance in update upsert mode from ETL tool

Hello dear Community,

i am using Mongo db_version: 4.2.8 and connect to it via C 1.17 latest driver from a remote ETL tool (Ab Initio).

In the tool, there is a special component which is used to communicate with the DB. Basically, it executes
update / insert commands. The task is to load ~300000 documents in update/upsert mode. Below, there is a criteria to decide on update or insert :

{ttnr: "ttnr", name: "name",  'location._id':ObjectId("location._id") };


{ "$set": {ttnr: "ttnr", name: "name",  'location._id': ObjectId("location._id") } }

batch =10000 : Number of records to submit in batch (does not really affect the udpate/upsert mode, it updates about 10 records per second → quite slow with respect to 300000 items)

data looks as follows:

{“_id”:{“$oid”:“60c2fbd3627f8b06d03c98b5”},“name”:“EL-; AR14-C”,“type”:“Product”,“ttnr”:“01215555501”,“location”:{“_id”:“60c23939898cb1d1168e4551”},“parents”:,“versions”:,“check_sum”:{“$numberLong”:“0”}}

The problem is that it takes hours to load these unique 300000 records in there; however in the "insert " mode it takes second. The insert mode, though, does not care about duplicates of data regarding “ttnr” and “location” values. That is why i use update in upsert mode. Does anybody know how to improve performance ? Can any settings be adjusted on the DB side to increase processing speed (it is clear that DB should check the criteria for the every incoming record. ) ? What may be improved for the query itself?

Thank you in advance

Best regards

Hi @igor_insights

Thanks for raising this interesting question, I’m not sure how this pertains to M201 and I’m unfamiliar with the tool you mention. I’d suggest reposting this in the Working with Data category as you may be able to get additional help and people who may be more familiar with the ETL tool you are using.

Is there a specific lesson or exercise in M201 that you have an issue with or question about that I can help you with?

Kindest regards,