Hello dear Community,
i am using Mongo db_version: 4.2.8 and connect to it via C 1.17 latest driver from a remote ETL tool (Ab Initio).
In the tool, there is a special component which is used to communicate with the DB. Basically, it executes
update / insert commands. The task is to load ~300000 documents in update/upsert mode. Below, there is a criteria to decide on update or insert :
{ttnr: "ttnr", name: "name", 'location._id':ObjectId("location._id") };
action:
then
{ "$set": {ttnr: "ttnr", name: "name", 'location._id': ObjectId("location._id") } }
batch =10000 : Number of records to submit in batch (does not really affect the udpate/upsert mode, it updates about 10 records per second → quite slow with respect to 300000 items)
data looks as follows:
{“_id”:{“$oid”:“60c2fbd3627f8b06d03c98b5”},“name”:“EL-; AR14-C”,“type”:“Product”,“ttnr”:“01215555501”,“location”:{“_id”:“60c23939898cb1d1168e4551”},“parents”:,“versions”:,“check_sum”:{“$numberLong”:“0”}}
The problem is that it takes hours to load these unique 300000 records in there; however in the "insert " mode it takes second. The insert mode, though, does not care about duplicates of data regarding “ttnr” and “location” values. That is why i use update in upsert mode. Does anybody know how to improve performance ? Can any settings be adjusted on the DB side to increase processing speed (it is clear that DB should check the criteria for the every incoming record. ) ? What may be improved for the query itself?
Thank you in advance
Best regards
Igor