How to bulk insert documents and skip duplicates

Hey there,

How do I use insert_many() to insert a batch of documents, where some of the documents may contain a duplicate id. Obviously in this, the collection has an index which enforces the uniqueness condition on ids.

For example, using pymongo, and see the mongodb connection to c:

c['test']['test'].create_index('id', unique=True, background=True)

data = [{"id": "234"}, {"id": "1234"}, {"id": "234"}, {"id": "45678"}]

c['test']['test'].with_options(write_concern=WriteConcern(w=0)).insert_many(data)

will run “successfully” but fails to insert the last document.

1 Like

You have to use unordered insert if you want to skip dups

2 Likes