Timeseries collection gives "E11000 duplicate key error index" on _id field

Janari_Parts · September 8, 2023, 8:17am

We are using mongo 6.0.3 currently. During large data ingestions using Spring Data every now and then get this error during batch insert. It does not happen with all batches and retrying seems to help for most cases.

I am just curious why this happens as timeseries should not have unique index and definitely can not create a index when trying to so it is a timeseries collection.

Any help on insight would be appreciated.

Kushagra_Kesav · September 8, 2023, 2:51pm

Hey @Janari_Parts,

Welcome to the MongoDB Community!

Could you kindly share the exact error message you are encountering?
Additionally, it would be helpful to understand the frequency of this error. Could you quantify it?
To better assist you, could you kindly specify both the size of the data you are ingesting into the Time Series Collection and the frequency at which you are performing these ingestions?

Regards,
Kushagra

Chris_Feldhaus · September 8, 2023, 3:52pm

We have a standard collection (non time series) that we perform single inserts to and we occasionally receive an “E11000 duplicate key error” exception on the id index too. My issue may not be the same as yours as it is not a time series collection and not java or spring data but it may be the similar given that they’re both duplicates on the id key. If this should be a separate topic I can do that, please advise.

We insert roughly 4 million records per day and this error happens 0 to 3 times per week. We are not supplying the _id in the inserted document, we rely on the mongo driver to do that for us. We run two processes, one on each of two servers, performing these inserts simultaneously, I expect that these two process would not interfere with each other because of the random per-process value component in the _id.

Example provided below with exception error and list of all document _ids with the same time and process values. The _id counter is incrementing as expected and the logged duplicate does exist in the incrementing series.

I’m hoping that someone has some insight or guidance on this problem that will help me solve it.

PHP Fatal error: Uncaught MongoDB\Driver\Exception\BulkWriteException: E11000 duplicate key error collection: cet.logs index: id dup key: { _id: ObjectId(‘64f88e78d0d43d0a2662c043’) }

db.getCollection("logs").find({_id: {"$gt": ObjectId('64f88e78d0d43d0a25ffffff'), "$lt": ObjectId('64f88e78d0d43d0a27000000')}}, {"_id":1})
{
    "_id" : ObjectId("64f88e78d0d43d0a2662c042")
}
{
    "_id" : ObjectId("64f88e78d0d43d0a2662c043")
}
{
    "_id" : ObjectId("64f88e78d0d43d0a2662c044")
}
{
    "_id" : ObjectId("64f88e78d0d43d0a2662c045")
}
{
    "_id" : ObjectId("64f88e78d0d43d0a2662c046")
}
{
    "_id" : ObjectId("64f88e78d0d43d0a2662c047")
}
{
    "_id" : ObjectId("64f88e78d0d43d0a2662c048")
}
{
    "_id" : ObjectId("64f88e78d0d43d0a2662c049")
}

Ubuntu 20.04.4 LTS
php-mongodb 1.6.1
mongo-php-library 1.12.0
alcaeus/mongo-php-adapter 1.2.2
php7.4-cli 7.4.3
mongodb-org-server 5.0.20

Janari_Parts · September 8, 2023, 4:08pm

The error looks like this:

Write errors: [BulkWriteError{index=6887, code=11000, message='E11000 duplicate key error collection: data.system.buckets.data dup key: { _id: ObjectId('64ebe3806fb730fe1d39cad3') }', details={}}]. 
	at org.springframework.data.mongodb.core.MongoExceptionTranslator.translateExceptionIfPossible(MongoExceptionTranslator.java:107)
	at org.springframework.data.mongodb.core.MongoTemplate.potentiallyConvertRuntimeException(MongoTemplate.java:2789)
	at org.springframework.data.mongodb.core.MongoTemplate.execute(MongoTemplate.java:555)
	at org.springframework.data.mongodb.core.MongoTemplate.insertDocumentList(MongoTemplate.java:1456)
	at org.springframework.data.mongodb.core.MongoTemplate.doInsertBatch(MongoTemplate.java:1316)
	at org.springframework.data.mongodb.core.MongoTemplate.doInsertAll(MongoTemplate.java:1285)
	at org.springframework.data.mongodb.core.MongoTemplate.insertAll(MongoTemplate.java:1258)
	at org.springframework.data.mongodb.repository.support.SimpleMongoRepository.insert(SimpleMongoRepository.java:240)
	at jdk.internal.reflect.GeneratedMethodAccessor144.invoke(Unknown Source)

We ingest a large file 9GB so it tends to happen a few thousand times, but the objects inserted are small. It works fine 95% of the time, so most of the data is ingested without problem and if we add a retry on this failure it can try again 5-20 times and it passes it in.

We split everything into Chunks, it did not make a difference on the chunk size either 100 or 10k both get errors. We usually have 5000 insertions per second. We let mongo generate the ID itself so we don’t actually add an _id field.