I’m upgrading from pymongo 3.12 to pymongo 4.0.1, and I’m running into some weird behaviour with the UUID representation. The database connection is configured with CodecOptions(uuid_representation=4).
There is a difference between return values when using normal insertion and when using bulk insertion:
Note that PyMongo is still sending the expected UUID format to the server, it’s just the client side upserted_ids field which is unexpected. A temporary workaround for this bug would be to convert the Binary into the expected type:
import uuid
from typing import Any, Dict, Mapping
from bson.binary import Binary, UuidRepresentation
from pymongo import MongoClient
from pymongo.collection import Collection
from pymongo.operations import ReplaceOne
def convert_ids(upserted_ids: Mapping[int, Any], coll: Collection) -> Dict[int, Any]:
"""Temporary workaround for https://jira.mongodb.org/browse/PYTHON-3075.
Use like this::
result = collection.bulk_write([...])
upserted_ids = convert_ids(result.upserted_ids, collection)
"""
res = {}
rep = coll.codec_options.uuid_representation
for idx, _id in upserted_ids.items():
if rep == UuidRepresentation.UNSPECIFIED:
if isinstance(_id, uuid.UUID):
_id = Binary.from_uuid(_id)
elif isinstance(_id, Binary):
_id = _id.as_uuid(rep)
res[idx] = _id
return res
client = MongoClient(uuidRepresentation='standard')
collection = client.test.test
doc = {'_id': uuid.uuid4()}
result = collection.bulk_write([ReplaceOne(doc, doc, upsert=True)])
for _id in convert_ids(result.upserted_ids, collection).values():
assert isinstance(_id, uuid.UUID)
Please follow the jira ticket for updates.
One question though, are you using pymongo directly or with a wrapper library? I ask because BulkWriteResult doesn’t have an inserted_ids property (it does have an upserted_ids property).
Thank you for the workaround and the bug report! You’re right it was upserted_ids, I apparently did something weird with the copy pasting and formatting to make it look nice…