Is there a codec or annotation way to map a python class field to a different field in the mongo document?

I’m trying to take a python class - a pydantic model - and save it almost as is: the class does not have an _id field, so I want to either use a field annotation or some codec / ODM way to signal that a certain field is to be written as _id into mongo, then read as my_field when read from mongo (or from bson to dict, or something like that. So with this in mind:

  1. Does pymongo have field annotation perform such manipulation?
  2. Does pymongo have a codec notion for classes to perform such manipulation?
  3. Is there a field-mapper that can be overridden to say "if you see field named x, store it as y?

Looks like SONManipulatior is going out of style - deprecated.

bson.codec_options seem to be intended for scalar types, whereas my class is not a scaler .

A pair of wrappers can certainly do the trick:


def documentize(d:dict, nominal:str)->dict:
   d[_id] = d.pop(nominal)
   return d

def pythonize(d: dict, nominal:str) ->dict:
   d[nominal]= d.pop('_id')
   return d

But the syntax pollution around every entry / exit into a CRUD command seems suboptimal.

I have gotten around this using some pydantic strong-arming, but pydantic is neither an ODM nor a storage translation layer, so I’m looking for way to do it “right” using the driver.

Hi @Nuri_Halperin, there is no pymongo feature to accomplish this kind of document level translation. The solutions you’ve mentioned seem reasonable to me: wrap/unwrap around CRUD methods or use pydantic to rewrite the field. You could also consider renaming the field to be _id. Feel free to open a feature request in our issue tracker: https://jira.mongodb.org/projects/PYTHON

2 Likes

Thanks!

The whole issue arises from the fact that a pydantic model field _id will be silently ignored. Calling my_model.dict() will not return it in the dict, and any repr or listing of fields will skip it.

As for a feature request, seems that SONManipulator is going out of fashion, seems prudent to know why before requesting (partial? similar?) re-introduction. Seems most related to [PYTHON-2733] Allow decoding/deserializing BSON container types to custom Python types - MongoDB Jira

I built some tooling for this, but have not yet published the package here: pydanticmongo on Github

There were a number of reason that led us to deprecate SONManipulator. If I recall correctly, one was that it only worked on the Database level when ideally such a ODM like serialization layer should work using only the bson module (via CodecOptions). Another reason was performance problems inherent to its design which caused it to be slow with large nested/documents. When designing a new feature we would keep these design issues in mind.

PYTHON-2733 is quite related but I believe it would be good to file this (field renaming) as a separate feature. We would likely group these features together into a ODM-lite project.

2 Likes

Thank you for the helpful insights!

1 Like

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.