Reading "normal" JSON with pymongo from a collection

Hi all,

sorry for my ignorance, I’m quite new to MongoDB and making my first steps.
I have a task to read all documents from a MongoDB collection as straight-forward JSONs, using python.

Assuming my connection works (it does), if i simply read the documents as-is and try to print them, python is throwing this type of exception:

my_cursor = my_collection.find_one()
print (json.dumps(my_cursor))

TypeError: Object of type datetime is not JSON serializable

What I have found is that I need to use the bson module to dump the data:

l_cursor = json.loads(bson.json_util.dumps(my_cursor))
print (json.dumps(l_cursor, indent=2))

It works, but the representation is a little bit unusual:

"date": { "$date": "2019-01-20T05:00:00Z"}

I have found different workarounds, suggesting custom encoders, lambda functions and so on.

My question: how do I get a representation "key": "value" instead of "key": {"$value type": value} in a more generic way? Without writing an encoder for all possible BSON data types, hopefully.

Thank you and apologies for my ignorance once again.

Best,
Michael

From an example I found I gather that you might simply have to

my_document = my_collection.find_one()
print( my_document )

I renamed my_cursor from your code as my_document because find_one() returns a single document rather than a cursor. The find() method would return a cursor.

You’re right, but then the BSON types are returned like this: 'date': datetime.datetime(2018, 12, 27, 5, 0)
That’s why json.loads() can’t serialise that.
And it’s not really human-readable.

I was using find_one() to simplify the example.

In this SO thread they mention to use default=str as an option.

That, i found too.
It looks better, of course.
But still the date should be ISO conform, an integer should remain an integer and a float should be a float, not a string.
I wrote a basic encoder for the datetime and Decimal128, but it’s 20 BSON data types, give or take.
I mean, there’s mongoexport tool which seems to do the job correctly, so I know it’s possible.

1 Like

This is above my little knowledge of python.

But as for normal JSON, there is no Date format, no Decimal128 and no difference between integer and float. The official JSON types are array, object, string, number, boolean and null. BSON has a richer data type system and this is why EJSON was brought to life, that is why you get:

"date": { "$date": "2019-01-20T05:00:00Z"}

My suggestion is to commit using MongoDB Extended JSON wherever possible. For example:

docs = list(coll.find())

# To encode as JSON:
docs_as_extended_json = bson.json_util.dumps(docs)

# To decode the JSON back to python/pymongo objects:
docs_decoded = bson.json_util.loads(docs_as_extended_json)
assert docs_decoded == docs

Using MongoDB Extended JSON should make your life easier because it is crossplatform and supports encoding/decoding all the BSON types. The problem with your initial attempt is that you used json.loads to decode the JSON instead of bson.json_util.loads.

1 Like