Yes - I noticed documentation change and switched MongoDB Spark Connector: 10.2.0
Now it works: If value of column is in this format: ‘{ “$oid” : “xxxxxxxxx”}’
But I have two problems:
- I don’t know how to get $ in place of struct field name. I can generate column with: to_jsob(struct(_id as oid)) but PySpark complains if I use $
- If I don’t use $ - Driver won’t convert to ObjectId
- In addition to this, driver still READS _id (ObjectId) as “String” when it LOADS data from MongoDB
Finally there is also mismatch in Documentation of driver: This post says use “object_Or_Array_Only” but documentation uses “objectOrArrayOnly”
In my case I was ONLY able to get above working with “any” option - but when I use “object_Or_Array_Only” - it gave error on converting my DF - due to other data in DF I think
Question
How I can convert _id (String) to have value with $oid in the format ‘{“$oid”: “_id value” }’ ?
This seems so simple issue - and having latest MongoDB / Spark built for purpose - and having to go through all these workarounds - bit surprising. But I need good sample if you can share for above question - greatly appreciate - and can live with whatever workarounds