How to write ObjectId value using Spark connector 10.1 using Pyspark?

Yes - I noticed documentation change and switched MongoDB Spark Connector: 10.2.0
Now it works: If value of column is in this format: ‘{ “$oid” : “xxxxxxxxx”}’

But I have two problems:

  • I don’t know how to get $ in place of struct field name. I can generate column with: to_jsob(struct(_id as oid)) but PySpark complains if I use $
  • If I don’t use $ - Driver won’t convert to ObjectId
  • In addition to this, driver still READS _id (ObjectId) as “String” when it LOADS data from MongoDB

Finally there is also mismatch in Documentation of driver: This post says use “object_Or_Array_Only” but documentation uses “objectOrArrayOnly”

In my case I was ONLY able to get above working with “any” option - but when I use “object_Or_Array_Only” - it gave error on converting my DF - due to other data in DF I think

Question
How I can convert _id (String) to have value with $oid in the format ‘{“$oid”: “_id value” }’ ?

This seems so simple issue - and having latest MongoDB / Spark built for purpose - and having to go through all these workarounds - bit surprising. But I need good sample if you can share for above question - greatly appreciate - and can live with whatever workarounds