I’m trying to replicate a MongoDB collection to Delta Lake using the Spark Connector with structured streaming but there is one problem.
When using the option
change.stream.publish.full.document.only=true I won’t get the deleted document. But that is expected.
But if I omit the option, I only get a row with the
_data field. All other fields are null.
I would at least expect to have the
_id field so I can delete the entry.
Can someone explain me how to capture deleted documents with structured streaming?