Hi - I am currently trying to read the change data from MongoDB and persisting the results to a file sink but getting a java.lang.UnsupportedOperationException: Data source mongodb does not support microbatch processing.
error
Here is my code snippet:
query=(spark.readStream.format("mongodb")
.option('spark.mongodb.connection.uri', 'mongodb+srv://<<mongo-connection>>')
.option('spark.mongodb.database', 'xxx') \
.option('spark.mongodb.collection', 'xxx') \
.option('spark.mongodb.change.stream.publish.full.document.only','true') \
.option("forceDeleteTempCheckpointLocation", "true") \
.load())
query.printSchema()
query.writeStream \
.outputMode("append") \
.format("parquet") \
.option("checkpointLocation", "s3://xxxx/checkpoint") \
.option("path", "s3://xxx") \
.start()
Environment details :
MongoDB Atlas : 5.0.8
Spark : 3.2.1
MongoDB-Spark connector : 10.0.2
Does this connector support writing to file sinks? Any suggestions?