Docs Home → MongoDB Spark Connector
Structured Streaming with MongoDB
On this page
Overview
Spark Structured Streaming is a data stream processing engine you can use through the Dataset or DataFrame API. The MongoDB Spark Connector enables you to stream to and from MongoDB using Spark Structured Streaming.
Important
Spark Structured Streaming and Spark Streaming with DStreams are different.
To learn more about Structured Streaming, see the Spark Programming Guide.
Configuring a Write Stream to MongoDB
Configuring a Read Stream from MongoDB
When reading a stream from a MongoDB database, the MongoDB Spark Connector supports both micro-batch processing and continuous processing. Micro-batch processing is the default processing engine, while continuous processing is an experimental feature introduced in Spark version 2.3. To learn more about continuous processing, see the Spark documentation.
Note
The connector reads from your MongoDB deployment's change stream. To generate change events on the change stream, perform update operations on your database.
To learn more about change streams, see Change Streams in the MongoDB manual.
Examples
The following examples show Spark Structured Streaming configurations for streaming to and from MongoDB.
Stream to MongoDB from a CSV File
To stream data from a CSV file to MongoDB:
Stream to your Console from MongoDB
To stream data from MongoDB to your console: