Docs Menu

Docs HomeView & Analyze DataSpark Connector

Release Notes

The 10.3 connector release includes the following new features:

  • Added support for reading multiple collections when using micro-batch or continuous streaming modes.

    Warning

    Breaking Change

    Support for reading multiple collections introduces the following breaking changes:

    • If the name of a collection used in your collection configuration option contains a comma, the Spark Connector treats it as two different collections. To avoid this, you must escape the comma by preceding it with a backslash (\).

    • If the name of a collection used in your collection configuration option is "*", the Spark Connector interprets it as a specification to scan all collections. To avoid this, you must escape the asterisk by preceding it with a backslash (\).

    • If the name of a collection used in your collection configuration option contains a backslash (\), the Spark Connector treats the backslash as an escape character, which might change how it interprets the value. To avoid this, you must escape the backslash by preceding it with another backslash.

    To learn more about scanning multiple collections, see the collection configuration property description.

The 10.2 connector release includes the following new features:

  • Added the ignoreNullValues write-configuration property, which enables you to control whether the connector ignores null values. In previous versions, the connector always wrote null values to MongoDB.

  • Added options for the convertJson write-configuration property.

  • Added the change.stream.micro.batch.max.partition.count read-configuration property, which allows you to divide micro-batches into multiple partitions for parallel processing.

  • Improved change stream schema inference when using the change.stream.publish.full.document.only read-configuration property.

  • Added the change.stream.startup.mode read-configuration property, which specifies how the connector processes change events when no offset is available.

  • Support for adding a comment to operations.

  • Corrected a bug in which aggregations including the $collStats pipeline stage did not return a count field for Time Series collections.

  • Support for Scala 2.13.

  • Support for micro-batch mode with Spark Structured Streaming.

  • Support for BSON data types.

  • Improved partitioner support for empty collections.

  • Option to disable automatic upsert on write operations.

  • Improved schema inference for empty arrays.

  • Support for null values in arrays and lists. The Connector now writes these values to MongoDB instead of throwing an exception.

See this post on the MongoDB blog for more information.

  • Support for Spark Structured Streaming.

← FAQ