Docs Menu

Docs HomeMongoDB Spark Connector

Configuring Spark

On this page

  • Overview
  • Specify Configuration

You can configure read and write operations in both batch and streaming mode. To learn more about the available configuration options, see the following pages:

You can specify configuration options with SparkConf using any of the following approaches:

  • The --conf flag at runtime. To learn more, see Dynamically Loading Spark Properties in the Spark documentation.

  • The $SPARK_HOME/conf/spark-default.conf file.

The MongoDB Spark Connector will use the settings in SparkConf as defaults.

In the Spark API, the DataFrameReader, DataFrameWriter, DataStreamReader, and DataStreamWriter classes each contain an option() method. You can use this method to specify options for the underlying read or write operation.

Note

Options specified in this way override any corresponding settings in SparkConf.

Options maps support short-form syntax. You may omit the prefix when specifying an option key string.

Example

The following syntaxes are equivalent to one another:

  • dfw.option("spark.mongodb.write.collection", "myCollection").save()

  • dfw.option("spark.mongodb.collection", "myCollection").save()

  • dfw.option("collection", "myCollection").save()

To learn more about the option() method, see the following Spark documentation pages:

The Spark Connector reads some configuration settings before SparkConf is available. You must specify these settings by using a JVM system property.

For more information on Java system properties, see the Java documentation.

Tip

Configuration Exceptions

If the Spark Connector throws a ConfigException, confirm that your SparkConf or options map uses correct syntax and contains only valid configuration options.

←  Getting StartedBatch Mode →
Share Feedback