Overview
Puede configurar las siguientes propiedades al escribir datos en MongoDB en modo de transmisión.
Nota
Si utilizas SparkConf Para establecer las configuraciones de escritura del conector, agregue el prefijo spark.mongodb.write. a cada propiedad.
Nombre de la propiedad | Descripción | |
|---|---|---|
| Required. The connection string configuration key. Default: mongodb://localhost:27017/ | |
| Required. The database name configuration. | |
| Required. The collection name configuration. | |
| The comment to append to the write operation. Comments appear in the
output of the Database Profiler. Default: None | |
| MongoClientFactory configuration key. You can specify a custom implementation that must implement the
com.mongodb.spark.sql.connector.connection.MongoClientFactory
interface.Default: com.mongodb.spark.sql.connector.connection.DefaultMongoClientFactory | |
| Specifies if the connector parses string values and converts extended JSON
into BSON. This setting accepts the following values:
Default: false | |
| Specifies a field or list of fields by which to split the collection data. To
specify more than one field, separate them using a comma as shown
in the following example: Default: _id | |
| When true, the connector ignores any null values when writing,
including null values in arrays and nested documents.Default: false | |
| Specifies the maximum number of operations to batch in bulk
operations. Default: 512 | |
| Specifies the type of write operation to perform. You can set
this to one of the following values:
Default: replace | |
| Specifies whether to perform ordered bulk operations. Default: true | |
| When true, replace and update operations insert the data
if no match exists.For time series collections, you must set upsertDocument to
false.Default: true | |
| Specifies w, a write-concern option requesting acknowledgment that
the write operation has propagated to a specified number of MongoDB
nodes.For a list of allowed values for this option, see WriteConcern
w Option in the MongoDB Server
manual. Default: Acknowledged | |
| Specifies j, a write-concern option requesting acknowledgment that
the data has been written to the on-disk journal for the criteria
specified in the w option. You can specify either true or
false.For more information on j values, see WriteConcern j
Option in the MongoDB Server
manual. | |
| Specifies wTimeoutMS, a write-concern option to return an error
when a write operation exceeds the specified number of milliseconds. If you
use this optional setting, you must specify a nonnegative integer.For more information on wTimeoutMS values, see
WriteConcern wtimeout in
the MongoDB Server manual. | |
| The absolute file path of the directory where the connector writes checkpoint
information. For more information about checkpoints, see the Spark Structured
Streaming Programming Guide Default: None | |
| A Boolean value that specifies whether to delete existing checkpoint data. Default: false | |
| Specifies how to truncate a collection when performing an overwrite. You can set
this option to one of the following values:
| |
| When set to true, the connector ignores duplicate key errors when performing
unordered insert operations. The data being inserted must include an _id
field value or whichever fields are specified in the idFieldList option.Default: false |
Especificación de propiedades en connection.uri
Si utilizas SparkConf para especificar cualquiera de las configuraciones anteriores, puede incluirlas en la connection.uri configuración o enumerarlas individualmente.
El siguiente ejemplo de código muestra cómo especificar la base de datos, la colección y la configuración convertJson como parte de la configuración connection.uri:
spark.mongodb.write.connection.uri=mongodb://127.0.0.1/myDB.myCollection?convertJson=any
Para mantener el connection.uri más breve y hacer que la configuración sea más fácil de leer, puedes especificarlos individualmente en su lugar:
spark.mongodb.write.connection.uri=mongodb://127.0.0.1/ spark.mongodb.write.database=myDB spark.mongodb.write.collection=myCollection spark.mongodb.write.convertJson=any
Importante
Si especifica una configuración tanto en la línea connection.uri como en la línea correspondiente, la configuración connection.uri tiene prioridad. Por ejemplo, en la siguiente configuración, la base de datos de conexión es foobar:
spark.mongodb.write.connection.uri=mongodb://127.0.0.1/foobar spark.mongodb.write.database=bar