개요
스트리밍 모드에서 MongoDB에 데이터를 쓸 때 다음 속성을 구성할 수 있습니다.
참고
SparkConf
를 사용하여 connector의 쓰기 구성을 설정하는 경우 각 속성 앞에 spark.mongodb.write.
를 접두사로 붙입니다.
속성 이름 | 설명 | |
---|---|---|
| Required. The connection string configuration key. Default: mongodb://localhost:27017/ | |
| Required. The database name configuration. | |
| Required. The collection name configuration. | |
| The comment to append to the write operation. Comments appear in the
output of the Database Profiler. Default: None | |
| MongoClientFactory configuration key. You can specify a custom implementation that must implement the
com.mongodb.spark.sql.connector.connection.MongoClientFactory
interface.Default: com.mongodb.spark.sql.connector.connection.DefaultMongoClientFactory | |
| Specifies if the connector parses string values and converts extended JSON
into BSON. This setting accepts the following values:
Default: false | |
| Specifies a field or list of fields by which to split the collection data. To
specify more than one field, separate them using a comma as shown
in the following example:
Default: _id | |
| When true , the connector ignores any null values when writing,
including null values in arrays and nested documents.Default: false | |
| Specifies the maximum number of operations to batch in bulk
operations. Default: 512 | |
| Specifies the type of write operation to perform. You can set
this to one of the following values:
Default: replace | |
| Specifies whether to perform ordered bulk operations. Default: true | |
| When true , replace and update operations insert the data
if no match exists.For time series collections, you must set upsertDocument to
false .Default: true | |
| Specifies w , a write-concern option requesting acknowledgment that
the write operation has propagated to a specified number of MongoDB
nodes.For a list of allowed values for this option, see WriteConcern
w Option in the MongoDB Server
manual. Default: Acknowledged | |
| Specifies j , a write-concern option requesting acknowledgment that
the data has been written to the on-disk journal for the criteria
specified in the w option. You can specify either true or
false .For more information on j values, see WriteConcern j
Option in the MongoDB Server
manual. | |
| Specifies wTimeoutMS , a write-concern option to return an error
when a write operation exceeds the specified number of milliseconds. If you
use this optional setting, you must specify a nonnegative integer.For more information on wTimeoutMS values, see
WriteConcern wtimeout in
the MongoDB Server manual. | |
| The absolute file path of the directory where the connector writes checkpoint
information. For more information about checkpoints, see the Spark Structured
Streaming Programming Guide Default: None | |
| A Boolean value that specifies whether to delete existing checkpoint data. Default: false |
다음에서 속성 지정 connection.uri
SparkConf 를 사용하여 이전 설정을 지정하는 경우 connection.uri
설정에 포함하거나 개별적으로 나열할 수 있습니다.
다음 코드 예시에서는 데이터베이스, 컬렉션, convertJson
설정을 connection.uri
설정의 일부로 지정하는 방법을 보여줍니다.
spark.mongodb.write.connection.uri=mongodb://127.0.0.1/myDB.myCollection?convertJson=any
connection.uri
를 더 짧게 유지하고 설정을 더 읽기 쉽게 만들려면 대신 개별적으로 지정할 수 있습니다.
spark.mongodb.write.connection.uri=mongodb://127.0.0.1/ spark.mongodb.write.database=myDB spark.mongodb.write.collection=myCollection spark.mongodb.write.convertJson=any
중요
connection.uri
및 해당 줄 모두에 설정을 지정하면 connection.uri
설정이 우선 적용됩니다. 예를 들어 다음 구성에서 연결 데이터베이스는 foobar
입니다.
spark.mongodb.write.connection.uri=mongodb://127.0.0.1/foobar spark.mongodb.write.database=bar