Writing configuration for upsert with pyspark in mongodb

For example, mongodb collection have 2 fields already.

pyspark dataframe is contains 3 fields with primary key .
so, if I m updating same document on key I want to keep old fields too while writing dataframe . I don’t want to lose old data and update fields which has been in dataframe

Is it possible ? Please suggest if pyspark writing configuration available that would be helpful.

Example as below:

Data present in collection:

A B
x 1
y 1
z 1

New Dataframe:

A B C
x 2
y 0 2
z 2 2

I want result in mongodb collection as below:

A B C
x 1 2
y 0 2
z 2 2
w 3 2

Hi Aishwarya,
The write configurations are defined here: https://www.mongodb.com/docs/spark-connector/current/configuration/write/

The relevant config would be the following:
||operationType|Specifies the type of write operation to perform. You can set this to one of the following values:

  • insert: insert the data.
  • replace: replace an existing document that matches the idFieldList value with the new data. If no match exists, the value of upsertDocument indicates whether or not the connector inserts a new document.
  • update: update an existing document that matches the idFieldList value with the new data. If no match exists, the value of upsertDocument indicates whether or not the connector inserts a new document.|
    | — | — |

Let us know if this answered your question