For example, mongodb collection have 2 fields already.
pyspark dataframe is contains 3 fields with primary key .
so, if I m updating same document on key I want to keep old fields too while writing dataframe . I don’t want to lose old data and update fields which has been in dataframe
Is it possible ? Please suggest if pyspark writing configuration available that would be helpful.
Example as below:
Data present in collection:
A | B |
---|---|
x | 1 |
y | 1 |
z | 1 |
New Dataframe:
A | B | C |
---|---|---|
x | 2 | |
y | 0 | 2 |
z | 2 | 2 |
I want result in mongodb collection as below:
A | B | C |
---|---|---|
x | 1 | 2 |
y | 0 | 2 |
z | 2 | 2 |
w | 3 | 2 |