Permissive mode for spark read with mongo-spark connector - nulls for corrupt fields

Can anyone please say as how do we enable spark permissive mode in mongo spark connector i.e. replace null for corrupt fields

Example
I have mongo collection with 2 records with following structure for each of them
Record 1
_id → String
num → Int32

Record 2
_id → String
num → String

I explicity pass schema on mongo spark read using following command

val df = MongoSpark.read(spark)
      .option("uri", uri)
      .option("database", database)
      .option("collection", collection)
      .schema(schema)
      .load

val schema: StructType = new StructType()
                    .add("_id", StringType, true)
                    .add("num", IntegerType, true)

I get MongoTypeConversionException as ‘num’ field in second record is a string. I want mongo spark connector to replace corrupt fields with null and the spark read to succeed.

1 Like