How to pass "allowDiskUse" parameter to aggregate pipeline in Pyspark

Gaurav_Gupta4 · October 26, 2021, 11:09am

Hi

How to pass “allowDiskUse” parameter to aggregate pipeline in Pyspark?

Robert_Walters · July 20, 2022, 1:19pm

You would specify it as part of the Spark conf

within your pyspark set as follows -

spark.mongodb.read.aggregation.allowDiskUse=true

Note that in V10 of the spark connector, we have a ticket to address an issue with this setting https://jira.mongodb.org/browse/SPARK-355. If anyone is reading this response be sure to check the status of the ticket before using the configuration property.