How to pass "allowDiskUse" parameter to aggregate pipeline in Pyspark

Hi

How to pass “allowDiskUse” parameter to aggregate pipeline in Pyspark?

You would specify it as part of the Spark conf

within your pyspark set as follows -

spark.mongodb.read.aggregation.allowDiskUse=true

Note that in V10 of the spark connector, we have a ticket to address an issue with this setting https://jira.mongodb.org/browse/SPARK-355. If anyone is reading this response be sure to check the status of the ticket before using the configuration property.