My app server creates queries and inserts data into mongo based on live user actions. This is important and should take precedence over reading from Mongo by Spark for data analysis, which runs concurrently. At present we get timeouts when trying do live action-based queries during Spark read tasks.
How do I throttle down the load mongo-spark-connector puts on Mongo so that my live input can continue to be inserted while Spark is reading from Mongo?
UPDATE: Maybe a clue to controlling load from Spark could be what the load is related to. Number of partitions, number of cores for the job, something in the Spark config or Job config?