Hello,
I’m working on an Ubuntu machine, I installed everything and I normally use Hadoop and pyspark. I’m trying to write a Spark dataframe to MongoDb but I keep get an error. I did all necessary steps but still no luck. I get an error on the mongo config DefaultMongoClientFactory, from the connector doc I can see this value is optional, I even tried to manually write the default value but no luck. Please find the steps with commands/output below:
Blockquote
-
mongo: v3.2.10
-
connector: mongo-spark-connector_2.12-10.2.0.jar
-
spark: 3.1.3
-
dependecies:
bson-3.2.0.jar
mongodb-driver-3.2.0.jar
mongodb-driver-core-3.2.0.jar
print(spark.sparkContext.getConf().toDebugString())
spark.app.id=local-1697738922184
spark.app.name=MongoDB
spark.app.startTime=1697738921889
spark.driver.host=10.0.2.15
spark.driver.port=43157
spark.executor.id=driver
spark.jars.packages=org.mongodb.spark:mongo-spark-connector_2.12-10.2.0
spark.master=local[*]
spark.mongodb.collection=tweets
spark.mongodb.connection.uri=mongodb://localhost:27017/
spark.mongodb.database=twitter_db
spark.rdd.compress=True
spark.serializer.objectStreamReset=100
spark.sql.catalogImplementation=hive
spark.sql.warehouse.dir=file:/home/hduser/Desktop/CA/spark-warehouse
spark.submit.deployMode=client
spark.submit.pyFiles=
spark.ui.showConsoleProgress=true
data.write.format(“mongodb”).mode(“overwrite”).save()
Py4JJavaError: An error occurred while calling o208.save.
: com.mongodb.spark.sql.connector.exceptions.ConfigException: Invalid value com.mongodb.spark.sql.connector.connection.DefaultMongoClientFactory for configuration mongoClientFactory
at com.mongodb.spark.sql.connector.config.ClassHelper.createInstance(ClassHelper.java:79)