I am trying to connect my MongoDB instance to spark in databricks using the mongo spark connector v10.1.0. I am able to connect to MongoDB through a MongoClient instance with the same connection string that I am trying to load into spark through the connection.uri option. The connection string I am using follows the convention below.
f"mongodb://{username}:{password}@{host}/?replicaSet={replicaSet}&readPreference={readPreference}&authSource={authSource}&tls=true&tlsCAFile={tlsCAFile path}&tlsCertificateKeyFile={tlsCertificateKeyFile path}"
Here are my spark configurations that I am using and the way I am trying to connect.
self.spark = SparkSession.builder \
.config("spark.jars.packages", "org.mongodb.spark:mongo-spark-connector:10.1.0") \
.appName("APP NAME") \
.getOrCreate()
df = self.spark.read.format("mongodb") \
.option("connection.uri", connection_string) \
.option("database", <DB NAME>) \
.option("collection", <COLLECTION NAME>) \
.load()
Here is the error I am running into
exception={com.mongodb.MongoSocketWriteException: Exception sending message}, caused by {javax.net.ssl.SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target}, caused by {sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target}, caused by {sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target}}]