October 01, 2018
We are delighted to announce that the MongoDB Connector for Apache Spark is officially certified by Cloudera. MongoDB users may already integrate Spark and MongoDB using the MongoDB Connector for Apache Spark, a fully supported package maintained by MongoDB. This connector allows you to perform advanced analytics and machine learning against the data sets that reside in MongoDB. Users of Cloudera may use this same connector to run Spark jobs from their managed clusters against both MongoDB Atlas and self-managed MongoDB instances.
Apache Spark and MongoDB are a potent analytics combination. MongoDB’s flexible schema, secondary indexing, aggregation pipelines, and workload isolation make it easy for users to efficiently process data drawn from multiple sources into a single database with zero impact to other business-critical database operations. Running Spark on MongoDB reduces operational overhead as well. Running Spark jobs on MongoDB eliminates the need to ETL duplicate data to a separate cluster of HDFS servers, greatly simplifying your architecture and increasing the speed at which analytics can be executed.
MongoDB Atlas, our on-demand, fully-managed cloud database service for MongoDB, makes it even easier to run sophisticated analytics processing by eliminating the operational overhead of managing database clusters directly. By combining Cloudera and MongoDB, Atlas users can make benefit of a fully managed analytics platform, freeing engineering resources to focus on their core business domain and deliver actionable insights quickly.