Docs Menu

Use your local SparkSession's read method to create a DataFrame representing a collection.

Note

A DataFrame is represented by a Dataset of Rows. It is an alias of Dataset[Row].

The following example loads the collection specified in the SparkConf:

val df = spark.read.format("mongodb").load() // Uses the SparkConf for configuration

To specify a different collection, database, and other read configuration settings, use the option method:

val df = spark.read.format("mongodb").option("database", "<example-database>").option("collection", "<example-collection>").load()

Schema Inference

When you load a Dataset or DataFrame without a schema, Spark samples the records to infer the schema of the collection.

Consider a collection named characters:

{ "_id" : ObjectId("585024d558bef808ed84fc3e"), "name" : "Bilbo Baggins", "age" : 50 }
{ "_id" : ObjectId("585024d558bef808ed84fc3f"), "name" : "Gandalf", "age" : 1000 }
{ "_id" : ObjectId("585024d558bef808ed84fc40"), "name" : "Thorin", "age" : 195 }
{ "_id" : ObjectId("585024d558bef808ed84fc41"), "name" : "Balin", "age" : 178 }
{ "_id" : ObjectId("585024d558bef808ed84fc42"), "name" : "Kíli", "age" : 77 }
{ "_id" : ObjectId("585024d558bef808ed84fc43"), "name" : "Dwalin", "age" : 169 }
{ "_id" : ObjectId("585024d558bef808ed84fc44"), "name" : "Óin", "age" : 167 }
{ "_id" : ObjectId("585024d558bef808ed84fc45"), "name" : "Glóin", "age" : 158 }
{ "_id" : ObjectId("585024d558bef808ed84fc46"), "name" : "Fíli", "age" : 82 }
{ "_id" : ObjectId("585024d558bef808ed84fc47"), "name" : "Bombur" }

The following operation loads data from the MongoDB collection specified in SparkConf and infers the schema:

val df = MongoSpark.load(spark) // Uses the SparkSession
df.printSchema() // Prints DataFrame schema

df.printSchema() outputs the following schema to the console:

root
|-- _id: struct (nullable = true)
| |-- oid: string (nullable = true)
|-- age: integer (nullable = true)
|-- name: string (nullable = true)
MongoDB Connector for Spark →
Give Feedback
© 2022 MongoDB, Inc.

About

  • Careers
  • Investor Relations
  • Legal Notices
  • Privacy Notices
  • Security Information
  • Trust Center
© 2022 MongoDB, Inc.