Quick Start: Java and MongoDB - Creating Documents

Maxime Beugnet

#Java
Quick Start Java and MongoDB

In a previous blog post, I showed you how to connect MongoDB and Java and perform a very simple query. In this blog post, I will show you how to insert documents into MongoDB.

Getting Set Up

I will use the same repository as I used in the previous blog post. If you don't have a copy of it yet, you can clone it:

git clone https://github.com/mongodb-developer/java-quick-start

If you already cloned this repository, just make sure you are using the last version:

git pull

In the previous blog post, we created the classes HelloMongoDB and Connection. This time we will work on the Create class.

If you didn't set up your free cluster on MongoDB Atlas, now is great time to do so. You have all the instructions in this blog post.

Checking the Collection and Data Model

In the sample dataset, you can find the database sample_training, which contains a collection grades. Each document in this collection represents a student's grades for a particular class.

Here is the JSON representation of a document in the Mongo Shell.

MongoDB Enterprise Cluster0-shard-0:PRIMARY> db.grades.findOne({student_id: 0, class_id: 339})
{
	"_id" : ObjectId("56d5f7eb604eb380b0d8d8ce"),
	"student_id" : 0,
	"scores" : [
		{
			"type" : "exam",
			"score" : 78.40446309504266
		},
		{
			"type" : "quiz",
			"score" : 73.36224783231339
		},
		{
			"type" : "homework",
			"score" : 46.980982486720535
		},
		{
			"type" : "homework",
			"score" : 76.67556138656222
		}
	],
	"class_id" : 339
}

And here is the extended JSON representation of the same student. You can retrieve it in MongoDB Compass if you want.

Extended JSON is the human readable version of a BSON document without loss of type information. You can read more about the Java driver and BSON here.

{
    "_id": {
        "$oid": "56d5f7eb604eb380b0d8d8ce"
    },
    "student_id": {
        "$numberDouble": "0"
    },
    "scores": [{
        "type": "exam",
        "score": {
            "$numberDouble": "78.40446309504266"
        }
    }, {
        "type": "quiz",
        "score": {
            "$numberDouble": "73.36224783231339"
        }
    }, {
        "type": "homework",
        "score": {
            "$numberDouble": "46.980982486720535"
        }
    }, {
        "type": "homework",
        "score": {
            "$numberDouble": "76.67556138656222"
        }
    }],
    "class_id": {
        "$numberDouble": "339"
    }
}

As you can see, MongoDB stores BSON documents and for each key-value pairs, the BSON contains the key and the value along with its type. This is how MongoDB knows that class_id is actually a double and not an integer, which is not explicit in the Mongo Shell representation of this document.

We have 10,000 students (student_id from 0 to 9999) already in this collection and each of them took 10 different classes which adds up to 100,000 documents in this collection. Let's say a new student (student_id 10,000) just arrived in this university and received a bunch of (random) grades in his first class. Let's insert this new student using Java.

In this university, the class_id varies from 0 to 500 so I can use any random value between 0 and 500.

Connecting to a Specific Collection

Firstly, we need to set up our Create class and access this sample_training.grades collection.

package com.mongodb.quickstart;

import com.mongodb.client.MongoClient;
import com.mongodb.client.MongoClients;
import com.mongodb.client.MongoCollection;
import com.mongodb.client.MongoDatabase;
import org.bson.Document;

import java.util.logging.Level;
import java.util.logging.Logger;

public class Create {

    public static void main(String[] args) {
        Logger.getLogger("org.mongodb.driver").setLevel(Level.WARNING);
        try (MongoClient mongoClient = MongoClients.create(System.getProperty("mongodb.uri"))) {

            MongoDatabase sampleTrainingDB = mongoClient.getDatabase("sample_training");
            MongoCollection<Document> gradesCollection = sampleTrainingDB.getCollection("grades");

        }
    }
}

Create a BSON Document

Secondly, we need to represent this new student in Java using the Document class.

Random rand = new Random();
Document student = new Document("_id", new ObjectId());
student.append("student_id", 10000d)
       .append("class_id", 1d)
       .append("scores", asList(new Document("type", "exam").append("score", rand.nextDouble() * 100),
                                new Document("type", "quiz").append("score", rand.nextDouble() * 100),
                                new Document("type", "homework").append("score", rand.nextDouble() * 100),
                                new Document("type", "homework").append("score", rand.nextDouble() * 100)));

As you can see, we reproduced the same data model from the existing documents in this collection as we made sure that student_id, class_id and score are all doubles.

Also, the Java driver would have generated the _id field with an ObjectId for us if we didn't explicitly create one here but it's good practise to set the _id ourselves. This won't change our life right now but it makes more sense when we directly manipulate POJOs and we want to create a clean REST API. I will show you how to do this in a future blog post.

Insert Operation

Finally, we can insert this document.

gradesCollection.insertOne(student);

Final Code to Insert One Document

Here is the final Create class to insert one document in MongoDB with all the details I mentioned above.

package com.mongodb.quickstart;

import com.mongodb.client.MongoClient;
import com.mongodb.client.MongoClients;
import com.mongodb.client.MongoCollection;
import com.mongodb.client.MongoDatabase;
import org.bson.Document;
import org.bson.types.ObjectId;

import java.util.Random;
import java.util.logging.Level;
import java.util.logging.Logger;

import static java.util.Arrays.asList;

public class Create {

    public static void main(String[] args) {
        Logger.getLogger("org.mongodb.driver").setLevel(Level.WARNING);
        try (MongoClient mongoClient = MongoClients.create(System.getProperty("mongodb.uri"))) {

            MongoDatabase sampleTrainingDB = mongoClient.getDatabase("sample_training");
            MongoCollection<Document> gradesCollection = sampleTrainingDB.getCollection("grades");

            Random rand = new Random();
            Document student = new Document("_id", new ObjectId());
            student.append("student_id", 10000d)
                   .append("class_id", 1d)
                   .append("scores", asList(new Document("type", "exam").append("score", rand.nextDouble() * 100),
                                            new Document("type", "quiz").append("score", rand.nextDouble() * 100),
                                            new Document("type", "homework").append("score", rand.nextDouble() * 100),
                                            new Document("type", "homework").append("score", rand.nextDouble() * 100)));

            gradesCollection.insertOne(student);
        }
    }
}

You can execute this class with the following maven command line in the root directory or using your IDE (see the previous post for details). Do not forget the doubles quotes around the MongoDB URI to avoid surprises.

mvn compile exec:java -Dexec.mainClass="com.mongodb.quickstart.Create" -Dmongodb.uri="mongodb+srv://USERNAME:PASSWORD@cluster0-abcde.mongodb.net/test?w=majority"

And here is the document I extracted from MongoDB Compass.

{
    "_id": {
        "$oid": "5d97c375ded5651ea3462d0f"
    },
    "student_id": {
        "$numberDouble": "10000"
    },
    "class_id": {
        "$numberDouble": "1"
    },
    "scores": [{
        "type": "exam",
        "score": {
            "$numberDouble": "4.615256396625178"
        }
    }, {
        "type": "quiz",
        "score": {
            "$numberDouble": "73.06173415145801"
        }
    }, {
        "type": "homework",
        "score": {
            "$numberDouble": "19.378205578990727"
        }
    }, {
        "type": "homework",
        "score": {
            "$numberDouble": "82.3089189278531"
        }
    }]
}

Note that the order of the fields is different from the initial document with "student_id": 0.

We could get exactly the same order if we want to by creating the document like this.

Random rand = new Random();
Document student = new Document("_id", new ObjectId());
student.append("student_id", 10000d)
       .append("scores", asList(new Document("type", "exam").append("score", rand.nextDouble() * 100),
                                new Document("type", "quiz").append("score", rand.nextDouble() * 100),
                                new Document("type", "homework").append("score", rand.nextDouble() * 100),
                                new Document("type", "homework").append("score", rand.nextDouble() * 100)))
       .append("class_id", 1d);

But if you do things correctly, this should not have any impact in your code and logic as fields in a JSON documents are not ordered.

I'm quoting json.org for this.

An object is an unordered set of name/value pairs.

Bulk Inserts

Now that we know how to create one document, let's learn how to insert many documents.

Of course, we could just wrap the previous insert operation into a for loop. Indeed, if we loop 10 times on this method, we would send 10 insert commands to the cluster and expect 10 insert acknowledgments. As you can imagine, this would not be very efficient as it would generate a lot more TCP communications than necessary.

Instead, we want to wrap our 10 documents and send them in one call to the cluster and we want to receive only one insert acknowledgment for the entire list.

Let's refactor the code. First, let's make the random generator a private static final field.

private static final Random rand = new Random();

Let's make a grade factory method.

private static Document generateNewGrade(double studentId, double classId) {
    List<Document> scores = asList(new Document("type", "exam").append("score", rand.nextDouble() * 100),
                                   new Document("type", "quiz").append("score", rand.nextDouble() * 100),
                                   new Document("type", "homework").append("score", rand.nextDouble() * 100),
                                   new Document("type", "homework").append("score", rand.nextDouble() * 100));
    return new Document("_id", new ObjectId()).append("student_id", studentId)
                                              .append("class_id", classId)
                                              .append("scores", scores);
}

And now we can use this to insert 10 documents all at once.

List<Document> grades = new ArrayList<>();
for (int classId = 1; classId <= 10; classId++) {
    grades.add(generateNewGrade(10001d, classId));
}

gradesCollection.insertMany(grades, new InsertManyOptions().ordered(false));

As you can see, we are now wrapping our grade documents into a list and we are sending this list in a single call with the insertMany method.

By default, the insertMany method will insert the documents in order and stop if an error occurs during the process. For example, if you try to insert a new document with the same _id as an existing document you would get a DuplicateKeyException.

Therefore, with an ordered insertMany, the last documents of the list would not be inserted and the insertion process would stop and return the appropriate exception as soon as the error occurs.

As you can see here, this is not the behaviour we want because all the grades are completely independent from one to another so, if one of them fails, we want to process all the grades and then eventually fall back to an exception for the ones that failed.

This is why you see the second parameter new InsertManyOptions().ordered(false) which is true by default.

The Final Code

Let's refactor the code a bit and here is the final Create class.

package com.mongodb.quickstart;

import com.mongodb.client.MongoClient;
import com.mongodb.client.MongoClients;
import com.mongodb.client.MongoCollection;
import com.mongodb.client.MongoDatabase;
import com.mongodb.client.model.InsertManyOptions;
import org.bson.Document;
import org.bson.types.ObjectId;

import java.util.ArrayList;
import java.util.List;
import java.util.Random;
import java.util.logging.Level;
import java.util.logging.Logger;

import static java.util.Arrays.asList;

public class Create {

    private static final Random rand = new Random();

    public static void main(String[] args) {
        Logger.getLogger("org.mongodb.driver").setLevel(Level.WARNING);
        try (MongoClient mongoClient = MongoClients.create(System.getProperty("mongodb.uri"))) {

            MongoDatabase sampleTrainingDB = mongoClient.getDatabase("sample_training");
            MongoCollection<Document> gradesCollection = sampleTrainingDB.getCollection("grades");

            insertOneDocument(gradesCollection);
            insertManyDocuments(gradesCollection);
        }
    }

    private static void insertOneDocument(MongoCollection<Document> gradesCollection) {
        gradesCollection.insertOne(generateNewGrade(10000d, 1d));
        System.out.println("One grade inserted for studentId 10000.");
    }

    private static void insertManyDocuments(MongoCollection<Document> gradesCollection) {
        List<Document> grades = new ArrayList<>();
        for (int classId = 1; classId <= 10; classId++) {
            grades.add(generateNewGrade(10001d, classId));
        }

        gradesCollection.insertMany(grades, new InsertManyOptions().ordered(false));
        System.out.println("Ten grades inserted for studentId 10001.");
    }

    private static Document generateNewGrade(double studentId, double classId) {
        List<Document> scores = asList(new Document("type", "exam").append("score", rand.nextDouble() * 100),
                                       new Document("type", "quiz").append("score", rand.nextDouble() * 100),
                                       new Document("type", "homework").append("score", rand.nextDouble() * 100),
                                       new Document("type", "homework").append("score", rand.nextDouble() * 100));
        return new Document("_id", new ObjectId()).append("student_id", studentId)
                                                  .append("class_id", classId)
                                                  .append("scores", scores);
    }
}

As a reminder, every write operation (create, replace, update, delete) performed on a SINGLE document is ACID in MongoDB. Which means insertMany is not ACID by default but, good news, since MongoDB 4.0, we can wrap this call in a multi-document ACID transaction to make it fully ACID. I am explaining this in more details in this blog post.

Wrapping Up

We have seen how to create MongoDB BSON documents in Java and insert them in MongoDB one-by-one with the insertOne method or several at a time using the insertMany method for bulk inserts.

The MongoDB Java Driver also supports a direct mapping between POJOs and BSON documents using codecs. I will show you how to do this in a future blog post but using the Document class like we did in this blog post is the best way to work with data that does not follow a strict data model. You can only leverage codec and POJOs when you have a fixed data model.

Thanks for reading my blog post and I will see you in the new one to discuss how to read documents.

Previous articles in this Quick Start Java and MongoDB series: