Docs Menu

Docs HomeDevelop ApplicationsMongoDB DriversJava Sync

Aggregation

On this page

  • Overview
  • Aggregation and Find Operations Compared
  • Useful References
  • Runnable Examples
  • Base Setup
  • Basic Aggregation Example
  • Explain Aggregation Example
  • Aggregation Expression Example

In this guide, you can learn how to use aggregation operations in the MongoDB Java driver.

Aggregation operations process data in your MongoDB collections and return computed results. MongoDB's Aggregation pipeline, part of the Query API, is modeled on the concept of data processing pipelines. Documents enter a multi-staged pipeline that transforms the documents into an aggregated result.

Another way to think of aggregation is like a car factory. Within the car factory is an assembly line, along which are assembly stations with specialized tools to do a specific job, like drills and welders. Raw parts enter the factory, which are then transformed and assembled into a finished product.

The aggregation pipeline is the assembly line, aggregation stages are the assembly stations, and operator expressions are the specialized tools.

Using find operations, you can:

  • select what documents to return

  • select what fields to return

  • sort the results

Using aggregation operations, you can:

  • perform all find operations

  • rename fields

  • calculate fields

  • summarize data

  • group values

Aggregation operations have some limitations you must keep in mind:

  • Returned documents must not violate the BSON document size limit of 16 megabytes.

  • Pipeline stages have a memory limit of 100 megabytes by default. If required, you may exceed this limit by using the allowDiskUse method.

    Important

    $graphLookup exception

    The $graphLookup stage has a strict memory limit of 100 megabytes and will ignore allowDiskUse.

Create a new Java file called AggTour.java and include the following import statements:

import com.mongodb.client.MongoClient;
import com.mongodb.client.MongoClients;
import com.mongodb.client.MongoCollection;
import com.mongodb.client.MongoDatabase;
import com.mongodb.ExplainVerbosity;
import com.mongodb.client.model.Accumulators;
import com.mongodb.client.model.Aggregates;
import com.mongodb.client.model.Filters;
import com.mongodb.client.model.Projections;
import org.bson.Document;
import java.util.Arrays;
import java.util.List;
public class AggTour {
public static void main(String[] args) {
// Replace the uri string with your MongoDB deployment's connection string
String uri = "<connection string uri>";
MongoClient mongoClient = MongoClients.create(uri);
MongoDatabase database = mongoClient.getDatabase("aggregation");
MongoCollection<Document> collection = database.getCollection("restaurants");
// aggregation here
}
}

Tip

See also:

For information on connecting to MongoDB, see the Connection Guide

collection.insertMany(Arrays.asList(
new Document("name", "Sun Bakery Trattoria").append("contact", new Document().append("phone", "386-555-0189").append("email", "SunBakeryTrattoria@example.org").append("location", Arrays.asList(-74.0056649, 40.7452371))).append("stars", 4).append("categories", Arrays.asList("Pizza", "Pasta", "Italian", "Coffee", "Sandwiches")),
new Document("name", "Blue Bagels Grill").append("contact", new Document().append("phone", "786-555-0102").append("email", "BlueBagelsGrill@example.com").append("location", Arrays.asList(-73.92506, 40.8275556))).append("stars", 3).append("categories", Arrays.asList("Bagels", "Cookies", "Sandwiches")),
new Document("name", "XYZ Bagels Restaurant").append("contact", new Document().append("phone", "435-555-0190").append("email", "XYZBagelsRestaurant@example.net").append("location", Arrays.asList(-74.0707363, 40.59321569999999))).append("stars", 4).append("categories", Arrays.asList("Bagels", "Sandwiches", "Coffee")),
new Document("name", "Hot Bakery Cafe").append("contact", new Document().append("phone", "264-555-0171").append("email", "HotBakeryCafe@example.net").append("location", Arrays.asList(-73.96485799999999, 40.761899))).append("stars", 4).append("categories", Arrays.asList("Bakery", "Cafe", "Coffee", "Dessert")),
new Document("name", "Green Feast Pizzeria").append("contact", new Document().append("phone", "840-555-0102").append("email", "GreenFeastPizzeria@example.com").append("location", Arrays.asList(-74.1220973, 40.6129407))).append("stars", 2).append("categories", Arrays.asList("Pizza", "Italian")),
new Document("name", "ZZZ Pasta Buffet").append("contact", new Document().append("phone", "769-555-0152").append("email", "ZZZPastaBuffet@example.com").append("location", Arrays.asList(-73.9446421, 40.7253944))).append("stars", 0).append("categories", Arrays.asList("Pasta", "Italian", "Buffet", "Cafeteria")),
new Document("name", "XYZ Coffee Bar").append("contact", new Document().append("phone", "644-555-0193").append("email", "XYZCoffeeBar@example.net").append("location", Arrays.asList(-74.0166091, 40.6284767))).append("stars", 5).append("categories", Arrays.asList("Coffee", "Cafe", "Bakery", "Chocolates")),
new Document("name", "456 Steak Restaurant").append("contact", new Document().append("phone", "990-555-0165").append("email", "456SteakRestaurant@example.com").append("location", Arrays.asList(-73.9365108, 40.8497077))).append("stars", 0).append("categories", Arrays.asList("Steak", "Seafood")),
new Document("name", "456 Cookies Shop").append("contact", new Document().append("phone", "604-555-0149").append("email", "456CookiesShop@example.org").append("location", Arrays.asList(-73.8850023, 40.7494272))).append("stars", 4).append("categories", Arrays.asList("Bakery", "Cookies", "Cake", "Coffee")),
new Document("name", "XYZ Steak Buffet").append("contact", new Document().append("phone", "229-555-0197").append("email", "XYZSteakBuffet@example.org").append("location", Arrays.asList(-73.9799932, 40.7660886))).append("stars", 3).append("categories", Arrays.asList("Steak", "Salad", "Chinese"))
));

To perform an aggregation, pass a list of aggregation stages to the MongoCollection.aggregate() method.

The Java driver provides the Aggregates helper class that contains builders for aggregation stages.

In the following example, the aggregation pipeline:

  • Uses a $match stage to filter for documents whose categories array field contains the element Bakery. The example uses Aggregates.match to build the $match stage.

  • Uses a $group stage to group the matching documents by the stars field, accumulating a count of documents for each distinct value of stars.

Tip

See also:

You can build the expressions used in this example using the aggregation builders.

collection.aggregate(
Arrays.asList(
Aggregates.match(Filters.eq("categories", "Bakery")),
Aggregates.group("$stars", Accumulators.sum("count", 1))
)
// Prints the result of the aggregation operation as JSON
).forEach(doc -> System.out.println(doc.toJson()));

The preceding aggregation should produce the following results:

{"_id": 4, "count": 2}
{"_id": 5, "count": 1}

For more information about the methods and classes mentioned in this section, see the following API Documentation:

To view information about how MongoDB executes your operation, use the explain() method of the AggregateIterable class. The explain() method returns execution plans and performance statistics. An execution plan is a potential way MongoDB can complete an operation. The explain() method provides both the winning plan (the plan MongoDB executed) and rejected plans.

You can specify the level of detail of your explanation by passing a verbosity level to the explain() method.

The following table shows all verbosity levels for explanations and their intended use cases:

Verbosity Level
Use Case
ALL_PLANS_EXECUTIONS
You want to know which plan MongoDB will choose to run your query.
EXECUTION_STATS
You want to know if your query is performing well.
QUERY_PLANNER
You have a problem with your query and you want as much information as possible to diagnose the issue.

In the following example, we print the JSON representation of the winning plans for aggregation stages that produce execution plans:

Document explanation = collection.aggregate(
Arrays.asList(
Aggregates.match(Filters.eq("categories", "Bakery")),
Aggregates.group("$stars", Accumulators.sum("count", 1))
)
).explain(ExplainVerbosity.EXECUTION_STATS);
List<Document> stages = explanation.get("stages", List.class);
List<String> keys = Arrays.asList("queryPlanner", "winningPlan");
// Prints the JSON representation of the winning execution plans
for (Document stage : stages) {
Document cursorStage = stage.get("$cursor", Document.class);
if (cursorStage != null) {
System.out.println(cursorStage.getEmbedded(keys, Document.class).toJson());
}
}

The preceding code snippet should produce the following output:

{ "stage": "PROJECTION_SIMPLE",
"transformBy": {"stars": 1, "_id": 0},
"inputStage": {
"stage": "COLLSCAN",
"filter": {
"categories": {"$eq":"bakery"}},
"direction": "forward"}}

For more information about the topics mentioned in this section, see the following resources:

The Java driver provides builders for accumulator expressions for use with $group. You must declare all other expressions in JSON format or compatible document format.

Tip

The syntax in either of the following examples will define an $arrayElemAt expression.

The $ in front of "categories" tells MongoDB that this is a field path, using the "categories" field from the input document.

new Document("$arrayElemAt", Arrays.asList("$categories", 0))
Document.parse("{ $arrayElemAt: ['$categories', 0] }")

In the following example, the aggregation pipeline uses a $project stage and various Projections to return the name field and the calculated field firstCategory whose value is the first element in the categories field.

collection.aggregate(
Arrays.asList(
Aggregates.project(
Projections.fields(
Projections.excludeId(),
Projections.include("name"),
Projections.computed(
"firstCategory",
new Document("$arrayElemAt", Arrays.asList("$categories", 0))
)
)
)
)
).forEach(doc -> System.out.println(doc.toJson()));

The preceding aggregation should produce the following results:

{"name": "456 Cookies Shop", "firstCategory": "Bakery"}
{"name": "Sun Bakery Trattoria", "firstCategory": "Pizza"}
{"name": "456 Steak Restaurant", "firstCategory": "Steak"}
{"name": "Blue Bagels Grill", "firstCategory": "Bagels"}
{"name": "XYZ Steak Buffet", "firstCategory": "Steak"}
{"name": "Hot Bakery Cafe", "firstCategory": "Bakery"}
{"name": "Green Feast Pizzeria", "firstCategory": "Pizza"}
{"name": "ZZZ Pasta Buffet", "firstCategory": "Pasta"}
{"name": "XYZ Coffee Bar", "firstCategory": "Coffee"}
{"name": "XYZ Bagels Restaurant", "firstCategory": "Bagels"}

For more information about the methods and classes mentioned in this section, see the following API Documentation:

←  Updates BuildersIndexes →