Java - Aggregation Pipeline
Rate this quickstart
- Update to Java 21
- Update Java Driver to 5.0.0
- Update to Java 17
- Update Java Driver to 4.11.1
- Update mongodb-crypt to 1.8.0
- Update Java Driver to 4.2.2.
- Added Client Side Field Level Encryption example.
- Update Java Driver to 4.1.1.
It's the most powerful way to work with your data in MongoDB. It will allow us to make advanced queries like grouping documents, manipulate arrays, reshape document models, etc.
Let's see how we can harvest this power using Java.
I will use the same repository as usual in this series. If you don't have a copy of it yet, you can clone it or just update it if you already have it:
As you can see, we have one document for each zip code in the USA and for each, we have the associated population.
To calculate the population of New York, I would have to sum the population of each zip code to get the population of the entire city.
Let's try to find the 3 biggest cities in the state of Texas. Let's design this on paper first.
- I don't need to work with the entire collection. I need to filter only the cities in Texas.
- Once this is done, I can regroup all the zip code from a same city together to get the total population.
- Then I can order my cities by descending order or population.
- Finally, I can keep the first 3 cities of my list.
After a little code refactoring, here is what I have:
The MongoDB driver provides a lot of helpers to make the code easy to write and to read.
As you can see, I solved this problem with:
Here is the output we get:
This time, I'm using the collection
posts in the same database.
This collection of 500 posts has been generated artificially, but it contains arrays and I want to show you how we can manipulate arrays in a pipeline.
Let's try to find the three most popular tags and for each tag, I also want the list of post titles they are tagging.
Here is my solution in Java.
It allows me in the following $group stage to group my tags, count the posts and collect the titles in a new array
Here is the final output I get.
The aggregation pipeline is very powerful. We have just scratched the surface with these two examples but trust me if I tell you that it's your best ally if you can master it.