Aggregates Builders
On this page
- Overview
- Match
- Project
- Projecting Computed Fields
- Documents
- Sample
- Sort
- Skip
- Limit
- Lookup
- Left Outer Join
- Full Join and Uncorrelated Subqueries
- Group
- Pick-N Accumulators
- MinN
- MaxN
- FirstN
- LastN
- Top
- TopN
- Bottom
- BottomN
- Unwind
- Out
- Merge
- GraphLookup
- SortByCount
- ReplaceRoot
- AddFields
- Count
- Bucket
- BucketAuto
- Facet
- SetWindowFields
- Densify
- Fill
- Atlas Full-Text Search
- Atlas Search Metadata
- Atlas Vector Search
Overview
In this guide, you can learn how to use the Aggregates class which provides static factory methods that build aggregation pipeline stages in the MongoDB Kotlin driver.
For a more thorough introduction to Aggregation, see our Aggregation guide.
The examples on this page assume imports for methods of the following classes:
Aggregates
Filters
Projections
Sorts
Accumulators
import com.mongodb.client.model.Aggregates import com.mongodb.client.model.Filters import com.mongodb.client.model.Projections import com.mongodb.client.model.Sorts import com.mongodb.client.model.Accumulators
Use these methods to construct pipeline stages and specify them in your aggregation as a list:
val matchStage = Aggregates.match(Filters.eq("someField", "someCriteria")) val sortByCountStage = Aggregates.sortByCount("\$someField") val results = collection.aggregate( listOf(matchStage, sortByCountStage)).toList()
Many Aggregation
examples in this guide use the Atlas sample_mflix.movies dataset. The documents in this collection are
modeled by the following Movie
data class for use with the Kotlin driver:
data class Movie( val title: String, val year: Int, val genres: List<String>, val rated: String, val plot: String, val runtime: Int, val imdb: IMDB ){ data class IMDB( val rating: Double ) }
Match
Use the match()
method to create a $match
pipeline stage that matches incoming documents against the specified
query filter, filtering out documents that do not match.
Tip
The filter can be an instance of any class that implements Bson
, but it's
convenient to combine with use of the Filters class.
class.
The following example creates a pipeline stage that matches all documents
in the movies
collection where the
title
field is equal to "The Shawshank Redemption":
Aggregates.match(Filters.eq(Movie::title.name, "The Shawshank Redemption"))
Project
Use the project()
method to create a $project
pipeline stage that project specified document fields. Field projection
in aggregation follows the same rules as field projection in queries.
Tip
Though the projection can be an instance of any class that implements Bson
,
it's convenient to combine with use of Projections.
The following example creates a pipeline stage that includes the title
and
plot
fields but excludes the _id
field:
Aggregates.project( Projections.fields( Projections.include(Movie::title.name, Movie::plot.name), Projections.excludeId()) )
Projecting Computed Fields
The $project
stage can project computed fields as well.
The following example creates a pipeline stage that projects the rated
field
into a new field called rating
, effectively renaming the field:
Aggregates.project( Projections.fields( Projections.computed("rating", "\$${Movie::rated.name}"), Projections.excludeId() ) )
Documents
Use the documents()
method to create a
$documents
pipeline stage that returns literal documents from input values.
Important
If you use a $documents
stage in an aggregation pipeline, it must be the first
stage in the pipeline.
The following example creates a pipeline stage that creates
sample documents in the movies
collection with a title
field:
Aggregates.documents( listOf( Document(Movie::title.name, "Steel Magnolias"), Document(Movie::title.name, "Back to the Future"), Document(Movie::title.name, "Jurassic Park") ) )
Important
If you use the documents()
method to provide the input to an aggregation pipeline,
you must call the aggregate()
method on a database instead of on a
collection.
val docsStage = database.aggregate<Document>( // ... )
Sample
Use the sample()
method to create a $sample
pipeline stage to randomly select documents from input.
The following example creates a pipeline stage that randomly selects 5 documents
from the movies
collection:
Aggregates.sample(5)
Sort
Use the sort()
method to create a $sort
pipeline stage to sort by the specified criteria.
Tip
Though the sort criteria can be an instance of any class that
implements Bson
, it's convenient to combine with use of
Sorts.
The following example creates a pipeline stage that sorts in descending order according
to the value of the year
field and then in ascending order according to the
value of the title
field:
Aggregates.sort( Sorts.orderBy( Sorts.descending(Movie::year.name), Sorts.ascending(Movie::title.name) ) )
Skip
Use the skip()
method to create a $skip
pipeline stage to skip over the specified number of documents before
passing documents into the next stage.
The following example creates a pipeline stage that skips the first 5
documents
in the movies
collection:
Aggregates.skip(5)
Limit
Use the $limit pipeline stage to limit the number of documents passed to the next stage.
The following example creates a pipeline stage that limits the number of documents
returned from the movies
collection to 4
:
Aggregates.limit(4)
Lookup
Use the lookup()
method to create a $lookup
pipeline stage to perform joins and uncorrelated subqueries between two collections.
Left Outer Join
The following example creates a pipeline stage that performs a left outer
join between the movies
and comments
collections in the sample mflix
database:
It joins the
_id
field frommovies
to themovie_id
field incomments
It outputs the results in the
joined_comments
field
Aggregates.lookup( "comments", "_id", "movie_id", "joined_comments" )
Full Join and Uncorrelated Subqueries
The following example uses the fictional orders
and warehouses
collections.
The data is modeled using the following Kotlin data classes:
data class Order( val id: Int, val customerId: Int, val item: String, val ordered: Int ) data class Inventory( val id: Int, val stockItem: String, val inStock: Int )
The example creates a pipeline stage that joins the two collections by the item
and whether the available quantity in inStock
field is enough to fulfill
the ordered
quantity:
val variables = listOf( Variable("order_item", "\$item"), Variable("order_qty", "\$ordered") ) val pipeline = listOf( Aggregates.match( Filters.expr( Document("\$and", listOf( Document("\$eq", listOf("$\$order_item", "\$${Inventory::stockItem.name}")), Document("\$gte", listOf("\$${Inventory::inStock.name}", "$\$order_qty")) )) ) ), Aggregates.project( Projections.fields( Projections.exclude(Order::customerId.name, Inventory::stockItem.name), Projections.excludeId() ) ) ) val innerJoinLookup = Aggregates.lookup("warehouses", variables, pipeline, "stockData")
Group
Use the group()
method to create a $group
pipeline stage to group documents by a specified expression and output a document
for each distinct grouping.
Tip
The driver includes the Accumulators class with static factory methods for each of the supported accumulators.
The following example creates a pipeline stage that groups documents
in the orders
collection by the value of the customerId
field.
Each group accumulates the sum and average
of the values of the ordered
field into the totalQuantity
and
averageQuantity
fields:
Aggregates.group("\$${Order::customerId.name}", Accumulators.sum("totalQuantity", "\$${Order::ordered.name}"), Accumulators.avg("averageQuantity", "\$${Order::ordered.name}") )
Learn more about accumulator operators from the Server manual section on Accumulators.
Pick-N Accumulators
The pick-n accumulators are aggregation accumulation operators that return the top and bottom elements given a specific ordering. Use one of the following builders to create an aggregation accumulation operator:
Tip
You can only perform aggregation operations with these pick-n accumulators when running MongoDB v5.2 or later.
Learn which aggregation pipeline stages you can use accumulator operators with from the Server manual section on Accumulators.
The pick-n accumulator examples use documents from the movies
collection
in the sample-mflix
database.
MinN
The minN()
builder creates the $minN
accumulator which returns data from documents that contain the n
lowest
values of a grouping.
Tip
The $minN
and $bottomN
accumulators can perform similar tasks.
See
Comparison of $minN and $bottomN Accumulators
for recommended usage of each.
The following example demonstrates how to use the minN()
method to return
the lowest three imdb.rating
values for movies, grouped by year
:
Aggregates.group( "\$${Movie::year.name}", Accumulators.minN( "lowestThreeRatings", "\$${Movie::imdb.name}.${Movie.IMDB::rating.name}", 3 ) )
See the minN() API documentation for more information.
MaxN
The maxN()
accumulator returns data from documents that contain the n
highest values of a grouping.
The following example demonstrates how to use the maxN()
method to
return the highest two imdb.rating
values for movies, grouped by year
:
Aggregates.group( "\$${Movie::year.name}", Accumulators.maxN( "highestTwoRatings", "\$${Movie::imdb.name}.${Movie.IMDB::rating.name}", 2 ) )
See the maxN() API documentation for more information.
FirstN
The firstN()
accumulator returns data from the first n
documents in
each grouping for the specified sort order.
Tip
The $firstN
and $topN
accumulators can perform similar tasks.
See
Comparison of $firstN and $topN Accumulators
for recommended usage of each.
The following example demonstrates how to use the firstN()
method to
return the first two movie title
values, based on the order they came
into the stage, grouped by year
:
Aggregates.group( "\$${Movie::year.name}", Accumulators.firstN( "firstTwoMovies", "\$${Movie::title.name}", 2 ) )
See the firstN() API documentation for more information.
LastN
The lastN()
accumulator returns data from the last n
documents in
each grouping for the specified sort order.
The following example demonstrates how to use the lastN()
method to show
the last three movie title
values, based on the the order they came into
the stage, grouped by year
:
Aggregates.group( "\$${Movie::year.name}", Accumulators.lastN( "lastThreeMovies", "\$${Movie::title.name}", 3 ) )
See the lastN() API documentation for more information.
Top
The top()
accumulator returns data from the first document in a group
based on the specified sort order.
The following example demonstrates how to use the top()
method to return
the title
and imdb.rating
values for the top rated movies based on the
imdb.rating
, grouped by year
.
Aggregates.group( "\$${Movie::year.name}", Accumulators.top( "topRatedMovie", Sorts.descending("${Movie::imdb.name}.${Movie.IMDB::rating.name}"), listOf("\$${Movie::title.name}", "\$${Movie::imdb.name}.${Movie.IMDB::rating.name}") ) )
See the top() API documentation for more information.
TopN
The topN()
accumulator returns data from documents that contain the
highest n
values for the specified field.
Tip
The $firstN
and $topN
accumulators can perform similar tasks.
See
Comparison of $firstN and $topN Accumulators
for recommended usage of each.
The following example demonstrates how to use the topN()
method to return
the title
and runtime
values of the three longest movies based on the
runtime
values, grouped by year
.
Aggregates.group( "\$${Movie::year.name}", Accumulators.topN( "longestThreeMovies", Sorts.descending(Movie::runtime.name), listOf("\$${Movie::title.name}", "\$${Movie::runtime.name}"), 3 ) )
See the topN() API documentation for more information.
Bottom
The bottom()
accumulator returns data from the last document in a group
based on the specified sort order.
The following example demonstrates how to use the bottom()
method to
return the title
and runtime
values of the shortest movie based on the
runtime
value, grouped by year
.
Aggregates.group( "\$${Movie::year.name}", Accumulators.bottom( "shortestMovies", Sorts.descending(Movie::runtime.name), listOf("\$${Movie::title.name}", "\$${Movie::runtime.name}") ) )
See the bottom() API documentation for more information.
BottomN
The bottomN()
accumulator returns data from documents that contain the
lowest n
values for the specified field.
Tip
The $minN
and $bottomN
accumulators can perform similar tasks.
See Comparison of $minN and $bottomN Accumulators
for recommended usage of each.
The following example demonstrates how to use the bottomN()
method to
return the title
and imdb.rating
values of the two lowest rated movies
based on the imdb.rating
value, grouped by year
:
Aggregates.group( "\$${Movie::year.name}", Accumulators.bottom( "lowestRatedTwoMovies", Sorts.descending("${Movie::imdb.name}.${Movie.IMDB::rating.name}"), listOf("\$${Movie::title.name}", "\$${Movie::imdb.name}.${Movie.IMDB::rating.name}"), ) )
See the bottomN() API documentation for more information.
Unwind
Use the unwind()
method to create an $unwind
pipeline stage to deconstruct an array field from input documents, creating
an output document for each array element.
The following example creates a document for each element in the lowestRatedTwoMovies
array:
Aggregates.unwind("\$${"lowestRatedTwoMovies"}")
To preserve documents that have missing or null
values for the array field, or where array is empty:
Aggregates.unwind( "\$${"lowestRatedTwoMovies"}", UnwindOptions().preserveNullAndEmptyArrays(true) )
To include the array index (in this example, in a field called "position"
):
Aggregates.unwind( "\$${"lowestRatedTwoMovies"}", UnwindOptions().includeArrayIndex("position") )
Out
Use the out()
method to create an $out
pipeline stage that writes all documents to the specified collection in
the same database.
Important
The $out
stage must be the last stage in any aggregation pipeline.
The following example writes the results of the pipeline to the classic_movies
collection:
Aggregates.out("classic_movies")
Merge
Use the merge()
method to create a $merge
pipeline stage that merges all documents into the specified collection.
Important
The $merge
stage must be the last stage in any aggregation pipeline.
The following example merges the pipeline into the nineties_movies
collection
using the default options:
Aggregates.merge("nineties_movies")
The following example merges the pipeline into the movie_ratings
collection
in the aggregation
database using some non-default options that specify to
replace the document if both year
and title
match, otherwise insert the
document:
Aggregates.merge( MongoNamespace("aggregation", "movie_ratings"), MergeOptions().uniqueIdentifier(listOf("year", "title")) .whenMatched(MergeOptions.WhenMatched.REPLACE) .whenNotMatched(MergeOptions.WhenNotMatched.INSERT) )
GraphLookup
Use the graphLookup()
method to create a $graphLookup
pipeline stage that performs a recursive search on a specified collection to match
a specified field in one document to a specified field of another document.
The following example uses the contacts
collection. The data is modeled
using the following Kotlin data class:
data class Users( val name: String, val friends: List<String>?, val hobbies: List<String>? )
The example computes the reporting graph for users in the
contact
collection, recursively matching the value in the friends
field
to the name
field:
Aggregates.graphLookup( "contacts", "\$${Users::friends.name}", Users::friends.name, Users::name.name, "socialNetwork" )
Using GraphLookupOptions
, you can specify the depth to recurse as well as
the name of the depth field, if desired. In this example, $graphLookup
will
recurse up to two times, and create a field called degrees
with the
recursion depth information for every document.
Aggregates.graphLookup( "contacts", "\$${Users::friends.name}", Users::friends.name, Users::name.name, "socialNetwork", GraphLookupOptions().maxDepth(2).depthField("degrees") )
Using GraphLookupOptions
, you can specify a filter that documents must match
in order for MongoDB to include them in your search. In this
example, only links with "golf" in their hobbies
field will be included:
Aggregates.graphLookup( "contacts", "\$${Users::friends.name}", Users::friends.name, Users::name.name, "socialNetwork", GraphLookupOptions().maxDepth(1).restrictSearchWithMatch( Filters.eq(Users::hobbies.name, "golf") ) )
SortByCount
Use the sortByCount()
method to create a $sortByCount
pipeline stage that groups documents by a given expression and then sorts
these groups by count in descending order.
Tip
The $sortByCount
stage is identical to a $group
stage with a
$sum
accumulator followed by a $sort
stage.
[ { "$group": { "_id": <expression to group on>, "count": { "$sum": 1 } } }, { "$sort": { "count": -1 } } ]
The following example groups documents in the movies
collection by the
genres
field and computes the count for each distinct value:
Aggregates.sortByCount("\$${Movie::genres.name}"),
ReplaceRoot
Use the replaceRoot()
method to create a $replaceRoot
pipeline stage that replaces each input document with the specified document.
The following example uses a fictional books
collection that contains data
modeled using the following Kotlin data class:
data class Libro(val titulo: String) data class Book(val title: String, val spanishTranslation: Libro)
Each input document is replaced by the nested document in the
spanishTranslation
field:
Aggregates.replaceRoot("\$${Book::spanishTranslation.name}")
AddFields
Use the addFields()
method to create an $addFields
pipeline stage that adds new fields to documents.
Tip
Use $addFields
when you do not want to project field inclusion
or exclusion.
The following example adds two new fields, watched
and type
, to the
input documents in the movie
collection:
Aggregates.addFields( Field("watched", false), Field("type", "movie") )
Count
Use the count()
method to create a $count
pipeline stage that counts the number of documents that enter the stage, and assigns
that value to a specified field name. If you do not specify a field,
count()
defaults the field name to "count".
Tip
The $count
stage is syntactic sugar for:
{ "$group":{ "_id": 0, "count": { "$sum" : 1 } } }
The following example creates a pipeline stage that outputs the count of incoming documents in a field called "total":
Aggregates.count("total")
Bucket
Use the bucket()
method to create a $bucket
pipeline stage that automates the bucketing of data around predefined boundary
values.
The following examples use data modeled with the following Kotlin data class:
data class Screen( val id: String, val screenSize: Int, val manufacturer: String, val price: Double )
This example creates a pipeline stage that groups incoming documents based
on the value of their screenSize
field, inclusive of the lower boundary
and exclusive of the upper boundary:
Aggregates.bucket("\$${Screen::screenSize.name}", listOf(0, 24, 32, 50, 70, 1000))
Use the BucketOptions
class to specify a default bucket for values
outside of the specified boundaries, and to specify additional accumulators.
The following example creates a pipeline stage that groups incoming documents based
on the value of their screenSize
field, counting the number of documents
that fall within each bucket, pushing the value of screenSize
into a
field called matches
, and capturing any screen sizes greater than "70"
into a bucket called "monster" for monstrously large screen sizes:
Tip
The driver includes the Accumulators class with static factory methods for each of the supported accumulators.
Aggregates.bucket("\$${Screen::screenSize.name}", listOf(0, 24, 32, 50, 70), BucketOptions() .defaultBucket("monster") .output( Accumulators.sum("count", 1), Accumulators.push("matches", "\$${Screen::screenSize.name}") ) )
BucketAuto
Use the bucketAuto()
method to create a $bucketAuto
pipeline stage that automatically determines the boundaries of each bucket
in its attempt to distribute the documents evenly into a specified number of buckets.
The following examples use data modeled with the following Kotlin data class:
data class Screen( val id: String, val screenSize: Int, val manufacturer: String, val price: Double )
This example creates a pipeline stage that will attempt to create and evenly
distribute documents into 5 buckets using the value of their price
field:
Aggregates.bucketAuto("\$${Screen::screenSize.name}", 5)
Use the BucketAutoOptions
class to specify a preferred number
based scheme to set boundary values, and specify additional accumulators.
The following example creates a pipeline stage that will attempt to create and evenly
distribute documents into 5 buckets using the value of their price
field,
setting the bucket boundaries at powers of 2 (2, 4, 8, 16, ...). It also counts
the number of documents in each bucket, and calculates their average price
in a new field called avgPrice
:
Tip
The driver includes the Accumulators class with static factory methods for each of the supported accumulators.
Aggregates.bucketAuto( "\$${Screen::price.name}", 5, BucketAutoOptions() .granularity(BucketGranularity.POWERSOF2) .output(Accumulators.sum("count", 1), Accumulators.avg("avgPrice", "\$${Screen::price.name}")) )
Facet
Use the facet()
method to create a $facet
pipeline stage that allows for the definition of parallel pipelines.
The following examples use data modeled with the following Kotlin data class:
data class Screen( val id: String, val screenSize: Int, val manufacturer: String, val price: Double )
This example creates a pipeline stage that executes two parallel aggregations:
The first aggregation distributes incoming documents into 5 groups according to their
screenSize
field.The second aggregation counts all manufacturers and returns their count, limited to the top 5.
Aggregates.facet( Facet( "Screen Sizes", Aggregates.bucketAuto( "\$${Screen::screenSize.name}", 5, BucketAutoOptions().output(Accumulators.sum("count", 1)) ) ), Facet( "Manufacturer", Aggregates.sortByCount("\$${Screen::manufacturer.name}"), Aggregates.limit(5) ) )
SetWindowFields
Use the setWindowFields()
method to create a $setWindowFields
pipeline stage that allows using window operators to perform operations
on a specified span of documents in a collection.
Tip
Window Functions
The driver includes the Windows class with static factory methods for building windowed computations.
The following example uses a fictional weather
collection using data modeled
with the following Kotlin data class:
data class Weather( val localityId: String, val measurementDateTime: LocalDateTime, val rainfall: Double, val temperature: Double )
The example creates a pipeline stage that computes the
accumulated rainfall and the average temperature over the past month for
each locality from more fine-grained measurements presented in the rainfall
and temperature
fields:
val pastMonth = Windows.timeRange(-1, MongoTimeUnit.MONTH, Windows.Bound.CURRENT) val resultsFlow = weatherCollection.aggregate<Document>( listOf( Aggregates.setWindowFields("\$${Weather::localityId.name}", Sorts.ascending(Weather::measurementDateTime.name), WindowOutputFields.sum( "monthlyRainfall", "\$${Weather::rainfall.name}", pastMonth ), WindowOutputFields.avg( "monthlyAvgTemp", "\$${Weather::temperature.name}", pastMonth ) ) )
Densify
Use the densify()
method to create a
$densify
pipeline stage that generates a sequence of documents to span a specified
interval.
Tip
You can use the $densify()
aggregation stage only when running
MongoDB v5.1 or later.
Consider the following documents retrieved from the Atlas sample weather dataset
that contain measurements for a similar position
field, spaced one hour
apart:
Document{{ _id=5553a..., position=Document{{type=Point, coordinates=[-47.9, 47.6]}}, ts=Mon Mar 05 08:00:00 EST 1984, ... }} Document{{ _id=5553b..., position=Document{{type=Point, coordinates=[-47.9, 47.6]}}, ts=Mon Mar 05 09:00:00 EST 1984, ... }}
These documents are modeled using the following Kotlin data class:
data class Weather( val id: ObjectId = ObjectId(), val position: Point, val ts: LocalDateTime )
Suppose you needed to create a pipeline stage that performs the following actions on these documents:
Add a document at every 15-minute interval for which a
ts
value does not already exist.Group the documents by the
position
field.
The call to the densify()
aggregation stage builder that accomplishes
these actions should resemble the following:
Aggregates.densify( "ts", DensifyRange.partitionRangeWithStep(15, MongoTimeUnit.MINUTE), DensifyOptions.densifyOptions().partitionByFields("Position.coordinates") )
The following output highlights the documents generated by the aggregate stage
which contain ts
values every 15 minutes between the existing documents:
Document{{ _id=5553a..., position=Document{{type=Point, coordinates=[-47.9, 47.6]}}, ts=Mon Mar 05 08:00:00 EST 1984, ... }} Document{{ position=Document{{coordinates=[-47.9, 47.6]}}, ts=Mon Mar 05 08:15:00 EST 1984 }} Document{{ position=Document{{coordinates=[-47.9, 47.6]}}, ts=Mon Mar 05 08:30:00 EST 1984 }} Document{{ position=Document{{coordinates=[-47.9, 47.6]}}, ts=Mon Mar 05 08:45:00 EST 1984 }} Document{{ _id=5553b..., position=Document{{type=Point, coordinates=[-47.9, 47.6]}}, ts=Mon Mar 05 09:00:00 EST 1984, ... }}
See the densify package API documentation for more information.
Fill
Use the fill()
method to create a
$fill
pipeline stage that populates null
and missing field values.
Tip
You can use the $fill()
aggregation stage only when running
MongoDB v5.3 or later.
Consider the following documents that contain temperature and air pressure measurements at an hourly interval:
Document{{_id=6308a..., hour=1, temperature=23C, air_pressure=29.74}} Document{{_id=6308b..., hour=2, temperature=23.5C}} Document{{_id=6308c..., hour=3, temperature=null, air_pressure=29.76}}
These documents are modeled using the following Kotlin data class:
data class Weather( val id: ObjectId = ObjectId(), val hour: Int, val temperature: String?, val air_pressure: Double? )
Suppose you needed to populate missing temperature and air pressure data points in the documents as follows:
Populate the
air_pressure
field for hour "2" using linear interpolation to calculate the value.Set the missing
temperature
value to "23.6C" for hour "3".
The call to the fill()
aggregation stage builder that accomplishes
these actions resembles the following:
val resultsFlow = weatherCollection.aggregate<Weather>( listOf( Aggregates.fill( FillOptions.fillOptions().sortBy(Sorts.ascending(Weather::hour.name)), FillOutputField.value(Weather::temperature.name, "23.6C"), FillOutputField.linear(Weather::air_pressure.name) ) ) ) resultsFlow.collect { println(it) }
Weather(id=6308a..., hour=1, temperature=23C, air_pressure=29.74) Weather(id=6308b..., hour=2, temperature=23.5C, air_pressure=29.75) Weather(id=6308b..., hour=3, temperature=23.6C, air_pressure=29.76)
See the fill package API documentation for more information.
Atlas Full-Text Search
Use the search()
method to create a $search
pipeline stage that specifies a full-text search of one or more fields.
Tip
Only Available on Atlas for MongoDB v4.2 and later
This aggregation pipeline operator is only available for collections hosted on MongoDB Atlas clusters running v4.2 or later that are covered by an Atlas search index. Learn more about the required setup and the functionality of this operator from the Atlas Search documentation.
The following example creates a pipeline stage that searches the title
field in the movies
collection for text that contains the word "Future":
Aggregates.search( SearchOperator.text( SearchPath.fieldPath(Movie::title.name), "Future" ), SearchOptions.searchOptions().index("title") )
Learn more about the builders from the search package API documentation.
Atlas Search Metadata
Use the searchMeta()
method to create a
$searchMeta
pipeline stage which returns only the metadata part of the results from
Atlas full-text search queries.
Tip
Only Available on Atlas for MongoDB v4.4.11 and later
This aggregation pipeline operator is only available on MongoDB Atlas clusters running v4.4.11 and later. For a detailed list of version availability, see the MongoDB Atlas documentation on $searchMeta.
The following example shows the count
metadata for an Atlas search
aggregation stage:
Aggregates.searchMeta( SearchOperator.near(1985, 2, SearchPath.fieldPath(Movie::year.name)), SearchOptions.searchOptions().index("year") )
Learn more about this helper from the searchMeta() API documentation.
Atlas Vector Search
Important
To learn about which versions of MongoDB Atlas support this feature, see Limitations in the Atlas documentation.
Use the vectorSearch()
method to create a $vectorSearch
pipeline stage that specifies a semantic search. A semantic search is
a type of search that locates pieces of information that are similar in meaning.
To use this feature when performing an aggregation on a collection, you must create a vector search index and index your vector embeddings. To learn how to set up search indexes in MongoDB Atlas, see How to Index Vector Embeddings for Vector Search in the Atlas documentation.
The example in this section uses data modeled with the following Kotlin data class:
data class MovieAlt( val title: String, val year: Int, val plot: String, val plotEmbedding: List<Double> )
This example shows how to build an aggregation pipeline that uses the
vectorSearch()
method to perform a vector search with the following
specifications:
Searches
plotEmbedding
field values by using vector embeddings of a string valueUses the
mflix_movies_embedding_index
vector search indexConsiders up to 2 nearest neighbors
Returns 1 document
Filters for documents in which the
year
value is at least2016
Aggregates.vectorSearch( SearchPath.fieldPath(MovieAlt::plotEmbedding.name), listOf(-0.0072121937, -0.030757688, -0.012945653), "mflix_movies_embedding_index", 2.toLong(), 1.toLong(), vectorSearchOptions().filter(Filters.gte(MovieAlt::year.name, 2016)) )
To learn more about this helper, see the vectorSearch() API documentation.