I have an application in which I am using MongoDB Atlas . I have more than 1M documents in a single collection and I am trying to do some aggregations, my document structure looks like this:
One of My aggregation query looks like this:
/**
* Get top cities.
* @param queryText Search text filter.
* @param limit Return that many results.
* @param language Language filter, in language code format (ex. "fr" for French)
* @return {*} Promise of a list of cities and their total count.
*/
getTopCities = (queryText, limit, language) => {
try {
return this.getMostImportantList(queryText, limit, "$city", language);
} catch (err) {
throw new InternalError(err.message);
}
};
/**
* Get most important list of the given property:
* Find insights matching the query text, filter them by the specified language, deconstruct the array field of the given property,
* group and count the output, sort them in descending order, and return the desired number.
* @param queryText Search text filter.
* @param limit Return that many results.
* @param property Name of the field, in the form "$fieldName".
* @param language Language filter, in language code format (ex. "fr" for French)
* @return {*} Promise of a list of the given property and its total count.
*/
getMostImportantList = (queryText, limit, property, language) => {
// Filter insights matching the specified language
const mostImportantListPipeline = [{ $match: { language: language } }];
// Deconstruct the field, if it is an array field.
mostImportantListPipeline.push({
$unwind: property,
});
// Case-insensitive grouping and count.
mostImportantListPipeline.push({
$group: {
_id: { $toLower: property },
firstValue: { $first: property }, // Return one of the group values without converting it to lowercase (the first one), because we need to keep some capitals (ex. C++ Programming)
totalCount: { $sum: 1 },
},
});
// Sort the result-set in descending order.
mostImportantListPipeline.push({
$sort: {
totalCount: -1,
},
});
// Number of items returned.
mostImportantListPipeline.push({ $limit: limit });
// Filter insights matching the query text.
this.addSearchPipelineStage(mostImportantListPipeline, queryText);
return this.insightModel.aggregate(mostImportantListPipeline);
};
/**
* Add Search stage to an aggregation pipeline.
* @param queryText Search text filter.
* @param pipeline Aggregation pipeline.
*/
addSearchPipelineStage = (pipeline, queryText) => {
// If queryText is defined, add it to the pipeline.
if (queryText.length > 0) {
pipeline.unshift({
$search: {
index: "insights-index",
phrase: {
query: queryText,
path: {
wildcard: "*",
},
},
},
});
}
};
This query takes more than 20 seconds to execute. I wonder if there is any way to optimize it?
I have dynamic search index “insights-index” set up.
Please can you let me know if there is anything I can do to make this work faster?