We are using MongoDB to store analytics records for our application.
We have collections that has huge amount of data ~200M.
We have a reporting functionality that uses aggregation and is insanely taking huge amount of time (~60 seconds)
We tried adding couple of indices to make the query PROJECTION_COVERED but no luck on reduced time,
If there is anything we could do to improve the performance, I would really appreciate.
Schema:
{
_id: ObjectId,
agent: String,
organisation: String
session: String
timestamp: Date
text: String
...
...
...
}
Indexes
Name | Type | Size | Usage | Properties |
---|---|---|---|---|
_id | Regular | 89MB | 4 Since Thu Dec | Unique |
agent_organisation | Regular | 90MB | 9 Since Thu Dec | Compound |
agent_organisation_session_timestamp | Regular | 180MB | 14 Since Thu Dec | Compound |
Pipeline:
db.getCollection("analytics")
.aggregate(
[
{
$match: {
agent: "6194df628be5f9a484979dee",
organisation: "61387a6d02cce479622bff92"
}
},
{
$group: {
_id: {
session: "$session"
},
start: { $min: "$timestamp" },
end: { $max: "$timestamp" },
count: {
$sum: 1
}
}
},
{
$project: {
_id: 0,
session: "$_id.session",
start: 1,
end: 1,
count: 1
}
},
{
$sort: {
count: -1
}
},
{
$limit: 1000
}
]);
Explain Result
...
...
"queryPlanner" : {
"plannerVersion" : 1.0,
"namespace" : "analytics.analytics",
"indexFilterSet" : false,
"parsedQuery" : {
"$and" : [
{
"organisation" : {
"$eq" : "61387a6d02cce479622bff92"
}
},
{
"agent" : {
"$eq" : "6194df628be5f9a484979dee"
}
}
]
},
"queryHash" : "7D53FE85",
"planCacheKey" : "E7A83797",
"winningPlan" : {
"stage" : "PROJECTION_COVERED",
"transformBy" : {
"session" : 1.0,
"timestamp" : 1.0,
"_id" : 0.0
},
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"agent" : 1.0,
"organisation" : 1.0,
"session" : 1.0,
"timestamp" : 1.0
},
"indexName" : "agent_organisation_session_timestamp",
...
...
...
Is there anything we could do to improve the performance of the query?