Hi there,
I’ve been working a lot with mongodb, but there is one question I’ve never really got an answer for. I have a lot of data in one collection (+10 million docs), where each day thousands of docs are added. In my aggregation I want to filter all documents in a specific timeframe, mostly per month. My first stage in the pipeline is a match query where I reduce to amount of documents to around 50-100 thousand. With an index this is extremly fast, however the next two stages are always eather unwind or groupby stages with multiple attributes. This process then takes at least 3-10 seconds to return the desired result.
My question now is, is there any way to speed up the group_by or unwind stage, when considering lots of documents. Or is there any technique to faster summarize nummerical values. Besides, we are currently using an M30 cluster. Does this also have a major impact on the pipeline duration? Any advice would help me enormously.