Release date for Mongo 4.4?

Hi All

Specifically we would really appreciate the fix for aggregate queries with sort running very slowly
SERVER-7568. The status is ‘Resolved’ against version 4.3.1, so I presume that this means it will be released as part of Mongo version 4.3.

Hence the question do we have a date on the road map for the release of 4.3?

Welcome to the community Simon!

MongoDB 4.3 is the development release series leading up to the 4.4 production release (see MongoDB Versioning). You can download development/unstable releases for testing, but we do not recommend running them in a production environment as they will include work in progress and have not been throughly tested yet.

MongoDB 4.4 is currently in the “release candidate” stage of testing, where there will be several 4.4.0 release candidates available before a final Generally Available (GA) release. Release candidates are also for testing purposes only, but you could test in a development/staging environment to confirm if your issue will be resolved.

The specific issue you have highlighted is not a general fix for “sorts running very slowly”, so I recommend starting a new topic to explore why your aggregation pipelines are slow.

It would be helpful to include:

  • your current MongoDB server version
  • your deployment type (standalone, replica set, or sharded cluster)
  • aggregation pipeline and explain() output


I’ll see if I can’t experiment with the rc dev version.

Hi Stennie

So it’s release 4.4 I’m after. I’m not after a fix for general sorts running very slowly, sorts on find work perfectly well. The bug I mentioned is related to aggregate queries not selecting the index to sort on correctly which results in the entire collection being reloaded (AIUI) and explains the problems we are having completely. I really just need to know when the next release of Mongo is scheduled, I assume the referenced bugs will be addressed in it.

We have implemented a work-around for now as recommended by a bug linked to 7568 by replacing our aggregate call with find but this is not going to be a long term viable option without a lot of code upheaval which I would rather not have to do.

The issue manifests itself (at least for us) working on a collection of 34,000,000 documents. Weirdly if we run an aggregate query that returns many documents (8,782,333 to be precise) the query completes in approx 8 seconds. But if we then run the same query against a different objectId such that it returns a very small number of documents (27) then it takes 2 minutes to return:

So we have a collection containing documents that reference other objects in an array of ObjectIds e.g.:

    groups: [

and we wish to run a query that returns every document that has a specific ObjectId in the groups array:

db.collection.aggregate([ { $match : { $and: [ { groups: ObjectId("5ce283422ab79c000f9040f5") } ] } }, {$sort: {_id: 1}}  ] )

8,782,333 documents
7.92 seconds

db.collection.aggregate([ { $match : { $and: [ { groups: ObjectId("5e9d01c5a5db2000075764fe") } ] } }, {$sort: {_id: 1}}  ] )

27 documents
127 seconds

Now I know that this is not a generic sort issue as the following equivalent find queries demonstrate:

db.collection.find({ $and: [ {groups: ObjectId("5ce283422ab79c000f9040f5")} ]}).sort({_id: 1})

8,782,333 documents
5.13 seconds

db.collection.find({ $and: [ {groups: ObjectId("5e9d01c5a5db2000075764fe")} ]}).sort({_id: 1})

27 documents
0.18 seconds

127 seconds down to 0.18 seconds replacing the aggregate with the find and both using sort.

MongoDb: 4.2 Atlas cluster M30 tier, replica set, not sharded (yet)

‘Explain’ output for the 2 aggregate queries, I could see no discernible difference between the 2

Experimenting with the 4.3.1 release candidate locally shows that this issue of the aggregate -> sort query running slowly is fixed in the RC.

Do we have any idea when 4.4 will be available on General Release? Are we talking days, weeks, months, years? Any insight you can provide would be very helpful to us, otherwise we are going to have to look into either re-working our queries or look at other solutions.

Neither of which fills me with joy.

While I don’t know the timing of the releases, I would assume since the 4.4 builds are on RC2 currently and there’s only about a dozen items left on their JIRA marked as 4.4 Required, I would assume that the release is only a few months, at most, off. That is purely a guess however based on what I see.

I don’t think that MongoDB provides estimated release dates either, but maybe someone on the inside will come by later and give a more official answer.

Hi Simon,

Since MongoDB 4.4 is a major release which introduces new features and compatibility changes, there is an extended testing and release candidate phase. Release candidates are considered feature complete and ready for testing, with perhaps some polish left on new features or documentation. Any testing or feedback provided on release candidates is extremely helpful.

The release date is generally determined based on when the release is ready (for example, documentation complete and no known blocking issues). As Doug mentioned, we are currently at rc2 which is the third release candidate (release candidates start at rc0). As a general timescale guideline, the expectation for GA will be weeks or months from now rather than days or years.

For more details on what has changed, please see the Release Notes for MongoDB 4.4.

The specific issue you are focused on is about aggregation preferring indexes that can be used for sort (avoiding a blocking in-memory sort which may potentially fail). If you compare the explain output for equivalent find() versus aggregate() queries you will likely find that aggregation is choosing the _id index (or one which supports your sort criteria) over an index like {groups:1} (which is more selective but does not support the sort criteria). If you are able to share the explain output, someone might be able to provide a more informed suggestion.

Possible workarounds include:

  • Add a compound index to support both your match and sort criteria (for example, on {groups: 1, _id: 1})).
  • Hint the index to be used. The workaround on SERVER-21471 was from 2015, before aggregation hints were added in MongoDB 3.6.

Your aggregation pipeline may include more stages than the redacted example, but if it can be replaced by a find() query that would also be a straightforward solution.


1 Like

@Doug_Duncan @Stennie
Thanks both for your replies. We have implemented the ‘find’ work-around presently to the queries against the collection that is currently affected by this issue. I am trying to pre-empt the approaching tsunami when others grow to the point that they too will be affected. We really would prefer to not have to re-write the whole back-end to use find rather than aggregate.

We could implement compound indexes to all of the collections but we have requirements where queries can be written to sort on many different fields (never more than one at a time I add) so we would need a lot of compound indexes.

We have experimented with using hint as suggested by the bug reports but we have found that all that seems to occur is the hinted index is applied to both the match and sort phases, with the net effect that the sort is carried out as a non-index sort. Which works lovely on small result sets and fixes the immediate problem but then breaks queries that return large result sets such that they need the allowDiskUse option set true and run more slowly than the problem we are trying to fix in the first place.

So if we hint the index that satisfies the match on a query:
a small result set performance is good
a large result set requires allowDiskUse: true and performance is terrible

If we hint the index that satisfies the sort
a small result set performance is terrible
a large result set performance is good.

So we would need compound indexes to keep both situations happy.

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.