Assessing the optimality of using Aggregation Framework to populate references

Joao_Santos2 · November 18, 2024, 1:35am

Motivation

reference population is a commonplace in web development
traditional MongoDB ODMs use a “query each one of the references then build the final object using JS” approach
it is possible to deep populate references sending a single aggregation pipeline payload to the MongoDB server

Methods

Github CI runner was used for each test
Used MongoDB version was 8.0.3 (latest)
Documents containing no references, a single reference, and a nesated reference were first inserted, then the time to retrieve the documents N times was recorded
no caching mechanisms except those of MongoDB were used

Results

Prisma was the slowest
Mongoose with mongoose-autopopulate was slightly slower to retreive plain and nested references than Mongoose with .populate() method
Aeria (which leverages the Aggregation Framework) was the fastest (213% speedup in comparison with Mongoose, 272% speedup in comparison with Prisma)

Conclusion

The Aggregation Framework is the optimal way to deep populate references.

{
  "mongodbVersion": "8.0.3",
  "results": {
    "aeria (bypassSecurity)": {
      "norefs": 1722,
      "plain": 2226,
      "deep": 2683
    },
    "aeria (default)": {
      "norefs": 1996,
      "plain": 2327,
      "deep": 2699
    },
    "mongoose (autopopulate)": {
      "norefs": 1672,
      "plain": 3871,
      "deep": 5702
    },
    "mongoose (populate method)": {
      "norefs": 1924,
      "plain": 3720,
      "deep": 5612
    },
    "prisma": {
      "norefs": 3106,
      "plain": 5208,
      "deep": 7296
    }
  }
}

Link to the repository (contains example aggregation pipeline): GitHub - aeria-org/benchmark

Opinions will be highly appreciated.

Joao_Santos2 · November 18, 2024, 1:38am

PS: I’m yet to see if there’s any slowdowns when the references are nested too deeply, or the Aggregation Framework is linearly faster than multiple queries and JS hooks.