Thoughts on our new feature, $lookup

Eliot Horowitz


Editors note: the content in this blog was revised in a follow-up post, Revisiting $lookup.

One MongoDB 3.2 feature we demoed at MongoDB World in June was $lookup. $lookup is an aggregation pipeline stage that lets you insert data from a different collection into your pipeline. This is effectively a left outer join. $lookup will only be included in MongoDB Enterprise Server, and we’d like to give some background as to why and how we have decided this.

When we decided that MongoDB would be a document-based database, we did so for pragmatic technical reasons. The relational database model has several theoretical merits, but also many real-world limitations, chiefly the need to join data together across tables. Joins impede performance, inhibit scaling, and introduce substantial technical and cognitive overhead into all but textbook examples. In the real world, people adopt complicated workarounds in order to get good performance for modern applications: any large, performant, RDBMS-backed application will incorporate ad hoc denormalization, materialized views, external caching layers, and so on.

It was these limitations that motivated us to look for a data model that would both benefit developers building applications and also perform well at scale. We believe that documents are a natural, general-purpose data model that suits most applications people want to create today, and that, with careful schema design, allows for applications that are simultaneously more performant, more easily understood, and more scalable than applications that must rely on frequent joins.

Using a document model does not mean that joins aren't useful. There's no question that when data is being used for reporting, business intelligence, and analytics, joins are effective. Many of our users, having built fast, flexible, and feature-rich applications with MongoDB, tell us they now have so much application data in MongoDB that they'd like to use their existing BI tools to derive more business-value from it.

The full integration with these tools is not limited to $lookup, but also includes MongoDB Connector for BI, which translates SQL queries into Mongo Query Language queries. Our integration with BI and visualization tools can add tremendous value to companies looking for insights into their data, but it requires an outsized amount of time and testing to ensure smooth interoperation with these other costly products. As such, it is offered as part of our MongoDB Enterprise Advanced subscription. Our philosophy continues to be that documents are the right model for application development.

If you'd like to try out this feature, it's available in the 3.1.8 Enterprise Server build.

About the Author - Eliot Horowitz

Eliot Horowitz is CTO and Co-Founder of MongoDB. Eliot is one of the core MongoDB kernel committers. Previously, he was Co-Founder and CTO of ShopWiki. Eliot developed the crawling and data extraction algorithm that is the core of its innovative technology. He has quickly become one of Silicon Alley's up and coming entrepreneurs and was selected as one of BusinessWeek's Top 25 Entrepreneurs Under Age 25 nationwide in 2006. Earlier, Eliot was a software developer in the R&D group at DoubleClick (acquired by Google for $3.1 billion). Eliot received a BS in Computer Science from Brown University.