New Lookup Fields: Build Charts with Data from Multiple Collections

David Macias

#Charts

We’re excited to announce the new Lookup Field feature in MongoDB Charts! The new Lookup Field will allow you to join data from two collections, or data sources, within the Chart builder and create a single chart with just a few clicks. While the power to do lookups has always existed in Charts, it could only be done in a data source pipeline using code (MQL). With the Lookup Field feature, you can do it at the individual chart level and without any knowledge of MQL.

Finding deeper insights by leveraging more data in a single data visualization

While the flexibility of the document model can reduce the need for complex data architectures that include many tables of data that reference each other, there is often still the need to have data live in multiple collections. It’s likely that as your projects or companies grow, so will your need for data organized in separate collections.

It might be that one collection in your database houses your primary user data while another collection has activity data that references what user was completing the activity. In fact, you could have multiple collections of data specific for several different apps, sites, or other data sources. Thus, the scenario could arise where you want to see activity for different microservices or apps, but at an aggregated view.

Whatever your purpose for viewing data from multiple sources in a single view, the path to that single view is now easier within Charts. Data visualizations are only as good as the data that constructs them, and the new Lookup Field enables access to more data with less effort, ultimately empowering more robust insights.

The power of the MongoDB aggregation framework with or without coding and queries

Even if you consider yourself a MongoDB Query Language (MQL) power user, the new Lookup Fields will make it easier, quicker, and more intuitive to bring fields from a second data source into the chart builder.

To add the fields from a second collection, follow these steps: (It can literally be as few as five clicks)

  1. Click on the ‘...’ of the common field between your two collections
  2. Click on ‘Lookup field’
  3. Select the ‘Remote data source’ you want to pull in data from
  4. Select the ‘Remote field’ that is common between your two data sources
  5. Click ‘Save’
Lookup Field in Another Data Source

Note that you can change the field name for how it’s displayed in the chart builder. The lookup field will show with a binoculars icon in the field panel with the data from the second collection as sub-fields. If you choose the “Return all matches as an array”, you will be able to view all the matching documents from the second collection as an array under the main italicized lookup field, and you can perform all built-in array reductions to visualize the data as you wish.

Seeing the Lookup Field in action with YouTube data

Let’s look at how the new Lookup Field feature helps with a real world data problem. I uploaded two datasets of YouTube data, U.S. videos and U.S. comments (from Kaggle). The problem I want to solve, or better, the insight I want to find, is what YouTube channels create the most amount of engagement within the comments. I know you’re probably thinking who wants to dive into the comments of any social media platform, but thankfully I just want to see the amount of discussion, not what is actually being said. Ultimately, I want to quickly see which channels have the most replies of comments, not just the most comments of the original videos.

  • The “USvideos” dataset includes data about the video including the channel title it belongs to, how many views, likes, dislikes, and total comments each video has received.

  • The “UScomments” dataset includes documents per comment that include data of how many likes and replies each comment received, and what video that comment belongs to.

  • This is a parent/child relationship in that the “parent” data, or the video collection, is referred to by an ID in the comment collection, or “child” data. The ID, or key, that connects the two collections is a video_id. You can see in the image below how the video_id enables a join between documents in the two USvideos and UScomments collections.

LookupGif

Once I was in the chart builder and had the “USvideos” data source connected, I could see all the fields per video. As seen below I then added the Lookup Field on the “video_ID” to connect to the “UScomments” data source and bring in the comment data per video.

LookupGif

Quick aside, it might be that your collections are quite large and that you need to create an index so that the Lookup Field can be created quicker and doesn’t time out. I created an index {“video_id”:”1”} on the UScomments collection within MongoDB Atlas to account for this.

I was then able to create the following chart just by dragging and dropping videos.channel_title, comments.replies fields into the chart builder. I had to unwind the replies fields, since it is a part of the collection added.

There I have it! With just a few clicks, I was able to join two collections and visualize fields from each dataset in a single chart. I limited the results to the top 10 channels, and I can see which channels’ comments get the most replies.

Now it’s your turn to explore the Charts sample dashboard. If you’re new to MongoDB, you can kick off with Charts for free, simply by signing up for MongoDB Cloud, deploying a free Atlas cluster, and activating Charts. We love receiving feedback, so if you have suggestions on how to improve anything in Charts, use the MongoDB Feedback Engine. We use this feedback to help improve Charts, and figure out where users need help.