GIANT Stories at MongoDB

Ingesting and Visualizing API Data with Stitch and Charts

Introduction

We are living in a world full of APIs - one where any data about your devices, your community and even yourself is just an HTTP call away. But while the data returned from APIs can be incredibly useful, it is often just a point-in-time snapshot that makes it impossible to get a complete picture of what’s going on. To get the insight you need, a useful pattern is to ingest the data from the API into a database and then analyze and visualize the data to understand how it changes over time. Since most API data is represented in JSON, MongoDB is the ideal database for this data. This post describes how to complete the story using MongoDB Stitch for ingestion and MongoDB Charts for visualization.

The Pattern

The pattern we are going to use is pretty straightforward, and could be implemented with many different technologies:

  • A Scheduler kicks off the process at a regular interval (as warranted by the problem space), such as every hour
  • An Ingestion Function queries the API, retrieves (and perhaps transforms) the data and saves it in a database
  • The Database stores the latest and historical data retrieved from the API
  • The Visualization Tool queries the database (leveraging appropriate filters) and shows charts of the data and how it changes over time

Implementation using the MongoDB stack

Since this is the MongoDB Blog, you won’t be surprised to hear that we’re storing the data in MongoDB. Given the data from modern APIs is usually represented in JSON, you can store it directly without needing to transform the data or lose richness defined in the original structure. But what about the other components of the solution?

The Ingestion Function runs infrequently, so it’s well-suited for a serverless compute platform. You could definitely use a platform like AWS Lambda or Azure Functions for this task, but since the goal is to write to MongoDB, the quickest path will be to use MongoDB Stitch. This gives you simple, rule-based access to MongoDB collections without needing to worry about connection URIs, drivers or package management.

Stitch does not currently have a built-in scheduler. However, Stitch allows you to expose a function as an HTTP Webhook, allowing it to be easily called by a scheduler of your choice.

Finally, to query, filter and visualize the data we are going to use MongoDB Charts beta. This tool lets you quickly and easily create charts of data stored in MongoDB, without needing to flatten the schema or convert the data.

Tutorial: Weather Data

As a simple example, let’s build a solution that ingests and visualizes data from a weather API - we’ll use OpenWeatherMap which is free for basic scenarios (although you’ll still need to sign up for an account to get an App ID to call the API). Our goal is to build our own personal weather dashboard, like this:

Setting up the Stitch App

First, make sure you have a MongoDB Atlas cluster (the free tier will work just fine), and create a Stitch App against that cluster.

Next, we’ll add a Stitch Value that contains some metadata needed for calling the API. This is a JSON document that specifies which city we want weather readings from (I’m using “Sydney, AU”) and the OpenWeatherMap App ID to use when calling the API (available from their website after you register for an account).

Because we’re dealing with an HTTP API, we need to create a Stitch HTTP Service which is responsible for making the requests. We’ll call our service “http” and we’ll create a rule allowing it to perform a GET request against any URL.

Finally, we’ll add an Incoming Webhook to the service, which is a function that can be called from outside of Stitch. We’ll call our webhook “ingestWeatherData”, configure it to run as the System user, make it invokable via an HTTP POST, and use a secret provided as a query string parameter. Don’t forget to set your secret to a nice random value.

Once you’ve configured your webhook, you need to provide the code that runs each time the webhook is invoked. This function calls the OpenWeatherMap API with the parameters from our Stitch Value. Once it receives a result, it adds a new observationDate field to the document (derived from the dt field, but with a proper Date data type), and then stores the updated document to the weather.observations collection within our MongoDB Atlas cluster.


exports = function(payload) {
  const httpService = context.services.get('http');
  var weatherApiInfo = context.values.get('weatherApiInfo');
  let url = `http://api.openweathermap.org/data/2.5/weather?q=${weatherApiInfo.city}&units=metric&appid=${weatherApiInfo.appId}`;
  console.log("Fetching " + url);
  return httpService.get( {url: url}).then(response => {
    
    let json = JSON.parse(response.body.text());
    json.observationDate = new Date(json.dt * 1000);
    
    var collection = context.services.get('mongodb-atlas').db('weather').collection('observations');
    collection.insertOne(json);
    console.log('Inserted document!');
  });
};


To schedule our ingestion function, we could use any scheduler we want, including one from a cloud provider like AWS or Azure or as a cron job in a VM. One simple free option is to use IFTTT which lets you create “applets” with a trigger and action. For this example, I used the “Every hour” trigger from the Date/Time category and the “Webhook” action. Whatever scheduler you decide to use, choose an HTTP POST and copy the URL from the Stitch webhook settings page, making sure you append “?secret=” to the end of the URL.

Building your Dashboard

While you’re probably keen to get started on your dashboard, keep in mind that readings are only coming through once an hour, so you may want to wait a few days until you have enough data to visualize. While you’re waiting, you can download and configure MongoDB Charts beta, and get to know the tool by following the tutorials.

Once you’ve got a critical mass of data in your collection, it’s time to build your dashboard! In the screenshot earlier in this article, you can see I’ve created a number of different charts showing different views of the data. In this article we’ll go through how to create the top chart—after that, it’s up to you to replicate the other charts or come up with your own ideas.

In order to visualize data in Charts, you first need to create a Data Source—simply navigate to the Data Sources tab and hit the New Data Source button. On the first step of this dialog, you’ll need the connection URI (including the username and password) to your Atlas cluster. You can get this by clicking the Connect button in the Atlas web console. While you’re here, make sure you’ve configured your IP Whitelist (on the Atlas Security tab) to allow access to the cluster from your Charts server.

Back in Charts, once you’ve successfully connected to your cluster you can choose the `weather.observations` collection for your data source. Finally, you can decide if you want to share the Data Source with other Charts users, and your data source will be created.

Now we’ve got a data source, it’s time to create a dashboard. Go to the Dashboards tab, click New Dashboard and enter an appropriate name and description (“Weather” will do nicely!). You’ll now be presented with a blank canvas, ready to add your first chart.

Clicking Add Chart will take you to the Chart Builder. This is where all of the magic happens. Building a chart involves the following steps, although you may sometimes do things in a different order or iterate until you’re happy with the result:

  1. Choose a Data Source
  2. Choose a Chart Type
  3. Drag fields from the field panel onto the encoding channels
  4. Apply a filter, if desired
  5. Give your chart a title
  6. Save your chart to your dashboard

Selecting the Data Source is always the first step. Once you do this, Charts will sample the Data Source and show the resulting fields in the field panel at the left of the window. Note that the field panel also shows the types of each field, and identifies the nested documents and arrays in the document’s structure.

Next, we’ll choose our chart type. To show our graph of how the temperature changes over time, we’ll select the Line category and then choose the Continuous Line chart. A continuous chart shows a data point for each individual document from the collection, as opposed to the discrete chart which aggregates multiple points together when they share the same category value.

After completing these steps, your Chart will look like this:

Now it’s time to encode some fields. Drag the observationDate field onto the X-Axis channel, and then expand the main nested document to reveal its children. We’ll drag the temp field onto the Y-axis, and you should see the beginnings of a useful chart!

To add a bit more information to the chart, let’s add additional series to show the minimum and maximum observed temperatures across the city. Charts lets you create multi-series charts in two ways: you can either create each series from a distinct field, you can use a “series” field to split values from a single field into multiple series. For the shape of the data in this collection, we’ll go with the first option. We’ll do this by dragging the temp_min and temp_max fields as additional fields in the Y-axis channel:

This is starting to look nice! But note that the chart is currently showing all of the data in the collection. While we only have a few days of data in the system at the moment, with each passing hour we’ll get another data point, gradually resulting in a more cluttered chart and slower performance. To get around this, we can create a filter that limits the data to a sliding time window, such as the previous 5 days. You specify filters in Charts using an MQL Query Document. To specify a relative date query, you need to do a bit of arithmetic to create a new Date object based on the number of milliseconds relative to the current date, represented by Date(). To specify the last 5 days, use the following filter:


{ observationDate: {$gte: Date(Date() - 5 * 24 * 60 * 60 * 1000)}}

After applying the filter, you can give your chart a title and we’re done!

After saving the chart, you can resize it on your dashboard, and then start to create additional charts to give you even more insight into the weather in your city.

Another Example: Home Battery Data

I chose the Weather scenario for the main example in this article as it’s easy to get working and something we can all relate to. However, in practice, you can probably get a lot of this info from weather websites. The real value is not so much in this one specific example but in the pattern and its implementation using the MongoDB platform.

No matter what your interests are, chances are there are APIs that return data that is of value to you, and you can use the same approach from the weather example to ingest and visualize this data. The following screenshot shows one I created to show the status and history of my home battery storage system. You can see the storage level over the last week, the current storage level and understand how the usage changes over the seasons.

Hopefully, this will give you some ideas on dashboards that you’d like to create. Have fun exploring and visualizing your data!

MongoDB Charts Beta Now Available

Today at MongoDB World 2018 in New York, we are excited to announce the beta release of MongoDB Charts, available now.

MongoDB Charts is the fastest and easiest way to build visualizations of MongoDB data. Now you can easily build visualizations with an intuitive UI and analyze complex, nested data—like arrays and subdocuments—something other visualization technologies designed for tabular databases struggle with. Then assemble those visualizations into a dashboard for at-a-glance, up-to-the minute information. Dashboards can be shared with other users, either for collaboration or viewing only, so that entire groups and organizations can become data-driven and benefit from data visualizations and the insights they provide. When you connect to a live data source, MongoDB Charts will keep your charts and dashboards up to date with the most recent data.

Charts allows you to connect to any MongoDB instance on which you have access permissions, or use existing data sources that other Charts users share with you. With MongoDB’s workload isolation capabilities—enabling you to separate your operational from analytical workloads in the same cluster—you can use Charts for a real-time view without having any impact on production workloads.

Unlike other visualization products, MongoDB Charts is designed to natively handle MongoDB’s rich data structures. MongoDB Charts makes it easy to visualize complex arrays and subdocuments without flattening the data or spending time and effort on ETL. Charts will automatically generate an aggregation pipeline from your chart design which is executed on your MongoDB server, giving you full access to the power of MongoDB when creating visualizations.

Charts already supports all common visualization types, including bar, column, line, area, scatter and donut charts, with table views and other advanced charts coming soon. Multiple visualizations can be assembled into dashboards, providing an at-a-glance understanding of all of your most important data. Dashboards can be used to track KPIs, understand trends, spot outliers and anomalies, and more.

Graphs and charts are a great way to communicate insights with others, but we know data can often be sensitive. Charts lets you stay in control over which users in your organization have access to your data sources and dashboards. You can choose to share with your entire organization, select individuals or keep things private to yourself. Dashboards can be shared as read-only or as read-write, depending on whether you want to communicate or collaborate.

Since this is a beta release, it’s free for anyone to try, but keep in mind that we may add or change functionality before the final release. MongoDB Charts is available as a Docker image which you can install on a server or VM within your environment. Once the Docker container is running, people in your organisation can access Charts from any desktop web browser. To download the image and get started with Charts, head to the MongoDB Download Center.

We think that MongoDB Charts is the best way to get quick, self-service visualizations of the data you’re storing in MongoDB. But we’re only just getting started. Once you try out the beta, please use the built-in feedback tool to report issues and to let us know what you do and don’t like, and we’ll use this input to make future releases even better.

Happy charting!


The development, release, and timing of any features or functionality described for our products remains at our sole discretion. This information is merely intended to outline our general product direction and it should not be relied on in making a purchasing decision nor is this a commitment, promise or legal obligation to deliver any material, code, or functionality.