Ingesting and Visualizing API Data with Stitch and Charts

Facebook ShareLinkedin ShareReddit ShareTwitter Share

Introduction

We are living in a world full of APIs - one where any data about your devices, your community and even yourself is just an HTTP call away. But while the data returned from APIs can be incredibly useful, it is often just a point-in-time snapshot that makes it impossible to get a complete picture of what’s going on. To get the insight you need, a useful pattern is to ingest the data from the API into a database and then analyze and visualize the data to understand how it changes over time. Since most API data is represented in JSON, MongoDB is the ideal database for this data. This post describes how to complete the story using MongoDB Stitch for ingestion and MongoDB Charts for visualization.

The Pattern

The pattern we are going to use is pretty straightforward, and could be implemented with many different technologies:

  • A Scheduler kicks off the process at a regular interval (as warranted by the problem space), such as every hour
  • An Ingestion Function queries the API, retrieves (and perhaps transforms) the data and saves it in a database
  • The Database stores the latest and historical data retrieved from the API
  • The Visualization Tool queries the database (leveraging appropriate filters) and shows charts of the data and how it changes over time

Implementation using the MongoDB stack

Since this is the MongoDB Blog, you won’t be surprised to hear that we’re storing the data in MongoDB. Given the data from modern APIs is usually represented in JSON, you can store it directly without needing to transform the data or lose richness defined in the original structure. But what about the other components of the solution?

The Ingestion Function runs infrequently, so it’s well-suited for a serverless compute platform. You could definitely use a platform like AWS Lambda or Azure Functions for this task, but since the goal is to write to MongoDB, the quickest path will be to use MongoDB Stitch. This gives you simple, rule-based access to MongoDB collections without needing to worry about connection URIs, drivers or package management.

Stitch does not currently have a built-in scheduler. However, Stitch allows you to expose a function as an HTTP Webhook, allowing it to be easily called by a scheduler of your choice.

Finally, to query, filter and visualize the data we are going to use MongoDB Charts beta. This tool lets you quickly and easily create charts of data stored in MongoDB, without needing to flatten the schema or convert the data.

Tutorial: Weather Data

As a simple example, let’s build a solution that ingests and visualizes data from a weather API - we’ll use OpenWeatherMap which is free for basic scenarios (although you’ll still need to sign up for an account to get an App ID to call the API). Our goal is to build our own personal weather dashboard, like this:

Setting up the Stitch App

First, make sure you have a MongoDB Atlas cluster (the free tier will work just fine), and create a Stitch App against that cluster.

Next, we’ll add a Stitch Value that contains some metadata needed for calling the API. This is a JSON document that specifies which city we want weather readings from (I’m using “Sydney, AU”) and the OpenWeatherMap App ID to use when calling the API (available from their website after you register for an account).

Because we’re dealing with an HTTP API, we need to create a Stitch HTTP Service which is responsible for making the requests. We’ll call our service “http” and we’ll create a rule allowing it to perform a GET request against any URL.

Finally, we’ll add an Incoming Webhook to the service, which is a function that can be called from outside of Stitch. We’ll call our webhook “ingestWeatherData”, configure it to run as the System user, make it invokable via an HTTP POST, and use a secret provided as a query string parameter. Don’t forget to set your secret to a nice random value.

Once you’ve configured your webhook, you need to provide the code that runs each time the webhook is invoked. This function calls the OpenWeatherMap API with the parameters from our Stitch Value. Once it receives a result, it adds a new observationDate field to the document (derived from the dt field, but with a proper Date data type), and then stores the updated document to the weather.observations collection within our MongoDB Atlas cluster.


exports = function(payload) {
  const httpService = context.services.get('http');
  var weatherApiInfo = context.values.get('weatherApiInfo');
  let url = `http://api.openweathermap.org/data/2.5/weather?q=${weatherApiInfo.city}&units=metric&appid=${weatherApiInfo.appId}`;
  console.log("Fetching " + url);
  return httpService.get( {url: url}).then(response => {
    
    let json = JSON.parse(response.body.text());
    json.observationDate = new Date(json.dt * 1000);
    
    var collection = context.services.get('mongodb-atlas').db('weather').collection('observations');
    collection.insertOne(json);
    console.log('Inserted document!');
  });
};


To schedule our ingestion function, we could use any scheduler we want, including one from a cloud provider like AWS or Azure or as a cron job in a VM. One simple free option is to use IFTTT which lets you create “applets” with a trigger and action. For this example, I used the “Every hour” trigger from the Date/Time category and the “Webhook” action. Whatever scheduler you decide to use, choose an HTTP POST and copy the URL from the Stitch webhook settings page, making sure you append “?secret=” to the end of the URL.

Building your Dashboard

While you’re probably keen to get started on your dashboard, keep in mind that readings are only coming through once an hour, so you may want to wait a few days until you have enough data to visualize. While you’re waiting, you can download and configure MongoDB Charts beta, and get to know the tool by following the tutorials.

Once you’ve got a critical mass of data in your collection, it’s time to build your dashboard! In the screenshot earlier in this article, you can see I’ve created a number of different charts showing different views of the data. In this article we’ll go through how to create the top chart—after that, it’s up to you to replicate the other charts or come up with your own ideas.

In order to visualize data in Charts, you first need to create a Data Source—simply navigate to the Data Sources tab and hit the New Data Source button. On the first step of this dialog, you’ll need the connection URI (including the username and password) to your Atlas cluster. You can get this by clicking the Connect button in the Atlas web console. While you’re here, make sure you’ve configured your IP Whitelist (on the Atlas Security tab) to allow access to the cluster from your Charts server.

Back in Charts, once you’ve successfully connected to your cluster you can choose the `weather.observations` collection for your data source. Finally, you can decide if you want to share the Data Source with other Charts users, and your data source will be created.

Now we’ve got a data source, it’s time to create a dashboard. Go to the Dashboards tab, click New Dashboard and enter an appropriate name and description (“Weather” will do nicely!). You’ll now be presented with a blank canvas, ready to add your first chart.

Clicking Add Chart will take you to the Chart Builder. This is where all of the magic happens. Building a chart involves the following steps, although you may sometimes do things in a different order or iterate until you’re happy with the result:

  1. Choose a Data Source
  2. Choose a Chart Type
  3. Drag fields from the field panel onto the encoding channels
  4. Apply a filter, if desired
  5. Give your chart a title
  6. Save your chart to your dashboard

Selecting the Data Source is always the first step. Once you do this, Charts will sample the Data Source and show the resulting fields in the field panel at the left of the window. Note that the field panel also shows the types of each field, and identifies the nested documents and arrays in the document’s structure.

Next, we’ll choose our chart type. To show our graph of how the temperature changes over time, we’ll select the Line category and then choose the Continuous Line chart. A continuous chart shows a data point for each individual document from the collection, as opposed to the discrete chart which aggregates multiple points together when they share the same category value.

After completing these steps, your Chart will look like this:

Now it’s time to encode some fields. Drag the observationDate field onto the X-Axis channel, and then expand the main nested document to reveal its children. We’ll drag the temp field onto the Y-axis, and you should see the beginnings of a useful chart!

To add a bit more information to the chart, let’s add additional series to show the minimum and maximum observed temperatures across the city. Charts lets you create multi-series charts in two ways: you can either create each series from a distinct field, you can use a “series” field to split values from a single field into multiple series. For the shape of the data in this collection, we’ll go with the first option. We’ll do this by dragging the temp_min and temp_max fields as additional fields in the Y-axis channel:

This is starting to look nice! But note that the chart is currently showing all of the data in the collection. While we only have a few days of data in the system at the moment, with each passing hour we’ll get another data point, gradually resulting in a more cluttered chart and slower performance. To get around this, we can create a filter that limits the data to a sliding time window, such as the previous 5 days. You specify filters in Charts using an MQL Query Document. To specify a relative date query, you need to do a bit of arithmetic to create a new Date object based on the number of milliseconds relative to the current date, represented by Date(). To specify the last 5 days, use the following filter:


{ observationDate: {$gte: Date(Date() - 5 * 24 * 60 * 60 * 1000)}}

After applying the filter, you can give your chart a title and we’re done!

After saving the chart, you can resize it on your dashboard, and then start to create additional charts to give you even more insight into the weather in your city.

Another Example: Home Battery Data

I chose the Weather scenario for the main example in this article as it’s easy to get working and something we can all relate to. However, in practice, you can probably get a lot of this info from weather websites. The real value is not so much in this one specific example but in the pattern and its implementation using the MongoDB platform.

No matter what your interests are, chances are there are APIs that return data that is of value to you, and you can use the same approach from the weather example to ingest and visualize this data. The following screenshot shows one I created to show the status and history of my home battery storage system. You can see the storage level over the last week, the current storage level and understand how the usage changes over the seasons.

Hopefully, this will give you some ideas on dashboards that you’d like to create. Have fun exploring and visualizing your data!