GIANT Stories at MongoDB

This blog post has charts and yours could too!

MongoDB Charts brings dynamic visualization to your data and now you can embed those live charts and graphs into web pages, apps, blogs and more. And the best part - Embedded charts are really simple to deploy.

MongoDB Charts Gets Embeddable

Jane Fine

Releases, Charts

MongoDB Chart’s latest feature is embeddable charts and MongoDB Atlas users are the first to get a preview of it. Now you can present your users with live charts from your data and you can do it without writing stacks of code.

MongoDB Charts Beta, Now Available in Atlas

Earlier in the year, we announced the availability of MongoDB Charts Beta, the fastest and easiest way to build visualizations of MongoDB data. Today at Mongodb.local San Francisco, we are excited to announce that an update to the beta is now available and integrated into MongoDB Atlas, our hosted database as a service platform. This means that Atlas users can now visualize their data and share with their team, without the need to install or maintain any servers or tools.

Getting started with MongoDB Charts in Atlas couldn’t be simpler. After logging into Atlas, select the Project with the clusters containing the data you want to visualize and click the Charts link in the left navigation bar. After a one-time step to activate Charts, you will be ready to start charting!

MongoDB Charts(Beta) inside MongoDB Atlas
MongoDB Charts(Beta) inside MongoDB Atlas

If you’ve used MongoDB Charts before, the new Atlas-integrated version will be instantly familiar. The main difference is that you can easily add data sources from any Atlas clusters in your project without needing to enter a connection URI. You’re also freed from the burden of managing users separately, with all Atlas Project members able to access Charts with their existing Atlas credentials provided they have Data Access Read Only role or higher.

MongoDB Charts(Beta) New Data Source
New Data Source

We’ve also been busy adding some of the most requested features to the charting experience. Charts has always been great at handling MongoDB’s flexible schema, allowing you to build charts from document-based data that contains nested documents or arrays. In this latest release, we’ve added a number of options for chart authors to customize their charts, including changing axis titles, colors, date formats and more.

Sample chart with MongoDB Charts
Sample Line Chart

After you’ve created a few charts, you can arrange them on a dashboard to get all of the information you need at a glance. Dashboards can be kept private, shared with selected individuals, or with everyone in your project team.

Sample MongoDB Charts Dashboard
Sample Dashboard

If you’re not currently using Atlas, we haven’t forgotten about you. MongoDB Charts Beta is also still available to install into your own server environment, allowing you to visualize data from any MongoDB server. We’ll be refreshing the on-premises beta to include the same charting enhancements as seen in the new Atlas version over the coming weeks.

We hope you enjoy this update and that it helps you get the insight you need from your data. If you have any questions or feature requests, you can always send a note to the Charts team by clicking the support button on the bottom of every page.

Happy Charting!

Reducing the Need for ETL with MongoDB Charts

Ken W. Alger

Charts

Databases as we know them have been around for over 40 years. When they first came about businesses would often keep data in separate systems and separate formats. There were a variety of reasons for these decisions. One of the side effects of these separate data stores is the need to combine together to be able to perform data analysis. This led to the long-standing practice of ETL, or Extract, Transform, Load.

ETL is a process to extract data from a starting data source, transform the data in some fashion, then load it into another data store. Sounds simple enough, but in fact, there is a lot of work going on under the covers and a lot of steps and decisions to navigate. These additional steps reduce the speed at which we can get meaningful insights from our data. Further, they rely on many assumptions about transforming data into what is assumed to be the correct format for later consumption - without knowing very much about the business questions to be asked of this data down the road.

From Data Warehouses to the Cloud

Traditionally, enterprise applications have relied on performing ETL operations to move data into an enterprise data warehouse (EDW).

Traditional ETL Architecture
Traditional ETL Architecture

Creating a successful data warehouse can be a long, complicated, and expensive process. One of the technologies that have been created to help with the process is Apache Hadoop. Hadoop allows for the processing of massive amounts of data on commodity hardware with open source technologies. However, instead of simplification, the ETL and data warehousing landscape has only become more complex and cumbersome and the proliferation of tools combined with maturity and adoption issues have only increased the cost. Further, according to Gartner analyst Nick Heudecker, 85% of big data projects fail. Mostly due to the complexity of the process itself.

With the transition to the cloud many organizations are undertaking, ETL becomes even more complicated from a meaningful and timely data analytics standpoint. Moving data from one source to another takes time. Now there is hidden data transfer and compute costs and latencies to navigate. While some meaningful analytics can be performed on stale data, most modern analytics need to be as close to real-time as possible.

Issues With ETL

A few of the problems that we are faced with when setting up ETL processes are:

  1. Latency & Downtime - There is an inherent cost of moving data from point A to point B. Forty years ago, when ETL started, we were working with megabytes of data and not needing “instant” access. Today we’re dealing with terra or petabytes of data and needing real-time insight from that data.

    Moving data across the network isn’t free. On a 100 BaseT network, transferring one gigabyte of data takes 100 seconds. A terabyte takes 10,000 seconds or over two and a half hours. All assuming that it’s on a dedicated network that isn’t used by other applications. At ETL demands grow, data could easily be stale by many hours.

    We used to be able to schedule these transfers during “downtime” at midnight. However, in today’s global world, users are always online somewhere demanding instant access and insight. Downtime is simply no longer acceptable and latency has become the new downtime. Should suppliers on one side of the world suffer from poor performance just so executives on the other side of the world have up to date dashboards in the morning?

  2. Storage is cheap, labor is expensive - Data warehouses started at a point in time in which storage was expensive. In 1981, one gigabyte of data storage cost about $290,000. Today that cost is under $0.10. It was, therefore, important to transform and compress as much data as possible when storing to save costs.

    As storage costs have decreased, labor costs have gone the opposite direction. Having a good database administrator to design, manage, and maintain your data warehouse and ETL path is expensive. Storing raw data is frequently seen as a more economically viable choice.

  3. ETL is hard - ETL takes planning. Lots of it. And not just for your current load of data, but for what might happen to the load down the road. Additionally, ETL scripts can get long and complex.

    Bringing in data from a variety of sources, looping over them, adding logging, error handling, configurations are just the start. Determining how the data needs to be transformed can be complex, and fragile. What happens if data stored today as a string gets changed down the road? The process breaks and adjustments need to be made.

    Do you ever wonder why the first answer out of a DBA’s mouth is an emphatic “No!” when asked if something can be changed? One “simple” change can mean changing dozens or hundreds of lines of code. For these reasons and more, ETL requires planning for current and future data needs, loads, and shape.

  4. Are developers the right people to build the ETL pipeline? - Developers are great at many things, however, knowing about data storage and ETL pipelines aren’t often one of them. ETL design and implementation are typically best done by data engineers. While a developer may be able to get data through an ETL pipeline and into a data warehouse, generally speaking, it often isn’t done in the most efficient manner. Specialized data engineers should be responsible for these tasks. If you don’t have them on your team, this is another cost of ETL.
  5. Maintenance headaches - As the size and complexity of data, applications, and analytics requirements grow, so does ETL maintenance. Maintaining changes in data velocity, formats, connections, and features takes time. Many of these challenges may not be thought of at the start of a project, but lead to long-term maintenance needs.

Use MongoDB Charts to Avoid the Headache of ETL

Companies today still have data in a variety of systems. In certain instances, ETL is the only option to be able to perform visualization and analysis of your data. Or, perhaps, you’ve explored ETL but haven’t taken the steps needed to get your data ready for analysis because it’s overwhelming.

If you’ve leveraged MongoDB as your database, the need for ETL procedures has been dramatically reduced with the introduction of MongoDB Charts, now in beta. MongoDB Charts natively understands the MongoDB Document Model allowing for the rapid creation of data visualizations over your data.

With MongoDB Charts you can connect to your MongoDB server, assign user authorization policies to your reports, and easily generate visualization dashboards. With over a dozen different chart variations to choose from, stunning visualizations are just a few clicks away.

MongoDB Charts allows for data to be visualized without performing ETL operations, saving valuable time and resources. You don’t need to write any code or rely on third-party tools. Further, you still get to leverage the richness of the Document Model.

Conclusion

For those situations that you want to quickly access your MongoDB Data, MongoDB Charts is a terrific option. If you’re in a situation that requires multiple data sources to be analyzed, we offer the MongoDB Connector for Business Intelligence. If you are doing advanced analytics with Apache Spark, we have an option for that as well with the MongoDB Connector for Apache Spark.

For many roles in an organization, MongoDB Charts is a great tool for analyzing your data. There’s no need to go through the pain of the ETL process. It is the fastest way to build visualizations over your MongoDB Data, wherever it’s stored. On-premise or in the cloud hosted by MongoDB Atlas. Give it a try today!

Download MongoDB Charts and start visualizing your MongoDB data today.

Visualizing Your Data With MongoDB Charts

Ken W. Alger

Charts

Having data stored in a database is practically a given for today’s businesses. Customer information, order history, product pricing, IoT sensor data, and much more is being recorded for future use. However, just having the data stored isn’t enough to form a competitive market advantage. We must be able to analyze the data as well. There are many options to do so and in a variety of ways. If you have data that needs to be visually analyzed in MongoDB, MongoDB Charts is a terrific option.

Prior to MongoDB Charts, there were really three ways to visualize your MongoDB Data.

  1. Leverage the MongoDB Business Intelligence (BI) Connector in conjunction with third-party BI tools,
  2. Perform Extract-Transform-Load (ETL) operations and leverage third-party tools, or
  3. Write custom code and use charting libraries such as D3.js or Bokeh.

MongoDB Charts, currently in Beta, provides an easy way to visualize your data living in MongoDB. You don’t need to move your data to a different repository, write your own code, or purchase third-party tools. MongoDB Charts knows and understands the richness of the Document Data Model and allows for easy data visualization.

Further, MongoDB Charts allows for a secure way to create and share visualization dashboards with everyone, or just targeted team members. Similarly, the data source being used behind the scenes can be shared securely as well. For example, data for the Sales Department doesn’t have to be made available to Marketing unless needed. Very powerful and follows MongoDB’s design of security being a top priority.

After downloading the MongoDB Charts Docker image and following the installation instructions, we’re able to connect to a data source stored in MongoDB Atlas and start making visualization dashboards. Once connected to the MongoDB Charts server, there are three steps we need to take:

  1. Add a data source
  2. Create a dashboard
  3. Create our charts

Analyzing Airbnb Data with MongoDB Charts

I have set up a database with some Airbnb data from various cities. We’ll be exploring the dataset from Seattle, WA here, but feel free to explore others on your own. We need to get the connection string from the Atlas Cluster that has our data and connect to it in Charts.

Get URI from MongoDB Atlas

Add a Data Source

With our MongoDB Charts server running on localhost:80, we can log in and head to the Data Sources tab. We use the URI from Atlas (mongodb+srv://airbnbdemo:airbnb@airbnb-rgl39.mongodb.net/test?retryWrites=true) and select Connect. We’re next asked which data source we want to use from that cluster, I’ll select the seattleListingAndReviews from the airbnb database for this example. For permissions, I just want to keep everything private so I’ll accept the defaults and select Publish Data Source. Once published I can add an alias to the data source. I’ll call it Airbnb Seattle.

Note: The URI above contains a sample URI. You should connect to your own Atlas Cluster and use an authorized username and password.

Create a Dashboard

Next up is to create an actual dashboard to house our visualizations. In the Dashboards section choose New Dashboard and give it a name and description, like Ken’s Airbnb Dashboard. This will take me to where I can add charts to my dashboard.

Create a Chart

After clicking on the Add Chart button we can start building our visualization. We’ll want to choose the Airbnb Seattle data source from the drop-down. MongoDB Charts automatically determines which fields are available for exploration. For this exercise, I’d like to see which neighborhoods in Seattle have the most Airbnb properties and split them by property type. We’ll use the Stacked Bar chart for the type.

  1. For the X-Axis then, we’ll want the id field, aggregated by count.
  2. Assign X-Axis value to a MongoDB Chart
  3. Along the Y-Axis we’ll look at the address and the suburb. Notice that address is a subdocument here and that MongoDB Charts natively knows how to handle this type of data. I’d like to sort the suburb by aggregated value, in descending order, and limit our results to the top 20 suburbs.
  4. Assign Y-Axis value to a Stacked Bar chart
  5. Let’s add the property_type field as our series
  6. Assign a Series value to a Stacked Bar chart

Now we can name our chart, Properties by Location and save it. We’re then taken back to our dashboard where we can add other visualizations for further exploration.

Have a look at this short video to see some other visualizations being created from this same data source.

Conclusion

MongoDB Charts is an excellent new tool to visually explore your data. It has some great features for specific use cases, such as:

  • Ad hoc analysis of your data
  • Natively understands the benefits of the Document Data Model
  • Collaboration on projects is easy with user-based sharing and permissions
  • It’s intuitive enough for non-developers to use allowing for self-service data analysis

MongoDB Charts is the fastest way to build visualizations over your MongoDB data. I’d encourage you to download it and try it out today. Let me know what visualizations you come up with from the Airbnb dataset. I always enjoy seeing how people explore their data.

Ingesting and Visualizing API Data with Stitch and Charts

Introduction

We are living in a world full of APIs - one where any data about your devices, your community and even yourself is just an HTTP call away. But while the data returned from APIs can be incredibly useful, it is often just a point-in-time snapshot that makes it impossible to get a complete picture of what’s going on. To get the insight you need, a useful pattern is to ingest the data from the API into a database and then analyze and visualize the data to understand how it changes over time. Since most API data is represented in JSON, MongoDB is the ideal database for this data. This post describes how to complete the story using MongoDB Stitch for ingestion and MongoDB Charts for visualization.

The Pattern

The pattern we are going to use is pretty straightforward, and could be implemented with many different technologies:

  • A Scheduler kicks off the process at a regular interval (as warranted by the problem space), such as every hour
  • An Ingestion Function queries the API, retrieves (and perhaps transforms) the data and saves it in a database
  • The Database stores the latest and historical data retrieved from the API
  • The Visualization Tool queries the database (leveraging appropriate filters) and shows charts of the data and how it changes over time

Implementation using the MongoDB stack

Since this is the MongoDB Blog, you won’t be surprised to hear that we’re storing the data in MongoDB. Given the data from modern APIs is usually represented in JSON, you can store it directly without needing to transform the data or lose richness defined in the original structure. But what about the other components of the solution?

The Ingestion Function runs infrequently, so it’s well-suited for a serverless compute platform. You could definitely use a platform like AWS Lambda or Azure Functions for this task, but since the goal is to write to MongoDB, the quickest path will be to use MongoDB Stitch. This gives you simple, rule-based access to MongoDB collections without needing to worry about connection URIs, drivers or package management.

Stitch does not currently have a built-in scheduler. However, Stitch allows you to expose a function as an HTTP Webhook, allowing it to be easily called by a scheduler of your choice.

Finally, to query, filter and visualize the data we are going to use MongoDB Charts beta. This tool lets you quickly and easily create charts of data stored in MongoDB, without needing to flatten the schema or convert the data.

Tutorial: Weather Data

As a simple example, let’s build a solution that ingests and visualizes data from a weather API - we’ll use OpenWeatherMap which is free for basic scenarios (although you’ll still need to sign up for an account to get an App ID to call the API). Our goal is to build our own personal weather dashboard, like this:

Setting up the Stitch App

First, make sure you have a MongoDB Atlas cluster (the free tier will work just fine), and create a Stitch App against that cluster.

Next, we’ll add a Stitch Value that contains some metadata needed for calling the API. This is a JSON document that specifies which city we want weather readings from (I’m using “Sydney, AU”) and the OpenWeatherMap App ID to use when calling the API (available from their website after you register for an account).

Because we’re dealing with an HTTP API, we need to create a Stitch HTTP Service which is responsible for making the requests. We’ll call our service “http” and we’ll create a rule allowing it to perform a GET request against any URL.

Finally, we’ll add an Incoming Webhook to the service, which is a function that can be called from outside of Stitch. We’ll call our webhook “ingestWeatherData”, configure it to run as the System user, make it invokable via an HTTP POST, and use a secret provided as a query string parameter. Don’t forget to set your secret to a nice random value.

Once you’ve configured your webhook, you need to provide the code that runs each time the webhook is invoked. This function calls the OpenWeatherMap API with the parameters from our Stitch Value. Once it receives a result, it adds a new observationDate field to the document (derived from the dt field, but with a proper Date data type), and then stores the updated document to the weather.observations collection within our MongoDB Atlas cluster.


exports = function(payload) {
  const httpService = context.services.get('http');
  var weatherApiInfo = context.values.get('weatherApiInfo');
  let url = `http://api.openweathermap.org/data/2.5/weather?q=${weatherApiInfo.city}&units=metric&appid=${weatherApiInfo.appId}`;
  console.log("Fetching " + url);
  return httpService.get( {url: url}).then(response => {
    
    let json = JSON.parse(response.body.text());
    json.observationDate = new Date(json.dt * 1000);
    
    var collection = context.services.get('mongodb-atlas').db('weather').collection('observations');
    collection.insertOne(json);
    console.log('Inserted document!');
  });
};


To schedule our ingestion function, we could use any scheduler we want, including one from a cloud provider like AWS or Azure or as a cron job in a VM. One simple free option is to use IFTTT which lets you create “applets” with a trigger and action. For this example, I used the “Every hour” trigger from the Date/Time category and the “Webhook” action. Whatever scheduler you decide to use, choose an HTTP POST and copy the URL from the Stitch webhook settings page, making sure you append “?secret=” to the end of the URL.

Building your Dashboard

While you’re probably keen to get started on your dashboard, keep in mind that readings are only coming through once an hour, so you may want to wait a few days until you have enough data to visualize. While you’re waiting, you can download and configure MongoDB Charts beta, and get to know the tool by following the tutorials.

Once you’ve got a critical mass of data in your collection, it’s time to build your dashboard! In the screenshot earlier in this article, you can see I’ve created a number of different charts showing different views of the data. In this article we’ll go through how to create the top chart—after that, it’s up to you to replicate the other charts or come up with your own ideas.

In order to visualize data in Charts, you first need to create a Data Source—simply navigate to the Data Sources tab and hit the New Data Source button. On the first step of this dialog, you’ll need the connection URI (including the username and password) to your Atlas cluster. You can get this by clicking the Connect button in the Atlas web console. While you’re here, make sure you’ve configured your IP Whitelist (on the Atlas Security tab) to allow access to the cluster from your Charts server.

Back in Charts, once you’ve successfully connected to your cluster you can choose the `weather.observations` collection for your data source. Finally, you can decide if you want to share the Data Source with other Charts users, and your data source will be created.

Now we’ve got a data source, it’s time to create a dashboard. Go to the Dashboards tab, click New Dashboard and enter an appropriate name and description (“Weather” will do nicely!). You’ll now be presented with a blank canvas, ready to add your first chart.

Clicking Add Chart will take you to the Chart Builder. This is where all of the magic happens. Building a chart involves the following steps, although you may sometimes do things in a different order or iterate until you’re happy with the result:

  1. Choose a Data Source
  2. Choose a Chart Type
  3. Drag fields from the field panel onto the encoding channels
  4. Apply a filter, if desired
  5. Give your chart a title
  6. Save your chart to your dashboard

Selecting the Data Source is always the first step. Once you do this, Charts will sample the Data Source and show the resulting fields in the field panel at the left of the window. Note that the field panel also shows the types of each field, and identifies the nested documents and arrays in the document’s structure.

Next, we’ll choose our chart type. To show our graph of how the temperature changes over time, we’ll select the Line category and then choose the Continuous Line chart. A continuous chart shows a data point for each individual document from the collection, as opposed to the discrete chart which aggregates multiple points together when they share the same category value.

After completing these steps, your Chart will look like this:

Now it’s time to encode some fields. Drag the observationDate field onto the X-Axis channel, and then expand the main nested document to reveal its children. We’ll drag the temp field onto the Y-axis, and you should see the beginnings of a useful chart!

To add a bit more information to the chart, let’s add additional series to show the minimum and maximum observed temperatures across the city. Charts lets you create multi-series charts in two ways: you can either create each series from a distinct field, you can use a “series” field to split values from a single field into multiple series. For the shape of the data in this collection, we’ll go with the first option. We’ll do this by dragging the temp_min and temp_max fields as additional fields in the Y-axis channel:

This is starting to look nice! But note that the chart is currently showing all of the data in the collection. While we only have a few days of data in the system at the moment, with each passing hour we’ll get another data point, gradually resulting in a more cluttered chart and slower performance. To get around this, we can create a filter that limits the data to a sliding time window, such as the previous 5 days. You specify filters in Charts using an MQL Query Document. To specify a relative date query, you need to do a bit of arithmetic to create a new Date object based on the number of milliseconds relative to the current date, represented by Date(). To specify the last 5 days, use the following filter:


{ observationDate: {$gte: Date(Date() - 5 * 24 * 60 * 60 * 1000)}}

After applying the filter, you can give your chart a title and we’re done!

After saving the chart, you can resize it on your dashboard, and then start to create additional charts to give you even more insight into the weather in your city.

Another Example: Home Battery Data

I chose the Weather scenario for the main example in this article as it’s easy to get working and something we can all relate to. However, in practice, you can probably get a lot of this info from weather websites. The real value is not so much in this one specific example but in the pattern and its implementation using the MongoDB platform.

No matter what your interests are, chances are there are APIs that return data that is of value to you, and you can use the same approach from the weather example to ingest and visualize this data. The following screenshot shows one I created to show the status and history of my home battery storage system. You can see the storage level over the last week, the current storage level and understand how the usage changes over the seasons.

Hopefully, this will give you some ideas on dashboards that you’d like to create. Have fun exploring and visualizing your data!

MongoDB Charts Beta Now Available

Today at MongoDB World 2018 in New York, we are excited to announce the beta release of MongoDB Charts, available now.

MongoDB Charts is the fastest and easiest way to build visualizations of MongoDB data. Now you can easily build visualizations with an intuitive UI and analyze complex, nested data—like arrays and subdocuments—something other visualization technologies designed for tabular databases struggle with. Then assemble those visualizations into a dashboard for at-a-glance, up-to-the minute information. Dashboards can be shared with other users, either for collaboration or viewing only, so that entire groups and organizations can become data-driven and benefit from data visualizations and the insights they provide. When you connect to a live data source, MongoDB Charts will keep your charts and dashboards up to date with the most recent data.

Charts allows you to connect to any MongoDB instance on which you have access permissions, or use existing data sources that other Charts users share with you. With MongoDB’s workload isolation capabilities—enabling you to separate your operational from analytical workloads in the same cluster—you can use Charts for a real-time view without having any impact on production workloads.

Unlike other visualization products, MongoDB Charts is designed to natively handle MongoDB’s rich data structures. MongoDB Charts makes it easy to visualize complex arrays and subdocuments without flattening the data or spending time and effort on ETL. Charts will automatically generate an aggregation pipeline from your chart design which is executed on your MongoDB server, giving you full access to the power of MongoDB when creating visualizations.

Charts already supports all common visualization types, including bar, column, line, area, scatter and donut charts, with table views and other advanced charts coming soon. Multiple visualizations can be assembled into dashboards, providing an at-a-glance understanding of all of your most important data. Dashboards can be used to track KPIs, understand trends, spot outliers and anomalies, and more.

Graphs and charts are a great way to communicate insights with others, but we know data can often be sensitive. Charts lets you stay in control over which users in your organization have access to your data sources and dashboards. You can choose to share with your entire organization, select individuals or keep things private to yourself. Dashboards can be shared as read-only or as read-write, depending on whether you want to communicate or collaborate.

Since this is a beta release, it’s free for anyone to try, but keep in mind that we may add or change functionality before the final release. MongoDB Charts is available as a Docker image which you can install on a server or VM within your environment. Once the Docker container is running, people in your organisation can access Charts from any desktop web browser. To download the image and get started with Charts, head to the MongoDB Download Center.

We think that MongoDB Charts is the best way to get quick, self-service visualizations of the data you’re storing in MongoDB. But we’re only just getting started. Once you try out the beta, please use the built-in feedback tool to report issues and to let us know what you do and don’t like, and we’ll use this input to make future releases even better.

Happy charting!


The development, release, and timing of any features or functionality described for our products remains at our sole discretion. This information is merely intended to outline our general product direction and it should not be relied on in making a purchasing decision nor is this a commitment, promise or legal obligation to deliver any material, code, or functionality.