EventGet 50% off your ticket to MongoDB.local NYC on May 2. Use code Web50!Learn more >>
MongoDB Developer
Atlas
plus
Sign in to follow topics
MongoDB Developer Centerchevron-right
Developer Topicschevron-right
Productschevron-right
Atlaschevron-right

Using the Atlas Data API from Excel with Power Query

John Page9 min read • Published Dec 09, 2021 • Updated Aug 26, 2022
ExcelAtlasData APIJavaScript
Facebook Icontwitter iconlinkedin icon
Rate this quickstart
star-empty
star-empty
star-empty
star-empty
star-empty

Data Science and the Ubiquity of Excel

This tutorial discusses the preview version of the Atlas Data API which is now generally available with more features and functionality. Learn more about the GA version here.
When you ask what tools you should learn to be a data scientist, you will hear names like Spark, Jupyter notebooks, R, Pandas, and Numpy mentioned. Many enterprise data wranglers, on the other hand, have been using, and continue to use, industry heavyweights like SAS, SPSS, and Matlab as they have for the last few decades.
The truth is, though, that the majority of back-office data science is still performed using the ubiquitous Microsoft Excel.
Excel has been the go-to choice for importing, manipulating, analysing, and visualising data for 34 years and has more capabilities and features than most of us would ever believe. It would therefore be wrong to have a series on accessing data in MongoDB with the data API without including how to get data into Excel.
This is also unique in this series or articles in not requiring any imperative coding at all. We will use the Power Query functionality in Excel to both fetch raw data, and to push summarization tasks down to MongoDB and retrieve the results.
The MongoDB Atlas Data API is an HTTPS-based API that allows us to read and write data in Atlas, where a MongoDB driver library is either not available or not desirable. In this article, we will see how a business analyst or other back-office user, who often may not be a professional Developer, can access data from, and record data, in Atlas. The Atlas Data API can easily be used by users, unable to create or configure back-end services, who simply want to work with data in tools they know like Google Sheets or Excel.

Prerequisites

To access the data API using Power Query in Excel, we will need a version of Excel that supports it. Power Query is only available on the Windows desktop version, not on a Mac or via the browser-based Office 365 version of Excel.
We will also need an Atlas cluster for which we have enabled the data API, and our endpoint URL and API key. You can learn how to get these in this article or this video if you do not have them already.
A common use-case of Atlas with Microsoft Excel sheets might be to retrieve some subset of business data to analyse or to produce an export for a third party. To demonstrate this, we first need to have some business data available in MongoDB Atlas, this can be added by selecting the three dots next to our cluster name and choosing "Load Sample Dataset" or following instructions here.

Using Excel Power Query with HTTPS POST Requests

If we open up a new blank Excel workbook and then go to the Data ribbon, we can see on the left-hand side an option to get data From Web. Unfortunately, Microsoft has chosen in the wizard that this launches, to restrict data retrieval to API's that use GET rather than POST as the HTTP verb to request data.
An HTTP GET request is passed all of its data as part of the URL, the values after the website and path encodes additional parts to the request, normally in a simple key-value format. A POST request sends the data as a second part of the request and is not subject to the same length and security limitations a GET has.
HTTP GET is used for many simple read-only APIs, but the richness and complexity of queries and aggregations possible using the Atlas Data API. do not lend themselves to passing data in a GET rather than the body of a POST, so we are required to use a POST request instead.
Fortunately, Excel and Power Query do support POST requests when creating a query from scratch using what Microsoft calls a Blank Query.
To call a web service with a POST from Excel, start with a new Blank Workbook.
Click on Data on the menu bar to show the Data Ribbon. Then click Get Data on the far left and choose From Other Sources->Blank Query. It's right at the bottom of the ribbon bar dropdown.
We are then presented with the Query Editor.
We now need to use the Advanced Editor to define our 'JSON' payload, and send it via an HTTP POST request. Click Advanced Editor on the left to show the existing Blank Query.
This has two blocks. The let part is a set of transformations to fetch and manipulate data and the in part defines what the final data set should be called.
This is using Power Query M syntax. To help understand the next steps, let's summarise the syntax for that.

Power Query M syntax in a nutshell

Power Query M can have constant strings and numbers. Constant strings are denoted by double quotes like "MongoDB." Numbers are just the unquoted number alone, i.e., 5.23. Constants cannot be on the left side of an assignment.
Something not surrounded by quotes is a variable—e.g., People or Source and can be used either side of an assignment. To allow variable names to contain any character, including spaces, without ambiguity variables can also be declared as a hash symbol followed by double quotes so #"Number of runs" is a variable name, not a constant.
Power Query M defines arrays/lists of values as a comma separated list enclosed in braces (a.k.a. curly brackets) so #"State Names" = { "on", "off", "broken" } defines a variable called State Names as a list of three string values.
Power Query M defines Records (Dynamic Key->Value mappings) using a comma separated set of variable=value statements inside square brackets, for example Person = [Name="John",Dogs=3]. These data types can be nested—for example, Person = [Name="John",Dogs={ [name="Brea",age=10],[name="Harvest",age=5],[name="Bramble",age=1] }].
If you are used to pretty much any other programming language, you may find the contrarian syntax of Power Query M either amusing or difficult.

Defining a JSON Object to POST to the Atlas Data API with Power Query M

We can set the value of the variable Source to an explicit JSON object by passing a Power Query M Record to the function Json.FromValue like this.
This is the request we are going to send to the Data API. This request will search the collection listingsAndReviews in a Cluster called Cluster0 for documents where the field property_type equals "House".
We paste the code above into the advanced Editor, and verify that there is a green checkmark at the bottom with the words "No syntax errors have been detected," and then we can click Done. We see a screen like this.
The small CSV icon in the grey area represents our single JSON Document. Double click it and Power Query will apply a basic transformation to a table with JSON fields as values as shown below.

Posting payload JSON to the Find Endpoint in Atlas from Excel

To get our results from Atlas, we need to post this payload to our Atlas API find endpoint and parse the response. Click Advanced Editor again and change the contents to those in the box below changing the value "data-amzuu" in the endpoint to match your endpoint and the value of YOUR-API-KEY to match your personal API key. You will also need to change Cluster0 if your database cluster has a different name.
You will notice that two additional steps were added to the Query to convert it to the CSV we saw above. Overwrite these so the box just contains the lines below and click Done.
You will now see this screen, which is telling us it has retrieved a list of JSON documents.
Before we go further and look at how to parse this result into our worksheet, let us first review the connection we have just set up.
The first line, as before, is defining postData as a JSON string containing the payload for the Atlas API.
The next line, seen below, makes an HTTPS call to Atlas by calling the Web.Contents function and puts the return value in the variable response.
The first parameter to Web.Contents is our endpoint URL as a string.
The second parameter is a record specifying options for the request. We are specifying two options: Headers and Content.
Headers is a record used to specify the HTTP Headers to the request. In our case, we specify Content-Type and also explicitly include our credentials using a header named api-key.
Ideally, we would use the functionality built into Excel to handle web authentication and not need to include the API key in the query, but Microsoft has disabled this for POST requests out of security concerns with Windows federated authentication (DataSource.Error: Web.Contents with the Content option is only supported when connecting anonymously). We unfortunately need to, therefore, supply it explicitly as a header.
We also specify Content=postData , this is what makes this become a POST request rather than a GET request and pass our JSON payload to the HTTP API.
The next line Source = Json.Document(response) parses the JSON that gets sent back in the response, creating a Power Query record from the JSON data and assigning it to a variable named Source.

Converting documents from MongoDB Atlas into Excel Rows

So, getting back to parsing our returned data, we are now looking at something like this. The parsed JSON has returned a single record with one value, documents, which is a list.In JSON it would look like this {documents : [ { … }, { … } , { … } ] }
How do we parse it? The first step is to press the Into Table button in the Ribbon bar which converts the record into a table. Now we have a table with one value 'Documents' of type list. We need to break that down.
Right click the second column (value) and select Drill Down from the menu. As we do each of these stages, we see it being added to the list of transformations in the Applied Steps list on the right-hand side.
We now have a list of JSON documents but we want to convert that into rows. First, we want to right-click on the word list in row 1 and select Drill Down from the menu again.
Now we have a set of records, convert them to a table by clicking the To Table button and setting the delimiter to None in the dialog that appears. We now see a table but with a single column called Column1. Finally, If you select the Small icon at the right-hand end of the column header you can choose which columns you want. Select all the columns then click OK.
Finally, click Close and Load from the ribbon bar to write the results back to the sheet and save the Query.

Parameterising Power Queries using JSON Parameters

We hardcoded this to fetch us properties of type "House"' but what if we want to perform different queries? We can use the Excel Power Query Parameters to do this.
Select the Data Tab on the worksheet. Then, on the left, Get Data->Launch Power Query Editor.
From the ribbon of the editor, click Manage Parameters to open the parameter editor. Parameters are variables you can edit via the GUI or populate from functions. Click New (it's not clear that it is clickable) and rename the new parameter to Mongo Query. Wet the type to Text and the current value to { beds: 2 }, then click OK. Now select Query1 again on the left side of the window and click Advanced Editor in the ribbon bar. Change the source to match the code below. Note that we are only changing the postData line.
What we have done is make postData take the value in the Mongo Query parameter, and parse it as JSON. This lets us create arbitrary filters by specifying MongoDB queries in the Mongo Query Parameter. The changed line is shown below.

Running MongoDB Aggregation Pipelines from Excel

We can apply this same technique to run arbitrary MongoDB Aggregation Pipelines. Right click on Query1 in the list on the left and select Duplicate. Then right-click on Query1(2) and rename it to Aggregate. Select it and then click Advanced Editor on the ribbon. Change the word find in the URL to aggregate and the word filter in the payload to pipeline.
You will get an error at first like this. This is because the parameter Mongo Query is not a valid Aggregation Pipeline. Click Manage Parameters on the ribbon and change the value to [{$sortByCount : "$beds" }]. Then Click the X next to Expanded Column 1 on the right of the screen  as the expansion is now incorrect.
Again, click on the icon next to Column1 and Select All Columns to see how many properties there are for a given number of beds - processing the query with an aggregation pipeline on the server.

Putting it all together

Using Power Query with parameters, we can specify the cluster, collection, database, and parameters such as the query, fields returned, sort order ,and limit. We can also choose, by changing the endpoint, to perform a simple query or run an aggregation pipeline.
To simplify this, there is an Excel workbook available here which has all of these things parameterised so you can simply set the parameters required and run the Power Query to query your Atlas cluster. You can use this as a starting point in exploring how to further use the Excel and Power Query to access data in MongoDB Atlas.

Facebook Icontwitter iconlinkedin icon
Rate this quickstart
star-empty
star-empty
star-empty
star-empty
star-empty
Related
Article

Exact Matches in Atlas Search: Beginners Guide


Aug 30, 2022 | 6 min read
Article

Build an E-commerce Search Using MongoDB Vector Search and OpenAI


Mar 12, 2024 | 12 min read
Tutorial

MongoDB Atlas With Terraform - Cluster and Backup Policies


Mar 04, 2024 | 22 min read
Tutorial

Adding Semantic Caching and Memory to Your RAG Application Using MongoDB and LangChain


Mar 20, 2024 | 16 min read
Table of Contents