MongoDB
MongoDB Developer Centerchevron-right
Developer Topicschevron-right
Productschevron-right
MongoDBchevron-right

Generating MQL Shell Commands Using OpenAI and New mongosh Shell

Pavel DuchovnyPublished Jul 23, 2021 • Updated May 10, 2022
MongoDBShell
Copy Link
facebook icontwitter iconlinkedin icon
random alt
Rate this article
star-empty
star-empty
star-empty
star-empty
star-empty

Generating MQL Shell Commands Using OpenAI and New mongosh Shell

OpenAI is a fascinating and growing AI platform sponsored by Microsoft, allowing you to digest text cleverly to produce AI content with stunning results considering how small the “learning data set” you actually provide is.
MongoDB’s Query Language (MQL) is an intuitive language for developers to interact with MongoDB Documents. For this reason, I wanted to put OpenAI to the test of quickly learning the MongoDB language and using its overall knowledge to build queries from simple sentences. The results were more than satisfying to me. Github is already working on a project called Github copilot which uses the same OpenAI engine to code.
In this article, I will show you my experiment, including the game-changing capabilities of the new MongoDB Shell (mongosh) which can extend scripting with npm modules integrations.

What is OpenAI and How Do I Get Access to It?

OpenAI is a unique project aiming to provide an API for many AI tasks built mostly on Natural Language Processing today. You can read more about their projects in this blog.
There are a variety of examples for its text processing capabilities.
If you want to use OpenAI, you will need to get a trial API key first by joining the waitlist on their main page. Once you are approved to get an API key, you will be granted about $18 for three months of testing. Each call in OpenAI is billed and this is something to consider when using in production. For our purposes, $18 is more than enough to test the most expensive engine named “davinci.”
Once you get the API key, you can use various clients to run their AI API from your script/application.
Since we will be using the new mongosh shell, I have used the JS API.

Preparing the mongosh to Use OpenAI

First, we need to install the new shell, if you haven’t done it so far. On my Mac laptop, I just issued:
Windows users should download the MSI installer from our download page and follow the Windows instructions.
Once my mongosh is ready, I can start using it, but before I do so, let’s install OpenAI JS, which we will import in the shell later on:
I’ve decided to use the Questions and Answers pattern, in the form of Q: <Question> and A: <Answer>, provided to the text to command completion API to provide the learning material about MongoDB queries for the AI engine. To better feed it, I placed the training questions and answers in a file called AI-input.txt and its content:
We will use this file later in our code.
This way, the completion will be based on a similar pattern.
Prepare Your Atlas Cluster
MongoDB Atlas, the database-as-a-platform service, is a great way to have a running cluster in seconds with a sample dataset already there for our test. To prepare it, please use the following steps:
  1. Create an Atlas account (if you don’t have one already) and use/start a cluster. For detailed steps, follow this documentation.
Use the copied connection string, providing it to the mongosh binary to connect to the pre-populated Atlas cluster with sample data. Then, switch to sample_restaurants database.

Using OpenAI Inside the mongosh Shell

Now, we can build our textToMql function by pasting it into the mongosh. The function will receive a text sentence, use our generated OpenAI API key, and will try to return the best MQL command for it:
In the above function, we first load the OpenAI npm module and initiate a client with the relevant API key from OpenAI.
The new shell allows us to import built-in and external modules to produce an unlimited flexibility with our scripts.
Then, we read the learning data from our AI-input.txt file. Finally we add our Q: <query> input to the end followed by the A: value which tells the engine we expect an answer based on the provided learningPath and our query.
This data will go over to an OpenAI API call:
The call performs a completion API and gets the entire initial text as a prompt and receives some additional parameters, which I will elaborate on:
  • engine: OpenAI supports a few AI engines which differ in quality and purpose as a tradeoff for pricing. The “davinci” engine is the most sophisticated one, according to OpenAI, and therefore is the most expensive one in terms of billing consumption.
  • temperature: How creative will the AI be compared to the input we gave it? It can be between 0-1. 0.3 felt like a down-to-earth value, but you can play with it.
  • Max_tokens: Describes the amount of data that will be returned.
  • Stop: List of characters that will stop the engine from producing further content. Since we need to produce MQL statements, it will be one line based and “\n” is a stop character.
Once the content is returned, we parse the returned JSON and print it with console.log.
Lets Put OpenAI to the Test with MQL
Once we have our function in place, we can try to produce a simple query to test it:
Nice! We never taught the engine about the restaurants collection or how to filter with regex operators but it still made the correct AI decisions.
Let's do something more creative.
Okay, now let's put it to the ultimate test: aggregations!
Now that is the AI power of MongoDB pipelines!

DEMO

asciicast

Wrap-Up

MongoDB's new shell allows us to script with enormous power like never before by utilizing npm external packages. Together with the power of OpenAI sophisticated AI patterns, we were able to teach the shell how to prompt text to accurate complex MongoDB commands, and with further learning and tuning, we can probably get much better results.
Try this today using the new MongoDB shell.

Copy Link
facebook icontwitter iconlinkedin icon
Rate this article
star-empty
star-empty
star-empty
star-empty
star-empty
Related
Article
Separating Data That is Accessed Together

May 31, 2022
Tutorial
Currency Analysis with Time Series Collections #1 — Generating Candlestick Charts Data

May 16, 2022
Quickstart
Getting Started with Aggregation Pipelines in Python

Sep 23, 2022
Quickstart
Quick Start: BSON Data Types - ObjectId

Sep 23, 2022
Table of Contents