Building a voice-activated movie search app powered by Amazon Lex, Lambda, and MongoDB Atlas - Part 1

< View all blog posts
Raphael Londner
November 12, 2017
Category: AWS re:Invent 2017, Technical, Cloud
It's that time of year again! This post is part of our Road to AWS re:Invent 2017 blog series. In the weeks leading up to AWS re:Invent in Las Vegas this November, we'll be posting about a number of topics related to running MongoDB in the public cloud. See all posts here.

Introduction

As we prepare to head out to Las Vegas for AWS re:Invent 2017, I thought it’d be a good opportunity to explore how to combine serverless and artificial intelligence services such as Lex and Lambda with MongoDB Atlas, our fully managed database service.

This tutorial is divided into 3 parts:

Since this is Part 1 of our blog series, let’s dig right into it now.

What is Amazon Lex?

Amazon Lex is a deep learning service provided by AWS to power conversational bots (more commonly known as "chatbots"), which can either be text- or voice-activated. It’s worth mentioning that Amazon Lex is the technology that powers Alexa, the popular voice service available with Amazon Echo products and mobile applications (hence the Lex name). Amazon Lex bots are built to perform actions (such as ordering a pizza), which in Amazon lingo is referred to as intents.

Note that each bot may perform multiple intents (such as "booking a flight" and “booking a hotel”), which can each be kicked off by distinct phrases (called utterances). This is where the Natural Language Understanding (NLU) power of Lex bots shines — you define a few sample utterances and let the Lex AI engine infer all the possible variations of these utterances (another interesting aspect of Lex’ AI engine is its Automatic Speech Recognition technology, which allows).

Let's illustrate this concept with a fictitious, movie search scenario. If you create a SearchMovies intent, you may want to define a sample utterance as “I would like to search for a movie”, since you expect it to be what the user will say to express their movie search intention. But as you may well know, human beings have a tendency to express the same intention in many different ways, depending on their mood, cultural background, language proficiency, etc... So if the user types (or says) “I’d like to find a movie” or “I’d like to see a movie”, what happens? Well, you’ll find that Lex is smart enough to figure out that those phrases have the same meaning as “I would like to search for a movie” and consequently trigger the “SearchMovies” intent.

However, as our ancestors the Romans would say, dura lex sed lex and if the user’s utterance veers too far away from the sample utterances you have defined, Lex would stop detecting the match. For instance, while "I’d like to search for a motion picture" and “I’d like to see a movie” are detected as matches of our sample utterance (I would like to search for a movie), “I’d like to see a motion picture” is not (at least in the tests I performed).

The interim conclusion I drew from that small experiment is that Lex’ AI engine is not yet ready to power Blade Runner’s replicants or Westworld’s hosts, but it definitely can be useful in a variety of situations (and I’m sure the AWS researchers are hard at work to refine it).

In order to fulfill the intent (such as providing the name of the movie the user is looking for), Amazon Lex would typically need some additional information, such as the name of a cast member, the movie genre and the movie release year. These additional parameters are called slots in the Lex terminology and theye are collected one at a time after a specific Lex prompt.

For instance, after an utterance is detected to launch the SearchMovies intent, Lex may ask the following questions to fill all the required slots:

  • What's the movie genre? (to fill the genre slot)

  • Do you know the name of an actor or actress with a role in that movie? (to fill the castMember slot)

  • When was the movie was released? (to fill the year slot)

Once all the required slots have been filled, Lex tries to fulfill the intent by passing all the slot values to some business logic code that performs the necessary action — e.g, searching for matching movies in a movie database or booking a flight. As expected, AWS promotes its own technologies so Lex has a built-in support for Lambda functions, but you can also "return parameters to the client", which is the method you’ll want to use if you want to process the fulfillment in your application code (used in conjunction with the Amazon Lex Runtime Service API).

Demo bot scenario

Guess what? This will be a short section since the scenario we will implement in this blog post series is exactly the "fictitious example" I described above (what a coincidence!).

Indeed, we are going to build a bot allowing us to search for movies among those stored in a movie database. The data store we will use is a MongoDB database running in MongoDB Atlas, which is a good serverless fit for developers and DevOps folks who don’t want to set up and manage infrastructure.

Speaking of databases, it’s time for us to deploy our movie database to MongoDB Atlas before we start building our Lex bot.

Data setup and exploration

To set up the movie database, follow the instructions available in this GitHub repository.

Note that in order to keep the database dump file under GitHub's 100MB limit per file, the database I have included isn’t complete (for instance, it doesn’t include movies released prior to 1950 - sincere apologies to Charlie Chaplin fans).

Now, let’s take a look at a typical document in this database (Mr. & Mrs. Smith released in 2005):

{
    "_id" : ObjectId("573a13acf29313caabd287dd"),
    "ID" : 356910,
    "imdbID" : "tt0356910",
    "Title" : "Mr. & Mrs. Smith",
    "Year" : 2005,
    "Rating" : "PG-13",
    "Runtime" : "120 min",
    "Genre" : "Action, Comedy, Crime",
    "Released" : "2005-06-10",
    "Director" : "Doug Liman",
    "Writer" : "Simon Kinberg",
    "Cast" : [
        "Brad Pitt",
        "Angelina Jolie",
        "Vince Vaughn",
        "Adam Brody"
    ],
    "Metacritic" : 55,
    "imdbRating" : 6.5,
    "imdbVotes" : 311244,
    "Poster" : "http://ia.media-imdb.com/images/M/MV5BMTUxMzcxNzQzOF5BMl5BanBnXkFtZTcwMzQxNjUyMw@@._V1_SX300.jpg",
    "Plot" : "A bored married couple is surprised to learn that they are both assassins hired by competing agencies to kill each other.",
    "FullPlot" : "John and Jane Smith are a normal married couple, living a normal life in a normal suburb, working normal jobs...well, if you can call secretly being assassins \"normal\". But neither Jane nor John knows about their spouse's secret, until they are surprised to find each other as targets! But on their quest to kill each other, they learn a lot more about each other than they ever did in five (or six) years of marriage.",
    "Language" : "English, Spanish",
    "Country" : "USA",
    "Awards" : "9 wins & 17 nominations.",
    "lastUpdated" : "2015-09-04 00:02:26.443000000",
    "Type" : "movie",
    "Genres" : [
        "Action",
        "Comedy",
        "Crime"
    ]
}

I have highlighted the properties of interest to our use case. Each movie record typically includes the principal cast members (stored in a string array), a list of genres the movie can be categorized in (stored in a string array) and a release year (stored as a 4-digit integer).

These are the 3 properties we will leverage in our Lex bot (which we will create in Part 2) and consequently in our Lambda function (which we will build in Part 3) responsible for querying our movies database.

Storing these properties as string arrays is key to ensure that our bot is responsive: they allow us to build small, multikey indexes that will make our queries much faster compared to full collection scans (which regex queries would trigger).

Summary

In this blog post, we introduced the core concepts of Amazon Lex and described the scenario of the Lex bot we’ll create in Part 2. We then deployed a sample movie database to MongoDB Atlas, explored the structure of a typical movie document and identified the fields we’ll use in the Lambda function we’ll build in Part 3. We then reviewed the benefits of using secondary indexes on these fields to speed up our queries.

I have only scratched the surface on all these topics, so here is some additional content for those of you who strive to learn more:

Stay tuned for Part 2!

About the Author - Raphael Londner

Raphael Londner is a Principal Developer Advocate at MongoDB, focused on cloud technologies such as Amazon Web Services, Microsoft Azure and Google Cloud Engine. Previously he was a developer advocate at Okta as well as a startup entrepreneur in the identity management space. You can follow him on Twitter at @rlondner

comments powered by Disqus