Optimizing AWS Lambda With MongoDB Atlas & NodeJS

Raphael Londner
April 10, 2017 | Updated: February 22, 2021
#Technical

I attended an AWS user group meeting some time ago, and many of the questions from the audience concerned caching and performance. In this post, I review the performance implications of using Lambda functions with any database-as-a-service (DBaaS) platform (such as MongoDB Atlas). Based on internal investigations, I offer a specific workaround available for Node.js Lambda functions. Note that other supported languages (such as Python) may only require implementing some parts of the workaround, as the underlying AWS containers may differ in their resource disposal requirements. I will specifically call out below which parts are required for any language and which ones are Node.js-specific.

AWS Lambda is serverless, which means that it is essentially stateless. Well, almost. As stated in its developer documentation, AWS Lambda relies on a container technology to execute its functions. This has several implications:

The first time your application invokes a Lambda function it will incur a penalty hit in latency – time that is necessary to bootstrap a new container that will run your Lambda code. The definition of "first time" is fuzzy, but word on the street is that you should expect a new container (i.e. a “first-time” event) each time your Lambda function hasn’t been invoked for more than 5 minutes.
If your application makes subsequent calls to your Lambda function within 5 minutes, you can expect that the same container will be reused, thus saving some precious initialization time. Note that AWS makes no guarantee it will reuse the container (i.e. you might just get a new one), but experience shows that in many cases, it does manage to reuse existing containers.
As mentioned in the How It Works page, any Node.js variable that is declared outside the handler method remains initialized across calls, as long as the same container is reused.

Understanding Container Reuse in AWS Lambda, written in 2014, dives a bit deeper into the whole lifecycle of a Lambda function and is an interesting read, though may not reflect more recent architectural changes to the service. Note that AWS makes no guarantee that containers are maintained alive (though in a "frozen" mode) for 5 minutes, so don’t rely on that specific duration in your code.

In our very first attempt to build Lambda functions that would run queries against MongoDB Atlas, our database as a service offering, we noticed the performance impact of repeatedly calling the same Lambda function without trying to reuse the MongoDB database connection. The wait time for the Lambda function to complete was around 4-5 seconds, even with the simplest query, which is unacceptable for any real-world operational application.

In our subsequent attempts to declare the database connection outside the handler code, we ran into another issue: we had to call db.close() to effectively release the database handle, lest the Lambda function time out without returning to the caller. The AWS Lambda documentation doesn’t explicitly mention this caveat which seems to be language dependent since we couldn’t reproduce it with a Lambda function written in Python.

Fortunately, we found out that Lambda’s context object exposes a callbackWaitsForEmptyEventLoop property, that effectively allows a Lambda function to return its result to the caller without requiring that the MongoDB database connection be closed (you can find more information about callbackWaitsForEmptyEventLoop in the Lambda developer documentation). This allows the Lambda function to reuse a MongoDB Atlas connection across calls, and reduce the execution time to a few milliseconds (instead of a few seconds).

In summary, here are the specific steps you should take to optimize the performance of your Lambda function:

Declare the MongoDB database connection object outside the handler method, as shown below in Node.js syntax (this step is required for any language, not just Node.js):

'use strict'

var MongoClient = require('mongodb').MongoClient;

let cachedDb = null;

In the handler method, set context.callbackWaitsForEmptyEventLoop to false before attempting to use the MongoDB database connection object (this step is only required for Node.js Lambda functions):

exports.handler = (event, context, callback) => {

    context.callbackWaitsForEmptyEventLoop = false;

Try to re-use the database connection object using the MongoDB.connect(Uri) method only if it is not null and db.serverConfig.isConnected() returns true (this step is required for any language, not just Node.js):

function connectToDatabase(uri) {
  
    if (cachedDb && cachedDb.serverConfig.isConnected()) {
        console.log('=> using cached database instance');
        return Promise.resolve(cachedDb);
    }
    const dbName = 'YOUR_DATABASE_NAME';
    return MongoClient.connect(uri)
        .then(client => { cachedDb = client.db(dbName); return cachedDb; });
}

Do NOT close the database connection! (so that it can be reused by subsequent calls).

The Serverless development with Node.js, AWS Lambda and MongoDB Atlas tutorial post makes use of all these best practices so I recommend that you take the time to read it. The more experienced developers can also find optimized Lambda Node.js functions (with relevant comments) in:

I’d love to hear from you, so if you have any question or feedback, don’t hesitate to leave them below.

Additionally, if you’d like to learn more about building serverless applications with MongoDB Atlas, I highly recommend our webinar below where we have an interactive tutorial on serverless architectures with AWS Lambda.

Watch Serverless Architectures with AWS Lambda and MongoDB Atlas

About the Author - Raphael Londner

Raphael Londner is a Principal Developer Advocate at MongoDB, focused on cloud technologies such as Amazon Web Services, Microsoft Azure and Google Cloud Engine. Previously he was a developer advocate at Okta as well as a startup entrepreneur in the identity management space. You can follow him on Twitter at @rlondner.

Learn more about using MongoDB with AWS, either self-managed or with our fully-managed database as a service, MongoDB Atlas. You can also check out information about the estimated cost of running MongoDB on AWS with MongoDB Atlas.

← Previous

10-Step Methodology to Creating a Single View of your Business: Part 1

Organizations have long seen the value in aggregating data from multiple systems into a single, holistic, real-time representation of a business entity. That entity is often a customer. But the benefits of a single view in enhancing business visibility and operational intelligence can apply equally to other business contexts. Think products, supply chains, industrial machinery, cities, financial asset classes, and many more. However, for many organizations, delivering a single view to the business has been elusive, impeded by a combination of technology and governance limitations. In this 3 part blog series, we will explore what it takes to successfully deliver a single view project: In Part 1 today, we will review the business drivers behind single view projects, introduce a proven and repeatable 10-step methodology to creating the single view, and discuss the initial “Discovery” stage of the project In Part 2 , we dive deeper into the methodology by looking at the development and deployment phases of the project In Part 3 , we wrap up with the single view maturity model, look at required database capabilities to support the single view, and present a selection of case studies. If you want to get started right now, download the complete 10-Step Methodology to Creating a Single View whitepaper . MongoDB has been used in many single view projects across enterprises of all sizes and industries. This whitepaper shares the best practices we have observed and institutionalized over the years. It provides a step-by-step guide to the methodology, governance, and tools essential to successfully delivering a single view project. Why Single View? Why Now? Today’s modern enterprise is data-driven. How quickly an organization can access and act upon information is a key competitive advantage. So how does a single view of data help? Most organizations have a complicated process for managing their data. It usually involves multiple data sources of variable structure, ingestion and transformation, loading into an operational database, and supporting the business applications that need the data. Often there are also analytics, BI, and reporting that require access to the data, potentially from a separate data warehouse or data lake. Additionally, all of these layers need to comply with security protocols, information governance standards, and other operational requirements. Inevitably, information ends up stranded in silos. Often systems are built to handle the requirements of the moment, rather than carefully designed to integrate into the existing application estate, or a particular service requires additional attributes to support new functionality. Additionally, new data sources are accumulated due to business mergers and acquisitions. All of a sudden information on a business entity, such as a customer, is in a dozen different and disconnected places. Figure 1: Sample of single view use cases Single view is relevant to any industry and domain as it addresses the generic problem of managing disconnected and duplicate data. Specifically, a single view solution does the following: Gathers and organizes data from multiple, disconnected sources; Aggregates information into a standardized format and joint information model; Provides holistic views for connected applications or services, across any digital channel; Serves as a foundation for analytics – for example, customer cross-sell, upsell, and churn risk. Figure 2: High-level architecture of single view platform Introducing the 10 Step Methodology to Delivering a Single View From scoping to development to operationalization, a successful single view project is founded on a structured approach to solution delivery. In this section of the blog series, we identify a repeatable, 10-step methodology and tool chain that can move an enterprise from its current state of siloed data into a real-time single view that improves business visibility. Figure 3: 10-step methodology to deliver a single view The timescale for each step shown in the methodology is highly project-dependent, governed by such factors as: The number of data sources to merge; The number of consuming systems to modify; The complexity of access patterns querying the single view. MongoDB’s consulting engineers can assist in estimating project timescales based on the factors above. Step 1: Define Project Scope & Sponsorship Building a single view can involve a multitude of different systems, stakeholders, and, business goals. For example, creating a single customer view potentially entails extracting data from numerous front and back office applications, operational processes, and partner systems. From here, it is aggregated to serve everyone from sales and marketing, to call centers and technical support, to finance, product development, and more. While it’s perfectly reasonable to define a future-state vision for all customer data to be presented in a single view, it is rarely practical in the first phase of the project. Instead, the project scope should initially focus on addressing a specific business requirement, measured against clearly defined success metrics. For example, phase 1 of the customer single view might be concentrated on reducing call center time-to-resolution by consolidating the last three months of customer interactions across the organization’s web, mobile, and social channels. By limiting the initial scope of the single view project, precise system boundaries and business goals can be defined, and department stakeholders identified. With the scope defined, project sponsors can be appointed. It is important that both the business and technical sides of the organization are represented, and that the appointees have the authority to allocate both resources and credibility to the project. Returning to our customer single view example above, the head of Customer Services should represent the business, partnered with the head of Customer Support Systems. Step 2: Identify Data Consumers This is the first in a series of iterative steps that will ultimately define the single view data model. In this stage, the future consumers of the single view need to share: How their current business processes operate, including the types of queries they execute as part of their day-to-day responsibilities, and the required Service Level Agreements (SLAs); The specific data (i.e., the attributes) they need to access; The sources from which the required data is currently extracted. Step 3: Identify Data Producers Using the outputs from Step 2, the project team needs to identify the applications that generate the source data, along with the business and technical owners of the applications, and their associated databases. It is important to understand whether the source application is serving operational or analytical applications. This information will be used later in the project design to guide selection of the appropriate data extract and load strategies. Wrapping Up Part 1 That wraps up the first part of our 3-part blog series. In Part 2, we will dive deeper into the Develop and Deploy phases of the single view methodology. Remember, if you want to get started right now, download the complete 10-Step Methodology to Creating a Single View whitepaper Download now

April 10, 2017

Next →

Five Languages, One Goal: A Developer's Path to Certification Mastery

MongoDB Community Creator Markandey Pathak has become a certified developer in five different programming languages: C#, Java, Node.JS, PHP, and Python. Pursuing multiple certifications equips developers with a diverse skill set, making them invaluable team members. Fluency across different programming languages enables them to foster platform-agnostic solutions and promote adaptability, collaboration, and informed decision-making, which are crucial for success in the global tech landscape. To understand what led Markandey to take on so many certifications while managing a busy and successful career, we spoke with him to gain insights into the challenges and triumphs he faced. What motivated you to pursue certification in multiple programming languages, and how has achieving such a diverse set of skills impacted your career? C was the first programming language I learned, followed by C# and the .NET ecosystem a few years later. Transitioning to a new language like C# after knowing one was straightforward. I then delved into ASP.NET, JAVA, and subsequently PHP. Despite the differing syntax of these languages, I found that fundamental programming concepts remained consistent. This enlightening realization led me to explore JavaScript and, later, Python. Such a diverse skill set made me a go-to resource for many senior leaders seeking insights. This versatility allowed me to transcend categorization based on programming ecosystems in the workplace, evolving my mindset to develop platform-agnostic solutions. I believe in the adage of being a jack of all trades while still mastering one or more. I took on the challenge of discovering MongoDB drivers available for various platforms. I created sample applications to practice basic MongoDB concepts using specific drivers, and soon, everything fell into place effortlessly. What tips or advice would you share with someone who looks up to your achievement and aspires to become a certified developer in multiple languages like C#, Java, Node.JS, PHP, and Python? How can they effectively approach learning and mastering these languages? Before attempting proficiency in MongoDB across multiple languages, it's crucial to prioritize understanding fundamental concepts such as data modeling practices, CRUD operations, and indexes. Mastering MongoDB's shell, MongoSh, is essential to grasp the workings of MongoDB's read and write operations. Following this, individuals should select a programming environment they're most adept in and practice executing MongoDB operations within that ecosystem. Constructing a personal project can aid in practically observing various MongoDB concepts in action. Utilizing resources such as MongoDB Certification Learning Paths , practice tests, and MongoDB Documentation is vital for excelling in certification exams. Additionally, it's advisable to undertake the initial certification in the programming language one feels most comfortable with. Reflection is key; saving or emailing exam scores enables individuals to identify areas needing improvement for future attempts. With proficiency in C#, Java, Node.JS, PHP, and Python, how do you perceive the role of versatility in today's tech industry, especially regarding job opportunities and project flexibility? Programming languages, very much like spoken languages, are merely a medium. The most important thing is knowing what to say. The tech industry depends on problems, and developers seek solutions to them. Once they have a solution, programming languages help make those solutions a reality. It’s not hard to learn different programming languages or even to master them. Knowing the basics of different programming ecosystems can give developers an edge regarding job opportunities. It makes them flexible and enables them to make crucial and informed decisions in choosing the correct tech stack or defining good architecture for solutions. In your experience, how does fluency in multiple languages enhance collaboration and innovation within development teams, particularly in today's globalized tech landscape? Fluency or even practical awareness about programming languages or ecosystems promotes versatility in problem-solving, facilitates cross-functional collaboration, supports agile development, enables integration with legacy systems, fosters global collaboration, reduces dependency, and empowers informed decision-making, all of which are crucial for staying competitive in today's globalized tech landscape. As a MongoDB Community Creator, how do you leverage your expertise in these five languages to contribute to and engage with the broader tech community? What advice would you offer aspiring developers seeking to expand their skill set? I aim to open-source my MongoDB-focused projects across various ecosystems, accompanied by detailed articles outlining their construction. Since these projects were designed with exams in mind, they serve as skill-testing tools for developers and comprehensive guides to the various components comprising certification exams. I advocate for developers to choose a favorite language and compare others to it, as this approach facilitates a quicker and more efficient understanding of concepts. Relating new information to familiar concepts makes learning easier and more effective. The MongoDB Community Advocacy Program is a vibrant global community designed for MongoDB enthusiasts who are passionate about advocating for the platform. Our Community Creators Program welcomes members of all skill levels eager to deepen their involvement in advancing MongoDB's community and technology. We empower our members to expand their expertise, visibility, and leadership by actively engaging with and advocating for MongoDB technologies among users worldwide. Join us and amplify your impact within the MongoDB community! Elevate your career with MongoDB University 's 1,000+ learning assets. Access free courses and hands-on labs, and earn certifications to boost your skills and stand out in tech.

April 24, 2024