Adopting a Serverless Approach at Bazaarvoice with MongoDB Atlas and AWS Lambda
I recently had the pleasure of welcoming Ani Hammond, Senior Staff Software Engineer from Bazaarvoice, to the MongoDB World stage. To a completely packed room, Ani chronicled her team’s journey as they replatformed Bazaarvoice’s Curations service from a runaway monolith architecture to a completely serverless architecture backed by MongoDB Atlas.
Even if you’ve never heard of Bazaarvoice, it’s almost impossible that you’ve never interacted with their services. To use Ani’s own description, “If you're shopping online and you’re reading a review, it's probably powered by us.”
Bazaarvoice strives to connect brands and retailers with consumers through the gathering, curation, and display of user-generated content—anything from pictures on Instagram to an online product review—during a potential customer’s buying journey.
To give you a sense of the scale of this task, Bazaarvoice clocked over a billion total page views between Thanksgiving Day and Cyber Monday in 2017, peaking at around 6,000 page views per second!
Even if you’ve never heard of Bazaarvoice, it’s almost impossible that you’ve never interacted with their services.
One of the technologies behind this herculean task is the Curations platform. To understand how this platform works, let’s look at an example:
An Instagram user posts a cute photo of their child wearing a particular brand’s rain boots. Using Curations, that brand is watching for specific content that mentions their products, so the social collection service picks up that post and shows it to the client team in the Curations application. The post can then be enriched in various manual and automatic ways. For example, a member of the client team can append metadata describing the product contained in the image or automatic rules can filter content for potentially offensive material. The Curations platform then automates the process of securing the original poster’s permission for the client to use their content. Now, this user-generated content is able to be displayed in real time on the brand’s homepage or product pages to potential customers considering similar products.
In a nutshell, this is what Curations does for hundreds of clients and hundreds of thousands of individual content pieces.
The technology behind Curations was previously a monolithic Python/Django-based stack on Amazon EC2 instances on top of a MySQL datastore deployed via RDS.
The technology behind Curations was previously a monolithic Python/Django-based stack on Amazon EC2 instances on top of a MySQL datastore deployed via RDS.
This platform was effective in allowing Bazaarvoice to scale to hundreds of new clients. However, this architecture did have an Achilles heel: each additional client onboarded to Bazaarvoice’s platform represented an additional Python/Django/MySQL cluster to manage. Not only was this configuration expensive (approximately $60,000/month), the operational overhead generated by each additional cluster made debugging, patching, releases, and general data management an ever-growing challenge. As Ani put it, “Most of our solutions were basically to throw more hardware/money at the problem and have a designated DevOps person to manage these clusters.”
One of the primary factors in selecting MongoDB for the new Curations platform was its support for a variety of different access patterns. For example, the part of the platform responsible for sourcing new social content had to support high write volume whereas the mechanism for displaying the content to consumers is read-intensive with strict availability requirements.
Diving into the specifics of why the Bazaarvoice team opted to move from a MySQL-based stack to one built on MongoDB is a blog post for another day. (Though, if you’d like to see what motivated other teams to do so, I recommend How DevOps, Microservices, and MongoDB are Making HSBC “Simpler, Better, and Faster” and Breuninger delivers omnichannel shopping experience for thousands of daily online users.)
That is to say, the focus of this particular post is the paradigm shift the Curations team made from a linearly-scaling monolith to a completely serverless approach, underpinned by MongoDB Atlas.
The new Curations platform is broken into three distinct services for content collection, enrichment, and display. The collections service is powered by a series of AWS Lambda functions triggered by an Amazon Kinesis stream written in Node.js whereas the enrichment and display services are built on autoscaling AWS Elastic Beanstalk instances. All three services making up the new Curations platform are backed by MongoDB Atlas.
Not only did this approach address the cluster-per-customer challenges of the old system, but the monthly costs were reduced by nearly 90% to approximately $6,500/month. The results are, again, best captured by Ani’s own words:
Massive cost savings, huge performance gains, strong consistency, and a handful of services rather than hundreds of clusters.
MongoDB Atlas was a natural fit in this new serverless paradigm as the team is fully able to focus on developing their product rather than on infrastructure management. In fact, the team had originally opted to manage the MongoDB instances on AWS themselves. After a couple of iterations of manual deployment and management, a desire to gain even more operational efficiency and increased insight into database performance prompted their move to Atlas. According to Ani, the cost of migrating to and leveraging a fully managed service was, "Way cheaper than having dedicated DevOps engineers.” Atlas’ support for direct VPC peering also made the transition to a hosted solution straightforward for the team.
Speaking of DevOps, one of the first operational benefits Ani and her team experienced was the ability to easily optimize their index usage in MongoDB. Previously, their approach to indexing was “build stuff that makes sense at the time and is easy to iterate on.” After getting up and running on Atlas, they were able to use the built-in Performance Advisor to make informed decisions on indexes to add and unused ones to remove. As Ani puts it:
An index killed is as valuable as an index added. This ensures all your indexes to fit into memory and a bad index doesn't push out the good ones.
Ani’s team also used the Atlas Performance Advisor to diagnose and correct inefficient queries. According to her, the built-in tools helped keep the team honest, "[People] say, ‘My database isn't scaling. It's not able to perform complex queries in real time...it doesn't work.’ Fix your code. The hardware is great, the tools are great but they can only carry you so far. I think sometimes we tend to get sloppy with how we write our code because of how cheap and how easy hardware is but we have to write code responsibly too.”
In another incident, a different Atlas feature, the Real Time Performance Panel, was key to identifying an issue with high load times in the display service. Some client’s displays were taking more than 6 seconds to load. (For context, content delivery network provider, Akamai, found that a two-second delay in web page load time can cause bounce rates to double!) High-level metrics in Datadog reported 5+ seconds query response times, while Atlas reported less than 100 ms response times for the same query. The team used both data points to triangulate and soon realized the discrepancy was a result of the time it took for Lambda to connect to MongoDB for each new operation. Switching from standard Lambda functions to a dockerized service ensured each operation could leverage an open connection rather than initiating a “cold start.”
I know a lot of the cool things that Atlas does can be done by hand but unless this is your full-time job, you're just not going to do it and you’re not going to do it as well.
Ani’s team also used the Atlas Performance Advisor to diagnose and correct inefficient queries.
Before wrapping up her presentation, Ani shared an improvement over the old system that the team wasn’t expecting. Using Atlas, they were able to provide the customer support and services teams read-only views into the database. This afforded them deeper insight into the data and allowed them to perform ad-hoc queries directly. The result was a more proactive approach to issue management, leading to an 80% reduction in inbound support tickets.
By re-architecting their Curations platform, Bazaarvoice is well-positioned to bring on hundreds of new clients without a proportional increase in operations work for the team. But once again, Ani summarized it best:
As the old commercial goes… ‘Old platform: $60,000. New platform: $6,000. Getting to focus all of my time on development: priceless.'
Thank you very much to Ani Hammond and the rest of the Curations team at Bazaarvoice for putting together the presentation that inspired this post. Be sure to check out Ani’s full presentation in addition to dozens of other high-quality talks from MongoDB World on our YouTube channel.
High-end retailer in Germany delivers omni-channel shopping experience on MongoDB Atlas for thousands of daily online users
The importance of delivering an optimized customer experience cannot be overstated, especially if your business is high-end retail. For Breuninger, the customer-first approach has been in their DNA for more than 130 years.
When the top German retailer set out to build a new e-commerce platform, they wanted the online experience to match that of walking in to one of Breuninger’s premium department stores. Accomplishing this goal required a feature-rich, high-performance, and reliable database capable of supporting complex data sets across multiple categories.
“Today, our development teams have a lot of independence. We only have a handful of rules about how they design and build applications within their respective business units,” says Benedikt Stemmildt, Lead Software Architect of E. Breuninger GmbH & Co. “It’s not quite a rule that you have to use MongoDB, but you do have to explain yourself if you don’t.”
However, it wasn’t always this way. Breuninger’s previous platform was built on one of the industry-standard product content management (PCM) platforms, which Stemmildt felt was “monolithic and difficult to code for.” Code freezes were common and the underlying architecture was a frequent cause of frustration for an organization striving to adopt more agile processes.
A new development and feature roll-out approach was needed to execute the company’s aggressive omni-channel integration plans, and time to market for new online features became a top priority. Breuninger decided to build a technology group in response, going from 10 to 30 in-house developers in just a year.
“We broke down our monolithic architecture and split our application into separate microservices that reflect how our customers shop in the physical stores,” Stemmildt says. “It’s the customer journey — they search, discover, evaluate, and buy not just individual products, but complete outfits.”
“To reflect this architectural change, we split our development teams by different steps of the customer journey and kept dependencies to an absolute minimum,” Stemmildt continues. “One key to making this work is a high-performance database capable of working easily with data in lots of different ways. The document model of MongoDB means we can deliver data with the quality and detail that reflects our products and shopping experience.”
The result? Much faster time to market. Breuninger was able to build their omni-channel platform in months rather than years by enabling teams to decide on important architectural components for their own sections, without having to ask the permission of other teams.
As a seven-year veteran of MongoDB, Stemmildt was confident in recommending the database to his organization. “There are a lot of good databases,” he says. “However, many of them require developers to have a deep knowledge about how they work before getting any benefit. MongoDB is not like that. It’s very quick to learn and start getting results. Our teams are able to deliver features straight away. Once users do expand their use of the database, it’s so feature-rich that you never get a sense of having to push it beyond what it was designed for.”
And agile wouldn’t be agile without automation. “Everything we deploy is automated, and with MongoDB Atlas on AWS, the deployment and management of our databases fit neatly into our processes. After a period of operating MongoDB ourselves on EC2, it’s great not having to worry about the details and not having to spend time setting up, configuring, and managing database[s]. You free up a lot of opportunities to add value to your service by not running things yourself.”
AWS offers a healthy mix of other tools for the teams at Breuninger to leverage, such as a managed Kubernetes service and serverless Lambda functions. MongoDB Atlas and AWS also help Breuninger stay on the right side of the regulators. “We need to comply with GDPR so we keep everything running within our borders. MongoDB Atlas’s built-in security features have helped us satisfy these requirements.”
The finished platform might look different to someone who is used to traditional architectures, but to Stemmildt, not being restrained by legacy approaches makes a lot of sense. “Each of our teams owns one or more sections of the customer journey. The search team updates its own database, pulling data in from the product data producer via a feed and re-populating its own database as needed. We don’t have to ripple refreshes out across the system as they happen. That means each team is free to add new features without changing some core database component and affecting other teams. Self-contained systems are an important design rule.”
And although there are some 25 different and largely independent systems, the customers see just one website. A front-end proxy uses server-side includes to marshal data as required from a mix of micro-frontends before delivering the final composite to the shopper. Product data, product availability, outfit data, price information, navigation metadata — these are all woven together from separate MongoDB databases as the customer goes through the shopping experience online.
Comparing a microservices architecture to a monolithic one revealed to Breuninger that some metrics don’t matter as much as they once did, while others matter more. “With multiple teams developing things so rapidly, I don’t know exactly how much total data is in play. But we are a very metrics-driven company, not just in the technical infrastructure but across the business. We know when a component is and is not working well from both a technical and business perspective, if it needs optimizing for performance, or whether it is delivering value to the business or we need to revisit that aspect of the system architecture.”
While Stemmildt couldn’t comment too much on future plans, he’s enthusiastic about MongoDB’s part in whatever they may be. “We wanted high performance, but most importantly we wanted to be able to add more features. We’re not using MongoDB’s graph database feature yet, but we may be by the end of the year. There are a lot of things we could do with text search, too.”
Other new features — such as multi-document transaction support in MongoDB 4.0 — may also be useful, but in unorthodox ways. “I don’t actually think transactions are needed anymore for our platform,” he laughs, “But there are some teams, like the customer data team, who don’t agree with me yet and won’t use MongoDB because of that. So the release of MongoDB 4.0 will help me to help them make the transition.”
While customers won’t see the nuts and bolts of Breuninger’s transformation to a data-driven enterprise, they will benefit from the company’s newly integrated omni-channel platform, which delivers an improved customer experience and more ways to get inspired.
And to anyone thinking about using MongoDB on their next project, Stemmildt has just one piece of advice: “Use it. Get a MongoDB Atlas account, create a cluster, and play with it. The way we see it, after the majority of our teams have naturally adopted MongoDB, if you can’t say why you should use another database, then you should just use MongoDB.”
2018 MongoDB Innovation Award Winners
We received an overwhelming number of nominations for the fifth annual MongoDB Innovation Awards, recognizing companies who are using MongoDB to dream big and deliver incredibly bold, innovative solutions that are moving forward industries and changing lives for the better.
We are thrilled to announce our 12 winners who will be honored at MongoDB World, New York, June 26 and 27.
See the full list and read a bit more about how they are disrupting the status quo here:
Global Go to Market Partner of the Year: Accenture
Accenture is a leading global professional services company, providing a broad range of services and solutions strategy, consulting, digital, technology and operations. Combining unmatched experience and specialized skills across more than 40 industries and all business functions, Accenture works at the intersection of business and technology to help clients improve their performance and create sustainable value for their stakeholders. The company partners with more than three-quarters of the Fortune Global 500, driving innovation to improve the way the world works and lives. Accenture and MongoDB have worked together to help organizations leverage the power of data to gain a competitive edge.
The Enterprise: Charles Schwab
Charles Schwab is one of the largest financial services firms in the United States. To improve customer experience, speed up development cycles, and prepare for cloud portability, Charles Schwab is modernizing a significant portion of its applications by migrating to MongoDB-powered microservices. Multiple applications are built on MongoDB, including an authentication app leveraged by retail customers as well as a portfolio management solution utilized by registered investment advisors.
Launch Fast: Coinbase
Coinbase is dedicated to creating an open financial system for the world and defining what the future of finance will look like. To do this, they built the most trusted and regulatory compliant global cryptocurrency trading platform to broker exchanges of Bitcoin, Bitcoin Cash, Ethereum and Litecoin as well as pioneering cryptocurrency indexes and institutional cryptocurrency trading. In 2017, they experienced exponential growth with over 20M+ users and $150B+ being traded on their platform in over 190 countries. The Coinbase engineering team scaled and optimized MongoDB to respond to this unprecedented volume of traffic and to prepare for future waves of cryptocurrency enthusiasm.
Scale: Epic Games
Epic Games develops cutting-edge games and cross-platform gaming engine technology. Their massively popular, multi-platform game, Fortnite, has been played by more than 125 million gamers around the globe. The Epic team has implemented a number of best practices and performance improvements to get the best scaling and availability characteristics out of MongoDB.
Data-Driven Business: Freddie Mac
Freddie Mac set out to modernize a number of applications that were previously built on legacy relational databases. One mission-critical application, a property appraisal tool, held massive amounts of property and loan information, but was increasingly expensive and time consuming to update. Turning to MongoDB, Freddie Mac was able to collect information from a variety of different sources in a variety of formats to build a single view of all the information needed to accurately appraise a property. In the months since using MongoDB, Freddie Mac has seen an increase in developer productivity.
Customer Experience: Fresenius Medical Care North America
Fresenius Medical Care North America is the premier health care company focused on providing the highest quality care to people with renal and other chronic conditions. Through its industry-leading network of dialysis facilities, outpatient cardiac and vascular labs, and urgent care centers Fresenius Medical Care North America (FMCNA) provides coordinated health care services at pivotal care points for hundreds of thousands of chronically ill customers throughout the continent.
Since 2015, FMCNA has used MongoDB Enterprise Advanced for a variety of projects to help support their mission to deliver superior care that improves the quality of life of every patient. These projects have included analytics platforms, a data lake and the FHIR platform (a healthcare standard for exchanging medical records securely and at scale). However, the most impactful application has been a single view of the patient platform built on MongoDB. This platform brings together a variety of data sources to ensure the patient, doctors and other caregivers all have a complete understanding of the treatments required and can make adjustments with confidence.
Healthcare: Genomics England
Genomics England, a company owned by the UK government's Department of Health and Social Care, is working with the NHS to sequence 100,000 genomes from patients with rare diseases and their families, as well as patients with common cancer. In the future, there may be a diagnosis where there wasn't one before and, in time, there is the potential of new and more effective personalized treatments for patients.
On average, 1,000 genomes are sequenced per week, which amounts to around 10 terabytes of data per day. To manage this immense and sensitive data set as well as power the data science that makes it all possible, Genomics England used MongoDB Enterprise Advanced. The partnership with MongoDB allows the processing time for complex queries to be reduced from hours to milliseconds, which means scientists can discover new insights more quickly.
Internet of Things: Humana Inc.
With a variety of applications built on MongoDB, Humana is changing healthcare for the better. One of their IoT applications called Go365 is a corporate wellness and rewards program which helps employees live healthier lives, which in turn increases productivity and reduces overall health claims costs for employers. Go365 features a personalized program that inspires, supports, and rewards members for taking steps to improve and continue healthy behavior. Users are able to compete in challenges, connect their fitness devices and mobile apps to log healthy activities and earn points, reward themselves through the Go365 Mall, and track their progress. In fact, by year 3, people who engaged with the program saw that the cost of their health claims were reduced by over 10%, relative to those of unengaged members.
Delivery Partner of the Year: Infosys
A perennial winner, this is the third year in a row Infosys has won a MongoDB Innovation Award. As a global leader in consulting, technology and next-generation services, this year Infosys has been working closely with MongoDB to accelerate application modernization for client organizations. A key part of this is the joint delivery of single view and mainframe modernization offerings to migrate and digitize business-critical applications away from rigid tabular databases and on to next-generation technology. In this long standing partnership, Infosys and MongoDB are already helping many large enterprises with renewing and modernizing their IT landscape.
The William Zola Award for Community Excellence: Ken W. Alger
Ken Alger is one of the most prolific bloggers on MongoDB's technology with dozens of posts in the past two years. He is a self-taught programmer and a teacher at Treehouse. An avid follower of open-source, he has previously sat on the board of directors of the Django Software Foundation. He is delighted to share his extensive MongoDB knowledge via his blog, Twitter, and his GitHub account. He exemplifies the true community spirit of MongoDB and The William Zola Award for Community Excellence.
Savvy Startup: Radar
Radar, a seed-stage startup and member of the MongoDB Startup Accelerator program, has built iOS and Android SDKs on MongoDB Atlas and AWS. As the location platform for modern apps, Radar allows developers to easily add location context and tracking to their applications. Radar currently runs on more than 25 million devices around the globe, processing billions of locations each week.
7-Eleven is continuing to redefine what convenience is. By leveraging MongoDB Atlas on AWS and a microservices architecture, 7-Eleven has built an e-commerce application called 7-Now which allows consumers to browse a product catalog connected to their local store’s inventory, make purchases on their mobile phones, and schedule in-store pick up or delivery through services like Postmates. This application not only streamlines the consumer’s experience, but also gives the 7-Eleven team extensive analytics capabilities allowing them to improve the overall customer experience. This is sure to have a major impact in their 10,000 stores in the US and Canada, and with 60% of the US population living within one mile of a 7-Eleven.
Stratifyd & MongoDB: AI-Driven Business Insights to Keep Customers Happy
2017 was a banner year for MongoDB's partner ecosystem. We remain strategic about engaging with our channels, and the results are validating our approach. Our strong network of ISVs, global system integrators, cloud, resellers, and technology partners is a competitive differentiator that helps us scale.
We are especially excited about the innovation and growth in store for our ISV business in 2018. It's already off to a great start. Our newest ISV partner Stratifyd is a fantastic example of how platforms built around MongoDB address serious market needs with the most cutting-edge, innovative technology.
Stratifyd is an end-to-end customer analytics platform powered by AI. The platform provides competitive advantages to some of the most recognized brands in the world. LivePerson, Etsy, MASCO, Kimberly-Clark, and many more rely on Stratifyd for a 360-degree view of their end customers.
Stratifyd analyzes customer interactions such as online reviews, social media posts, phone calls, emails, chats, surveys, CRM data, and more to turn them into actionable business insights which increase customer acquisition and retention, which is critical to the continued success of Stratifyd’s clients. In addition to these benefits, Stratifyd is just a flat-out cool implementation of AI.
I caught up with Stratifyd's CTO, Kevin O'Dell, to discuss the data technology behind the platform, and how MongoDB drives value for their customers.
For anyone that isn’t familiar with Stratifyd yet, how do you describe the platform?
Stratifyd uses human generated data to analyze, categorize, and understand intent with the purpose of changing human behavior. This changes the way brands interact with their customers, but also the way customers interact with brands, increasing customer acquisition and raising retention rates.
What was the genesis of the company? Why did you set out to build this?
Stratifyd was a result of postdoctoral research done at the University of North Carolina, Charlotte. Our founders were researching how AI could analyze unstructured data. During their research, they discovered strong business and government use cases. The founding team was working with numerous three letter agencies on predicting terrorist and disease movements globally. They were able to raise millions of dollars in funding from these agencies. The demand for a product organically grew from there, which led to the development of the Stratifyd platform.
I love the insights Stratifyd can provide – how would you describe the unique advantages that Stratifyd gives its customers?
Stratifyd provides near real-time business intelligence for contact center, marketing, product, and customer experience teams, all based on customer interactions. These insights enable businesses to be proactive rather than reactive in regard to business strategy. Stratifyd customers are able to respond to customer requests, complaints, or general feedback in near real time, changing the way companies interact with their end users. For example, we have empowered a customer with the knowledge to launch a new product line. Another customer gained insights that fundamentally changed how they are rolling out a 700+ million-dollar brand in a new continent.
What kind of feedback are you getting from customers that have deployed Stratifyd in their businesses?
Our customers love using our platform. They are surprised at how simple it is to use, and how powerful it is. They really appreciate how Stratifyd is making AI and machine learning meaningful for them in their day-to day-lives. Stratifyd helps ensure measurable results from day one. Speaking of implementation, they REALLY love that we don’t just use the term day one figuratively – customers are up and running in less than a day.
Talk to me about how you landed on MongoDB. What were you looking for in a database, and what problems were you having before moving to MongoDB?
That's an easy one to answer: speed and flexibility. Stratifyd ingests data from hundreds of sources. We needed a database that could keep up with high read and write request rates while handling a flexible schema. The hardest problem we were trying to solve was lack of secondary indexes; with those, MongoDB accelerated our query response times by at least 100x.
Can you share any best practices for scaling your MongoDB infrastructure? Any impressive metrics around the number of interactions, the volume of reads / writes per second, response times?
As a SaaS-first platform, being always-on is a HUGE best practice for us. MongoDB’s innate replication and failover abilities ensured less than 17 minutes of total downtime last year! Using MongoDB as our backend system, our AI can process a quarter of a million words in less than a minute.
How do you measure the impact of MongoDB on your business?
Our business wouldn’t be able to succeed without MongoDB. The uptime, failover, query response times, secondary indexes, and dynamic schemas have empowered most of Stratifyd’s key differentiators.
What advice would you give someone who is considering using MongoDB for their next project?
With all projects, I recommend truly understanding the requirements for the end results. There is a ton of excellent technology out there, but picking the wrong one can be detrimental to project success. Always run numerous tests, comparing different stacks to make sure you find the right one and fail fast on the wrong technology stack.
Stratifyd has some impressive customers, from the Fortune 500 to some really innovative startups: what’s next for the company?
We have some pretty big things planned for 2018. We're now providing more than just actionable intelligence; we are now streamlining and automating workflows. We are closing the customer feedback loop, which enables us to plug Stratifyd into any business process quickly to deliver measurable results.
DarwinBox Evolves HR SaaS Platform and Prepares for 10x Growth with MongoDB Atlas
Evolution favors those that find ways to thrive in changing environments. DarwinBox has done just that, providing a full spectrum of HR services online and going from a standing start to a top-four sector brand in the Indian market in just two years. From 40 enterprise clients in its first year to more than 80 in its second, it now supports over 200,000 employees, and is hungrily eyeing expansion in new territories.
“We’re expecting 10x growth in the next two years,” says Peddi. “That means aggressive scaling for our platform and MongoDB Atlas will play a big role."
Starting from a blank sheet of paper
The company’s key business insight is that employees have grown accustomed to the user experience of online services they access in their personal lives. However, the same ease of use is simply not found at work, especially in HR solutions that address holiday booking, managing benefits, and appraisals. DarwinBox’s approach is to deliver a unified platform of user-friendly HR services to replace a jumble of disparate offerings, and to do so in a way that supports its own aggressive growth plans. The company aims to support nearly every employee interaction with corporate HR, such as recruitment, employee engagement, expense management, separation, and more.
“We started in 2015 from a blank sheet of paper,” Peddi says. “It became very clear very quickly that for most of our use cases, only a non-relational database would work. Not only did we want to provide an exceptionally broad set of integrated services, but we also had clients with a large number of customization requirements. This meant we needed a very flexible data model. We looked at a lot of options. We wanted an open source technology to avoid lock-in and our developers pushed for MongoDB, which fit all our requirements and was a pleasure to work with. Our databases are now 90 percent MongoDB. We expect that to be at 100 percent soon.”
Reducing costs and future-proofing database management
When DarwinBox launched, it ran its databases in-house, which wasn’t ideal. “We have a team of 40+ developers, QA and testers, and three running infrastructure, and suddenly we’re growing much faster than we expected. It’s a good problem to have, but we couldn’t afford to offer anything less than excellent service.” Peddi emphaszied that of all the things they wanted to do to succeed, becoming database management experts wasn’t high on the list.
This wasn’t the only reason that MongoDB Atlas looked like the next logical step for the company when it became available, says Peddi, “We were rapidly developing our services and our customer base, but our strategies for backing up the databases, for scaling, for high availability, and for monitoring performance weren’t keeping up. In the end, we decided that we’d migrate to Atlas for a few major reasons.”
The first reason was the most obvious. “The costs of managing the databases, infrastructure, and backups were increasing. In addition, it became increasingly difficult to self-manage everything as requirements became more sophisticated and change requests became more frequent. Scaling up and down to match demand and launching new clusters consumed precious man hours. Monitoring performance and issue resolution was taking up more time than we wanted. We had built custom scripts, but they weren’t really up to the task.”
With MongoDB Atlas on AWS, Peddi says, all these issues are greatly reduced. “We’re able to do everything we need with our fully managed database very quickly – scale according to business need at the press of a button, for example. There are other benefits. With MongoDB technical engineers a phone call away, we’re able to fix issues far quicker than we could in the past. MongoDB Compass, the GUI for the database, is proving helpful in letting our teams visually explore our data and tune things accordingly.”
Migrating to Atlas has also helped Darwinbox dramatically reduce costs.
We’ve optimized our database infrastructure and how we manage backups. Not only did we bring down costs by 40%, but by leveraging the queryable snapshot feature, we’re able to restore the data we actually need 80% faster.Chaitanya Peddi, Co-founder and Head of Product, DarwinBox
The increased availability and data resilience from the switch to MongoDB Atlas on AWS eases the responsibility in managing the details of 200,000 employees’ working lives. “Data is the most sensitive part of our business, the number one thing that we care about,” says Peddi, “We can’t lose even 0.00001 percent of our data. We used to take snapshots of the database, but that was costly and difficult to manage. Now, it’s more a live copy process. We can guarantee data retention for over a year, and it only takes a few moments to find what you need with MongoDB Atlas.”
For DarwinBox to achieve its target of 10x growth in two years, it has to – and plans to – go international.
“We had that in mind from the outset. We’ve designed our architecture to cope with a much larger scale, both in total employee numbers and client numbers, and to handle different regulatory regimes.” According to Peddi, that means moving to microservices, developing data analytics, maybe even looking at other cloud providers to host the DarwinBox HR Platform. He added: “If we were to do this on AWS and self-manage the database with our current resources, we would have to invest a significant amount of effort into orchestrating and maintaining a globally distributed database. MongoDB Atlas with its cross-region capabilities makes this all much easier.”
Darwinbox is confident that MongoDB Atlas will help the organization achieve its product plans.
“MongoDB Atlas will be able to support the business needs that we've planned out for the next two years.” says Peddi, “We’re happy to see how rapidly the Atlas product roadmap is evolving.”
BookMyShow Continues to Lead Online Entertainment Ticketing in India and Scales to 25 Million Users with MongoDB
India's twin passions for cinema and tech make it a natural fit for automated ticketing. But if ever a market needs scalable solutions, this 1.4 billion-strong nation is it.
That’s a lesson Viraj Patel, VP Technology for BigTree Entertainment, learned the hard way. "We started out in ticketing distribution in 1999 using telephones," he says, "before mobile platforms and internet access were on the scene. It just didn't work. The investors pulled the plug in 2002.”
Undeterred, the company successfully pivoted to selling software to cinema chains. By 2006, Viraj and team were ready to aim for the big prize again. They just needed the right tools. With the internet and mobile data fitting into place, a trial project in online ticket aggregation looked promising enough for investors to fund the launch of BookMyShow in 2007.
“We launched with a 100 percent Microsoft stack,” says Viraj, “but soon realized that scaling with Microsoft was not an easy job.” It wasn’t the Windows platform or the developer tools that were the problem, he recalls: “It was the SQL Server database. That was the first bottleneck as we got more and more traffic, and it soaked up more and more resources and money. It wasn’t the right solution. It couldn’t scale with us.”
Spoiler: By 2018, BookMyShow, each month, sells more than 10 million tickets for all manner of movies and events and serves three billion pages a month across the web and its 50 million plus installed apps. Scaling happened.
The plot changed for the better in 2010 with the discovery of MongoDB. “We were looking around for alternatives, and it was the new kid on the block.” (In fact, MongoDB 1.0 had launched just the year before, and MongoDB India was yet to come.) “We tested it internally as a straight distributed database for monolithic SQL database swap. Every web and mobile application we built needed a database that had performance and scalability, and MongoDB blew us away on both.”
MongoDB really won its spurs when the company added Facebook Connect to its registration process. “The registration database was the first thing we built, and it was running on SQL Server. Which was OK, until Facebook Connect came along and we added that as a registration option. Then the database really struggled. We switched to MongoDB and it was night and day. Tremendous gains. Not only did we get the ability to represent customers directly as JSON documents in the database, which made our data model much simpler, but we got all our performance back.
“We want the flexibility of upgrading the schema for future use cases, and that’s so much easier in MongoDB. The data structures we create are clear and easy to read, and it’s so much simpler to understand and extend,” Viraj adds, about their discovery of the advantages of document-model storage.
MongoDB’s second big job was also thoroughly web scale, as it took on the task of giving each of those millions of users their own bespoke, personalized view of the service. This time, the engineering team knew where to start. “About five years ago, we built our personalization engine on MongoDB,” says Viraj, “and it continues to scale with us. It stores a lot of customer information and when a customer visits, it pulls it out, personalizes it in real time and delivers it. That really improves the customer experience. We see an 18 percent increase in conversion, personalized versus non-personalized.”
Today, MongoDB is the default database for developing ideas and services in BigTree, and Viraj cheerfully admits he has long ago stopped counting how many nodes are in use. “Last time I looked, it was between 100-160,” he says.
Future plans include containerization of the databases to smooth out upgrades and ease of deployment with BigTree’s agile DevOps production pipeline and, when the time comes, sharding the customer database. That’s planned for, but not currently necessary. He explains: “We just haven’t reached the point where writes to MongoDB are the limiting factor anywhere in the service. We get a long way with MongoDB replica sets, and are safe in the knowledge that there are no limitations to scaling further when we need to.”
Viraj cares deeply about latency – “We’re a performance-sensitive company” – and much of the service is instrumented by monitoring and management platforms such as New Relic. While initial performance gains were superlative, he says, things have only continued to improve as new features and technologies have been added. “We had been using SQL tabular databases for customer booking history,” says Viraj. “We moved this to MongoDB and have seen a superb performance boost. What used to take up to 5000 ms on traditional SQL databases went down to 10-20 ms on MongoDB using the MMAP storage engine. When we moved to MongoDB’s default WiredTiger storage engine, it improved five to ten times further, to 2ms. We’re still getting this performance, even though the database now has close to 200 million documents.”
There have been other benefits from following MongoDB’s roadmap. “WiredTiger has made things much more cost-effective,” he says. “Security is better as we now encrypt data instead of storing it in plain JSON. Our customer database is five times more compact and our personalization database uses nearly eight times less storage.”
In the future, he says, they expect aggregation queries and query caching mechanisms will improve performance still more. As for reliability, “MongoDB auto-heals so well in the event of any failures in our platform we don’t even need to worry about it. That’s highly appreciated, and much better than any of the other databases we have used.”
There can be few better stories of early adoption and innovation with MongoDB than the success BigTree Entertainment has enjoyed with BookMyShow. Viraj and his engineers insist on picking the right tools for each part of the job running India’s favourite online ticketing service, their long experience of casting this particular actor in so many roles makes MongoDB a performer they’ve come to rely on.
Future Facilities Triples the Speed of Development with MongoDB
Future Facilities is an OEM partner of MongoDB that helps engineers and IT professionals use virtual prototyping to better plan IT deployments within data centers. By leveraging Computational Fluid Dynamics (CFD) simulation, users can test what-if scenarios unique to their facilities. Their web-based platform was originally built on MySQL, but the team quickly realized that the database couldn’t scale to meet their needs.
Instead, Future Facilities chose to migrate to MongoDB Enterprise Advanced. We sat down with Akhil Docca, Corporate Marketing & Product Strategy Manager of Future Facilities, to learn how migrating to MongoDB helped to triple the speed of development.
Can you tell us a little bit about yourself and Future Facilities?
I lead the marketing and product strategy here at Future Facilities. We provide software and services specifically focused on physical infrastructure design and management to customers in the data center market. Our solutions span the entire data center ecosystem, from design to operations. By utilizing a digital clone that we call the Virtual Facility (VF), our users can see the impact of any change like adding new capacity, upgrading equipment, etc., before it is implemented.
In 2004 we released 6SigmaRoom, the data center industry’s leading CFD software for data centers. 6SigmaRoom is how our users create a VF, where they can input live data from their facility, and include necessary objects such as cooling and power units, servers and racks. Having this digital twin allows engineers to troubleshoot, predict and analyze the impact of any deployment plan, and find the optimal method for implementation. With 6SigmaRoom, engineers can speed up capacity planning and improve the overall efficiency and resilience of their data center.
6SigmaRoom is essential for accurate data center capacity planning, however, it’s a heavy-duty desktop application developed for engineers. We wanted to create a product that Facilities and IT teams could use to improve both their processes and overall data center performance. In 2016 we launched a new product, 6SigmaAccess, to do just that.
6SigmaAccess is a multi-user, browser-based software platform that allows IT professionals to interact with their data center model and propose changes through a central management system. The browser-based architecture allows us to load up a lighter version of the 3D model specifically tailored to the IT capacity planning process.
Here’s how it works. IT planners propose changes such as adding new IT or racks, decommissioning equipment or cabinets, or simply editing attributes. These changes are then submitted and queued up via MongoDB. When the data center engineer opens up 6SigmaRoom, the proposed changes are automatically merged, allowing the engineer to simply run the simulation to see how the changes would affect the facility. If the analysis reveals that the proposed installations don’t impact performance, they can then be approved, merged back into the database and scheduled for deployment
MongoDB is the integration layer between 6SigmaAccess and 6SigmaRoom that makes this process possible.
What were you using before MongoDB?
We initially started building on MySQL, but quickly ran into challenges. Whenever we wanted to make an update to the database schema, there would be a huge demand on time and resources from our developers, DBAs, and ops teams. It quickly became apparent that we wouldn’t be able to scale to meet the needs of our customers. While redesigning the platform, we knew that we had to get away from the rigid architecture of a SQL tabular database.
Our goal was to find a data platform that was easy to work with, that developers would like, and that could scale as our business grew. After briefly considering Cassandra and CouchDB, we selected MongoDB for its strong community ecosystem, which made adopting the technology seamless. MongoDB allows us to focus on delivering new features instead of having to worry about managing the database. We are able to code, test and deliver incremental changes to 6SigmaAccess without having to change 6SigmaRoom. This will shorten our development cycles by 66%, from 9 to 3 months.
Can you describe your MongoDB deployment?
The key components of 6SigmaAccess are node.js, angular.js, JSON, and RESTful APIs. 6SigmaRoom is built on C++. We are currently deploying a 3-node cluster to our enterprise customers.
Our technology is built in a way that we aren’t always writing massive amounts of data to the database. 6SigmaAccess changes tend to be a few MBs at a time. 6SigmaRoom data files tend to be in the 100s of GB range, but we only write the data into the database based on a user action. The typical (minimum) server configuration that we’ve sized for our applications are: 4-16 Cores, 64 GB of RAM & 1 TB of disk space.
We are Windows Active Directory compliant and have additional access controls built into our software that enforces roles and permissions when connecting to the database.
What advice would you give someone who is considering using MongoDB for their next project?
Start early and incorporate MongoDB in your project from the beginning. Redundancy and scalability are important at the heart of any application and planning how to achieve those goals from the onset will make development much smoother down the road. Additionally, choose a vendor with a strong support team. We were extremely impressed with MongoDB’s sales and technical team prowess throughout the conversion process, and look forward to working with them in the future.
STREAM: How MongoDB Atlas and AWS help make it easier to build, scale, and personalize feeds that reach millions of users
Stream is a platform designed for building, personalizing, and scaling activity feeds that reach over 200 million users. We offer an alternative to building app feed functionality from scratch by simplifying implementation and maintenance so companies can stay focused on what makes their products unique.
Today our feed-as-a-service platform helps personalize user experiences for some of the most engaging applications and websites. For example, Product Hunt, which surfaces new products daily and allows enthusiasts to share and geek out about the latest mobile apps, websites, and tech creations, uses our API to do so.
We’ve recently been working on an application called Winds, an open source RSS and podcast application powered by Stream, that provides a new and personalized way to listen, read, and share content.
We chose MongoDB to support the first iteration of Winds as our developers found the database very easy to work with. I personally feel that the mix of data model flexibility, scalability, and rich functionality that you get with MongoDB makes it superior to what you would get out of the box with other NoSQL databases or tabular databases such as MySQL and PostgreSQL.
Our initial MongoDB deployment was managed by a vendor called Compose but that ultimately didn’t work out due to issues with availability and cost. We migrated off Compose and built our own self-managed deployment on AWS. When MongoDB’s own database as a service, MongoDB Atlas, was introduced to us, we were very interested. We wanted to reduce the operational work that our team was doing and found Atlas’s pricing much more predictable than what we had experienced with our previous MongoDB service provider. We also needed a database service that would be highly available out of the box. The fact that MongoDB Atlas sets a minimum replica set member count and automatically distributes each cluster across AWS availability zones had us sold.
The great thing about managing or scaling MongoDB with MongoDB Atlas is that pretty much almost all of the time, we don’t have to worry about it. We run our application on a deployment using the M30 size instances with the auto-expanding storage option enabled. When our disk utilization approaches 90%, Atlas automatically provisions us more with no impact to availability. And if we experience spikes in traffic like we have in the past, we can easily scale up or out using MongoDB Atlas by either clicking a few buttons in the UI or triggering a scaling event using the API.
Another benefit that MongoDB Atlas has provided us is on the cost savings side. With Atlas, we no longer need a dedicated person to worry about operations or maintaining uptime. Instead, that person can work on the projects that we’d rather have them working on. In addition, our team is able to move much faster. Not only can we make changes on the fly to our application leveraging MongoDB’s flexible data model, but we can deploy any downstream database changes on the fly or easily spin up new clusters to test new ideas. All of these can happen without impacting things in production; no worrying about provisioning infrastructure, setting up backups, monitoring, etc. It’s a real thing of beauty.
In the near future, we plan to look into utilizing change streams from MongoDB 3.6 for our Winds application, which is already undergoing some major upgrades (users can sign up for the beta here). This may eliminate the need to maintain separate Redis instances, which would further increase our savings and reduce architectural complexity.
We’re also looking into migrating more applications onto MongoDB Atlas as its built-in high availability, automation, fully managed backups, and performance optimization tools make it a no-brainer. While there are other MongoDB as a service providers out there (Compose, mLab, etc.) available, no other solution comes close to what MongoDB Atlas can provide.
Interested in reducing costs and faster time to market? Get started today with a free 512 MB database managed by MongoDB Atlas.
Be a part of the largest gathering of the MongoDB community. Join us at MongoDB World.
Longbow Advantage - Helping companies move beyond the spreadsheet for a real-time view of logistics operations
The global market in supply chain analytics is estimated at some $2.7 billion — and yet, far too often supply chain leaders use spreadsheets to manage their operation, limiting the real-time visibility into their systems.
Longbow Advantage, a supply chain partner, helps companies get the maximum ROI from their supply chain software products. Moving beyond the spreadsheet and generic enterprise BI tools, Longbow developed an application called Rebus™ which allows users to harness the power of smart data and get real-time visibility into their entire supply chain. That means ingesting data in many formats from a wide range of systems, storing it for efficient reference, and presenting it as needed to users — at scale.
MongoDB Atlas is at the heart of Rebus. We talked to Alex Wakefield, Chief Commercial Officer, to find out why they chose to trust such a critical part of their business to MongoDB and how it’s panned out both technically and commercially.
Tell us a little bit about Longbow Advantage. How did you come up with the idea?
Sixteen years ago our Founder, Gerry Brady, left his job at a distribution company to build Longbow Advantage. The goal was to build a company that could help streamline warehouse and workforce management implementations, upgrades, and integrations, and put more focus on customer experience and success.
Companies of all sizes have greatly improved distribution processes but still lack real-time visibility into their systems. While there’s a desire to use BI/analytics systems, automate manual processes, and work with information in as close to real-time as possible, most companies continue to rely on manually generated spreadsheets to measure their logistics KPIs, slowing down speed to insights.
There had to be a better way to help companies address this problem. We built an application called Rebus. This SaaS-based analytics platform, used by industry leaders such as Del Monte Foods and Subaru of America, aggregates and harmonizes logistics data from any supply chain execution software to provide a near real-time view of logistics operations and deliver cross-functional insights. The idea is quite simply to provide more accurate data in as close to real-time as technically possible within a common platform that can be shared across the supply chain.
For example, one company may have a KPI around labor productivity. When that company receives a customer order to ship, there is a lot of information they want to know:
- Was the order shipped and on-time?
- How efficiently is the labor staff filling orders?
- How many orders are processing?
- How many individual lines or tasks on the order are being filled?
The list goes on. With Rebus, manufacturers, retailers and distributors can segment different business lines like ecommerce, traditional retail, direct to consumer and more, to ensure that they are being productive and meeting the appropriate deadlines. Without this information, a company may miss major deadlines, negatively impact customer satisfaction, miss out on revenue opportunities, and in some cases, incur significant financial penalties.
What are some of the benefits that your customers are experiencing?
Our customers are able to automate a manual and time-intensive metrics process and collect near real-time data in a common platform that can be used across the organization. All of this leads to more efficient decision-making and a coordinated communication effort.
Customers are also able to identify inaccurate or duplicate data that may be contributing to slow performance in their Warehouse and Labor Management software. Rebus provides an immediate way to identify data issues and improve overall performance. This is a huge benefit for customers who are shipping thousands of orders every week.
Why did you decide to use MongoDB?
Four years ago, when we first came up with the idea for Rebus, we gathered a group of employees to brainstorm the best way to build it.
In that brainstorm, one of our employees suggested that we use MongoDB as the underlying datastore. After doing some research, it was clear that the document model was a good match for Rebus. It would allow us to gather, store, and build analytics around a lot of disparate data in close to real time. We decided to build our application on MongoDB Enterprise Advanced.
When and why did you decide to move to MongoDB Atlas?
We first heard about MongoDB Atlas in July 2016 shortly after it launched, but were not able to migrate right away. We maintain strict requirements around compliance and data management, so it was not until May 2017, when MongoDB Atlas became SOC2 compliant, that we decided to migrate. Handing off our database management to the team that builds MongoDB gave us peace of mind and has helped us stay efficient and agile. We wanted to ensure that our team could remain focused on the application and not have to worry about the underlying infrastructure. Atlas allowed us to do just that.
The migration wasn’t hard. We were moving half a terabyte of data into Atlas, which took a couple of goes — the first time didn’t take. But the support team was proactive. After working with us to pinpoint the issue, one of our key technical people reconfigured an option and the process re-ran without any issues. We hit our deadline.
Why did you decide to use Atlas on Google Cloud Platform (GCP)?
Google Cloud Platform is SOC2 compliant and allows us to keep our team highly efficient and focused on developing the application instead of managing the back end. Additionally, GCP gave us great responses that we weren’t getting from other cloud vendors.
How has your experience been so far?
MongoDB Atlas has been fantastic for us. In particular, the real-time performance panel is fantastic, allowing us to see what is going on in our cluster as it’s happening.
In comparison to other databases, both NoSQL and SQL, MongoDB provides huge benefits. Despite the fact that many of our developers have worked with relational databases their entire careers, the way we can get data out of MongoDB is unparalleled to anything they’ve ever seen. That’s even with a smaller, more efficient footprint on our system.
Additionally, the speed of MongoDB has been really helpful. We’re still looking at the results from our load tests, but the ratio of timeouts to successes was very low. Atlas outperforms what we were doing before. We know we can support at least a couple hundred users at one time. That tells us we will be able to go and grow with MongoDB Atlas for years to come.
Thank you for your time Alex.
 Grand View Research, Supply Chain Analytics Market Analysis, 2014 - 2025, https://www.grandviewresearch.com/industry-analysis/the-global-supply-chain-analytics-market
Rebus is a trademark of Longbow Advantage Inc.
Powering an online community of coders with MongoDB Atlas
If you’re learning to code, or if you already have coding experience, it helps to have other people around -- like mentors, coworkers, hackathon buddies and study partners -- to help accelerate your learning, especially when you get stuck.
But not everyone can commute to a tech meetup, or lives in a city with access to a network of study partners or mentors/coworkers who can help them.
CodeBuddies started in 2014 as a free virtual space for independent code learners to share knowledge and help each other learn. It is fully remote and 100% volunteer-driven, and helps those who — due to geography, schedule or personal responsibilities — might not be able to easily attend in-person tech meetups and workshops/hackathons where they could find study partners and mentors.
The community is now comprised of a mix of experienced software engineers and beginning coders from countries around the world, who share advice and knowledge in a friendly Slack community. Members also use the website at codebuddies.org to start study groups and schedule virtual hangouts. We have a pay-it-forward mentality.
The platform, an open-sourced project, was painstakingly built by volunteer contributors to help members organize study groups and schedule focused hangouts to learn together. In those peer-to-peer organized remote hangouts, the scheduler of the hangout might invite others to join them in:
- Working through a coding exercise together
- Screen sharing and helping each other through a contribution to an open-sourced project
- Co-working silently in a “silent” hangout (peer motivation)
- Helping them practice their knowledge of a topic by attempting to teach it
- Reading through a chapter of a programming tutorial together
Occasionally, the experience will be magical: a single hangout on a popular framework might have participants joining in at the same time from Australia, the U.S., Finland, Hong Kong, and Nigeria.
The site uses the MeteorJS framework, and the data is stored in a MongoDB database.
For years, with a zero budget, CodeBuddies was hosted on a sandbox instance from mLab. When we had the opportunity to migrate to MongoDB Atlas, our database was small enough that we didn’t need to use live migration (which requires a paid mLab plan), but could migrate it manually. These are the three easy steps we took to complete the migration:
1) Dump the mongo database to a local folder
Once you have stopped application writes to your old database, run:
mongodump -h ds015995.mlab.com --port 15992 --db production-database -u username -p password -o Downloads/dump/production-database
2) Create a new cluster on MongoDB Atlas
3) Use mongorestore to populate the dumped DB into the MongoDB Atlas cluster
First, whitelist your droplet IP on MongoDB Atlas:
Then you can restore the mlab dump you have in a local folder to MongoDB Atlas:
mongorestore --host my-awesome-cluster-shard-00-00-dpkz5.mongodb.net --port 27018 --authenticationDatabase admin --ssl -u username -p password Downloads/dump/production-database
We host our app on DigitalOcean, and use Phusion Passenger to manage our app. When we were ready to make the switchover, we stopped Phusion Passenger, added our MongoDB connection string to our nginx config file, and then restarted Phusion Passenger.
CodeBuddies is a small project now, but we do not want to be unprepared when the community grows. We chose MongoDB Atlas for its mature performance monitoring tools, professional support, and easy scaling.