GIANT Stories at MongoDB

Leaf in the Wild: Qumram Migrates to MongoDB to Deliver Single Customer View for Regulatory Compliance & Customer Experience

Every financial services organization is tasked with two, often conflicting, priorities:

  1. The need for rapid digital transformation
  2. Implementing compliance controls that go beyond internal systems, extending all the way to their digital borders.

However, capturing and analyzing billions of customer interactions in real time across web, social, and mobile channels is a major data engineering challenge. Qumram solves that challenge with its Q-Suite portfolio of on-premise and cloud services.

Qumram’s software is used by the most heavily regulated industries in the world to help organizations capture every moment of the customer’s journey. Every keystroke, every mouse movement, and every button click, across all digital channels. Then store it for years. As you can imagine this generates an extraordinary volume and variety of data. Some Qumram customers ingest and store multiple terabytes of this sensitive data every day.

Starting out with relational databases, Qumram quickly hit the scalability wall. After extensive evaluation of alternatives, the company selected MongoDB to provide a single source of truth for all customer interactions across any digital channel.

I met with Simon Scheurer, CTO of Qumram AG, to learn more.

Can you start by telling us a little bit about your company?

Qumram provides a single view of all customer interactions across an organization’s digital channels, helping our customers to ensure compliance, prevent fraud, and enrich the experience they deliver to their users. Our unique session recording, replay, and web archival solution captures every user interaction across web, mobile, and social channels. This means that any user session can be replayed at a moment’s notice, in a movie-like form, giving an exact replica of the activity that occurred, when, and for how long. It’s pretty rare to provide a solution that meets the needs of compliance and risk officers while also empowering marketing teams – but that is what our customers can do with Q-Suite, built on modern technologies like MongoDB.

Q-suite Figure 1: Q-Suite recording of all digital interactions for regulatory compliance

Most of our customers operate in the highly regulated financial services industry, providing banking and insurance services. Qumram customers include UBS, Basler Kantonalbank, Luzerner Kantonalbank, Russell Investments, and Suva.

How are you using MongoDB?

Our solution provides indisputable evidence of all digital interactions, in accordance with the global regulatory requirements of SEC, US Department of Labor (DOL), FTC, FINRA, ESMA, MiFID II, FFSA, and more. Qumram also enables fraud detection, and customer experience analysis that is used to enhance the customer journey through online systems – increasing conversions and growing sales.

Because of the critical nature of regulatory compliance, we cannot afford to lose a single user session or interaction – unlike competitors, our system provides lossless data collection for compliance-mandated recording.

We use MongoDB to ingest, store, and analyze the firehose of data generated by user interactions across our customer’s digital properties. This includes session metadata, and the thousands of events that are generated per session, for example, every mouse click, button selection, keystroke, and swipe. MongoDB stores events of all sizes, from those that are contained in small documents typically just 100-200 bytes, through to session and web objects that can grow to several megabytes each. We also use GridFS to store binary content such as screenshots, CSS, and HTML.

Capturing and storing all of the session data in a single database, rather than splitting content across a database and separate file system massively simplifies our application development and operations. With this design, MongoDB provides a single source of truth, enabling any session to be replayed and analyzed on-demand.

You started out with a relational database. What were the challenges you faced there?

We initially built our products on one of the popular relational databases, but we quickly concluded that there was no way to scale the database to support billions of sessions every year, with each session generating thousands of discrete events. Also, as digital channels grew, our data structures evolved to become richer and more complex. These structures were difficult to map into the rigid row and column format of a relational schema. So in Autumn 2014, we started to explore non-relational databases as an alternative.

What databases did you look at?

There was no shortage of choice, but we narrowed our evaluation down to Apache Cassandra, Couchbase, and MongoDB.

What drove your decision to select MongoDB? We wanted a database that would enable us to break free of the constraints imposed by relational databases. We were also looking for a technology that was best-in-class among modern alternatives. There were three drivers for selecting MongoDB:

  1. Flexible data model with rich analytics Session data is richly structured – there may be up to four levels of nesting and over 100 different attributes. These complex structures map well to JSON documents, allowing us to embed all related data into a single document, providing us two advantages:

    1. Boosting developer productivity by representing data in the same structure as objects in our application code.
    2. Making our application faster as we only need issue a single query to the database to replay a session. At the same time, we need to be able to analyze the data in position, without the latency of moving it to an analytics cluster. MongoDB’s rich query language and secondary indexes allow us to access data by single keys, ranges, full text search, graph traversals, and geospatial queries, through to complex aggregations.
  2. Scalability The ability to grow seamlessly by scaling the database horizontally across commodity servers deployed on-premise and in the cloud, while at the same time maintaining data consistency and integrity.

  3. Proven We surveyed customers across our target markets, and the overwhelming feedback was that they wanted us to use a database they were already familiar with. Many global financial institutions had already deployed MongoDB and didn’t want to handle the complexity that came from running yet another database for our application. They knew MongoDB could meet the critical needs of regulatory compliant services, and that it was backed by excellent technical support, coupled with extensive management tools and rock-solid security controls.

As a result, we began development on MongoDB in early 2015.

How do your customers deploy and integrate your solution?

We offer two deployment models: on-premise and as a cloud service.

Many of the larger financial institutions deploy the Q-Suite with MongoDB within their own data centers, due to data sensitivity. From our application, they can instantly replay customer sessions. We also expose the session data from MongoDB with a REST API, which allows them to integrate it with their back-office processes, such as records management systems and CRM suites, often using message queues such as Apache Kafka.

We are also rolling out the Q-Suite as a “Compliance-as-a-Service” offering in the cloud. This option is typically used by smaller banks and insurers, as well the FinTech community.

How do you handle analytics against the collected session data?

Our application relies heavily on the MongoDB aggregation pipeline for native, in-database analytics, allowing us to roll-up session data for analysis and reporting. We use the new$graphLookup operator for graph processing of the session data, identifying complex relationships between events, users, and devices. For example, we can detect if a user keeps returning to a loan application form to adjust salary in order to secure a loan that is beyond his or her capability to repay. Using MongoDB’s in-built text search along with geospatial indexes and queries, we can explore session data to generate behavioral insights and actionable fraud intelligence.

Doing all of this within MongoDB, rather than having to couple the database with separate search engines, graph data stores, and geospatial engines dramatically simplifies development and ongoing operations. It means our developers have a single API to program against, and operations teams have a single database to deploy, scale, and secure.

I understand you are also using Apache Spark. Can you tell us a little more about that?

We use the MongoDB Connector for Apache Spark to feed session data from the database into Spark processes for machine learning, and then persist the models back into MongoDB. We use Spark to generate user behavior analytics that are applied to both fraud detection, and for optimization of customer experience across digital channels.

We are also starting to use Spark with MongoDB for Natural Language Processing (NLP) to extract customer sentiment from their digital interactions, and other deep learning techniques for anti-money laundering initiatives.

What does a typical installation look like?

The minimum MongoDB configuration for Q-Suite is a 3-node replica set, though we have many customers running larger MongoDB clusters deployed across multiple active data centers for disaster recovery and data locality. Most customers deploy on Linux, but because MongoDB is multi-platform, we can also serve those institutions that run on Windows.

We support both MongoDB 3.2 and the latest MongoDB 3.4 release, which gives our users the new graph processing functionality and faceted navigation with full text search. We recommend customers use MongoDB Enterprise Advanced, especially to access the additional security functionality, including the Encrypted storage engine to protect data at rest.

For our Compliance-as-a-Service offering, we are currently evaluating the MongoDB Atlas managed service in the cloud. This would allow our teams to focus on the application, rather than operations.

What sort of data volumes are you capturing?

Capturing user interactions is a typical time-series data stream. A single MongoDB node can support around 300,000 sessions per day, with each session generating up to 3,000 unique events. To give an indication of scale in production deployments, one of our Swiss customers is ingesting multiple terabytes of data into MongoDB every day. Another in the US needs to retain session data for 10 years, and so they are scaling MongoDB to store around 3 trillion documents.

Of course, capturing the data is only part of the solution – we also need to expose it to analytics, without impacting write-volume. MongoDB replica sets enable us to separate out these two workloads within a single database cluster, simultaneously supporting transaction and analytics processing.

Funnel metrics Figure 2: Analysis of funnel metrics to monitor customer conversion through digital properties

How are you measuring the impact of MongoDB on your business?

Companies operating in highly regulated industries, from financial services to healthcare to communications, are facing a whole host of new government and industry directives designed to protect digital boundaries. The Q-Suite solution, backed by MongoDB, is enabling us to respond to our customers’ compliance requirements. By using MongoDB, we can accelerate feature development to meet new regulatory demands, and implement solutions faster, with lower operational complexity.

The security controls enforced by MongoDB further enable our customers to achieve regulatory compliance.

Simon, thanks for sharing your time and experiences with the MongoDB community

To learn more about cybersecurity and MongoDB, download our whitepaper Building the Next Generation of Threat Intelligence with MongoDB

Leaf in the Wild: World’s Most Installed Learning Record Store Migrates to MongoDB Atlas to Scale Data 5x, while Reducing Costs

Learning Locker moves away from ObjectRocket to scale its learning data warehouse, used by the likes of Xerox, Raytheon and U.K. Universities.

From Amazon’s recommendations to the Facebook News Feed, personalization has become ingrained in consumer experience, so it should come as no surprise that resourceful educators are now trying improve learning outcomes with that same concept. After all, no two students are identical in much the same way that no two consumers are exactly alike. Developing a truly personalized educational experience is no easy feat, but emerging standards like the xAPI are helping to make this lofty goal a reality.

xAPI is an emerging specification that enables communication between disparate learning systems in a way that standardizes learning data. That data could include things like a student’s attendance in classes, or participation in online tools, but can also stretch to performance measures in the real-world, how students apply their learning. This data-led approach to Learning Analytics is helping educators improve learning practices, tailor teaching and take early intervention if it looks like a student is moving in the wrong direction.

But the implications of this go far beyond the classroom, and increasingly companies are using these same techniques to support their employees development and to measure the impact of training on performance outcomes. Whilst educators are predicting the chances of a particular student dropping out, businesses can use these same tools to forecast organizational risk, based on compliance training and performance data, for example.

We recently spoke with James Mullaney, Lead Developer at HT2 Labs a company that is at the forefront of the learning-data movement. HT2 Labs’ flagship product, Learning Locker, is an open source data warehouse used by the likes of the Xerox, Raytheon and a wide-range of universities to prove the impact of training and to make more informed decisions on future learning design. To continue to scale the project, better manage their operations and reduce costs, Learning Locker migrated from ObjectRocket to database as a service MongoDB Atlas.

Tell us about HT2 Labs and Learning Locker.

HT2 Labs is the creator of Learning Locker, which is a data warehouse for learning activity data (commonly referred to as a Learning Record Store or LRS). We have a suite of other learning products that are all integrated; Learning Locker acts as the hub that binds everything together. Our LRS uses the xAPI, which is a specification developed in part by the U.S. Department of Defense to help track military training initiatives. It allows multiple learning technology providers to send data into a single data store in a common format

We started playing around with xAPI around four years ago as we were curious about the technology and had our own Social Learning Management System (LMS), Curatr. Today, Learning Locker receives learning events via an API, analyzes the data stored, and is instrumental in creating reports for our end customers.

Who is using Learning Locker?

The software is open source so our users range from hobbyists to enterprise companies, like Xerox, who use our LRS to track internal employee training.

Another example is Jisc, the R&D organization that advances technologies in UK Higher & Further Education.. Jisc are running one of the largest national-level initiatives to implement Learning Analytics across universities in the UK and our LRS is used to ingest data and act as a single source of data for predictive models. This increased level of insight into individual behavior allows Jisc to do some interesting things, such as predict and preempt student dropouts.

How has Learning Locker evolved?

We’re currently on version two of Learning Locker. We’ve open sourced the product and we’ve also launched it as a hosted Software as a service (SaaS) product. Today we have clients using our LRS in on-premise installations and in the cloud. Each on-prem installation comes packaged with MongoDB. The SaaS version of Learning Locker typically runs in AWS supported by MongoDB Atlas, the managed MongoDB as a Service.

Tell us about your decision to go with MongoDB for the underlying database.

MongoDB was a very natural choice for us as the xAPI specification calls for student activities to be sent as JSON. These documents are immutable. For example, you might send a document that says, “James completed course XYZ.” You can’t edit that document to say that he didn’t complete it. You would have to send another document to indicate a change. This means that scale is very important as there is a constant stream of student activity that needs to be ingested and stored. We’ve been very happy with how MongoDB, with its horizontal scale-out architecture, is handling increased data volume; to be frank, MongoDB can handle more than our application can throw at it.

In fact, our use of MongoDB is actually award-winning: Last year we picked up the MongoDB Innovation Award for best open source project.

Beyond using the database for ingesting and storing data in Learning Locker, how else are you using MongoDB?

As mentioned earlier, our LRS runs analytics on the data stored and those analytics are then using in reporting for our end users. For running those queries, we use MongoDB’s aggregation framework and the associated aggregation APIs. This allows our end users to get quick reports on information they’re interested in, such as course completion rates, score distribution, etc.

Our indexes are also rather large compared to the data. We index on a lot of different fields using MongoDB’s secondary indexes. This is absolutely necessary for real-time analytics, especially when the end user wants to ask many different questions. We work closely with our clients to figure out the indexes that make the most sense based on the queries they want to run against the data.

Tell us about your decision to run MongoDB in the cloud. Did you start with MongoDB Atlas or were you using a third party vendor?

Our decision to use a MongoDB as a service provider was pretty simple — we wanted someone else to manage the database for us. Initially we were using ObjectRocket and that made sense for us at the time because we were hosting our application servers on Rackspace.

Interesting. Can you describe your early experiences with MongoDB Atlas and the migration process?

We witnessed the launch of MongoDB Atlas last year at MongoDB World 2016 and spun up our first cluster with Atlas in October. It became pretty clear early on that it would work for what we needed. First we migrated our Jisc deployment and our hosted SaaS product to MongoDB Atlas and we also moved our application servers to AWS for lower latency. The migration was completed in December with no issues.

Why did you migrate to MongoDB Atlas from ObjectRocket?

Cost was a major driving force for our migration from ObjectRocket. We’ve been growing and are now storing five times as much data in MongoDB Atlas at about the same costs.

ObjectRocket was also pretty opaque about what was happening in the background and that’s not the case with MongoDB Atlas, which gives you greater visibility and control. I can see, for example, exactly how much RAM I’m using at any point in time.

And finally, nobody is going to tell you that security isn’t important, especially in an industry where we’re responsible for handling potentially-sensitive student data. We were very happy with the native security features in MongoDB Atlas and the fact that we aren’t charged a percentage uplift for encryption, which was not the case with ObjectRocket.

Do you have any plans to integrate MongoDB with any other technologies to build more functionality for Learning Locker?

We’re looking into Hadoop, Spark, and Tableau for a few of our clients. MongoDB’s native connectors for Hadoop, Spark, and BI platforms come in handy for those projects.

Any advice for people looking into MongoDB and MongoDB Atlas?

Plan for scale. Think about what you’re doing right now and ask yourself, “Will this work when I have 100x more data? Can we afford this at 100x the scale?”

The MongoDB Atlas UI makes most things extremely easy, but remember that some things you can only do through the mongo shell. You should ensure your employees learn or retain the skills necessary to be dangerous in the CLI.

And this isn’t specific to just MongoDB, but think about the technology you’re partnering with and the surrounding community. For us, it’s incredibly important that MongoDB is a leader in the NoSQL space as it’s made it that much easier to talk about Learning Locker to prospective users and clients. We view it as a symbiotic relationship; if MongoDB is successful then so are we.

James, thanks for taking the time to share your experiences with the MongoDB community and we look forward to seeing you at MongoDB World 2017.

For deploying and running MongoDB, MongoDB Atlas is the best mix of speed, scalability, security, and ease-of-use.

Learn more about MongoDB Atlas





Hyperscale Performance with Nuxeo Platform and MongoDB Enterprise Advanced

This is a guest post by Eric Barroca, CEO - Nuxeo.

Digitizing content and processes is a 20- or even 30-year-old story. The first wave of solutions delivered huge efficiency gains for enterprises – from paper, to PDF and checks, to Paypal. Today, companies across industries – from Media to Financial Services to Telecommunications – see new opportunities in a second wave of technology, such as creating new revenue streams and developing new products and services for their customers. But, so many are still managing their content in systems architected in the first wave and which now stand in the way of transformation.

Legacy systems can't support today's digital transformation needs. They lack the enterprise-wide visibility, searchability and control to keep content and the metadata that makes it valuable together. They are staggering under the crush of complexity of content and firehose of information flowing in and out of these systems on a daily basis. And, after implementation, they can’t be easily adapted to today’s ever-more dynamic and unpredictable business needs and opportunities.

To succeed in your business transformation, you need an approach that can unlock the value of assets through a system that can see, search and manage assets and metadata across 100s or even 1,000s of places across your enterprise. You need to multiply the value of your assets by empowering your entire business to easily leverage these critical assets and information. You need to accelerate innovation by leveraging the speed and investment of a cloud services ecosystem. And, you have to assume and plan for evolution and scale as your business responds to new opportunity and growth.

Nuxeo & MongoDB Enterprise Advanced: Unmatched Performance in Content Management

Nuxeo’s integration with MongoDB Enterprise Advanced, as an alternative to a relational database, is first of its kind in the Enterprise Content Management (ECM) space. The Nuxeo Platform for content management and Digital Asset Management (DAM) allows enterprises to discover the full value of their most complex digital assets, and scales to support even the largest content repositories, leveraging MongoDB Enterprise Advanced’s scaling, performance and replication capabilities.

Legacy ECM systems fall short when trying to turn data into valuable assets. Content Management and Digital Asset Management are now data-centric. Digital assets are core to any successful digital transformation. Unfortunately, value is often locked in the data surrounding these assets and many organizations have trouble unlocking this data to enable true transformation. The Nuxeo Platform helps to transform this data into valuable assets and, together with MongoDB Enterprise Advanced, allows enterprises to do it at true enterprise scale.

High performance of the Nuxeo Platform has already been tested and benchmarked to the tune of several billion documents with MongoDB Enterprise Advanced. The latest benchmarks from Nuxeo on an average cloud instance and using complex content objects now show the following results:

  • Document Processing: 30,000 doc/sec
  • Bulk Import: 5x faster than any relational database implementation
  • Overall, a 15x performance increase compared to the fastest relational database implementation

Check out our benchmark results and learn more in this video: Using MongoDB Enterprise Advanced to Build a Fast and Scalable Document Repository

Why MongoDB Enterprise Advanced as a backend storage for Nuxeo apps

Nuxeo chose MongoDB Enterprise Advanced because it enables organizations to deploy cloud-ready applications with unmatched performance and scalability. Used with the Nuxeo Platform, MongoDB Enterprise Advanced provides a database storage option offering high performance, high availability, and exceptional scalability for Enterprise Content Management (ECM) applications.

Nuxeo customers with extremely large content store requirements are able to leverage MongoDB Enterprise Advanced to get features such as replication, zero downtime and scalability. It is also a good combination with Elasticsearch, leveraging Elastic for advanced queries and MongoDB Enterprise Advanced for highly scalable content and asset storage. Nuxeo customers now have access to capabilities such as full-index support, rich querying, auto-sharding, replication and high availability, and much more.

Using the Nuxeo Platform with MongoDB Enterprise Advanced provides the opportunity to build content management applications with big data tools capable of dealing with complex, enterprise-scale data volumes at unmatched speeds.

Nuxeo for Giant ECM Applications

Nuxeo provides a Hyperscale Digital Asset Platform that helps enterprise organizations unlock the full value of their digital assets to create new revenue streams, improve performance, and maximize existing IT investments. Over 200 leading organizations use Nuxeo for digital asset management, document management, knowledge management, and other content-centric business applications.

Nuxeo is headquartered in New York with five additional offices worldwide, and raised $30 million in capital from Goldman Sachs and Kennet Partners in 2016.

More information is available at www.nuxeo.com.

Leaf in the Wild: Powering Smart Factory IoT with MongoDB

Jason Ma

Customer Stories

BEET Analytics OEMs MongoDB for its Envision manufacturing IOT platform. MongoDB helps Envision deliver 1-2 orders of magnitude better performance than SQL Server, resulting in increased manufacturing throughput and reduced costs

Leaf in the Wild posts highlight real world MongoDB deployments. [Read other stories](https://www.mongodb.com/blog/category/customer-stories Read other stories") about how companies are using MongoDB for their mission-critical projects.

BEET Analytics Technology creates solutions to help the manufacturing industry transition to smart IOT factories for the next evolution of manufacturing. BEET’s Process Visibility System, Envision, makes the assembly line machine process visible and measurable down to every motion and event. Built on MongoDB, Envision is able to precisely analyze telemetry data streamed from sensors on the production line to help improve the manufacturing process.

At MongoDB World 2016, BEET Analytics was a recipient of a MongoDB Innovation Award, which recognizes organizations and individuals that took a giant idea and made a tangible impact on the world.

I had the opportunity to sit down with Girish Rao, Director of Core Development, to discuss how BEET Analytics harnesses MongoDB to power its Envision platform.

Can you tell us a little bit about BEET Analytics?

Founded in June 2011, BEET Analytics Technology is a smart manufacturing solution provider. We provide a Process Visibility System (PVS) built upon Envision, the software created by BEET. Envision monitors the automated assembly line for any potential issues in throughput, performance, and availability –- and alerts users about possible breakdowns before they occur. For our customers, one minute of lost production time can result in a significant loss of revenue, sothus we collect and monitor the vital details of an automated assembly line. This provides predictive analytics and insights that avoid unplanned downtime and help sustain higher manufacturing throughput.

Why did you decide to OEM MongoDB?

When we started using MongoDB about 4 years ago, it was not as well known as it is now – at least not in the manufacturing industry. Our strategy was to build a complete product with MongoDB embedded within our system. We could then bring our complete system, deploy it in the plant, and have it run out of the box. This helped minimize the burden on our customer plant’s IT department to manage multiple software and hardware products. This model has worked well for us to introduce our product into several customer plants. Not only have we been able to provide a seamless customer experience, but MongoDB’s expertise both in development and production has helped us to accelerate our own product development. Additionally, co-marketing activities that promote our joint solution have been extremely beneficial to us.

How does BEET Analytics Use MongoDB today?

The Envision platform consists of multiple data collectors, which are PC based devices that are deployed close to the assembly line and stream data from the Programmable Logic Controllers (PLC). The PLCs (0.05 - 0.08 second scan cycle) continuously monitor the “motion” of hundreds of moving parts in the manufacturing facility. Each “motion” is captured by data collectors and stored in MongoDB. The daily transactional data for an assembly line creates about 1-3 million MongoDB documents per day, and we typically keep between 3-6 months worth of data, which comes out to be about 500 million documents.

Can you describe your MongoDB deployment and how it’s configured?

Each data collector on the assembly line runs its own standalone MongoDB instance. For a medium sized assembly line, we will typically have 1-2 data collectors, while a larger assembly line can have 4-6 data collectors. The data collectors transfer the data through a web service up to a central repository that is backed by a MongoDB replica set and where the Envision application server runs. The central MongoDB replica set consists of a primary node, running Linux, and two secondaries that run Windows. We use Windows as a secondary because we also run Internet Information Services (IIS) for our application. This architecture is cost effective for us. In the future, we will probably run both the primary and secondary on Linux. We have failed over a few times to the secondary without any application downtime. Users interact with the application server through a browser to visualize the “heartbeat” of the entire manufacturing process. We use the MongoDB aggregation and map reduce framework to aggregate the data and create analytics reporting.

Were you using something different before MongoDB?

Our first version of the Envision platform was developed about 6 years ago using a Microsoft SQL Server database. SQL Server was okay up to a certain point, but we couldn’t scale up without using very expensive hardware. Our primary requirement was to support the throughput that our system needed without resorting to massive server arrays. In our internal benchmarks, MongoDB had 1-2 orders of magnitude better performance than SQL Server for the same hardware. At that point, we decided to build the solution using MongoDB.

Are there specific tools you use to manage your MongoDB deployment?

We currently use Ops Manager internally for development servers, and are looking to implement Ops Manager in production. Ops Manager has been extremely useful in helping us automate our deployment of MongoDB and ensuring we follow MongoDB best practices. It’s also invaluable that Ops Manager provides visibility into key metrics, so we are able to diagnose any potential problems before they happen.

Any best practices of deploying MongoDB that you can share with the community that you think is pertinent?

Understanding your dataset is a critical component. As we understood our dataset better, we were able to size the hardware more appropriately. Another important practice is indexing. Make sure you have index coverage for most of the queries to avoid full collection scans. MongoDB offers an extensive range of secondary indexes that you typically don’t get in a NoSQL database. Capped collections work really well for log type data that does not need to be saved for a long period of time. Finally, use a replica set to help you with performance, always-on availability, and scalability.

How are you measuring the impact of MongoDB?

MongoDB has allowed BEET to reduce the overall infrastructure cost and provide better value to customers. From a development perspective, MongoDB’s flexible data model with dynamic schema has allowed us to make application changes faster, and rapidly add new features to our product to maintain competitive advantage and better serve our customers.

What advice would you give for someone using MongoDB for their project?

MongoDB has [best practices guides](https://www.mongodb.com/collateral/mongodb-performance-best-practices MongoDB performance best practices") and whitepapers that are really helpful to ensure you follow the right guidelines. Also, we have been using Ops Manager in our development environment and it has been a huge advantage to troubleshoot any performance or setup issues. This is something we plan to implement in production and recommend other users to do as well.

Girish, thank you so much for taking the time to share your experiences with the MongoDB community.


Harness MongoDB for its IoT Solutions

Leaf in the Wild: How EG Built a Modern, Event-Driven Architecture with MongoDB to Power Digital Transformation

Mat Keep

Customer Stories

*UK’s Leading Commercial Property Data Service Delivers 50x More Releases Per Month, Achieving Always-On Availability*

The total value of the UK commercial property is estimated at close to £1.7 trillion1. Investment decisions on big numbers requires big data, especially in handling a big variety of multi-structured data. And that is why EG, the UK’s leading commercial property data service, turned to MongoDB.

I met with Chris Fleetwood, Senior Director of Technology, and Chad Macey, Principal Architect at EG. We discussed how they are using agile methodologies with devops, cloud computing, and MongoDB as the foundation for the company’s digital transformation – moving from a century old magazine, into a data driven technology service.

Can you start by telling us a little bit about your company?

Building on over 150 years of experience, [EG](http://www.estatesgazette.com/ "EG") (formerly Estates Gazette) delivers market-leading data & analytics, insight, and decision support tools covering the UK commercial property market. Our services are used by real estate agents, lawyers, investors, surveyors, and property developers. We enable them to make faster, better informed decisions, and to win more business in the property market. We offer a comprehensive range of market data products with information on hundreds of thousands of properties across the UK, accessible across multiple channels including print, digital, online, and events. EG is part of Reed Business Information, providing data and analytics to business professionals around the world.

What is driving digital transformation in your business?

Our business was built on print media, with the Estates Gazette journal serving as the authoritative source on commercial property across the UK for well over a century. Back in the 1990s, we were quick to identify the disruptive potential of the Internet, embracing it as a new channel for information distribution. Pairing our rich catalog of property data and market intelligence with new data sources from mobile and location services – and the ability to run sophisticated analytics across all of it in the cloud – we are accelerating our move into enriched market insights, complemented with decision support systems.

IT was once just a supporting function for the traditional print media side of our business, but digital has now become our core engine for growth and revenue.

Can you describe your platform architecture?

Our data products are built on a microservices-inspired architecture that we call “pods”. Each pod serves a specific data product and audience. For example, the agent pod provides market intelligence for each geographic area such as recent sales, estimated values, local amenities, and zone regulations, along with lists of potential buyers and renters. Meanwhile the surveyor pod will maintain data used to generate valuations of true market worth. The pods also implement the business rules that drive the workflow for each of our different user audiences.

The advantage of our pod-based architecture is that it improves on our deployment and operational capabilities, supporting the transition to continuous delivery – giving us faster time to market for new features demanded by our customers. Each pod is owned by a “squad” with up to half a dozen members, comprising devops engineers, architects, and product managers.

EG Pod Architecture

Figure 1: EG Pod Architecture

How are you using MongoDB in this architecture?

Each pod maintains a local instance of the data it needs, pulling from a centralized system of record that stores all property details, transactions, locations, market data, availability, and so on. The system of record – or the “data-core pod” as we call it – in addition to each of the data product pods all run on MongoDB.

MongoDB is at the center of our event driven architecture. All updates are committed to the system of record – for example, when a new property comes onto the market – which then uses the Amazon Web Services (AWS) SNS push notification and SQS message queue services to publish the update to all the other product pods. This approach means all pods are working with the latest copy of the data, while avoiding tight coupling between each pod and the core database.

Why did you select MongoDB?

Agility and time to market are critical. We decided to use a Javascript-based technology stack that allows consistent developer experience from the client, to the server, through to the database, without having to deal with any translations between layers.

We evaluated multiple database options as part of our platform modernization:

  • Having used relational databases extensively in the past, we knew the pain and friction involved with having to define a schema in the database, and then re-implement that same schema again in an ORM at the application layer. And we would need to repeat this process for each pod we developed, and for each change in the data model as we evolved application functionality.
  • We also took a look at an alternative NoSQL document database, but it did not provide the development speed we needed as we found it was far too complex and difficult to use.

As the M in the MEAN stack, we knew MongoDB would work well with Javascript and Node.js. I spun up a local instance on my laptop, and was up and running in less that 5 minutes, and productive within the hour. I judge all technology by my one hour rule. Basically, if within an hour I can start to understand and be productive with the technology, that tells me it's got a really good developer experience, supported by comprehensive documentation. If it's harder than that, I’m not likely to get along with that technology in the longer term. We didn’t look back from that point onwards – we put MongoDB through its paces to ensure it delivered the schema flexibility, query functionality, performance, and scale we needed, and it passed all of our tests.

Can you describe your MongoDB deployment?

We run [MongoDB Enterprise Advanced](https://www.mongodb.com/products/mongodb-enterprise-advanced "MongoDB Enterprise Advanced") on AWS. We get access to MongoDB support, in addition to advanced tooling. We are in the process of installing [Ops Manager](https://www.mongodb.com/products/ops-manager "Ops Manager") to take advantage of fine-grained monitoring telemetry delivered to our devops team. This insight enables them to continue to scale the service as we onboard more data products and customers.

MongoDB Compass is a new tool we’ve started evaluating. The schema visualizations can help our developers to explore and optimize each pod’s data model. The new integrated geospatial querying capability is especially valuable for our research teams. They can use the Compass GUI to construct sophisticated queries with a simple point and click interface, returning results graphically, and as sets of JSON documents. Without this functionality, our valuable developer resource would have been tied up creating the queries for them.

We will also be upgrading to the latest MongoDB release to take advantage of the MongoDB Encrypted storage engine to extend our security profile, and prepare for the new EU General Data Protection Regulation (GDPR) market legislation that is coming into force in 2018.

How is MongoDB performing for you?

MongoDB has been rock solid for us. With very few operational issues our team is able to focus on building new products. The flexible data model, rich query language, and indexing makes development super-fast. Geospatial search is the starting point for the user experience – a map is the first thing a customer sees when they access our data products. [MongoDB’s geospatial queries and indexes](https://docs.mongodb.com/manual/applications/geospatial-indexes/ "MongoDB Documentation for geospatial queries and indexes") allow our customers to easily navigate market data by issuing polygon searches that display properties within specific coordinates of interest.

Navigating property market data with geospatial search UI

Figure 2: Navigating property market data with sophisticated geospatial search UI

We also depend on the MongoDB aggregation pipeline to generate the analytics and insights the business, and our customers, rely on. For example, we can quickly generate averages for rents achieved in a specific area, aggregated against the total number of transactions over a given time period. Each MongoDB release enriches the aggregation pipeline, so we always have new classes of analytics we can build on top of the database.

How are you measuring the impact of MongoDB on your business?

It’s been a core part of our team transitioning from being a business support function to being an enabler of new revenue streams. With our pod-based architecture powered by MongoDB, we can get products to market faster, and release more frequently.

With relational databases, we were constrained to one new application deployment per month. With MongoDB, we are deploying 50 to 60 times per month. With MongoDB’s self-healing replica set architecture, we’ve delivered improved uptime to deliver always-on availability to the business.

Chris and Chad, thanks for taking the time to share your experiences with the MongoDB community

To learn more about microservices, download our new whitepaper:

Microservices and the Evolution of Building Modern Apps


1. http://www.bpf.org.uk/about-real-estate

How a 520% Increase in Players Sparked Online Gaming Platform Pirate Kings Migration to MongoDB Professional and Cloud Manager

MongoDB

Customer Stories

70 million pirates battle it out on MongoDB

Jelly Button Games is a free-to-play mobile gaming company based in Tel Aviv, Israel that focuses on building original games that are mobile-friendly, multi-platform, and allow people to play together no matter where in the world they are located. Founded in 2011, Jelly Button has grown from the five original founders to more than 85 employees.

I’m Shai Zonis, a senior server developer at Jelly Button for the game Pirate Kings. Pirate Kings is a fully realized world where over 70 million pirates battle it out to conquer exotic islands in a quest of gold, artifacts and revenge. Most users notice the palm trees, glimmers of gold, and the quality of the animation, but few think about the tools working behind the scenes to make the game operate seamlessly.

![Pirate Kings](https://webassets.mongodb.com/_com_assets/cms/PirateKings1-asetu3pmrr.png "Pirate Kings")

After upgrading to MongoDB Professional and Cloud Manager, we have scaled to easily manage 70 million users with 60% cost savings compared to our previous MongoDB hosting provider. While today everything is running smoothly, the path to success wasn’t always nicely paved - we had to fight our own battles to win the day.

Challenges of a third party database hosting service

Our team originally had experience with relational database technologies. However, we knew that a relational database would not provide the scale, agility and performance needed to make a game like Pirate Kings successful. MongoDB was the clear choice, though at the time, we didn’t know much about the operational aspects of running it. In the end we decide to work with a third party MongoDB hosting service to manage our database.

In the early days Jelly Button had a million daily unique users and, for a while, all was going well.

Suddenly, the game went viral and there was a 520% increase in users in just two weeks. The business was excited by this increase in popularity, though the engineering team got a little nervous about the latency spikes impacting the user experience.

Despite the challenges we faced, we initially did not want to migrate from our existing hosting service because of the amount of time and money we had already invested in the platform.

![Pirate Kings](https://webassets.mongodb.com/_com_assets/cms/PirateKings_2-y6i6skptmm.png "Pirate Kings")

The final straw

Fast forward to February of 2016 when our existing third party MongoDB hosting service began to strangle our ability to scale and expand the game. We were constantly facing issues with performance, and the third party service was not able to help us address the problem.

At that point, it was necessary to move beyond a third party and instead work directly with the team that develops the database. We needed to find ways to better manage our data and scale to meet our growing number of users. We tried to make the transition on our own, but quickly realized we could accelerate the upgrade and transition by working directly with MongoDB Professional Services.

Working with Masters - how MongoDB helped replatform our database and grow the business

Before the migration, we were facing exorbitant costs and had very little insight into how the database was performing.

MongoDB Professional Services worked alongside our team to successfully migrate Pirate Kings from the third party hosting service to MongoDB 3.2 configured with the WiredTiger storage engine in under two months. Together we were able to migrate, fix and optimize our database with little downtime. Our consultant was focused on teaching and mentoring the team, and the amount of know-how and technical discussions we had during this time were truly empowering. Working with professional services felt like working with true MongoDB masters.

Once upgraded, we saw a 60% cost savings and we were able to compress 18 shards down to one single replica set. With the transition to WiredTiger, the data size on disk dropped by 60% due to its native compression libraries.

MongoDB Cloud Manager, a platform for managing MongoDB, was also instrumental in giving us full insight into the database for the first time. With Cloud Manager we had much higher levels of data protection and lower operational complexity with managed backups. We were finally able to dig deep into database telemetry to understand the pitfalls that were inherent in our previous service. With MongoDB Professional, we were able to get direct access to 24x7 support.

Overall, the complexity of our database significantly decreased and our database administrators are able to sleep much better.

What’s Next

While the main motivation for migrating away from a third party hosted service was to better manage Pirate Kings data, MongoDB provided us the promise of a better life for our developers and a better future for our company. Today Pirate Kings easily manages 10 million unique players per month. Better yet, our team now feels very comfortable and confident with the technology.

Moving forward, you can expect to see Jelly Button develop two new games per year, all of which - we are excited to say - are being built on MongoDB. They are the pirate kings!


Try MongoDB Cloud Manager

Thermo Fisher moves into the cloud with MongoDB Atlas & AWS

Biotechnology giant uses MongoDB Atlas and an assortment of AWS technologies and services to reduce experiment times from days to minutes.

Thermo Fisher (NYSE: TMO) is moving its applications to the public cloud as part of a larger Thermo Fisher Cloud initiative with the help of offerings such as MongoDB Atlas and Amazon Web Services. Last week, our CTO & Cofounder Eliot Horowitz presented at AWS re:Invent with Thermo Fisher Senior Software Architect Joseph Fluckiger on some of the transformative benefits they’re seeing internally and across customers. This recap will cover Joseph’s portion of the presentation.

Joseph started by telling the audience that Thermo Fisher is maybe the largest company they’d never heard of. Thermo Fisher employs over 51,000 people across 50 countries, with over $17 billion in revenues in 2015. Formed a decade ago through the merger of Thermo Electron & Fisher Scientific, it is one of the leading companies in the world in the genetic testing and precision laboratory equipment markets.

The Thermo Fisher Cloud is a new offering built on Amazon Web Services consisting of 35 applications supported by over 150 Thermo Fisher developers. It allows customers to streamline their experimentation, processing, and collaboration workflows, fundamentally changing how researchers and scientists work. It serves 10,000 unique customers and stores over 1.3 million experiments, making it one of the largest cloud platforms for the scientific community. For internal teams, Thermo Fisher Cloud has also streamlined development workflows, allowing developers to share more code and create a consistent user experience by taking advantage of a microservices architecture built on AWS.

One of the precision laboratory instruments the company produces is a mass spectrometer, which works by taking a sample, bombarding it with electrons, and separating the ions by accelerating the sample and subjecting it to an electric or magnetic field. Atoms within the sample are then sorted by mass and charge and matched to known values to help customers figure out the exact composition of the sample in question. Joseph’s team develops the software powering these machines.

Thermo Fisher mass spectrometers are used to:

  • Detect pesticides & pollutants — anything that’s bad for you
  • Identify organic molecules on extraplanetary missions
  • Process samples from athletes to look for performance-enhancing substances
  • Drive product authenticity tests
  • And more

Thermo Fisher mass spectrometers

*Mass Spectrometry Results*

MS Instrument Connect

*MS Instrument Connect*

During the presentation, Joseph showed off one application in the Thermo Fisher Cloud called MS Instrument Connect, which allows customers to see the status of their spectrometry instruments with live experiment results from any mobile device or browser. No longer does a scientist have to sit at the instrument to monitor an ongoing experiment. MS Instrument Connect also allows Thermo Fisher customers to easily query across instruments and get utilization statistics. Supporting MS Instrument Connect and marshalling data back and forth is a MongoDB cluster deployed in MongoDB Atlas, our hosted database as a service.

Joseph shared that MongoDB is being used across multiple projects in Thermo Fisher and the Thermo Fisher Cloud, including Instrument Connect, which was originally deployed on DynamoDB. Other notable applications include the Thermo Fisher Online Store (which was migrated from Oracle), Ion Reporter (which was migrated from PostgreSQL), and BioPharma Finder (which is being migrated from SQL Lite).

Thermo Fisher apps moving to MongoDB

*Thermo Fisher apps moving to MongoDB*

To support scientific experiments, Thermo Fisher needed a database that could easily handle a wide variety of fast-changing data and allow its customers to slice and dice their data in many different ways. Experiment data is also very large; each experiment produces millions of “rows” of data. When explaining why MongoDB was chosen for such a wide variety of use cases across the organization, Joseph called the database a “swiss army knife” and cited the following characteristics:

  • High performance
  • High flexibility
  • Ability to improve developer productivity
  • Ability to be deployed in any environment, cloud or on premises

What really got the audience’s attention was a segment where Joseph compared incumbent databases that Thermo Fisher had been using with MongoDB.

MongoDB compared to MySQL (Aurora)

If I were to reduce my slides down to one, this would be that slide,” Joseph stated, “This is absolutely phenomenal. What we did was we inserted data into MongoDB & Aurora and with only 1 line of code, we were able to beat the performance of MySQL.

Inserting data: MongoDB vs. MySQL

*"If I were to reduce my slides down to one, this would be that slide.”*

In additional to delivering 6x higher performance with 40x less code, MongoDB also helped reduce the schema complexity of the app.

MongoDB compared to SQL Lite

For the mass spectrometry application used in performance enhancing drug testing, Thermo Fisher rewrote the data layer from SQL Lite to MongoDB and reduced their code by a factor of about 3.5.

MongoDB compared to DynamoDB MongoDB compared to MongoDB

Joseph then compared MongoDB to DynamoDB, stating that while both databases are great and easy to deploy, MongoDB offers a more powerful query language for richer queries to be run and allows for much simpler schema evolution. He also reminded the audience that MongoDB can be run in any environment while DynamoDB can only be run on AWS.

Finally, Joseph showed an architecture diagram showing how MongoDB is being used with several AWS technologies and services (including AWS Lambda, Docker, & Apache Spark) to parallelize algorithms and significantly reduce experiment processing times.

Parallel data processing

*Reducing experiment times from days to minutes*

He concluded his presentation by explaining why Thermo Fisher is pushing applications to MongoDB Atlas, citing its ease of use, the seamless migration process, and how there has been no downtime, even when reconfiguring the cluster. The company began testing MongoDB Atlas around its release date in early July and began launching production applications on the service in September. With the time Thermo Fisher team is saving by using MongoDB Atlas (that would have otherwise been spent on writing and optimizing their data layer), they’re able to invest more time in improving their algorithms, their customer experience, and their processing infrastructure.

Anytime I can use a service like MongoDB Atlas, I’m going to take that so that we at Thermo Fisher can focus on what we’re good at, which is being the leader in serving science.

To view Joseph & Eliot’s AWS re:Invent presentation in its entirety, click here.

--

Get started with MongoDB Atlas today!


Travellers Instantly Browse Billions of Recommendations With MongoDB-Powered New Amadeus Flight Search Portfolio

Mat Keep

Customer Stories

I’ve got $600 and I want to escape to somewhere warm for winter – where can I go? Answering that question efficiently is a tricky data challenge but it’s increasingly the type of query travellers are asking. That’s why Amadeus called on MongoDB to help build an Instant Search application which can browse billions of travel options across multiple criteria in real time. Travel site KAYAK is already using Amadeus Instant Search technology to improve user experience for travellers and increase conversion rates.

“Online channels have transformed the way people plan and shop for trips. They want to be inspired by travel choices and they want to explore, compare and buy now. Providing instant results to complex search queries is daunting and requires cutting-edge technology explained Wolfgang Krips, Executive Vice President Global Operations at Amadeus.

Amadeus and KAYAK travel site

Amadeus is a leading provider of advanced technology solutions for the global travel industry, serving over 190 countries and handling over 3.9 million travel related bookings on peak days. To serve those hundreds of millions of customers, Amadeus has to process an enormous amount of data, including more than 1.6 billion data requests every day. As part of its mission to shape the future of travel at a massive scale, Amadeus’s infrastructure team has been increasing its use of MongoDB as the database powering the latest generation of its feature-rich travel applications.

MongoDB is working with Amadeus on more than a dozen applications, from back-office airline accounting platforms right through to consumer facing, mission critical search applications. MongoDB, the fastest growing database on the planet, gives Amadeus the flexibility to accelerate time to value and handle key data structures at immense scale, for the industry's most demanding travel companies. As a central part of its digital initiatives, the organisation has become one of the most advanced global users of MongoDB.

Olaf Schnapauff, Chief Technology Officer Global Operations at Amadeus said “Expectations for online and mobile services are now incredibly high. With MongoDB, we’ve been able to build an application that has incredible performance even while concurrently handling billions of queries. Regular databases just don’t have that scale.”

MongoDB and Amaeus instant search

When travellers sit down to book a holiday 25% of them are not decided on a destination and 42% don’t even know what date they’ll want to leave. This makes most travel search applications, which require dates and destinations to be specified, far too limited.

Amadeus Instant Search, solves that problem, making it possible to look for holidays in a more human and natural way. Users have the ability to search billions of travel options across multiple dimensions – cost, distance, location, date, and route. It’s a simple idea, but complex to implement.

Originally Instant Search was developed on an internal NoSQL database but whenever the testing load increased, performance would suffer. As other teams within Amadeus were having success with MongoDB, the Instant Search team adopted it and immediately saw major benefits. No matter how large the data set got MongoDB would continue to deliver outstanding performance.

The application takes advantage of MongoDB Enterprise Advanced, with the operations team using MongoDB Ops Manager to monitor and manage the database. Proactive alerts generated from monitoring telemetry helps ensure any potential performance bottlenecks or issues are spotted well before they affect customers.

To scale effectively Amadeus has distributed the database across multiple shards (the practice of breaking the database deployment into smaller more manageable parts). Using the MongoDB WiredTiger storage engine the team has also managed to compress storage by 80%, leading to significant cost reductions and performance improvements.

For many of its other MongoDB projects, Amadeus has also worked closely with MongoDB’s consulting services team. In fact, the range of complex and industry-leading projects that the company is working on has made it one of the most sophisticated users of non-relational technologies, even winning MongoDB’s Innovation Award in 2015. Many best practices, and some features, have been created in concert with Amadeus staff.

“We’re working to build a world where all travel is personalised and connected. To do this we can’t just solve yesterday’s problems, we have to continually run ahead of the pack by building giant ideas no one else has the capabilities or creativity for,” said Wolfgang Krips.

Olaf Schnapauff continued “MongoDB plays a crucial role in that work as it gives us the right features for critical use cases, stability, and security, but it also has the product maturity to run a massive application at scale, a combination that is hard to find.”

Try MongoDB Enterprise Advanced



About the Author - Mat Keep

Mat is a director within the MongoDB product marketing team, responsible for building the vision, positioning and content for MongoDB’s products and services, including the analysis of market trends and customer requirements. Prior to MongoDB, Mat was director of product management at Oracle Corp. with responsibility for the MySQL database in web, telecoms, cloud and big data workloads. This followed a series of sales, business development and analyst / programmer positions with both technology vendors and end-user companies.

Cisco & MongoDB: E-commerce Transformation

Mat Keep

Customer Stories

Migrating a Major Revenue Generation Platform to Improve Customer Experience

The recent Launchpad event in San Francisco was the venue for our announcement of MongoDB 3.4. This new release offers a major evolution in capabilities for users seeking to address the emerging opportunities presented by digital transformation. Even before MongoDB 3.4, many of the world’s most progressive and innovative companies have been using MongoDB as a cornerstone of their transformation initiatives.

To illustrate this point, at the launch event Dharmesh Panchmatia, Director of e-commerce at Cisco Systems, discussed how his company is using MongoDB to modernize its e-commerce platform. This is a mission-critical application in every sense of the word – it handles the online configuration, pricing and purchasing of all Cisco products and services globally. Through the transformation, Cisco is aiming to do two things:

  1. Replatform for the cloud. Now the company’s e-commerce solution can tap into the business agility enabled by cloud computing.
  2. Improve customer experience. 8x faster response times, while delivering continuous availability to ensure users are being served 24x7.

You can get all of the detail by watching a recording of Dharmesh discussing the Cisco e-commerce transformation project in a fireside chat with Seong Park, VP of Solutions Architecture at MongoDB. I had the chance to catch up with Dharmesh after his chat to get a summary of the lessons learned from using MongoDB to transform a mission-critical application.

Can you start by telling us about your role at Cisco?
I manage a team of 45 Cisco engineers and architects, in addition to the external resources we use for development and testing. We are distributed across multiple engineering sites in the US and India.

What is your team responsible for?
We are responsible for the e-commerce platform – a suite of 35 different services that power product configuration, pricing, quoting, export compliance, credit checks, and order booking across all Cisco product lines. It’s managing both B2B and B2C transactions, serving 170,000 unique users across Cisco's sales team, Cisco partner, and direct customer users across the globe. We are handling an average of 3.6 million hits per day, rising to 6 million at quarter end, and we’re experiencing continual growth. The average financial value of each transaction is extremely high, so it is truly critical to our business.

What challenges were you seeking to address in the transformation project?
We’re driven by improving customer experience and business agility. By modernizing the underlying database layer, we wanted to make the e-commerce platform cloud-ready. What does that look like for the business?:

  • Ensuring resilience of the core platform by providing fault tolerance against any outage.
  • Enabling continuous delivery of new code deployments and upgrades with zero downtime.
  • Improving response times to users by a minimum of 5x. Our previous SLAs required 98% of all page loads be completed in less than 3 seconds, and 99% within 5 seconds. In order to provide a seamless user experience, we needed to make page rendering on the client a sub-second operation

What led you to MongoDB?
We looked across the entire non-relational database landscape. MongoDB came out as the top candidate for three reasons:

  1. Strong consistency with partition tolerance. As a distributed transactional platform, we can’t afford the risk of acting on stale data. Eventual consistency wasn’t an option. MongoDB was ahead of all other databases in providing the consistency guarantees demanded by the business.
  2. Secondary Indexes. We don’t have control over the types of questions business users want to ask of the data, so we needed the ability to efficiently query against any attribute. MongoDB’s secondary indexes gave us this flexibility in ways other databases simply couldn’t match.
  3. Ease of development. Time to market is critical, so we needed a database that was easy to develop against. MongoDB’s flexible document model allows us to store objects in the same structure as our applications, without requiring transformations or mappings between the code and persistence layers. This makes it much simpler for our engineers to interact with the database. Our UI is written in AngularJS and JavaScript, while the application layer is developed in Java. MongoDB’s JSON documents fit perfectly into this environment.

How did you prove MongoDB as the right solution for you?
We needed to ensure MongoDB delivered both the scalability and availability demanded by the e-commerce platform. So we embarked on a rigorous PoC (Proof of Concept) to test MongoDB’s behavior under a variety of conditions:

  • We started by validating MongoDB’s performance against required peak load. We then increased that load by 5x. MongoDB scaled seamlessly, delivering the same low latency and high throughput. This gave us the headroom to accommodate projected business growth.
  • We then went on to explore MongoDB’s availability guarantees by inducing a range of failure conditions. We randomly killed primary replica set members, then measured recovery times. We were extremely impressed by how quickly MongoDB self-healed – typically in 2 seconds – without administrators having to step in to restore operations.
  • Finally, we tested resilience to outages in the underlying network and storage infrastructure. MongoDB’s distributed architecture was able to withstand these failures without interruption to the application or its users.

With the testing complete, we had the confidence we needed to move forward with MongoDB.

At what stage are you now with the project?
We started development in February 2015, and went live in December of that year. We have focused first on the most mission-critical part of the app – capturing orders from partners and customers. Any issue in that part of the application would have materially affected Cisco’s financial results. We had to ensure there was no system impact as we migrated the database layer to MongoDB, so testing has been exhaustive.

It’s not just the database we’ve changed. As part of the migration, we have delivered 26 additional business capabilities that involved the development of 600,000 lines of new application-side code and 120,000 lines of rewritten code.

Our e-commerce platform is such a critical application, and naturally the business is averse to introducing any risk. As a result, we went through a lot of regression testing, and ran database operations in parallel for a while where we wrote transactions to both the existing database and to MongoDB. While this approach involved more work for my team, it was essential to giving the business owners the confidence they needed to move forward with the project.

With this part of the project successfully delivered, we are now actively moving the remaining parts of the application across to MongoDB, and expect to be done in around 12 months. We are also spending more time with other application owners in Cisco who have learned about our achievements with MongoDB, and are now seeking to replicate it in their own domains.

What does your MongoDB deployment look like?
Currently we are running MongoDB Enterprise Advanced on a 5-node replica set across three data centers located across the US. This architecture gives us the performance, scale, and fault tolerance demanded by the business. We use MongoDB Ops Manager extensively to monitor node health and query performance – generating alerts so that my team can proactively address any issues before impact to our users.

What have the results of the migration to MongoDB been?
The results have been excellent – we’ve encountered no issues since launch. We’ve just closed our first full quarter on the new platform, with perfect performance. The maximum query latency in MongoDB is close to 10x less than the latency we specified at the start of the project. Pages render in 600 – 700ms at the 99th percentile, compared to 5 seconds in the old system.

We've also completed a MongoDB upgrade since launch. In my experience with other databases, upgrading can be an arduous operation. Plus, for such a critical revenue-generating application, we were not able to schedule a downtime window. Using Ops Manager, we were able to upgrade from MongoDB 3.0 to 3.2 in less than 5 minutes, and without any service interruption.

You’ve taken a look at MongoDB 3.4. What is most interesting to you in the new release?
There are a couple of new features that really stand out:

  1. Faceted navigation. This allows us to deliver an enriched search experience to our users while streamlining our environment.
  2. MongoDB Compass. This is an essential tool for my DBAs. It enables them to fully control the database schema – from enforcing data governance to optimizing query performance and user experience.

What guidance would you give to other teams planning transformation initiatives with MongoDB?
There are four best practices I’d recommend:

  1. Make sure you identify the key business metrics that will define success, and get buy-in from stakeholders.
  2. Work with the MongoDB solution architecture and consulting teams. They provide unparalleled expertise in key areas such as schema design, system sizing, and performance best practices. You will be accelerating your path to success by leveraging these teams.
  3. Invest in regression testing and in backwards compatibility.
  4. Don’t build your project in isolation. Instead, fully think through impacts to other applications and develop APIs where necessary to reduce tight coupling between systems.

Dharmesh, thank you for sharing your experiences with the MongoDB community.

Want to get a head-start on your digital transformation initiative?

Download our Relational Database to MongoDB Migration Guide



About the Author - Mat Keep

Mat is a director within the MongoDB product marketing team, responsible for building the vision, positioning and content for MongoDB’s products and services, including the analysis of market trends and customer requirements. Prior to MongoDB, Mat was director of product management at Oracle Corp. with responsibility for the MySQL database in web, telecoms, cloud and big data workloads. This followed a series of sales, business development and analyst / programmer positions with both technology vendors and end-user companies.

MongoDB Atlas Customer Spotlight: Checkr

MongoDB Atlas continues to explode in popularity amongst our users. One of the big reasons MongoDB has grown to be the fourth largest database in the world is the desire by users to share their successes on implementation.

Checkr is a company that provides an API that assists businesses in performing modern and compliant background checks. Their discovery and eventual migration to Atlas was recently shared with the internet by Sašo Matejina of the Checkr team.

Checkr

The blog post "MongoDB cluster migration with zero downtime" provides a detailed idea of why they decided to make this change to Atlas and even includes additional software written in Go by Sašo and his team. I got a chance to catch up with Sašo recently to discuss his blog post and some other thoughts on his migration, open source software and MongoDB Atlas.

Jay: How long have you been working with MongoDB personally and at Checkr?

Saso: I started using MongoDB in 2010 right after it shifted its model to open source. So for the last couple of years I used, deployed and managed MongoDB at different scales. Checkr was built using MongoDB from the start and we are actively using it with our microservices.

Jay: What was your most compelling reason to switch to Atlas?

Sašo: We started evaluating the product a couple of months ago, and we were satisfied with easy management and possibility to scale our clusters with a click of a button, flexibility on storage and securing our data with on-rest encryption.

Jay: Talk to me about your support experience a bit. Did you need to reach out to our team? What was it like?

Sašo: We were quite active on support channels as our migration required some special permissions that were not available in Atlas at the time. Support was helpful, and we like that the Atlas team listened to us and made the necessary product changes.

Jay: What's your favorite feature of MongoDB Atlas?

Sašo: I would go with ease of management and scaling. But there are a lot of other cool features like access control with support for teams, alerting on host type. But there are also some features we would like to see in the future like slow query log and index management.

Jay: How do you feel about the performance you see with Atlas?

Sašo: Performance is good, and web interface is very responsive.

Jay: You wrote a wrapper to sync additional data after your data import, was this your first open source project?

Sašo: I've done and contributed to open source before. We are all using open source every day and it's important and it feels good to give something back to the community.

Jay: Would you recommend MongoDB Atlas as an alternative to any existing MongoDB as a services that exist?

Sašo: I would recommend it because I think it provides a good product with competitive pricing.

Great to hear about another successful migration. Tell me more about yours! Email us at cloud-tam@mongodb.com and share your success!

About the Author - Jay Gordon

Jay is a Technical Account Manager with MongoDB and is available via our chat to discuss MongoDB Cloud Products at https://cloud.mongodb.com.