GIANT Stories at MongoDB

Modernizing and Protecting Data Center Operations with MongoDB and Dell EMC

As part of our ongoing series highlighting our partner ecosystem, we recently sat down with Dell EMC Global Alliances Director, Tarik Dwiek and Director of Product Management, Philip Fote, to better understand how Dell EMC and MongoDB partner to help modernize and protect data center operations.

What do you want customers to know about the MongoDB and Dell EMC relationship?
Tarik: We have been partnering for over a year on Dell EMC Flash and software-defined platforms, and the traction has been amazing. To fully realize the potential of MongoDB, customers need to modernize their infrastructure and transform their data center operations. At Dell EMC, our strategy is to help customers achieve this modernization by taking advantage of 4 key pillars: flash, software-defined, scale-out, and cloud enabled solutions. In addition, we are working on a data protection strategy for enterprise-grade backup and restore of MongoDB.

Can you further explain how this strategy relates directly to MongoDB?
Tarik: First off, MongoDB unlocks the ability for unparalleled performance at the database layer. This is where Flash is essential, meeting these performance requirements with compelling economics. Second, scale-out architectures, like MongoDB, have become a requirement because customers are generating orders of magnitude more data. Third, many organizations are implementing a software-defined data center. This model automates the deployment and configuration of IT services, resulting in agility and flexibility for managing data services. Finally, we want to ensure that the on-prem data center can leverage public cloud economics non-disruptively.

Tell us more about Dell EMC Data Protection solutions.
Philip: At Dell EMC, we believe data needs to be protected wherever it lives and no matter what happens. With this in mind, we start with the reality that data protection cannot be one size fits all in terms of service levels. Protection and availability should be based on data value and service levels that align to business objectives. Dell EMC looks at data protection as a continuum that spans over many protection tiers, including availability, replication, backup, snapshots, and archive; we offer products and solutions that span this continuum. With this, customers can tailor their data protection solution to best serve their specific needs.

What is Data Domain?
Philip: Dell EMC Data Domain systems deliver industry leading protection storage. Data Domain can reduce the amount of disk storage needed to retain and protect data by ratios of 10-30x and greater. It can scale up to 150 PB of logical capacity managed by a single system and with throughput up to 68 TB/hour. Data Domain systems make it possible to complete more backups in less time and provide faster, more reliable restores. The Data Domain Operating System (DD OS) is the intelligence behind Data Domain systems that makes them the industry’s most reliable and cloud-enabled protection storage.

What is DD Boost?
Philip: DD Boost provides advanced integration between Data Domain systems and leading backup and enterprise applications. DD Boost distributes parts of the deduplication process to the backup server or application client to speed backups by up to 50 percent and reduce bandwidth requirements by up to 99 percent.

What is DD Boost file system plug-in?
Philip: DD Boost is now immediately available for new workloads that were previously unavailable by using a standard file system interface. BoostFS can be deployed in minutes, reducing backup windows and storage capacity.

Why did you choose to certify MongoDB with BoostFS?
Philip: Dell EMC is committed to providing customers a holistic data protection strategy that evolves with changes in the market. The adoption of NoSQL open source databases is one of those changes, and MongoDB is a market leader. This new partnership with the Data Domain ecosystem will better allow our customers to add MongoDB workloads to their existing infrastructure. BoostFS provides all the benefits and efficiencies of DD Boost, and does so in a simple, cost effective manner. With Dell EMC and MongoDB, customers are now given a valuable, synergistic solution built from two industry leaders.

What MongoDB configurations are supported with BoostFS?
Database: MongoDB v2.6, 3.0, 3.2, and 3.4 (future)
Storage Engines: mmapv1 and wired tiger
Backup Tools: Ops Manager 2.0.7, mongodump


Data Domain: All Platforms and DDVE
DD OS: v6.0
BoostFS: v1.0

For more information or to ask questions about BoostFS with MongoDB, please visit the Data Domain Community web site.

Where do you see this relationship going?
Philip: As the Product Manager for DD Boost and BoostFS, part of my responsibilities include running the partner ecosystem for DD Boost, so I have a lot experience in dealing with partners. When working in that capacity, it’s easy to separate the good from the bad. Working with MongoDB has been great from the start – they have been responsive, flexible, and proactive in solving problems. Both firms are excited about the solution being offered today, and discussions have already started on extending this solution to cloud use cases.

What is the main use case for MongoDB with BoostFS?
Philip: One of the main use cases for BoostFS is to provide an enterprise backup and recovery solution with the option to replicate to a remote site. This secondary site can be used for disaster recovery or long term retention. The BoostFS plug-in resides on the MongoDB Ops Manager server as a Linux file system mount point, and the DD Boost protocols transports the data written to the file system, by Ops Manager, to/from the Data Domain. Then backups are replicated using MTree replication to a remote Data Domain system.

MongoDB and Boost

What are the benefits you’ll get with BoostFS for MongoDB as opposed to Network File System (NFS)?
Philip: BoostFS offers advanced features while retaining the user experience you get with NFS, including load balancing and failover plus security. The chart below shows the benefits of BoostFS over NFS. Details on these features can be found on DellEMC.com or at the Data Domain User Community site.

BoostFS for MongoDB

What exciting things can we look forward to next from MongoDB and Dell EMC?
Tarik: We have invested heavily in hyper-converged infrastructure. More and more customers are seeing the benefits in shifting their focus from maintaining infrastructure to innovating their application. We see tremendous potential in validating and eventually embedding MongoDB into our converged offerings.

Thank you for speaking with us Tarik and Philip. If you’d like to learn more:

Dell EMC and MongoDB Solutions Brief



Leaf in the Wild: How Loopd uses MongoDB to Power its Advanced Location-Tracking Platform for Conferences

Leo Zheng

Customer Stories

Conferences can be incredibly hectic experiences for everyone involved. You have attendees wanting to meet and exchange information, sponsors and exhibitors looking to maximize foot traffic to their booths, and the conference hosts trying to get a sense of how they can optimize their event and if it was all worth it in the end.

While sponsors usually do get a lead list immediately after an event for their troubles, attendees often struggle to remember who they actually spoke to and event hosts are often left in the dark about what they can do to maximize the returns on their investments. Enter Loopd, an advanced events engagement platform.

I sat down with their CEO, Brian Friedman, to understand how they’re using MongoDB to help conference attendees and event hosts separate the signal from the noise.

Tell us about Loopd.
Loopd provides physical intelligence for corporate events. We help corporate marketers learn how people interact with each other, with their company, and with their company's products. The Loopd event engagement system is the industry's only bi-directional CRM solution that enables the exchange of content and contact information passively and automatically. We equip conference attendees with Loopd wearable badges, which can be used to easily exchange contact information or gain entry into sessions. Through our enterprise IoT analytics and sensors, we then gather and interpret rich data so that marketers have a more sophisticated understanding of business relationships and interactions at conferences, exhibits and product activation events.

Some of our clients include Intel, Box, Twilio, and MongoDB.

loopd bluetooth badges

Bluetooth LE Loopd Badges

How are you using MongoDB?
We use MongoDB to store millions of datapoints from connected advertising and Bluetooth LE Loopd Badges on the conference floor. All of the attendee movement data captured by the Loopd Badge at an event can be thought of as time series data associated with location information. We track each Loopd Badge’s location and movement path in real time during the event. As a result, we handle heavy write operations during an event to make sure any and all calculations are consistent, timely, and accurate.

We also use the database for real-time analysis. For example, we calculate the number of attendee visits & returns, and average time durations in near real time. We use the aggregation framework in MongoDB to make this happen.

What did you use before MongoDB?
Before MongoDB, we used PostgreSQL as our main data store. We used Redis as a temporary data buffer queue for storing new movement data. The data was dumped, inserted, and updated into rows in the SQL database once per second. The raw location data was read and parsed from the SQL database into a user-readable format. We needed a temporary buffer because the high volume of insert and update requests drained available resources.

What challenges did you face with PostgreSQL?
With PostgreSQL, we needed a separate Redis caching server to buffer write and update operations before storing them in the database, which added architectural and operational complexity. It also wasn’t easy to scale as it’s not designed to be deployed across multiple instances.

How did MongoDB help you resolve those challenges?
When we switched to MongoDB from PostgreSQL, our write throughput significantly increased, removing the need for a separate caching server in between the client and the database. We were able to halve our VM resource consumption (CPU power and memory), which translated to significant cost savings. As a bonus, our simplified underlying architecture is now much easier to manage.

Finally, one of the great things about MongoDB is its data model flexibility, which allows us to rapidly adapt our schema to support new application demands, without the need to incur downtime or manage complex schema migrations.

Please describe your MongoDB deployment.
We typically run one replica set per event. The database size depends on the event — for MongoDB World 2016, we generated about 2 million documents over the course of a couple of days. We don’t shard our MongoDB deployments yet but having that ability in our back pocket will be very important for us going forward.

At the moment, all of our read queries are executed on the secondaries in the replica set, which means write throughput isn’t impacted by read operations. The smallest analytics window in our application is a minute, which means we can tolerate any eventual consistency from secondary reads.

Our MongoDB deployments are hosted in Google Cloud VM instances. We’re exploring using containers but they’re currently not in use for any production environments. We’re also evaluating Spark and Hadoop for doing some more interesting things with the data we have in MongoDB.

What version of MongoDB are you running?
We use MongoDB 3.2. We find the added document validation feature very valuable for checking data types. While we will still perform application-level error validation, we appreciate this added level of security.

What advice do you have for other companies looking to start with MongoDB?
MongoDB is flexible, scalable, and quite developer and DBA friendly, even if you’re used to RDBMS.

We would recommend familiarizing yourself with the basic concepts of MongoDB first, heavily leaning on the community during learning. I’d also recommend reading the production notes to optimize system configuration operational parameters.

Brian, thanks for taking the time to share your story with the MongoDB community.

Thinking of migrating to MongoDB from a relational database? Learn more from our guide:

Download RDBMS Migration Guide


Transforming Customer Experiences with Sitecore and MongoDB Enterprise

Modern customers live across multiple channels, creating data with every tweet, comment, swipe, and click. They expect deep personalization based on their interactions with a company's brand and won’t settle for anything less than instant gratification. Customer data is a company’s lifeblood. But it can’t help a company if it’s strewn across an organization, locked away in siloed systems. Designed to alleviate IT organizations’ data burden and empower marketers, Sitecore Experience Database (xDB) is a Big Marketing Data repository that collects all customer interactions, connecting them to create a comprehensive, unified view of the individual customer.

Sitecore is a leader in this space and their product makes data available to marketers in real-time, for automated interactions across all channels. xDB is a critical component of Sitecore Experience Platform™, a single platform that allows you to create, deliver, measure, and optimize experiences for your prospects and customers. xDB is powered by MongoDB, and collects and connects all of a customer’s interactions with a company's brand - including those on other customer-facing platforms such as ERP, CRM, customer service, and non-Sitecore websites, creating comprehensive, unified views of each customer. Those views are available to your marketers in real-time to help you create catered customer experiences across all your channels. Sitecore is one of our best and most strategic partners and we are proud to state the relationship is stronger than ever.

Recently, Sitecore launched Sitecore® Experience Platform 8.2, which includes new features such as advanced publishing, e-commerce enhancements, and data-at-rest encryption. Encryption is critical component of any application and as a best practice Sitecore recommends that all xDB deployments encrypt-data-at-rest. MongoDB provides a comprehensive native at-rest database encryption through the WiredTiger storage engine. There is no need to use 3rd party applications that may only encrypt files at the application, file system, or disk level. MongoDB’s WiredTiger storage engine is fully integrated and allows enterprises to safeguard their xDB deployments by encrypting and securing customer data-at-rest.

MongoDB is the leading non-relational database on the market and an integral part of xDB. It is the ideal database for collecting varied interactions and connecting them to create a “single view” of your customers.

Store and analyze anything

Instead of using rows and columns, MongoDB stores data using a flexible document data model, allowing you to ingest, store, and analyze any customer interaction or data type.

Scale without limit

MongoDB enables you to handle up to hundreds of billions of visits or interactions per year. When your deployment hits the scalability limits of a single server (e.g., the CPU, memory, or storage is fully consumed), MongoDB uses a process called sharding to partition and distribute data across multiple servers. Automatic load balancing ensures that performance is consistent as your data grows. The database runs on commodity hardware so you can scale on-demand while keeping costs low.

Minimize downtime

MongoDB supports native data replication with automated failover in the event of an outage. High availability and data redundancy is built into each MongoDB replica set, which serves as the basis for all production deployments of the database. The database’s self-healing architecture ensures that your team will always have access to the tools they need to deliver the best customer experience.

Deploy anywhere

MongoDB can be deployed in your data center or in the cloud. MongoDB management tools providing monitoring, backup and operational automation are available for each type of deployment.

In xDB, the collection database acts as a central repository for storing contact, device, interaction, history and automation data. An optimal collection database configuration helps organizations increase the availability, scalability, and performance of their Sitecore deployments.

With MongoDB, companies can ingest, store, and analyze varied data from billions of visits with ease. MongoDB scales horizontally across commodity servers, allowing customers to cost-effectively grow their deployments to handle increasing data volumes or throughput.

We offer a number of products and customized services to ensure your success with SiteCore xDB. If you’re interested in learning more about how we can help, click here and a member of our team will be in touch with you shortly.

MongoDB Enterprise Advanced is the best way to run MongoDB in your data center. It includes the advanced features and the round-the-clock, enterprise-grade support you need to take your deployment into production with the utmost confidence. Features include:

  • Advanced Security
  • Commercial License
  • Management Platform
  • Enterprise Software Integration
  • Platform Certification
  • On-Demand Training
  • Enterprise-grade support, available 24 x 365

The MongoDB Deployment for Sitecore consulting engagement helps you create a well-designed plan to deploy a highly available and scalable Sitecore xDB. Our consulting engineer will collaborate with your teams to configure MongoDB’s replication and sharding features to satisfy your organization’s requirements for Sitecore xDB availability and performance.

Click here to learn more


About the Author - Alan Chhabra

Alan is responsible for Worldwide Partners at MongoDB which include System Integrators, ISVs, and Technology Alliances. Before joining the company, Alan was responsible for WW Cloud & Data Center Automation Sales at BMC Software managing a $200M annual revenue business unit that touched over 1000 customers. Alan has also held senior sales, services, engineering & IT positions at Egenera (a cloud pioneer), Ernst & Young consulting, and the Charles Stark Draper Laboratory. Alan is a graduate of the Massachusetts Institute of Technology, where he earned his B.S. in mechanical engineering and his Masters in aerospace engineering.

Leaf in the Wild: KPMG France Enters the Cloud Era with New MongoDB Data Lake

Love it or loathe it, the term “big data” continues to gain awareness and adoption in every industry. No longer just the preserve of internet companies, traditional businesses are innovating with “big data” applications in ways that were unimaginable just a few years ago.

A great example of this is KPMG France’s deployment of a MongoDB-based data lake to support its accounting suite named Loop, and the release of its industry-first financial benchmarking service – enabling KPMG France customers to unlock new levels of insight into how each of their businesses are really performing. In the true spirit of big data, this application would have truly overwhelmed the capabilities of traditional data management technologies. I spoke with Christian Taltas, Managing Director of KPMG France Technologies Services, to learn more.

Can you start by telling us about KPMG France?
KPMG is one of the world’s largest professional services firms operating as independent businesses in 155 countries, with 174,000 staff. KPMG provides audit, tax and advisory services used by corporations, governments and not-for-profit organizations.

KPMG France provides accounting services to 65,000 customers. I am the managing director of KPMG Technologies Services (KTS), a software company subsidiary of KPMG France. KTS developed Loop, a complete collaborative accounting solution which is used by KPMG France’s Certified Public Accountants (CPAs) and their clients.

Please describe how you use MongoDB.
MongoDB is the database powering the Loop accounting suite, used by KPMG’s 4,800 CPAs. The suite is also currently used in collaboration with around 2,000 of KPMG’s customers. We are expecting more than 20,000 customers to adopt Loop’s collaborative accounting within the next 18 months.

What services does MongoDB provide for the accounting suite?
It serves multiple functions for the suite:

Data Lake: All raw accounting data from our customers’ business systems, such as sales data, invoices, bank statements, cash transactions, expenses, payroll and so on, is ingested from Microsoft SQL Server into MongoDB. This data is then accessible to our CPAs to generate the customer’s KPIs. A unique capability we have developed for our customers is financial benchmarking. We can use the data in the MongoDB data lake to allow our customers to benchmark their financial performance against competitors operating in the same industries within a specified geographic region. They can compare salary levels, expenses, margin, marketing costs – in fact almost any financial metric – to help determine their overall market competitiveness against other companies operating in the same industries, regions and markets. The MongoDB data lake enables us to manage large volumes of structured, semi-structured, and unstructured data, against which we can run both ad-hoc and predefined queries supporting advanced analytics and business intelligence dashboards. We are continuously loading new data to the data lake, while simultaneously supporting thousands of concurrent users.

Metadata Management:- Another unique feature of our accounting suite is the ability to customize reporting for each customer, based on specific criteria they want to track. For example, a restaurant chain will be interested in different metrics than a construction company. We enable this customization by creating a unique schema for each customer which is inherited from a standard business application schema, and then written to MongoDB. It stores the schema classes for each customer, which are then applied at run time when accounts and reports are generated. The Loop application has been designed as a business framework that generates reports in real time, running on top of Node.js. MongoDB is helping us manage the entire application codebase in order to deliver the right schemas and application business modules to each user depending on their role and profile, i.e.: bookkeeper, CPA, sales executive. It is a very powerful feature enabled by the flexibility of the MongoDB document data model that we could not have implemented with the constraints imposed by a traditional relational data model.

Caching Layer: The user experience is critical, so we use MongoDB as a high-speed layer to manage user authentication and sessions.

Logging Layer: We also use MongoDB to store all the Loop application’s millions of clients requests each day. This enables us to build Tableau reports on top of the logs to troubleshoot production performance issues for each user session, and for each of the 220 regional KPMG sites spread across France. We are using the MongoDB Connector for BI to generate these reports in Tableau.

Why did you choose MongoDB?
When we started development back in 2012, we knew we needed schema flexibility to handle the massive variances in data structures the accounting suite would need to store and process. This requirement disqualified traditional relational databases from handling the caching, metadata management and KPIs benchmarking computation. As we explored different NoSQL options, we were concerned that we’d over-complicate our architecture by running separate caches and databases. However, in performance testing MongoDB offered the flexibility and scalability to serve both use cases. It outperformed the NoSQL databases and dedicated caches we tested, and so we took the decision to build our platform around MongoDB.

As our accounting suite is built on JavaScript, close integration between the JavaScript application and the database was also a significant advantage in helping us accelerate development cycles.

As we were developing our new financial benchmarking service last year, we evaluated Microsoft’s Azure Cosmos DB (note, at the time this was called DocumentDB), but MongoDB offered much richer query and indexing functionality. We also considered building the benchmarking analytics on Hadoop, but the architecture of MongoDB, coupled with the power of the aggregation pipeline gave us a much simpler solution, while delivering the data lake functionality we needed. Aggregation enhancements delivered in MongoDB 3.2, especially the introduction of the $lookup operator, were key to our technology decisions.

Can you describe what your MongoDB deployment looks like?
Both the caching layer and metadata management are run on dedicated three node replica sets. This gives the accounting suite fault resilience to ensure always-on availability. The metadata is largely read only, while the caching layer serves a mixed read / write workload.

The data lake is deployed as a sharded cluster handling both large batch loads of data from clients business systems while concurrently serving complex analytics queries and reporting to the CPAs.

We are running MongoDB on Windows instances in the Microsoft Azure cloud, after migrating from our own data center. We needed to ensure we could meet the scalability demands of the app, and the cloud is a better place to do that, rather than investing in our own infrastructure.

How do you support and manage your deployment?
We use MongoDB's fully managed database, MongoDB Atlas, and have access to 24x7 proactive support from MongoDB engineers. We have also recently used the Production Readiness package from MongoDB consulting services.

The combination of the cloud database service, professional services, and technical support are proving invaluable:

  • The MongoDB consultants reviewed our operational processes and Azure deployment plans, from which they able to provide guidance and best practices to execute the migration without interruption to the business. They also helped us create an operations playbook to institutionalize best practices going forward.
  • MongoDB Atlas automated the configuration and provisioning of MongoDB instances onto Azure, and we rely on it now to handle on-going upgrades and maintenance. A few simple clicks in the UI eliminates the need for us to develop our own configuration management scripts.
  • MongoDB Atlas also provides high-resolution telemetry on the health of our MongoDB databases, enabling us to proactively address any issues before they impact the CPAs.
  • Data integrity is obviously key to our business, and so Atlas is invaluable in providing continuous backups of our data lake. We evaluated managing backup ourselves, but ultimately it was much more cost effective for MongoDB to manage it for us as part of the fully managed backup service available through Atlas.

As part of your migration to Azure, you also migrated to the latest MongoDB 3.2 release. Can you share the results of that upgrade?
One word – scalability. With MongoDB 3.2 now using WiredTiger as its default storage engine, we can achieve much higher throughput and scalability on a lower hardware footprint.

The accounting suite supports almost 7,000 internal and external customers today, with half of them connecting for an average of 5 hours every working day. But we plan to roll it out to 20,000 customers over the next 18 months. We’ve been able to load test the suite against our development cluster, and MongoDB has scaled to 5x the current sessions, analytics and data volumes with no issues at all. WiredTiger’s document level concurrency control and storage compression are key to these results.

What future plans do you have for the Loop accounting suite?
We want to automate more of the benchmarking, and enable further data exploration to build predictive analytics models for our customers. This will enable us to provide benchmarks against both historic data, as well as evaluate future likely business outcomes. We plan on using the Azure Machine Learning framework against our MongoDB data lake.

How are you measuring the impact of MongoDB on your business?
We estimate that by selecting MongoDB for the accounting suite we achieved at least a 50% faster time to market than using any other non-relational database. The tight integration with JavaScript, flexible data model, out-of-box performance and sophisticated management platform have all been key to enabling developer productivity and reducing operational costs.

The accounting suite’s financial benchmarking service is a highly innovative application that provides KPMG France with significant competitive advantage. We have access to a lot of customer information which becomes actionable with our data lake built on MongoDB. It allows us to store that data cost effectively, while supporting rich analytics to give insights that other accounting practices just can’t match.

Christian, thanks for taking the time to share your story with the MongoDB community.

Thinking of implementing a data lake? Learn more from our guide:

Bringing Online Big Data to BI & Analytics with MongoDB


About the Author - Mat Keep

Mat is a director within the MongoDB product marketing team, responsible for building the vision, positioning and content for MongoDB’s products and services, including the analysis of market trends and customer requirements. Prior to MongoDB, Mat was director of product management at Oracle Corp. with responsibility for the MySQL database in web, telecoms, cloud and big data workloads. This followed a series of sales, business development and analyst / programmer positions with both technology vendors and end-user companies.

How Saavn Grew to India’s Largest Music Streaming Service with MongoDB

Building a push notification system on a sophisticated data analytics pipeline powered by Apache Kafka, Storm and MongoDB

2015 was an important year for the music industry. It was the first time digital became the primary revenue source for recorded music, overtaking sales of physical formats. Key to this milestone was the revenue generated by streaming services – growing over 45% in a single year.

As with many consumer services, the music streaming market is fragmented across the globe. In India – the 2nd most populous country on the planet and second largest smartphone market – Saavn has grown to become the sub-continent’s largest music service. It has 80m subscribers, experiencing a 9x increase in Daily Active Users (DAU) in just 24 months, with 90% of its streams served to mobile users. There are many factors that collectively have driven Saavn’s growth – but at the heart of it is data. And for this, they rely on MongoDB.

![](https://webassets.mongodb.com/_com_assets/cms/Saavn-Logo-Horizontal-White-500-eua0kyb1uk.png)

Saavn started out using MongoDB as a persistent cache, replacing an existing memcached layer. They soon realised the versatility and flexibility of the database to serve as the system of record for its data on subscribers, devices, and user activity. It was MongoDB’s flexibility and scalability that proved instrumental to maintain pace with Saavn’s breakneck growth.

Through its extensive collection of music, the company quickly attracted new users to its streaming service, but found engagement often dropped away. It identified that push notifications sent directly to client devices was key to reconnecting with users, and keeping them engaged by serving personalized playlists. At this year’s MongoDB World conference, CTO Sriranjan Manjunath, presented how Saavn has used MongoDB as part of a sophisticated analytics pipeline to drive a 3x increase in user engagement.

As Sriranjan and his team observed, it wasn’t enough to simply broadcast generic notifications to its users. Instead Saavn needed to craft notifications that provided playlists personalized to each user. Saavn built a sophisticated data processing pipeline that uses a scheduler to extract device, activity and user data stored in MongoDB. From there, it computes relevant playlists by analyzing a user’s listening preferences, activity, device, location and more. It then sends the computed recommendations to a dispatcher process that delivers the playlist to each user’s device and inbox. To refine personalizations, all user activity is ingested back into a Kafka queue where it is processed by Apache Storm and written back to MongoDB. Saavn is also expanding its use of artificial intelligence to better predict users interests, and is using MongoDB to store the resultant machine learning models and serve them in real time to the recommender application.

The system currently sends 30m notifications per day, but has been sized to support up to 1m per minute, providing plenty of headroom to support Saavn’s continued growth.

In his presentation, Sriranjan discussed how Saavn migrated from MongoDB 2.6 to MongoDB 3.0, taking advantage of the WiredTiger storage engine’s document level concurrency control to deliver improved performance. He talks about his key learnings in modifying schema design to reflect the differences in how updates are handled by the underlying storage engine, and usage of TTL indexes to automatically expire data from MongoDB . Sriranjan also discusses shard key selection to optimize uniform data distribution across the cluster, and the benefits of using MongoDB Cloud Manager for system monitoring and continuous backups, including integration with Slack for automated alerting to the ops team.

Click through to view Saavn’s presentation from MongoDB World

To learn more about managing real time streaming data, download:

The MongoDB and Kafka white paper


About the author - Mat Keep

Mat is a director within the MongoDB product marketing team, responsible for building the vision, positioning and content for MongoDB’s products and services, including the analysis of market trends and customer requirements. Prior to MongoDB, Mat was director of product management at Oracle Corp. with responsibility for the MySQL database in web, telecoms, cloud and big data workloads. This followed a series of sales, business development and analyst / programmer positions with both technology vendors and end-user companies.

Leaf in the Wild: Ogilvy & Mather Delivers Security Compliance with MongoDB Enterprise Advanced

Audit Event Logging with Real-Time Reporting Deployed Across Continents with Write-Local and Read-Anywhere MongoDB Cluster

Security and compliance is a top-of-mind issue for executives in all enterprises around the world. Enforcing robust security controls to your most critical digital assets and systems through fine-grained auditing is a key step towards defending against potentially costly breaches.

I had the opportunity to sit down with Addison Chappell, Enterprise Architect at Ogilvy & Mather, to discuss an innovative auditing application built on top of MongoDB.

Can you start by telling us a little bit about your company?
Ogilvy & Mather (O&M) is one of the largest marketing communications companies in the world. We comprise industry leading business units operating across a range of disciplines including advertising, public relations, branding, promotions and market analytics. O&M services Fortune Global 500 companies as well as local businesses through its network of more than 500 offices in 126 countries.

Please describe how you are using MongoDB
MongoDB is being used for our core auditing application, capturing authentication and authorization activities of all users as they access O&M’s systems. From events written to MongoDB, we are able to build an audit trail of system access, which is used by our support, compliance and security teams.

From the auditing application, our teams have fine-grained visibility into who did what and when, what privileges each user has, and how each account is configured. The teams can enforce security policies such as password resets; resolve application access issues; monitor application usage by user, business unit and region; and much more.

What are the key characteristics of the auditing application?
The application is write-heavy, with MongoDB ingesting tens of gigabytes of data every day, from tens of thousands of users distributed around the globe.

MongoDB is used for both data ingest, and in generating real time analytics. We are using the MongoDB aggregation pipeline to roll up key metrics, such as snapshots of how many users are active on our system at any one time.

Did you start out on MongoDB, or were you using another database?
We originally built the application on a relational database, but it just was not able to keep up with the increasing size of our data set.

We also needed to ensure users got the same consistent low latency wherever they were located around the world. Creating a distributed database environment that spans continents proved extremely challenging using relational technology. It was hard to both capture data reliably, and query it within the performance SLAs demanded by the business.

What led you to MongoDB?
We discovered MongoDB back in 2013 while looking for alternatives to our SQL-based solutions. We choose MongoDB based on technical maturity, the size of the community actively using it, and the quality of support.

Please describe your MongoDB deployment
We have deployed a sharded MongoDB cluster in three data centers across two continents. We have active/active data centers in North America and on mainland Europe both serving read and write traffic, with a third data center in London housing replica sets arbiters to protect against network partitions causing divergent copies of the database.

O&M’s globally distributed MongoDB cluster: supporting local writes, with read-anywhere access
As illustrated in our architectural diagram, each active data center is provisioned with two shards, each of which is configured as a replica set with a primary and local secondary node, and then two remote secondaries in the other data center. This way, we can achieve continuous availability in the event of a regional data center outage.

We use MongoDB zones (also known as Tag-Aware sharding) to partition our database according to user location. With MongoDB zones, we ensure audit event data is written to nodes local to the user, thereby minimizing the effects of network latency, and then we can read that data globally for centralized analytics and reporting.

Do you use MongoDB’s commercial subscriptions?
We have built the auditing application on MongoDB Enterprise Advanced, providing us with access to the expert, proactive technical support required for this mission-critical application.

Through MongoDB Enterprise Advanced, we also get access to the Connector for BI for advanced analytics and data visualization, and Ops Manager for advanced operational tooling. We will be exploring both of these options in the future.

I understand you are planning to upgrade to the latest MongoDB 3.2 release later this year. Can you share your motivations for the upgrade?
There are three drivers behind the upgrade:

What have been the major benefits to you of switching from your previous relational database to MongoDB?

  1. Scalability to handle a data set growing at tens of GBs every day
  2. The ability to generate near real-time analytics against live operational data, even in a write heavy app like ours
  3. True cross-continent geo-distribution to support the performance and availability requirements of a global audience

What advice would you give to someone consider MongoDB for their next project?
MongoDB is very easy to get started with, but that doesn’t mean you shouldn't plan and architect your application carefully:

Addison, thanks for sharing your experiences with the MongoDB community.

If you are struggling with your traditional database, download our:

Relational to MongoDB Migration Guide


About the Author - Mat Keep

Mat is a director within the MongoDB product marketing team, responsible for building the vision, positioning and content for MongoDB’s products and services, including the analysis of market trends and customer requirements. Prior to MongoDB, Mat was director of product management at Oracle Corp. with responsibility for the MySQL database in web, telecoms, cloud and big data workloads. This followed a series of sales, business development and analyst / programmer positions with both technology vendors and end-user companies.

MongoDB and AXA France: Creating a Connected Homes Project

The Internet of Things (IoT) is a transformative technology, touching all of our lives – in the workplace, in the gym, on the road, and in so many other places and contexts. But it’s in the home where the IoT is making some of its largest strides.

The concept of the smart, connected home is revolutionizing how we protect our families and most valuable personal assets – whether it’s from intruders, from fires and floods, through to environmental controls. Insurance companies have been quick to identify the opportunity smart homes present to them, and to their customers. By providing an “IoT Hub”, insurers can aggregate sensor data from multiple devices in the home to allow their customers complete control to defend, protect and monitor their families and property from a single user interface.

At this year’s MongoDB World customer conference, AXA France developers Guillaume Chervet and Vincent Gillot presented their connected homes project, built on MongoDB, node.js and the cloud. In their presentation, they discuss: - How they optimized schema design through multiple iterations to most efficiently represent customers, sensor data, and notifications - How they consume, process and alert against sensor events emitted within the same millisecond from multiple devices in a single connected home - How they use the MongoDB aggregation pipeline to analyse and report against sensor data. - How the move to MongoDB 3.2 has allowed them to simplify their application code. By taking advantage of new features such as document validation they are able to enforce structure over their IoT events schema; and the $lookup aggregation operator allows the team to JOIN data from multiple collections for richer analytics

Guillaume and Vincent shared how the development team had come from SQL Server and Oracle database backgrounds before working with MongoDB on this project. Selected quotes from the team included:

MongoDB is really easy to install, to configure. Its immediate ease of use impressed me.

Developing new features in agile with sprint modes is easier with a NoSQL database than with a relational database.

Querying system is rich, manipulating data in JSON becomes intuitive, and my algorithms are simpler!

View the slides to learn more about AXA’s connected home project with MongoDB

Download our IoT & Big Data whitepaper, co-authored with Bosch Software Innovations to learn more about the challenges presented by managing IoT data.

Download the white paper


*About the Author - Mat Keep

Mat is a director within the MongoDB product marketing team, responsible for building the vision, positioning and content for MongoDB’s products and services, including the analysis of market trends and customer requirements. Prior to MongoDB, Mat was director of product management at Oracle Corp. with responsibility for the MySQL database in web, telecoms, cloud and big data workloads. This followed a series of sales, business development and analyst / programmer positions with both technology vendors and end-user companies.*

MongoDB and Stratio: Building an Operational Data Lake for One of Spain’s Largest Insurance Companies

MongoDB, Apache Spark, Zeppelin, Hadoop and Kafka Improve Customer Experience and Optimize Marketing ROI

Data lakes are playing an increasingly critical role in modern enterprise data architectures. Providing a centralized repository to aggregate data from multiple internal and external sources, the data lake – often built on Apache Hadoop – provides a foundation for sophisticated analytics that can improve customer insights, deliver higher operational efficiency and reduce costs.

Beyond just storing data, a key requirement of the data lake is being able to serve analytic models to real-time, operational applications. And that requires more than just Hadoop. Powerful operational databases are needed to make analytics models accessible and actionable within operational applications. Exposing these models to online applications makes our business processes smarter and more contextually aware – for example, presenting personalized recommendations to users, detecting and preventing fraudulent transactions while they are in flight, or predicting imminent failures in critical systems.

Mutua Madrileña, one of the largest insurance companies in Spain, recognized the importance of creating a data lake in its goal to improve customer experience and optimize marketing spend. They worked with the Pure Spark Platform from Stratio to implement an operational data lake, bringing together:

  • Apache Kafka and Apache Flume for data ingestion
  • Apache Hadoop for storing raw data
  • Apache Spark for analytical processing and machine learning, orchestrated by Apache Mesos
  • MongoDB and Postgres for serving analytics models to operational applications and reporting tools
  • R Studio and Apache Zeppelin for business intelligence and analytics
*Figure 1: Stratio Pure Spark Platform: creating an operationalized data lake*

Alvaro Santos, Big Data Architect at Stratio, presented the data lake built for Mutua Madrileña at this year’s MongoDB World conference. In his session, Alvaro discussed how the data lake is ingesting data from over 25 different sources to power a range of applications, including:

  • The creation of machine learning models to personalize user experience across web and mobile channels, present product recommendations, and classify insurance applicants by risk
  • Mapping of the customer journey through Mutua’s systems to understand user context and identify gaps in business processes
  • Collection and analysis of marketing campaign data to measure impact and improve performance.

Alvaro discussed the selection of technology for the data lake. Apache Spark was chosen because of its speed as a distributed processing engine, access to rich machine learning libraries, and developer-friendly APIs. MongoDB was chosen:

  1. To take advantage of its flexible data model, allowing rapidly changing, semi-structured data to be stored and processed with ease
  2. Rich secondary indexes that push query filtering down to the database. As a result, operational applications can execute complex queries with low latency
  3. Automatic sharding to support a doubling of data volumes in the data lake, and native replication for always-on availability

View the slides to learn more about Mutua’s data lake journey:

Learn more about unlocking operational intelligence from the data lake with our new whitepaper:

Download the white paper


Leaf in the Wild: FACEIT Scales to 4m+ Gamers with MongoDB and Cloud Manager, Fuels Major New Partnerships with Industry Leaders

Computer gaming is one of the fastest growing industries on the planet. Revenues are expected to exceed $100bn for the first time this year, with games redefining how new generations of users consume and interact with media.

To be successful in this industry, companies need to rely on technology infrastructure that is agile, scalable and cloud-ready. That is why FACEIT, the world's leading competitive gaming platform, selected MongoDB.


I sat down with FACEIT CTO Maurizio Attisani, Systems Administrator & Service Delivery, to learn more about their journey from zero to over 4m users with MongoDB running in the cloud.

Can you start by telling us a little bit about your company?
We founded FACEIT in 2011. It has quickly become the leading platform for online competitions in Player versus Player (PvP) multiplayer games including League of Legends, Counter-Strike and DOTA 2. We have attracted over 4.3 million users since launch, and now every month host 2,000+ tournaments and manage 6 million game sessions. We also just partnered with Twitch, the video gaming platform, to create the world’s first professional eSports league that is offering teams co-ownership positions.

Please describe how you are using MongoDB
MongoDB is the main database underpinning our platform. We use it orchestrate services between players, teams, and competitions. All user profiles and tournament data is managed by MongoDB.

On average, each tournament match involves 10 players generating streams of time-series statistics, all of which are written to MongoDB. We use the data we collect in MongoDB to drive sophisticated analytics that track player behavior, engagement, and competitive performances in tools such as Mixpanel and Keen IO.

Why did you select MongoDB for your gaming platform?
We evaluated several different database options before making our technology choice. Our primary requirement was the management of user profiles. Each player could have one or more games registered to their profile, and we’d be constantly extending the schema with additional attributes as we added new games features, tournament types and gaming platforms to our service.

We quickly recognized that the static data model of relational databases would inhibit development agility in evolving our platform features. Also, the need to perform costly JOIN operations between player tables and game tables at run time would restrict future scalability.

As a result, we explored document databases that provided us with both a flexible schema and rich document data model. This capability enabled us to embed related data into a single user object that could be quickly accessed in just one call to the database.

We took a look at several options. But MongoDB was a long way ahead in terms of query functionality and performance, community vibrancy and ecosystem strength, coupled with the quality of its documentation and online tutorials. There was also broader availability of developer skills in the market, and options to run MongoDB as a managed service in the cloud. We recognized these attributes would help us to accelerate our time to market and deliver a better customer experience.

How did you get started with MongoDB?
We live by cloud-first principles here at FACEIT, so we initially chose to consume MongoDB as a service from the Compose platform. It was great to get started with, but we quickly outgrew it as users and traffic scaled. So we migrated to our own database instances hosted on AWS, and contracted a 3rd party MongoDB support provider to run the platform with its remote DBA services. But as our growth accelerated, it became clear that we needed more that just reactive troubleshooting to any issues we encountered.

To put our growth into perspective, in the first six months after launch, we went from 300,000 to 1m users, and then in the next 12 months, we scaled from 1m to 4m+ users. To sustain this growth, we needed proactive support and access to dedicated consultants to help optimize our schema and queries, especially for the streams of time series data the platform needed to manage.

So what did you do next?
What better way to support our growth than partnering directly with the company behind MongoDB! This gives us direct access to its architects, engineers, and consultants who have worked with thousands of other companies, and bring that aggregated expertise to optimize our environment.

We use MongoDB Professional for access to proactive support, and Cloud Manager Premium for operational automation and database monitoring. We also use MongoDB’s consulting services:

  • The Major Version Upgrade package enabled us to complete the migration from MongoDB 2.6 to the latest 3.2 release in under one week. Our services experienced no downtime at all during the migration, and the upgrade has resulted in a reduced database footprint with significantly lower costs.
  • The Performance Evaluation and Tuning package has been instrumental in helping us redesign our schema to improve experience for our gamers.

We also make extensive use of the courses provided by MongoDB University to accelerate the on-ramp for new developers coming into the company. The result is that they are productive and contributing new features to our service much faster.

Can you describe your current environment?
We are running MongoDB across two replica sets provisioned on AWS EC2. We use Cloud Manager to configure and deploy new instances, which is super easy and quick for our developers.

We use New Relic to provide global oversight across our application stack, and then Cloud Manager for MongoDB-specific low level telemetry. The monitoring data we get from Cloud Manager enables our developers to make more informed choices on query optimization. Its new visual query profiler makes it simple to identify slow running queries, and provides recommendations on how to address them.

*Cloud Manager’s visual query profiler makes it fast to find and fix problem queries*

Most of our apps are written in Java, and we also use Ruby for transformations of game statistics stored in MongoDB.

You mentioned you had upgraded to the latest MongoDB 3.2 release. How is that performing for you?
It’s going great!

How are you measuring the impact of MongoDB on your business?
MongoDB has been central to us scaling from zero to over 4m users in just 18 months. This growth has given us the capacity to scale our service to support new partnerships for FACEIT with Microsoft Xbox and Twitch.

Expanding to new platforms demonstrates the benefit of MongoDB’s flexible document model. Trying to add game-specific player IDs to existing user profiles without downtime or changes to the codebase when you are tied to a relational database is impossible. With MongoDB, we don’t give it a second thought. The database is flexible enough to adapt and grow as our business continues to expand.

What advice would you give someone who is considering using MongoDB for their next project?
MongoDB schema design is very straightforward – don’t be afraid to denormalize your data and optimize for the application’s query patterns. But also pay attention to future scaling needs – make sure you design your collections in such a way that it is easy to select a shard key when the time comes. The MongoDB documentation contains some good tutorials to help you on the right path.

Maurizio, thanks for taking the time to share your experiences with the MongoDB community.

Start your free trial of MongoDB Cloud Manager today


Leaf in the Wild: Foyer Assurances SA Moves from Relational Databases to MongoDB to Accelerate Digital Transformation

Every enterprise is embarking on digital transformation – bringing applications and services online to more quickly and efficiently serve their customers and enter new markets. But some of their technology choices of the past are now holding them back.

I met with Damien Goosse, Chief Architect in the IT department at Foyer, to discuss how they turned to MongoDB to accelerate application innovation, and build new services that were just not possible with relational databases.

Can you start by telling us a little bit about your company?
The Foyer Group was founded back in 1922 and has grown to become market leader in Luxembourg’s insurance sector, and one of the country’s largest financial services companies, providing insurance, pension and asset management products to the consumer and corporate markets across Europe.

What are challenges are you trying to address with MongoDB?
We have traditionally built our applications on top of relational databases. But we are finding that they are increasingly holding back our digital transformation initiatives, and our strategy to accelerate application delivery by embracing agile development methodologies and cloud computing.

How is MongoDB helping you address those challenges?
We have several applications running on MongoDB today, with many new projects in the pipeline.

Our first application with MongoDB was a migration from one of the leading commercial relational databases. The application is used to capture customer data, which then drives back-end operational processes. We increasingly engage with our customers through web and mobile channels, using online forms to collect the information required for insurance quotes, applications, claims and other business applications. The forms are rich data structures, which we had to flatten and split across relational tables. Any new attributes we needed to capture through the forms incurred painful migrations to our relational database schema. These migrations meant lots of coordination between developers, DBAs, and the operations team, which slowed down how quickly we could implement feature enhancements for the business.

So we decided to switch the application to MongoDB. Now forms are stored in their native structures as JSON documents directly within MongoDB. This has given us a number of benefits: *We have removed the complex ORM layers that were required in the previous implementation to translate between application objects and the relational database schema. *We can add new attributes and features without the convoluted coordination cycles of the past, and without performance impact to our customer-facing applications. *It’s much easier for our backend business applications that need to process the form data to extract JSON data directly from MongoDB, without converting the data between different formats.

As a result of the migration to MongoDB, my team can innovate faster, responding instantly to business requirements, and we’ve seen significantly greater application performance, which translates directly to improved customer experience.

I believe you have also built a new mobile application using MongoDB? We have equipped our agents and partners with tablet devices, enabling them to more efficiently collect information when they are out with customers. All of the data they collect is stored locally on the device, and then securely synchronized back to our central MongoDB database when they are connected to the network.

This is a full-stack, end-to-end, JavaScript application. Using MongoDB’s flexible data model and native JSON support, we have been able to build this application in weeks, rather than the months it would have taken had we used our traditional relational database.

What led you to use MongoDB in the first place?
Recommendations from my developers. A number had used MongoDB in previous roles, and believed that it would be a great fit for us. I asked several members of the team to build a prototype app on MongoDB. They did it in hours. Not days or weeks.

Clearly there are many “NoSQL” choices available. For me, what was really critical in my final selection was production-proven deployments at scale, breadth of tooling, developer productivity, and community vibrancy. MongoDB was the clear leader across all of these dimensions.

Can you describe how you deploy MongoDB?
We are using the latest MongoDB 3.2 release. We deploy against MongoDB replica sets, which provides us fault tolerance against failures, and the ability to perform zero downtime maintenance operations. As we bring more projects onto MongoDB, we will move our replica sets to sharded clusters. All of our hosts run Red Hat Linux and VMware.

We are customers of MongoDB Enterprise Advanced, which provides our teams with access to 24x7 proactive support. As part of Enterprise Advanced, we make extensive use of Ops Manager to automate the configuration, provisioning and upgrading of our deployment. We also use its continuous backup with point in time recovery. I didn’t want my team to spend time scripting all of these operations. Ops Manager allows us to run a very lean and highly productive ops organization.

We will also be integrating MongoDB into our LDAP infrastructure in the next few months.

Do you make use of other MongoDB Services?
Yes, we have used MongoDB’s Production Readiness consulting service. The consultant was able to advise on our overall architecture, and help with MongoDB configuration best practices. The quality of the advice was extremely high, and was key in accelerating the launch of our new MongoDB services. We did initially use a third party consultancy that had worked with us in the past, but they did not have the depth of expertise we expected.

What advice would you give to someone considering MongoDB?
Take time to study the MongoDB documentation, and the online MongoDB University courses. These are both excellent resources that will get you up to speed quickly.

I think what is important is to select a meaningful project, and to use it in order to demonstrate productivity gains and quality of the applications you can build with MongoDB. You will always encounter someone within your organization who is cautious about selecting any new technology. So use your project as an example, to prove what is possible when you think outside of the world of relational databases.

Damian, thank you for taking the time to share your experiences with the MongoDB community.

Want to explore moving from relational databases to MongoDB?

Download our Migration Guide