Editor's note: This post was edited on June 30, 2015 to reflect the change from MongoDB Management Service (MMS) to MongoDB Cloud Manager. Learn more here.
Leaf in the Wild posts highlight real world MongoDB deployments. Read other stories about how companies are using MongoDB for their mission-critical projects.
24x7 Uptime Across Dozens of Sharded Clusters, Supported by MongoDB Enterprise Advanced, Cloud Manager & Just a Single Administrator
Square Enix is one of the world’s leading providers of gaming experiences, publishing such iconic titles as TOMB RAIDER and FINAL FANTASY. As more of its games moved online, Square Enix rapidly hit the scalability limits of relational databases and migrated to MongoDB. By adopting a multi-tenant database-as-a-service, Square Enix has been able to consolidate database instances, resulting in increased performance and reliability. Advanced operational tooling enables the operations team at Square Enix to scale dozens of database clusters on-demand and deliver 24x7 availability to gamers around the world, all with just one administrator.
With this year’s E3 Expo just around the corner, I had the opportunity to sit down with Tomas Jelinek, Senior Online Operations Administrator at Square Enix Europe to discuss how his platforms have evolved to meet the demands of millions of dedicated online gamers around the world.
Can you start by telling us a little bit about Square Enix?
Square Enix is one of the leading and most influential providers of digital entertainment content in the world. We have a valuable portfolio of gaming titles including FINAL FANTASY, DRAGON QUEST, TOMB RAIDER, HITMAN, DEUS EX and JUST CAUSE, which collectively have sold hundreds of millions of units worldwide.
We are always seeking ways to push the boundaries of creativity and innovation by providing high quality entertainment content and products. As gaming is right at the intersection of big data, mobile, and cloud computing, our infrastructure platforms are critical to staying ahead of the market and giving gamers experiences they won’t get anywhere else.
We will be making some really interesting announcements at E3 so it’s an exciting time to be working as part of the Ops team that keep all of our gaming platforms running.
Tell us how Square Enix is using MongoDB.
Originally we ran our games on a platform where data was stored in old-fashioned relational databases. But that wasn’t efficient as our titles multiplied, game complexity increased and datasets grew out of proportions. So we built our multi-tenant Online Suite – a central shared infrastructure used across the company. We deliver MongoDB-as-a-Service to all of our studios and developers. As part of this Online Suite we provide an API that allows the studios to use MongoDB to store and manage metrics, player profiles, info cast information, leaderboards and competitions. MongoDB is also used to enable players to share messages across all supported platform such as PlayStation, Xbox, PC, web, iOS, and Android etc. Essentially, the Online Suite supports any functionality that is needed across multiple games.
Every title also needs to support its own specific in-game functionality, and so each is provisioned with dedicated infrastructure connected to MongoDB to store game state and player metrics, along with specific content and features. For example, Hitman Absolution introduced the ability for players to create their own contracts, and then share those with other players. This is all managed by MongoDB and Cloud Manager.
We also use MongoDB for inter-game and inter-player messaging. Our web and mobile sites use MongoDB for content management and product catalogs.
Just Cause 3 - dropping onto a platform near you.
Just Cause 3 © 2015 Square Enix Ltd
What were you using before MongoDB? Was this a new project or did you migrate from a different database?
The move to online gaming started back in 2007. We used relational databases to store player profiles and leaderboards, and to run analytics across metrics we collected from the game play. But as our online audience grew, the databases didn’t scale with us. We ended up bringing in teams of consultants to buy us more headroom, but the existing databases just couldn’t keep up.
We started to ask increasingly tough questions of our data. One complex analytics query took 3 weeks to run in our relational database! We knew it was time to start looking for an alternative.
How did you hear about MongoDB? Did you consider other alternatives?
We started our search for a replacement in 2011. We needed a database that could meet the requirements of both our development teams in Montreal and our ops team here in London.
The development teams were focused on how quickly they could build new games and add functionality to maximize game lifecycles. They also needed to ensure all of the operational and analytical functionality they demanded could be supported by the database. So a flexible schema and expressive query language were important to them.
In operations, we needed to validate scalability and robustness of the database. Customer experience is not something we could afford to compromise on, even if the developers loved the database! For ongoing manageability, we also needed to verify that the database could fit into our existing operational workflows and tooling.
Each team embarked on its own evaluations. We built Proofs of Concept and looked at various established and new database technologies, including MongoDB. We hired an external consultant to help guide the process.
All of the teams reached the same conclusion: MongoDB was the best fit for our next generation gaming platform. And that complex query that took 3 weeks to run in our relational databases…we got it down to 2 minutes in MongoDB! So in-game analytics was a go.
In the context of database evolution, 2011 was an eternity ago, and I’m sure a lot has changed in products we evaluated. But we’ve never looked back with MongoDB.
Can you describe your MongoDB deployment?
We mainly run MongoDB 2.6 today. Each game server instance is deployed on VM running Ubuntu Linux, Nginx & Jetty connecting to MongoDB with the Java driver.
Our multi-tenant Online Suite is provisioned across the main 10-shard MongoDB cluster. Each shard is configured with a 3-node replica set running a primary and secondary node and a replica set arbiter. Each MongoDB instance runs on its own physical server in our data centres.
We use a different architecture for the clusters dedicated to other projects supporting individual titles. The load our games place on the backend is very spikey. It’s not uncommon to provision 60+ front-end servers to support a game at launch, and then dial that back as traffic reduces. Our marketing team often runs promotions around individual games, and with this approach, we are able to just add nodes back into the cluster when they are needed. This type of elasticity is essential to ensure we are keeping costs down by avoiding over-provisioning. MongoDB gives us solid persistence layer to support such an approach.
Many of our dedicated games clusters are deployed on AWS, and we have started to use MongoDB Cloud Manager to automate provisioning and configuration, as well as manage upgrades.
I’ve found Cloud Manager to be extremely useful – it saves us a lot of time, and means we can scale our infrastructure without having to scale our ops personnel at the same time. It also means I can get a lot more done, and have a life outside of work!
Automated MongoDB provisioning on AWS EC2
In addition to provisioning, we also use Cloud Manager to gather telemetry from our MongoDB clusters. We constantly monitor and alert on key metrics like ops counters, queues, connections, memory and CPU utilization, along with disk IOPS. By establishing baselines and alerting on these key metrics, we can automate MongoDB scaling to respond to growing traffic, before our service starts to degrade.
Looking at our broader backend infrastructure, we use Nagios and Cacti for monitoring and automate everything using Ansible and Puppet.
Our web and mobile properties run a slightly different configuration, with MongoDB connected to Ruby-based services, and Docker for orchestration.
Can you share any best practices on scaling your MongoDB infrastructure?
For starters, it’s critical to have good monitoring in place. You should never wait until your system is over-utilized before you add capacity. Trying to scale without impacting the application's performance is really hard if your system’s resources are already highly consumed.
I would also avoid co-locating too much stuff between completely different applications. For example, sharing nodes or config servers can complicate upgrades, so I would recommend isolating components when your apps have very different query patterns and traffic profiles.
Are you integrating MongoDB with other data analytics platforms?
We use Pentaho to visualize and dashboard data stored in MongoDB.
Metrics data collected from games and players is stored in MongoDB and then loaded into a Cloudera Hadoop cluster to run deeper, offline analytics.
Do you use any MongoDB services to support your deployment?
Yes. We use MongoDB Enterprise Advanced which provides us access to MongoDB technical support services. When we’ve had issues, we’ve got fast resolution from the support team. For example, last Christmas we started to experience performance degradation across a couple of clusters. The MongoDB support engineers were able to quickly diagnose the issue and pinpoint the cause to an underlying hardware issue. We were able to avoid impacting customer experience at one of our busiest times of the year. And so MongoDB support has been highly beneficial for our business.
How are you measuring the impact of MongoDB on your business?
Customer experience is everything. We must be able to scale quickly, on-demand when new games launch. We cannot afford downtime – we need to be able to stay up 24x7 during both outages and planned maintenance. MongoDB’s architecture enables us to do this.
We can add and remove MongoDB nodes from sharded clusters when we need to. This is essential when performing various management tasks. The amount of data we’re collecting is continuously growing, and MongoDB is scaling with us. A new game will typically generate 120GB of data per day. With this data aggregated across multiple games and years, you can imagine how quickly the dataset grows. New data is stored on fast disks, while aged data is migrated to slower disks.
Replica sets give us resilience, and rolling upgrades means we don’t need to take services offline when we change things in our platform.
The operational efficiency we get with MongoDB is key. We are able to run our entire MongoDB estate with just myself and occasional help from my colleague. Cloud Manager coupled with proactive support is central to achieving this type of efficiency.
Do you have any plans to upgrade to MongoDB 3.0?
We are excited about the new release – more granular concurrency control will give us greater performance, especially in our write-intensive applications. Compression is really important in enabling us to make more efficient use of our storage as our data sets continue to grow. We plan on rolling out MongoDB 3.0 later this year.
What advice would you give someone who is considering using MongoDB for their next project?
It has never been easier to try out MongoDB. With Cloud Manager, you can spin up new instances in seconds on AWS. Load your data and just start testing it.
Want to try Cloud Manager for yourself? Sign up for our 30 day free trial:
Try Cloud Manager for free
Square Enix games running on MongoDB include Tomb Raider, Lara Croft and the Guardian of Light, Lara Croft and the Temple of Osiris, Hitman Absolution and Hitman Go:, Deus Ex, Thief, Just Cause 3, Sleeping Dogs, Life is Strange and Nosgoth.
DEUS EX, DRAGON QUEST, FINAL FANTASY, HITMAN, JUST CAUSE, SQUARE ENIX, the SQUARE ENIX logo, and TOMB RAIDER are registered trademarks or trademarks of the Square Enix Group. All other trademarks are the property of their respective owners.