Auto Trader UK is by far the leading place to buy and sell used cars in the country, capturing over 70% of all time spent on car classified sites. What originated as a publishing company with a magazine as its primary offering successfully transitioned to a fully digital business back in 2013.. Since the closure of the magazine, Auto Trader is in the midst of evolving their online digital business and becoming an entirely technology-focused company.
Auto Trader has always concentrated on engineering agility. “In FY 2019, we did 15,000 releases to production and 40% of those were either tier one or highly important applications. In the first five months of FY 2020, we’ve already done 16,200 deployments and we're looking to break around 35,000 deployments in the coming year” says Mohsin Patel (twitter, blog), a Principal Database Engineer and eleven year Auto Trader veteran. The issue was maintaining this type of agility in an on-prem environment.
Previously, they had been incredibly Oracle and SQL Server heavy. “If someone came to me and said ‘I have a new application! I need a database’ we said to them, ‘Here is your Oracle schema. This is your relational database, go use it.’ There was no real choice in the matter.” Due to the demand for enhanced agility and the growing popularity of open source and emerging technologies, Mohsin recognized it was time for a change.
MongoDB arrives at Auto Trader
Engineers began feeling more comfortable using open source in production environments so eventually, when handed their Oracle schema, more of a conversation arose. This was how MongoDB was first implemented on-premise in Auto Trader nine years ago.
Given the increase in deployments just over the past year, you can imagine how much more complex the architecture has gotten since MongoDB was first implemented. On-premise MongoDB clusters are remarkably powerful tools, but they’re not magic. As company needs develop, their database should too.
Auto Trader had created a MongoDB on-premise monolith with one cluster supporting over 30 applications and a multitude of other clusters all running on different versions which caused a range of unique dependencies and decreased their ability to remain agile. This is why in 2019, Auto Trader migrated to a microservice aligned database architecture with MongoDB Atlas on Google Cloud Platform (GCP).
Migration to Atlas
This migration was thoughtfully considered for years prior. In fact, in November of 2017 Mohsin attended a meeting in which he jotted down the discussed company architecture goals, all of which aligned with MongoDB Atlas: The ability to use infrastructure as code, to have in-flight encryption as a standard (SSL), and to improve their backup and recovery confidence. Moving to the cloud and away from their previous MongoDB on-premise monolith was an ongoing process for Auto Trader, and at the time of the MongoDB.local London event, was 90% completed within a span of six months. Since the talk, Auto Trader has completed their Atlas migration, so how did they initiate such a monumental migration?
First Step - Proof
The first step of the migration was creating a timeboxed proof of concept on Atlas. Auto Trader gave themselves 3 months to thoroughly test if Atlas was the right fit. “It’s really important to choose a complex application for your proof of concept. It will give you more accurate feedback. Don’t choose your easiest application and say ‘we’ve done a good job, let’s move everything!’ Put Atlas to the test.”
In this period of time, they tested numerous Atlas features including export/import, queryable snapshots, backups, restores, etc. “We tested a feature called Live Migrate. It’s a really great feature to help move your on-premise clusters into Atlas” all while keeping the cluster in sync with the source.
Second Step - Arranging Owners and Applications
After a positive proof of concept experience, the grander database analysis can begin. One of the first things needed is to establish is database ownership. “It may sound easy, but it’s really difficult to find owners for databases when application owners move along from one project to the next. Create a map of who owns which databases.”
Then create a logical grouping of applications to clusters. Deciding which applications are similar enough to run on the same cluster is crucial because a one to one application to cluster ratio isn’t the most cost effective or efficient management technique. Next is to purge unnecessary data, because who wants to migrate data you don’t need?
Third Step - Analyze your Infrastructure
The third step is infrastructure analysis which includes testing networking, choosing a custom backup policy and database version, determining cluster sizes and more. “We were very keen on using small clusters. We tried to limit the number of databases that we put on one cluster because keeping them lean was important to us.”
One of the key elements in the infrastructure analysis is the budget forecast. “If you’re migrating from an enterprise or professional version of MongoDB on-premise, you’re going to enter a period where it’s going to cost you to be on Atlas as well. There’s going to be that little uplift in cost before it all drops down again. Be considerate of that and forecast accordingly.”
Migrating with Methodology
Following the methodology above, Auto Trader continued to migrate on a per application basis. With each application, they bore in mind the impact of application scaling, as it’s different in the cloud. “In the cloud, cluster sizes dictate how many connections you can have. For example, the lowest cluster, m10, can have 350 connections so you have to be really considerate about how you look after them which is why introducing connection pools is one of the most critical things you’ll do.”
For each application, determine how many connections are needed by examining the disk iops, upgrading your driver versions, and deciding how much downtime each application can handle. Use the Live Migrate feature for those that can’t handle downtime at all.
The step often interlaced between others, the one that’s the actual goal yet the very last priority: decommissioning everything you don’t need. “We started by removing user access on-premise, then monitored the logs for a week.
That got us to the point where we felt comfortable with dropping the database from pre-production, then we waited another week, dropped the database from production, reclaimed the space and decommissioned the nodes in our clusters.” The decommission value for Auto Trader included retiring their on-premise datacenter housing 26 virtual machines, saving 272GB of RAM and 8 terabytes of flash.
Time Saving Tips
Given such a successful migration, Mohsin shares a few tips he wishes he could have known before beginning this process.
One of the most time saving things you can do is start talking to the development teams to upgrade your applications’ driver versions and dependencies long before it’s time to migrate so this doesn’t delay the migration process.
Another tip is that Atlas clusters have connection limits. “Connection pooling is huge in Atlas. M10s have 350 connections and every cluster size you move up, the connections just about double. That really becomes important in a cloud model where you’ve got to scale up and scale down and you don’t want your applications connection storming.”
Since Auto Trader was migrating from MongoDB on-premise to Atlas, they realized a difference in the admin restrictions. Database users are on cluster level when on-premise, but “when you create a user in Atlas, it goes to all the clusters you have in a project so there’s a slightly different security tone you need to adhere to.”
Data Purging Policies:
“If you have a lean database, you have a fast database.” Backups are a variable cost so it’s important to carefully control your data. Mohsin recommends using cloud provider snapshots to reduce the cost of using continuous backup.
These tips in combination with the migration guide have given Auto Trader enormous benefits in terms of availability, monitoring, scalability and application development. With Atlas, Auto Trader has created a Unified Delivery Platform which runs Kubernetes in GCP and containers using Docker. On this platform, their engineers receive the same seamless experience, regardless of the language their coding in.
Mohsin shares an example of the platform capabilities in which five applications ran on an Atlas cluster and one was using all the disk iops. After the team detected this, the cluster was upgraded in less than 20 minutes and gave the team time to discover the problem itself. “You need to ask yourselves where you want to be. Do you want to be in an on-premise world where you’re having to look at how you’re going to scale up that cluster, ask a VM person to scale it up for you, then do a rolling bounce and it takes a day? Or do you want to be in the Atlas world where you basically set a parameter and it does it for you so you can actually concentrate on what’s important?”
The tips and experiences in this article come from Mohsin Patel's presentation at MongoDB.local London. If you missed MongoDB.local London or Mohsin's talk, never fear as you can watch the whole thing here: