Smartling

Companies increasingly need to reach a global audience and sell more products internationally, but often struggle with the best approach to deliver high-quality translations appropriate for their brand. Smartling addresses this need by helping customers translate web sites and mobile apps through professional translators, crowdsourcing, and, when appropriate, machine translation. With their translation management platform, companies can compress translation time from 8-12 months to mere weeks; but the platform’s success depends on low latency and the ability to scale to meet customer demand.

As their MySQL database grew, performance suffered and it became increasingly difficult to manage. After a thorough evaluation of multiple NoSQL solutions, Smartling moved a core component of their application to MongoDB, providing a long-term solution that allows them to scale horizontally for capacity and throughput.

The Problem

Smartling initially built their application in a monolithic way, with all data stored in MySQL, but it rapidly became time consuming to manage (e.g. back-ups took too long if they needed to restore data), resulting in lost performance and reduced developer agility. Smartling’s growth was also constricted by the complexity of scaling MySQL with sharding.

The Smartling application provides translators with text strings, as well as the HTML in which the strings are contained, providing important translation context (for example, the use of “home” referring to a “home page” as opposed to a house). This enables translators to get immediate feedback on where the content fits, ultimately speeding up the process and resulting in higher quality translation. In order to minimize the back-and-forth between project manager and translator and keep translators happy, Smartling can’t afford any latency when rendering HTML pages to translators. They needed a way to store the HTML and metadata so that it would load very quickly when requested.

“We realized we needed to split the application into multiple components with multiple databases in order to shift the load from our main database server,” said Smartling CTO Andrey Akselrod.

In addition to several NoSQL solutions, Smartling evaluated Amazon Simple Storage Service (S3) but Smartling’s storage/retrieval requirements – high-speed, asynchronous storage and very quick reads – weren't met by S3. A predictive caching solution would have been necessary to speed up S3 loads while keeping data secure.

Why MongoDB?

Testing showed MongoDB to be the best choice for Smartling from a feature/performance standpoint as well as speed of development. Akselrod said the New York headquarters location and communication with the MongoDB team sealed the decision.

Smartling now uses MongoDB to store HTML and metadata for their Translation Delivery Network. Text strings are still stored in MySQL, though Smartling is considering using MongoDB as their primary data store in the future.

HIGH PERFORMANCE, LOW LATENCY

With MongoDB, Smartling can reliably store a large number of small text files with very fast read times and automatically expire data, such as outdated web site text, after a certain period of time. MongoDB's memory-mapped architecture allows for quick content ingestion without a separate caching layer.

HORIZONTAL SCALING WITH BUILT-IN SHARDING

“Sharding is ridiculously easy with MongoDB and satisfied Smartling’s requirement far better than other NoSQL solutions tested,” said Akselrod. MongoDB’s built-in sharding capability makes it very easy to scale horizontally, paving the path for growth without the scalability challenges of relational database solutions.

EASE OF DEVELOPMENT

Editing a MySQL database with millions of records can become impossible over time. According to Akselrod, MongoDB “handles that eloquently,” allowing developers to add fields on the fly and easily update object structure.

HIGH AVAILABILITY

High redundancy with replica sets ensures Smartling is up and running with 100% uptime. In the cases where an Amazon instance is lost, MongoDB automatically fails over “without any fuss.”

Deployment

  • Stack: CentOS, Java with Spring / Hibernate, Apache Tomcat
  • Middleware: RabbitMQ, memcached
  • Deployment platform: Amazon Web Services
  • Server hardware configuration: 13 Servers on Amazon EC2
  • 3 configuration servers (t1.micro)
  • 3 shards, each of which has 3 replica sets (m1.large)
  • 1 router (t1.micro)
  • developers: 10

  • Monitoring: Zabbix
  • Database size: 400GB
  • documents in largest cluster: 15 million

  • writes/second in cluster: 500

  • of reads/ second: 10

  • Time to production: 1 month

Results

MongoDB allowed Smartling to easily scale one of the core components of their translation platform with minimal development effort. As a result, they can provide high-quality translation efficiently to customers.

“Availability, hardware requirements and costs are superb, and development team excitement cannot be measured – it is priceless,” said Akselrod. “This was a ‘set it and forget it’ project which just worked, and sometimes that is all you need.”

Next Steps

Smartling is planning several new MongoDB projects which they expect to increase developer productivity. In addition to moving their primary data store to MongoDB, Akselrod expects that MongoDB will be a great fit for other infrastructure components, such as metadata and statistics aggregation.

Learn more about Smartling