Migrating Terabytes of IoT Data From a Legacy NoSQL Database to MongoDB Atlas With MongoDB's Custom Migration Tool
In 2020, a large European energy company began an ambitious plan to replace its traditional metering devices — all 7.6 million of them — with smart meters. That would allow the energy company to monitor gas use remotely and allow customers’ bills to more accurately reflect their energy consumption. At the same time, the company began installing smart components along their production network to monitor operations in real-time, manage alarms, use predictive maintenance tools, and find leaks using advanced technologies. The energy company knew this shift would result in a massive amount of data coming into their systems, and they thought they were ready for it. They understood the complexities of managing and leveraging data from the Internet of Things (IoT), such as the high velocity at which data must be ingested and the need for time-based data aggregations. They rolled out an IoT platform with big data and analytics tools to help them make progress toward their objectives of high-quality, efficient, and safe service. This article examines how the company migrated its system to MongoDB Atlas in order to handle the massive influx of data. Managing data The energy company was managing 3 TB of data on its NoSQL database, with the remainder housed and managed on a relational database. However, it started facing challenges, including a lack of scalability, increasing costs, and poor performance. The costs to maintain the pre-production and production environments were unsustainable, and the situation wasn’t going to get better: By 2023, the energy company planned to increase the number of IoT devices and sensors by a factor of five. They needed a viable solution for the long term. Migrating to MongoDB Atlas The energy company decided to migrate to MongoDB Atlas for several reasons. Atlas’s online archive, combined with the ability to create time-series sharded collections, makes Atlas an ideal fit for IoT data, as does the flexibility of the document data model. Additionally, an API that was compatible with the existing database would minimize the impact on application code and make it easier to migrate applications. The customer chose PeerIslands to be its technical partner and help them with the migration. PeerIslands, a MongoDB partner, is an enterprise-class digital transformation company with an expert, multilingual team with significant experience working across multiple technologies and cloud platforms. PeerIslands has developed solutions for both homogenous and heterogenous workload migrations. Among these solutions is a MongoDB migration tool that helps perform one-time migrations and change data capture while minimizing downtime. The tool is fully GUI-based, and tasks such as infrastructure provisioning, dump and restore, change stream listeners, and processors have all been automated. For change capture, the tool uses the native MongoDB change stream APIs. Migration challenges In working with the energy company to perform the migration, the PeerIslands team faced two particular challenges: The large volume of data. Initial snap-shotting of the data would take about a day. The application had significant write loads. On average, it was writing about 12,000 messages per second. However, the load was unevenly distributed, with spikes when devices would “wake up” and report their status. These two factors quickly generated close to 20 million change events that had to be synced to MongoDB. Meanwhile, new data was constantly being written into the source. Migration tool PeerIslands’ migration tool uses mongodump and mongorestore for one-time data migration and MongoDB Kafka Connector for real-time data synchronization. By using Apache Kafka, the migration tool was able to handle the large amount of change stream data and successfully manage the migration. To address the complexity of the migration, PeerIslands also enhanced the migration tool with additional capabilities: Parallelize the Kafka change stream processing using partitions. The Kafka partitioning strategy was in sync with the target Atlas sharding strategy. Use ReplaceOneBusinessKeyStrategy as the write model for Kafka MongoDB sink connector to write into sharded Atlas collections. By using its in-house tooling, PeerIslands was able to successfully complete the migration with near zero downtime. Improved performance With the migration complete, the customer has already begun to realize the benefits of MongoDB Atlas for their massive amounts of IoT data. The user interface has become extremely responsive, even in front of more expensive queries. Because of the improved performance of the database, the customer is now able to pursue improvements and efficiencies in other areas. With better performance, the company expects consumption of the data to rise and their schema design to evolve. They’re looking to leverage the time-series benefits of MongoDB both to simplify their schema design and deliver richer IoT functionality. They’re also better equipped to rapidly respond to and fulfill business needs, because the database is no longer a limitation. Importantly, costs have decreased for the production environment, and even more dramatic cost reductions have been seen for the pre-production environment. Learn more about the migration tool and MongoDB’s time series capabilities . Interested in your own data migration? Contact us .
PeerIslands Cosmos DB Migrator Tool to MongoDB Atlas on Google Cloud
When you’re in the midst of innovating, the last thing you want to worry about is infrastructure. Whether you’re looking to streamline inventory management or reimagine marketing, you need applications that can scale fast and maintain high availability. That’s where MongoDB Atlas on Google Cloud comes in. With MongoDB Atlas’ general-purpose, document-based database, users can free themselves from the hassle of database management, and give back precious time to developers to focus on innovation. Combine these benefits with Google Cloud’s cloud computing power, high availability, and ability to integrate with tools like BigQuery, Dataflow, Dataproc and more, and it’s hard to find a comparable joint solution. In fact, many current Microsoft Azure Cosmos DB users are now considering making the move to MongoDB. Microsoft’s Cosmos DB only supports single partition transactions, has no schema governance and forces developers to work with five different APIs to deliver full application functionality. Conversely, MongoDB Atlas on Google Cloud supports distributed multi-document ACID transactions, includes schema governance, and offers integrated full-text search, auto-archiving, data lakes, and edge-to-cloud data sync. The following blog illustrates how PeerIslands’ Cosmos DB Migrator tool can help users move from Cosmos DB to MongoDB Atlas on Google Cloud. Why PeerIslands PeerIslands is an enterprise-class digital transformation company composed of a team of polyglots who are comfortable across multiple technologies and cloud platforms. As a services firm, PeerIslands is focused on helping customers with both cloud-native development and application transformation. With best-in-the-industry talent, PeerIslands has been working with the MongoDB team to build a suite of solutions around two key objectives: For a customer evaluating MongoDB, how can we rapidly address common questions? Once a customer has chosen MongoDB, how can we reduce time to value by rapidly migrating workloads to MongoDB? With this in mind, PeerIslands developed a suite of tools around schema generation, understanding MongoDB query performance, as well as helping customers understand code changes required for upgrading MongoDB versions. In terms of workload migrations, PeerIslands developed solutions for both homogenous and heterogenous migrations. The company is also contributing to the open source community with a mobile app for enabling MongoDB admins to manage Atlas on the go. PeerIslands' Cosmos DB migration use case The current approach for migrating data from Cosmos DB to MongoDB is to use MongoDB dump and restore. But there are several problems with this approach. It’s fully manual and CLI-based which creates a poor user experience and requires technical resources even for simple migrations. There’s a lack of change capture capability which requires downtime during the duration of migration. For large Cosmos DB migrations, this causes significant issues. The team is also under pressure to deliver the entire migration in a short period of time. Migrations often get delayed as customers have difficulty identifying the right migration window. The Cosmos to MongoDB tool is a “Live Migrate” like tool that helps perform one-time migrations and change data capture from Cosmos DB (MongoDB model) to MongoDB Atlas and minimizes downtime requirements associated with migrations. The tool is fully GUI-based and nearly everything is automated. All the tasks for infrastructure provisioning, dump & restore, change stream listeners and processors have all been automated with a graphical user interface (GUI). The Cosmos to Mongo migration tool uses native MongoDB tools and the performance is similar to native tools. For change capture, we leverage the native MongoDB change stream APIs. A high level view of the solution is provided in figure 1 below: Figure 1: Solution Map Migration steps: Migration configuration: Provide the name of the migration task, source Cosmos DB details, and target MongoDB details. The tool supports key vault integration as well. Migration infrastructure provisioning: Provide migration infrastructure details required for creating the VM (Virtual Machine) including location, type of VM instance, etc. Migration execution: Allow for automation of the migration once the configuration is complete. The migration is executed in 3 steps: backup, restore and change event processing. As a user, you can initiate the backup process. The change event listener is started in parallel with the backup process and captures all the changes. Once the backup is complete, the user can restore the initial data and then perform change event processing to apply all the changes to MongoDB. Migration validation: The tool also provides facilities for validating the migration. Users can view the total number of documents on both source Cosmos DB collection and target MongoDB collection. They can also compare random documents picked up from Cosmos DB and MongoDB side by side and validate whether the data elements have been loaded correctly. For a more detailed demo and description of events, watch the following video: Migrating to a new database can feel daunting at first, but PeerIslands Cosmos DB migrator makes it easy. Major concerns like delays and downtime are eliminated from the process, helping you run your business smoothly and reap the benefits of MongoDB more quickly. And with PeerIslands suite of tools, you can rapidly address MongoDB-specific questions and accelerate time to value. Reach out today to get started