Companies in the security and monitoring market capture and analyze data from cameras, alarm panels and other security devices to help keep their customers safe. That means using a security management platform to manage data about alerts, events, and service calls.
Segware provides that platform for hundreds of monitoring companies across the World. The platform soaks up the huge amount of event data the monitoring systems produce. Considering the high volume of event data, this system needs to do this at scale and at a low cost.
That’s not easy. Segware is growing, so its data needs are growing too. Its customers have different data storage requirements, as data regulations differ from country to country. And Segware is also in the process of migrating customers who use its legacy desktop product into the cloud.
Enter MongoDB and Atlas Data Lake.
Lots of data, hot and cold
Founded in 2001, Segware is based in Florianópolis, Brazil, with branches in Campinas and Miami, Florida. By late 2012, it had locked in a large proportion of the local market, so it started to look farther afield. Today, it has around 1,200 central monitoring station customers guarding some two million end-users in over 15 countries, including Brazil, Mexico, Argentina, and the US. Supermarkets, pharmacies, and other businesses—as well as individual homeowners—pour all sorts of data into Segware’s service.
“Since we had a large amount of events coming to our platform, treating those events in a relational database was going to be tough for us,” said Renato Martins, Segware’s development manager. Early on, the company began developing a multi-tenant platform with plans to build that on a NoSQL database. “We did some research and we came up with MongoDB as the best solution for us.”
Segware self-hosted MongoDB before deciding to migrate to MongoDB Atlas two years ago and data demands have kept growing since then. Today, it has around one terabyte of data in some four billion documents. Storage costs were becoming a challenge, but there was a solution, as not all of that data needed to be instantly or frequently accessible.
We needed to solve a problem since we had so many events and huge growth of data in our databaseRenato Martins, Development Manager, Segware
Martins said, “Our platform needs to be cost-effective—we cannot charge too much from our clients. So, we determined that we needed a data lake so we could access hot information and some cold information when needed.”
Segware began looking for a way to manage older, less frequently accessed data more cost-effectively, while also enabling such data to be easily queried when the need arises. That could happen, for example, if a pharmacy business in Brazil—where monitoring data must, in most of the cases, be stored for at least five years—needed to access sensor data to prove that a former employee suing the company wasn’t at work on the days they claimed.
Initially, Segware considered building its own data lake on Amazon S3 until it learned about MongoDB’s newest offering: Atlas Data Lake. Segware decided to dive in.
No need to rewrite queries
Work on a proof-of-concept for Data Lake began in December 2019. A few months later, it was in production, with infrequently accessed data being moved from Atlas to S3, with Atlas Data Lake as the query engine. Segware is now doing that every two weeks.
To extract, transform, and load data from Atlas and move it into Atlas Data Lake, Segware uses Apache Spark and the MongoDB Spark connector. One of Atlas Data Lake’s big wins for the team was that it used the MongoDB Query Language (MQL), which provided everyone a familiar way of handling queries.
“The principle [reason] to use MongoDB Atlas Data Lake is MQL, the MongoDB Query Language,” said software specialist developer Igor Agenor Piovezan.
We use MQL in every part [of our business]. And with Data Lake, we can use that easily and find data in any storage with a very familiar query, and this is good for us.Igor Agenor Piovezan, Software Specialist Developer, Segware
“We didn’t want to rewrite all the queries to a new language,” Martins added. “Since MongoDB offered the Data Lake solution, it was way easier for us because the query that we type in for applications to access hot information is going to be the same query that we’re going to use to access the cold information. It’s like we snap our fingers and it’s done.”
Because customers don’t need to access their historical data often, it wouldn’t have been a problem for Segware if queries took longer to execute against data in S3 compared to those on Atlas. But in reality, Segware has found that those queries don’t take much time at all. In fact, Segware queries both their Atlas cluster and Atlas Data Lake at the same time—something that’s made easier through Atlas Data Lake’s new federated query feature, which automates this process. Atlas Data Lake users can run a single query to analyze their live Atlas data and historical data on S3 together and in-place. Such queries return a single query response, no matter where the data resides.
“The time to get this data can be one minute, two minutes,” said Piovezan. “This is no problem for customers.”
Key to healthy growth
Data Lake’s biggest benefit, however, is its cost-effectiveness.
Before, Martins said, “[data storage] was costing too much for us. And if we migrate everyone from desktop to cloud right now with this architecture, we are not going to be a healthy company. It’s going to be more [costly] and no more money coming into our pockets. So this was key to maintaining our company growing in a healthy way.”
Being able to handle larger volumes of data affordably will be especially critical as Segware moves ahead with migrating customers currently using its desktop product into the cloud. “We intend to migrate those clients that have the on-premises version,” said Martins. “We are going to migrate the data. So I believe that within one year, one year and a half, we’ll have 60 to 70 percent more, maybe double the size that we have today of data.”
With Data Lake on its side, Segware expects it should be easy to manage that kind of growth in data volumes. So far, Piovezan said, the performance has been “incredible for us.”