Getting Production Ready at Gliph with MMS
This is a guest post from Nick Asch, CTO of Gliph.
Gliph is a startup with a secure messaging app that allows easy bitcoin payments. Gliph runs on top of a technology stack that includes MongoDB as the primary database for long-term data storage, from user profiles to hundreds of thousands of messages and pictures. In this post, I’ll describe how we transitioned from a proof of concept MongoDB implementation to a production-ready setup, and how we now utilize MongoDB Management Service (MMS) to ensure an excellent user experience.
From Hackathon to Production
A service like Gliph has fairly complicated components, including user data encryption, messaging with scheduled delivery and expiry, integration with major wallet providers (each with their own API), among many other pieces. We’re a small team adding features quickly, and iterating is easiest when your system is stable. But our system wasn’t always as robust as it is today!
Gliph started as a proof of concept at a hackathon. Our use of MongoDB at that time was fairly simple. The system was designed to work 99 percent of the time, but as usage increased, edge cases popped up daily. By the time we launched our iPhone app three months later, it was time to move to production. For that, we configured multiple servers for a replica set, scheduled regular backups (and tested them), and added comprehensive error logging.
Once we had a production ready database setup and configured, I started to learn how infrastructure and efficient use of the database were closely tied. Our code needed to take advantage of the intricacies of the database, and the database needed to take advantage of the structure of our app and data. Being relatively new to MongoDB, I lacked detailed knowledge of the internals and of how to optimize our backend. I needed greater visibility into my queries and to understand system performance and health.
Taking off the blindfold
Using monitoring from MongoDB Management Service (MMS), I was able to uncover and correct inefficiencies in our system. MMS monitors as many databases as you have, tracking system stats such as ram available. It lets you visualize the queries being performed, and set alerts when certain metrics are out of range.
Here are three things that I discovered and was able to fix with the assistance of MMS:
1. Missing indexes. I didn’t realize that some queries for analytics were not using indexes. Our analytics queries run every ten minutes and it wasn’t apparent to me that they were affecting performance until I looked at the graphs MMS provides. While these queries executed, unlucky users would have to wait longer than necessary to read a new message or send one. I still wonder how many users tried signing up and were unimpressed with the latency! Several graphs showed periodic spikes exactly ten minutes apart, including the network bandwidth graph. This pointed right at the analytics queries, but still required some analysis to determine the cause of the problem. A few query hints later and I knew the problem was with missing indexes. We added an index, performance improved, and it no longer impacted our users.
2. Unnecessary fields in queries. While looking at network bandwidth, I noticed that our primary database server was returning far more data to the web server than the web server was returning to the user. After further investigation, I realized that MongoDB was returning fields that were never returned to the client. We only needed a small subset of the document. When we updated the query to make it more efficient, bandwidth decreased.
3. Completely unnecessary queries. When users retrieved messages, their messages were being marked as read, even if they were already read. This lead to unnecessary update queries. I first noticed this when looking at the opscounter graph. Updates were almost as frequent as read queries, counter-intuitive when we mostly send and receive read-only messages. Because updates are so infrequent in our application, I could quickly find in the code the unnecessary “mark-as-read” function. Rather than updating each message to mark it as read, it now keeps track of only the last message the user saw. When fixed and deployed, the graph made the improvement clear.
Conclusion
MongoDB’s is a great option for people looking to create web applications at the earliest stage of their venture. In my experience, it is possible to successfully move from a single instance to a robust production deployment. And tools like MMS make that transition simple by enabling you to find inefficiencies in your web application’s interaction with the database.