Leaf in the Wild posts highlight real world MongoDB deployments. Read other stories about how companies are using MongoDB for their mission-critical projects.
I had the opportunity to sit down with Michael Lettner, CTO of Hardware & Services and Bernhard Wolkerstorfer, Head of Web & Services at Tractive, to discuss how they use MongoDB at their Internet of Things startup.
Tell us a little bit about your company. What are you trying to accomplish? How do you see yourself growing in the next few years?
Tractive is a cool 18-month old startup designed for pet owners. We extend the concept of the “quantified self” to the quantified pet, enabling owners to monitor their beloved companions through wearable sensor technology.
Our first service was the GPS Pet Tracking device that attaches to the pet’s collar and enables the owner to receive real time location-based tracking on their iOS or Android device. Users can also define a safe zone that acts as a virtual fence - whenever the pet leaves the safe zone, a notification is sent to the owner’s device. We have extended our products to include Tractive Motion that tracks a pet’s activity. Owners can compare how much exercise their pet is getting to other owners with the same breed. The Peterest image gallery enables owners to share images and activity with other members of their social network, and Pet Manager can be used to record veterinary appointments, allergies, vaccination schedules and more.
Tractive is currently available in over 70 countries, mainly across Europe and the Middle East, and is now rapidly extending worldwide with our first customers recently added in the USA, Asia, Australia and New Zealand.
Please describe your application using MongoDB.
MongoDB is our primary database - we use it to store all of the data we rely on to deliver our services - from sensor and geospatial data, to activity data, to user data and social sharing. Image data is stored in AWS S3 with its metadata managed by MongoDB.
We also use MongoDB to log all data from our infrastructure, ensuring our service is always available.
Why did you select MongoDB for Tractive? Did you consider other alternatives?
We initially came from a background of using relational databases, but we believed that these were not appropriate tools for managing the diversity of sensor data we would rely on for the Tractive services. In addition, we knew we would be rapidly evolving the functionality of our apps and were concerned the rigidity of the relational data model would constrain our creativity and time to market.
We knew the way forward was a non-relational database, and many would give us the flexible data model our app needed. Beyond a dynamic schema, we had additional criteria that guided our ultimate decision
- How easily would the database allow us to store and query geospatial data?
- How well could the database handle time-series and event-based data?
- What sort of query flexibility did the database offer to support analytics against the data?
- How easily and quickly could the database scale as our customer base and data volumes grew?
- Was the database open source?
There are a multitude of key-value, wide column and document databases we could have chosen. There were many that could ingest time-series data quickly, but they lacked the ability to run rich queries against the data in place – instead forcing us to replicate the data to external systems.
Only MongoDB met all of key criteria – easy to develop against, simple to run in operations and without throwing away the type of query functionality we had come to expect from relational databases.
Please describe your MongoDB deployment
We run our MongoDB cluster across three shards with each shard configured as a three-node replica set. This architecture gives us the resilience we need to deliver always-on availability, and enables us to rapidly add shards as our service continues to grow.
The cluster is deployed in a colocation facility with an external service provider.
Our backend is primarily based on Ruby and currently running MongoDB 2.2 in production. We are planning a move to MongoDB 2.6 to take advantage of some specific new capabilities:
- Aggregation framework improvements such as cursors
- Geospatial enhancements
- Index intersection with the ability to use more than one index to resolve a query
Can you share best practices you learned while scaling MongoDB? For best results, shard before you have to. Get a thorough understanding of your data structures and query patterns. This will help you select a shard key that best suits your applications. If you follow these simple rules, sharding in MongoDB is really simple. It’s automatic and transparent to the developer.
Scaling is of course much more than simply throwing hardware at the database cluster. So we got a lot of benefits from MongoDB tooling in optimizing our queries. During development, we used the MongoDB explain operator to ensure good index coverage. We also use the MongoDB Database Profiler to log all slow queries for further analysis and optimization.
For our analytics queries, we initially used MongoDB’s inbuilt MapReduce, but have since moved to the aggregation framework, which is faster and simpler.
Are you using any tools to monitor, manage and backup your MongoDB deployment? We rely heavily on the MongoDB Management Service application for proactive monitoring of our database cluster. Through MMS alerting we identified a potential issue with replication and were able to rectify it before it caused an outage.
For backups, we currently use mongodump, but are evaluating MMS Backup as this has the potential to extend our disaster recovery capabilities.
For overall performance monitoring of our application stack, we use New Relic which is implemented in the drivers we use.
What business advantage is MongoDB delivering?
As a startup, time to market is key. We could not have got to market as quickly with other databases. MongoDB’s flexible document model and dynamic schema have been essential not only in launching the original service, but now as we evolve our products. Requirements change quickly and we are always adding new features. MongoDB enables us to do that.
As we add more products and features, we add new customers. We need the ability to scale our infrastructure fast. Again MongoDB provides that scalability and operational simplicity we need to focus on the business, rather than the database.
What advice would you give someone who is considering using MongoDB for their next project?
We came from a relational database background and were surprised how easy it was for us in development and ops to transfer that knowledge to MongoDB. That helps us get up and running quickly.
MongoDB schema design is new concept and requires a change in thinking - from a normalized model that packs data into rows and columns across multiple tables to a document model that allows embedding of related data into a single object. Developers need to move on from focusing on how data is stored, to how it is queried by the application. You need to identify your queries and build your schema from there.
The good news is that there is a wealth of documentation online. The MongoDB blog is a great resource to learn best practices from the community. An example is the awesome post on MongoDB schema design for time series data - this will help anyone managing this type of data in IoT applications.
The MongoDB University provides free self-paced training for developers (in multiple languages), administrators and operations staff. There are also some really useful tutorials covering every step of MongoDB replication and sharding.
Our recommendation would be to perform due diligence during your research - ensure you understand your requirements, then download the software and get started in your evaluation.
Mike and Bernhard - I’d like to thank you for taking the time to share your experiences with us!