Making Diabetes Data More Accessible and Meaningful with Tidepool and MongoDB
Rate this podcast
The data behind diabetes management can be overwhelming — understanding it all is empowering. Tidepool turns diabetes data points into accessible, actionable, and meaningful insights using an open source tech stack that incorporates MongoDB. Tidepool is a nonprofit organization founded by people with diabetes, caregivers, and leading healthcare providers committed to helping all people with dependent diabetes safely achieve great outcomes through more accessible, actionable, and meaningful diabetes data.
They are committed to empowering the next generation of innovations in diabetes management. We harness the power of technology to provide intuitive software products that help people with diabetes.
In this episode of the MongoDB Podcast, Michael and Nic sit down with Tapani Otala, V.P. of Engineering at Tidepool, to talk about their platform, how it was built, and how it uses MongoDB to provide unparalleled flexibility and visibility into the critical data that patients use to manage their condition.
Tapani: [00:00:00] Hi, my name is Tapani Otala. I'm the VP of engineering at . We are a nonprofit organization whose mission is to make diabetes data more accessible, meaningful, and actionable. The software we develop is designed to integrate [00:01:00] data from various diabetes devices like insulin pumps, continuous glucose monitors, and blood glucose meters into a single intuitive interface that allows people with diabetes and their care team to make sense of that data. And we're using Mongo DB to power all this. Stay tuned for more.
Chris: [00:00:47] My name is Christopher Snyder. I've been living with type one diabetes since 2002. I'm also Tidepool's community and clinic success manager. Having this data available to me just gives me the opportunity to make sense of everything that's happening. Prior to using Tidepool, if I wanted to look at my data, I either had to write everything down and keep track of all those notes. Or I do use proprietary software for each of my devices and then potentially print things out and hold them up to the light to align events and data points and things like that. Because Tidepool brings everything together in one place, I am biased. I think it looks real pretty. It makes it a lot easier for me to identify trends, make meaningful changes in my diabetes management habits, and hopefully lead a healthier life.
Mike: [00:01:28] So we're talking today about Tidepool and maybe you could give us a quick description of what Tidepool is and who it may appeal to **Tapani: [00:01:38] **We're a nonprofit organization. And we're developing software that helps people with diabetes manage that condition. We enable people to upload data from their devices, different types of devices, like glucose monitors, meters, insulin pumps, and so on into a single place where you can view that data in one place. And you can share it with your care team members like doctors, clinicians, or [00:02:00] your family members. They can view that data in real time as well.
Mike: [00:02:03] Are there many companies that are doing this type of thing today?
Tapani: [00:02:06] There are a few companies, as far as I'm aware, the only non-profit in this space though. Everything else is for profit. And there are a lot of companies that look at it from diabetes, from different perspective. They might work with type two diabetes or type one. We work with any kind. There's no difference.
Nic: [00:02:24] In regards to Tidepool, are you building hardware as well as software? Or are you just looking at data? Can you shed some more light into that?
Tapani: [00:02:33] Sure. We're a hundred percent software company. We don't make any other hardware. We do work with lots of great manufacturers of those devices in the space and medical space in general, but in particular diabetes that make those devices. And so we collaborate with them.
Mike: [00:02:48] So what stage is Tidepool in today? Are you live?
Tapani: [00:02:50] Yeah, we've been live since 2013 and we we've grown since a fair bit. And we're now at 33 or so people, but still, I guess you could consider as a [00:03:00] startup, substance. So
Nic: [00:03:01] I'd actually like to dig deeper into the software that Tidepool produces. So you said that there are many great hardware manufacturers working in this space. How are you obtaining that data? Are you like a mobile application connecting to the hardware? Are you some kind of IoT or are they sending you that information and you're working with it at that point?
Tapani: [00:03:22] So it really depends on the device and the integration that we have. For most devices, we talk directly to the device. So these are devices that you would use at your home and you connect them to a PC over Bluetooth or USB or your phone for that matter. And we have software that can read the data directly from the device and upload it to our backend service that's using Mongo DB to store that data.
Mike: [00:03:43] Is there a common format that is required in order to send data to Tidepool?
Tapani: [00:03:49] We wish. That would make our life a whole lot simpler. No, actually a good chunk of the work that's involved in here is writing software that knows how to talk to each individual device. And there's some [00:04:00] families of devices that, that use similar protocols and so on, but no, there's no really universal protocol that talk to the devices or for the format of the data that comes from the devices for that matter. So a lot of the work goes into normalizing that data so that when it is stored in in our backend, it's then visible and viewable by people.
Nic: [00:04:21] So we'll get to this in a second. It does sound like a perfect case for a kind of a document database, but in regards to supporting all of these other devices, so I imagine that any single device over its lifetime might experience different kind of data output through the versions. What kind of compatibility is Tidepool having on these devices? Do you use, do say support like the latest version or? Maybe you can shed some light on that, how many devices in general you're supporting. Tapani: [00:04:50] Right now, we support over 50 different devices. And then by extension anything that Apple Health supports. So if you have a device that stores data in apple [00:05:00] health kit, we can read that as well. But 50 devices directly. You can actually go to type bullet org slash devices, and you can see the list full list there. You can filter it by different types of devices and manufacturers and so on. And that those devices are some of them are actually obsolete at this point. They're end of life. You can't buy them anymore. So we support devices even long past the point when there've been sold. We try to keep up with the latest devices, but that's not always feasible.
Mike: [00:05:26] This is it's like a health oriented IOT application right?
Tapani: [00:05:30] Yeah. In a way that that's certainly true. The only difference here maybe is that those devices don't directly usually connect to the net. So they need an intermediary. Like in our case, we have a mobile application. We have a desktop application that talks to the device that's in your possession, but you can't reach the device directly over internet.
Mike: And just so we can understand the scale, how many devices are reporting into Tidepool today?
Tapani: I don't actually know exactly how many devices there are. Those are discreet different types of devices. [00:06:00] What I can say is our main database production database, we're storing something it's approaching to 6 billion documents at this point in terms of the amount of data across across and hundreds of thousands of users.
Nic: [00:06:11] Just for clarity, because I want to get to, because the diabetes space is not something I'm personally too familiar in. And the different hardware that exists. So say I'm a user of the hardware and it's reporting to Tidepool. Is Tidepool gonna alert you if there's some kind of low blood sugar level or does it serve a different purpose?
Tapani: [00:06:32] Both. And this is actually a picture that's changing. So right now what we have out there in terms of the products, they're backward looking. So what happened in the past, but you might might be using these devices and you might upload data, a few times a day. But if you're using some of the more, more newer devices like continuous glucose monitors, those record data every five minutes. So the opposite frequency, it could be much higher, but that's going to change going [00:07:00] forward as more and more people start using this continuous glucose monitors that are actually doing that. For the older devices might be, this is classic fingerprint what glucose meter or you poke your finger, or you draw some little bit of blood and you measure it and you might do that five to 10 times a day. Versus 288 times, if you have a glucose monitor, continuous glucose monitor that sends data every five minutes. So it varies from device to device.
Mike: [00:07:24] This is a fascinating space. I test myself on a regular basis as part of my diet not necessarily for diabetes, but for for ketosis and that's an interesting concept to me. The continuous monitoring devices, though, that's something that you attach to your body, right?
Tapani: [00:07:39] Yeah. These are little devices about the size of a stack of quarters that sits somewhere on your skin, on an arm or leg or somewhere on your body. There's a little filament that goes onto your skin, that does the actual measurements, but it's basically a little full.
Mike: [00:07:54] So thinking about the application itself and how you're leveraging MongoDB, do you want to talk a little bit about how the [00:08:00] application comes together and what the stack looks like?
Tapani: [00:08:01] Sure. So we're hosted in AWS, first of all. We have about 20 or so microservices in there. And as part of those microservices, they all communicate to all MongoDB Atlas. That's implemented with the sort of best practices of suppose security in mind because security and privacy are critically important for us. So we're using the busy gearing from our microservices to MongoDB Atlas. And we're using a three node replica set in MongoDB Atlas, so that there's no chance of losing any of that data.
Mike: [00:08:32] And in terms of the application itself, is it largely an API? I'm sure that there's a user interface or your application set, but what does the backend or the API look like in terms of the technology?
Tapani: [00:08:43] So, what people see in front of them as a, either a desktop application or mobile application, that's the visible manifestation of it. Both of those communicate to our backend through a set of rest APIs for authentication authorization, data upload, data retrieval, and so on. Those APIs then take that data and they store it in our MongoDB production cluster. So the API is very from give me our user profile to upload this pile of continuous glucose monitor samples.
Mike: [00:09:13] What is the API written in? What technologies are you using?
Tapani: [00:09:16] It's a mix of Node JS and Golang. I would say 80% Golang and 20% Node JS.
Nic: [00:09:23] I'm interested in why Golang for this type of application. I wouldn't have thought it as a typical use case. So are you able to shed any light on that?
Tapani: [00:09:32] The decision to switch to Golang? And so this actually the growing set of services. That happened before my time. I would say it's pretty well suited for this particular application. This, the backend service is fundamentally, it's a set of APIs that have no real user visible manifestation themselves. We do have a web service, a web front end to all this as well, and that's written in React and so on, but the Golang is proven to be a very good language for developing this, services specifically that respond to API requests because really all they do is they're taking a bunch of inputs from the, on the caller and translating, applying business policy and so on, and then storing the data in Mongo. So it's a good way to do it.
Nic: [00:10:16] Awesome. So we know that you're using Go and Node for your APIs, and we know that you're using a MongaDB as your data layer. What features in particular using with MongoDB specifically?
Tapani: [00:10:26] So right now, and I mentioned we were running a three node replica set. We don't yet use sharding, but that's actually the next big thing that we'll be tackling in the near future because that set of data that we have is growing fairly fast and it will be growing very fast, even faster in the future with a new product coming out. But sharding will be next one. We do a lot of aggregate queries across several different collections. So some fairly complicated queries. And as I mentioned, that largest collection is fairly large. So performance, that becomes critical. Having the right indices in place and being able to look for all the right data is critical.
Nic: [00:11:07] You mentioned aggregations across numerous collections at a high level. Are you able to talk us through what exactly you're aggregating to give us an idea of a use case.
Tapani: [00:11:16] Yeah. Sure. In fact, the one thing I should've mentioned earlier perhaps is besides being non-profit, we're also open source. So everything we do is actually visible on GitHub in our open-source repo. So if anybody's interested in the details, they're welcome to take a look in there. But in the sort of broader sense, we have a user collection where all the user accounts profiles are stored. We have a data collection or device data collection, rather. That's where all the data from diabetes devices goes. There's other collections for things like messages that we sent to the users, emails, basically invitations to join this account or so on and confirmations of those and so different collections for different use cases. Broadly speaking is it's, there's one collection for each use case like user profiles or messages, notifications, device data.
Mike: [00:12:03] And I'm thinking about the schema and the aggregations across multiple collections. Can you share what that schema looks like? And maybe even just the number of collections that you're storing.
Tapani: [00:12:12] Sure. Number of collections is actually relatively small. It's only a half a dozen or so, but the schema is pretty straightforward for most of them. They like the user profiles. There's only so many things you store in a user profile, but that device data collection is perhaps the most complex because it stores data from all the devices, regardless of type. So the data that comes out of a continuous glucose monitor is different than the data that comes from an insulin pump. For instance, for example. So there's different fields. There are different units that we're dealing with and so on.
Mike: [00:12:44] Okay, so Tapani, what other features within the Atlas platform are you leveraging today? And have you possibly look at automated scalability as a solution moving forward?
Tapani: [00:12:55] So our use of MongoDB Atlas right now is pretty straightforward and intensive. So a lot of data in the different collections, indices and aggregate queries that are used to manage that data and so on. The things that we're looking forward in the future are things like sharding because of the scale of data that's growing. Other things are a data lake, for instance, archiving some of the data. Currently our production database stores all the data from 2013 onwards. And really the value of that data beyond the past few months to a few years is not that important. So we'd want to archive it. We can't lose it because it's important data, but we don't want to archive it and move it someplace else. So that, and bucketizing the data in the more effective ways. And so it's faster to access by different stakeholders in the company.
Mike: [00:13:43] So some really compelling features that are available today around online archiving. I think we can definitely help out there. And coming down the pike, we've got some really exciting stuff happening in the time series space. So stay tuned for that. We'll be talking more about that at our .live conference in July. So stay tuned for that.
Nic: [00:14:04] Hey Mike, how about you to give a plug about that conference right now?
Mike: [00:14:06] Yeah, sure. It's our biggest user conference of the year. And we get together, thousands of developers join us and we present all of the feature updates. We're going to be talking about MongoDB 5.0, which is the latest upcoming release and some really super exciting announcements there. There's a lot of breaks and brain breaking activities and just a great way to get plugged into the MongoDB community. You can get more information at mongodb.com/live. So Tapani, thanks so much for sharing the details of how you're leveraging Mongo DB. As we touched on earlier, this is an application that users are going to be sharing very sensitive details about their health. Do you want to talk a little bit about the security?
Tapani: [00:14:49] Sure. Yeah, it's actually, it's a critically important piece for us. So first of all of those APS that we talked about earlier, those are all the traffic is encrypted in transit. There's no unauthorized or unauthenticated access to any other data or API. In MongoDB Atlas, what we're obviously leveraging is we use the encryption at rest. So all the data that's stored by MongoDB is encrypted. We're using VPC peering between our services and MongoDB Atlas, to make sure that traffic is even more secure. And yeah, privacy and security of the data is key thing for us, because this is all what what the health and human services calls, protected health information or PHI. That's the sort of highest level of private information you could possibly have.
Nic: [00:15:30] So in regards to the information being sent, we know that the information is being encrypted at rest. Are you collecting data that could be sensitive, like social security numbers and things like that that might need to be encrypted at a field level to prevent prying eyes of DBAs and similar?
Tapani: [00:15:45] We do not collect any social security information or anything like that. That's purely healthcare data. Um, diabetes device data, and so on. No credit cards. No SSNs.
Nic: [00:15:56] Got it. So nothing that could technically tie the information back to an individual or be used in a malicious way?
Tapani: [00:16:02] Not in that way now. I mean, I think it's fair to say that this is obviously people's healthcare information, so that is sensitive regardless of whether it could be used maliciously or not.
Mike: [00:16:13] Makes sense. Okay. So I'm wondering if you want to talk a little bit about what's next for Tidepool. You did make a brief mention of another application that you'll be launching. Maybe talk a little bit about the roadmap.
Tapani: [00:16:25] Sure. We're working on, besides the existing products we're working on a new product that's called Tidepool Loop and that's an effort to build an automatic insulin dosing system. This takes a more proactive role in the treatment of diabetes. Existing products show data that you already have. This is actually helping you administer insulin. And so it's a smartphone application that's currently under FDA review. We are working with a couple of great partners and the medical device space to launch that with them, with their products.
Mike: [00:16:55] Well, I love the open nature of Tidepool. It seems like everything you're doing is kind of out in the open. From open source to full disclosure on the architecture stack. That's something that that I can really appreciate as a developer. I love the ability to kind of dig a little deeper and see how things work. Is there anything else that you'd like to cover from an organizational perspective? Any other details you wanna share?
Tapani: [00:17:16] Sure. I mean, you mentioned the transparency and openness. We practice what some people might call radical transparency. Not only is our software open source. It's in GitHub. Anybody can take a look at it. Our JIRA boards for bugs and so on. They're also open, visible to anybody. Our interactions with the FDA, our meeting minutes, filings, and so on. We also make those available. Our employee handbook is open. We actually forked another company's employee handbook, committed ours opened as well. And in the hopes that people can benefit from that. Ultimately, why we do this is we hope that we can help improve public health by making everything as, as much as possible we can do make it publicly. And as far as the open source projects go, we have a, several people out there who are making open source contributions or pull requests and so on. Now, because we do operate in the healthcare space, we have to review those submissions pretty carefully before we integrate them into the product. But yeah, we do take to take full requests from people we've gotten community submissions, for instance, translations to Spanish and German and French products. But we'd have to verify those before we can roll them up.
Mike: [00:18:25] Well, this has been a great discussion. Is there anything else that you'd like to share with the audience before we begin to wrap up?
Tapani: [00:18:29] Oh, a couple of things it's closing. So I was, I guess it would be one is first of all we're a hundred percent remote first and globally distributed organization. We have people in five countries in 14 states within the US right now. We're always hiring in some form or another. So if anybody's interested in, they're welcome to take a look at our job postings tidepool.org/jobs. The other thing is as a nonprofit, we tend suddenly gracefully accept donations as well. So there's another link there that will donate. And if anybody's interested in the technical details of how we actually built this all, there's a couple of links that I can throw out there. One is tidepool.org/pubsecc, that'll be secc, that's a R a security white paper, basically whole lot of information about the architecture and infrastructure and security and so on. We also publish a series of blood postings, at tidepool.org/blog, where the engineering team has put out a couple of things in there about our infrastructure. We went through some pretty significant upgrades over the past couple of years, and then finally github.com/tidepool is where are all our sources.
Nic: [00:19:30] Awesome. And you mentioned that you're a remote company and that you were looking for candidates. Were these candidates global, strictly to the US, does it matter?
Tapani: [00:19:39] So we hire anywhere people are, and they work from wherever they are. We don't require relocation. We don't require a visa in that sense that you'd have to come to the US, for instance, to work. We have people in five countries, us, Canada, UK, Bulgaria, and Croatia right now.
Mike: [00:19:55] Well, Tapani I want to thank you so much for joining us today. I really enjoyed the conversation.
Tapani: [00:19:58] Thanks as well. Really enjoyed it.