ALERT: MMS Public IP Addresses are Changing
As a result of a network migration from AWS Classic networking to AWS VPC networking, the public-facing IP addresses of MongoDB’s services, including both JIRA (jira.mongodb.org) and MMS (mms.mongodb.com and api-backup.mongodb.com), will be changing on Thursday, January 29, 2015 (16:00 UTC / 11:00 EDT).
The current IP addresses and new IP addresses are documented (along with a procedure for making a seamless change, if you have them hard-coded anywhere) here.
This will only impact Customers that * are using DNS TTL overrides, * have our IP addresses hard-coded into one or more of their systems * or have firewalls controlling outbound access to those IP addresses.
Workarounds In the scenario where there is a hard-coded IP address, it will need to be changed on Thursday, January 29, 2015 (16:00 UTC / 11:00 EDT) or access to our systems will be degraded once the IP address change has been made. Please also note that the old IPs will be decommissioned shortly after and at that point access will be lost completely until your systems are updated.
Firewalls can add both sets of IP addresses in advance, and then remove the old IP addresses once the change has been made. DNS servers should be adjusted to honor the TTL returned during this process, or will need their caches manually cleared once the change has been made. Hard-coded IP addresses in applications should be switched to use DNS or will need to be manually adjusted post-change.
If you have any questions about this change or need help implementing any workarounds on your systems, feel free to file a ticket in JIRA prior to the change date and we will work with you to make sure that your team can make the appropriate updates on your side.
Leaf in the Wild: Qihoo Scales with MongoDB
Leaf in the Wild posts highlight real world MongoDB deployments. Read other stories about how companies are using MongoDB for their mission-critical projects. 100+ apps, 1,500+ Instances, 20B Queries per Day Qihoo is China’s number 1 Android mobile distribution platform. Qihoo is also China’s top malware protection company, providing products for both web and mobile platforms. A MongoDB user since 2011, Qihoo has built over 100 different applications on MongoDB – including new services and migrations from MySQL and Redis – running on 1,500+ instances and supporting 20 billion queries per day. I had the chance to sit down with Yang Yan Jie, the Senior DBA at Qihoo to learn more about how and why they use MongoDB, his scaling best practices, and recommendations for those getting started with the database. Can you start by telling us about Qihoo? Qihoo 360 Technology Co. Ltd. is a leading Chinese Internet company. At the end of June 2014, we had around 500 million monthly active PC Internet users and over 640 million mobile users. Recognizing malware protection as a fundamental need of all Internet and mobile users, we built our large user base by offering comprehensive, effective and user-friendly Internet and mobile security products and services to protect users' computers and mobile devices against malware and malicious websites. Our products and services are supported by our cloud-based security technology, which we believe is one of the most advanced and robust technologies in the malware protection industry. We monetize our user base primarily through online advertising and Internet value-added services. In terms of our market position, we are: A top three Internet Company as measured by user base in China No. 1 Android-based mobile distribution platform in China No. 1 provider of Internet and mobile malware protection products and services in China No. 2 PC search engine in China When did Qihoo start using MongoDB? We were a very early adopter of MongoDB, building our first applications on the database back in 2011. I think we were using version 1.8 then! How is Qihoo using MongoDB today? MongoDB has become our standard modern database platform. We now have over 100 applications powered by MongoDB – both external customer-facing services and internal business applications. In total we have more than 1,500 MongoDB instances running on our in-house built “HULK” cloud platform, collectively serving 20 billion queries per day. Three particularly critical applications for our business are: Location-based mobile search application. We use MongoDB with its geospatial indexes and queries to deliver geo-aware search results to mobile users. The user can be searching for anything, from a local restaurant, to a shop, to a car dealership. The app will detect their location and serve search results based on proximity. MongoDB handles 1.2 billion queries per day from this application. Caching layer for user authentication data. Qihoo is a central portal for many Chinese Internet users. We have many partners that our users can connect to directly after logging into our site. We provide Single Sign On (SSO) to multiple services so users don’t need to keep providing their security credentials as they navigate around the web. The user’s SSO session is cached in MongoDB for ultra-fast access. MongoDB supports millions of concurrent users, handling 30,000 operations per second and 1.8 billion queries daily. Log analytics platform. We need to know our infrastructure is running well. Our internal business users also want to measure user engagement with new promotions and campaigns. To accomplish this, we collect log data from all of our Linux, Apache web server and Tomcat servers, and stream it directly into MongoDB. From there, our internal business users can generate real time analytics and reports using our PHP-based Business Intelligence (BI) platform. MongoDB stores 2.5 billion documents at any one time across 18 shards configured with 3-node replica sets for always-on availability. MongoDB serves nearly 3 billion queries per day, including 1 billion writes. What other databases do you use? MongoDB is one of the three database technologies used in our company. It isn’t necessarily suitable for all applications, so we also use MySQL for relational data problems and Redis for certain caching use-cases. Over time, we have migrated more than a dozen projects from MySQL and Redis to MongoDB. What factors drove this migration? Our goal is to use the best technology where it best fits. In the case of MySQL, migration was driven by scalability and developer productivity. As a relational database, MySQL does not scale out, so as our user base grew above 100 million active users, we hit the limits of how far we could push MySQL. MongoDB auto-sharding allows us to scale on-demand using commodity hardware. The MongoDB data model is also far more flexible. Our developers can get more done and iterate faster with MongoDB than they can with the relational model. In the case of Redis, the migrations were driven by cost and flexibility. We found that MongoDB meets our low latency caching requirements for many applications, while it’s on-disk persistence reduces the need to provision costly systems configured with high-memory footprints. In addition, there is much more you can do with MongoDB’s document data model than you can with Redis’ Key-Value model. This translates directly to richer application functionality. For applications where data volumes are expected to grow rapidly, we choose MongoDB over Redis. Tell us about the platforms you are running MongoDB on. Most of our applications are PHP based. We run CentOS on x86 hardware. We have standardized on local SSD storage as this gives us the best performance. We are running MongoDB 2.4 and the latest 2.6 releases. We are also looking forward to MongoDB 3.0! How is MongoDB configured? We run both single replica sets and sharded clusters, depending on the application. We have data centres across the country, with the main ones located in Beijing. We deploy MongoDB on our private cloud across multiple data centers, both for disaster recovery and for low latency local reads and writes. We don’t control our own fiber, so network quality is out of our control. For the most critical apps, we spin up identical MongoDB clusters in multiple data centers and use our own message queue to replicate between them – this gives us assurance of maintaining availability in the face of network partitions. How do you manage your MongoDB deployment? We have developed a centralized orchestration web platform, which we call the HULK cloud platform. It is used by nearly all of our technical engineers to control our mission critical infrastructure and services. It is a complex piece of engineering which we are very proud of. When we originally started the cloud platform project, we hoped it would allow our engineers to stand on the shoulders of giants, relying on the platform to speed up the time to market for their applications. Hence we named it “HULK”. HULK currently provides elastic services such as Web, relational database, NoSQL and distributed storage, etc. At same time, the open platform concept attracted various internal teams to move their applications onto the platform. The re-platforming of these applications provided immediate access to other LoBs internally, and in the process of doing that we helped the business groups to attain higher efficiency and greater technology expertise. MongoDB is one of the most critical services on HULK and it is fully integrated into the platform with a high degree of automation, allowing us to operate more than 1,500 MongoDB instances with just one and a half DBAs. The DBAs can perform “one click deployment” and “one click upgrade” tasks via the HULK management interface. All backup and monitoring is fully automated. For instance, if you add a new MongoDB node or cluster, HULK automatically configures the monitoring and backup strategy, as well as deploy the necessary agents. For developers, they can monitor a multitude of MongoDB metrics and status. In addition, they can open a ticket right on the management portal itself, instead of using email or IM, all with a few mouse clicks. How do you backup MongoDB? We use a combination of approaches, governed by the application’s RPO and RTO objectives: Filesystem backups. This is the default approach. We shut down a secondary replica set member and snapshot the filesystem image Incremental replication. For continuous backup, we have built a tool that tails the MongoDB oplog. We use this approach for more critical apps where we need faster restoration of service Delayed replicas . We use this approach for additional assurances, again governed by how quickly we need to bring the data back Can you share any best practices on scaling your MongoDB infrastructure? There are three tips I would like to share: From a DBA perspective, invest time to understand application usage. The developers will give their guidance, but we generally take any number they give us and add 50%! If you encounter performance issues, start with your hardware. We found upgrading from hard disks to SSDs gave us an instant performance boost without any other optimizations. For highly dynamic, write-intensive workloads, make sure you monitor storage fragmentation and compact regularly if needed. Are you measuring the impact of MongoDB on your business? Yes – in terms of time to market. An example of the impact this makes is our reaction to the 2014 earthquake in Yunnan province. Everyone in China wanted to have access to the latest updates and to be able to check in on friends and family in the region. The business felt the best way to do this was to build an app that verified and then consolidated newsfeeds from multiple sources. We designed the app in the morning after the earthquake, coded it in the afternoon and launched it in the evening. One business day from concept to production. Only MongoDB could support that velocity of development. Are you looking forward to MongoDB 3.0? We started testing MongoDB 3.0 and filing bugs as soon as we could get our hands on the first Release Candidate. We are especially excited about document level concurrency control. This will further improve write scaling and fully saturate the latest generation of dense multi-core systems we are using now. Compression is also a huge benefit for us. We have standardized on SSDs, so compression means we can pack more onto each drive, which will bring costs down. It will also give us another performance boost as fewer bits are read from disk, making better use of disk I/O cycles. What advice would you give to those considering using MongoDB for their next project? MongoDB’s document data model and dynamic schema bring great flexibility and power. But they also bring great responsibility! I’d recommend not storing multitudes of different document types and formats within a single collection as it makes ongoing application maintenance complex. Split out documents of different types and structures into their own collections. We have implemented tools that scan and sample documents from each collection. If variances in structure exceed our best practices, we alert the devs so they can go and address the issue. So that is where I’d start. Mr. Yang – I’d like to thank you for taking the time to share your insights with the MongoDB community. Struggling to scale your relational database? Download our Migration White Paper: Migration White Paper About the Author - Mat Keep Mat is part of the MongoDB product marketing team, responsible for building the vision, positioning and content for MongoDB’s products and services, including the analysis of market trends and customer requirements. Prior to MongoDB, Mat was director of product management at Oracle Corp. with responsibility for the MySQL database in web, telecoms, cloud and big data workloads. This followed a series of sales, business development and analyst / programmer positions with both technology vendors and end-user companies.
How Edenlab Built a High-Load, Low-Code FHIR Server to Deliver Healthcare for 40 Million Plus Patients
The Kodjin FHIR server has speed and scale in its DNA. Edenlab, the Ukrainian company behind Kodjin , built our original FHIR solution to digitize and service the entire Ukrainian national health system. The learnings and technologies from that project informed our development of the Kodjin FHIR server. At Edenlab, we have always been driven by our passion for building solutions that excel in speed and scale. With Kodjin, we have embraced a modern tech stack to deliver unparalleled performance that can handle the demands of large-scale healthcare systems, providing efficient data management and seamless interoperability. Eugene Yesakov, Solution Architect, Author of Kodjin Built for speed and scale While most healthcare projects involve handling large volumes of data, including patient records, medical images, and sensor data, the Kodjin FHIR server is based on a system developed to handle tens of millions of patient records and thousands of requests per second, to ensure timely access and efficient decision-making for a population of over 40 million people. And all of this information had to be processed and exchanged in real-time or near real-time, without delays or bottlenecks. This article will explore some of the architectural decisions the Edenlab team took when building Kodjin, specifically the role MongoDB played in enhancing performance and ensuring scalability. We will examine the benefits of leveraging MongoDB's scalability, flexibility, and robust querying capabilities, as well as its ability to handle the increasing velocity and volume of healthcare data without compromising performance. About Kodjin FHIR server Kodjin is an ONC-certified and HIPAA-compliant FHIR Server that offers hassle-free healthcare data management. It has been designed to meet the growing demands of healthcare projects, allowing for the efficient handling of increasing data volumes and concurrent requests. Its architecture, built on a horizontally scalable microservices approach, utilizes cutting-edge technologies such as the Rust programming language, MongoDB, ElasticSearch, Kafka, and Kubernetes. These technologies enable Kodjin to provide users with a low-code approach while harnessing the full potential of the FHIR specification. A deeper dive into the architecture approach - the role of MongoDB in Kodjin When deciding on the technology stack for the Kodjin FHIR Server, the Edenlab team knew that a document database would be required to serve as a transactional data store. In an FHIR Server, a transactional data store ensures that data operations occur in an atomic and consistent manner, allowing for the integrity and reliability of the data. Document databases are well-suited for this purpose as they provide a flexible schema and allow for storing complex data structures, such as those found in FHIR data. FHIR resources are represented in a hierarchical structure and can be quite intricate, with nested elements and relationships. Document databases, like MongoDB, excel at handling such complex and hierarchical data structures, making them an ideal choice for storing FHIR data. In addition to supporting document storage, the Edenlab team needed the chosen database to provide transactional capabilities for FHIR data operations. FHIR transactions, which encompass a set of related data operations that should either succeed or fail as a whole, are essential for maintaining data consistency and integrity. They can also be used to roll back changes if any part of the transaction fails. MongoDB provides support for multi-document transactions , enabling atomic operations across multiple documents within a single transaction. This aligns well with the transactional requirements of FHIR data and ensures data consistency in Kodjin. Implementation of GridFS as a storage for the terminologies in Terminology service Terminology service plays a vital role in FHIR projects, requiring a reliable and efficient storage solution for terminologies used. Kodjin employs GridFS , a file system within MongoDB designed for storing large files, which makes it ideal to handle terminologies. GridFS offers a convenient way to store and manage terminology files, ensuring easy accessibility and seamless integration within the FHIR ecosystem. By utilizing MongoDB's GridFS, Kodjin ensures efficient storage and retrieval of terminologies, enhancing the overall functionality of the terminology service. Kodjin FHIR server performance To evaluate the efficiency and responsiveness of the Kodjin FHIR server in various scenarios we conducted multiple performance tests using Locust, an open-source load testing tool. One of the performance metrics measured was the retrieval of resources by their unique ids using the GET by ID operation. Kodjin with MongoDB achieved a performance of 1721.8 requests per second (RPS) for this operation. This indicates that the server can efficiently retrieve specific resources, enabling quick access to desired data. The search operation, which involves querying ElasticSearch to obtain the ids of the searched resources and retrieving them from MongoDB, exhibited a performance of 1896.4 RPS. This highlights the effectiveness of polyglot persistence in Kodjin, leveraging ElasticSearch for fast and efficient search queries and MongoDB for resource retrieval. The system demonstrated its ability to process search queries and retrieve relevant results promptly. In terms of resource creation, Kodjin with MongoDB showed a performance of 1405.6 RPS for POST resource operations. This signifies that the system can effectively handle numerous resource-creation requests. The efficient processing and insertion of new resources into the MongoDB database ensure seamless data persistence and scalability. Overall, the performance tests confirm that Kodjin with MongoDB delivers efficient and responsive performance across various FHIR operations. The high RPS values obtained demonstrate the system's capability to handle significant workloads and provide timely access to resources through GET by ID, search, and POST operations. Conclusion Kodjin leverages a modern tech stack including Rust, Kafka, and Kubernetes to deliver the highest levels of performance. At the heart of Kodjin is MongoDB, which serves as a transactional data store. MongoDB's capabilities, such as multi-document transactions and flexible schema, ensure the integrity and consistency of FHIR data operations. The utilization of GridFS within MongoDB ensures efficient storage and retrieval of terminologies, optimizing the functionality of the Terminology service. To experience the power and potential of the Kodjin FHIR server firsthand, we invite you to contact the Edenlab team for a demo. For more information On MongoDB’s work in healthcare, and to understand why the world’s largest healthcare companies trust MongoDB, read our whitepaper on radical interoperability .