What is WiredTiger? What is MMAPv1?
WiredTiger is MongoDB’s new storage engine. It is available as an option for all 64-bit MongoDB 3.0 and higher builds. Among other features, it supports document-level locking and compression on disk. Check out MongoDB’s docs on WiredTiger for more information.
MMAPv1 is MongoDB’s traditional memory-mapped files storage engine. In MongoDB 3.0 we added collection-level locking while remaining compatible with the on-disk storage of MongoDB 2.6 and older.
Why you should match
Matching your currently running storage engine to your backed-up engine is critical to easy restores. If your restore files are in a different format than what you are used to running, you will have to make certain that you set your command line options correctly, and differently from what you are running elsewhere.
Where to change your setting
First, you have to be running the new MMS. MMS Classic customers cannot change their backup format, and only have MMAPv1 backups available to them. You are running the new MMS if the upper-left of your MMS window looks like this:
Once you are in the new MMS, you can change your backup format by going to the “Backup” tab, click the “…” for your replica set or cluster and choose “Edit Storage Engine”
Once there, you can choose MMAPv1 (“MongoDB Memory Mapped Files”) or WiredTiger:
Once you make this change, an initial sync will be triggered. Choose the server you want to sync from and confirm. An initial sync is required so we can build your new backup. This will not change your existing snapshot formats, so if you request an older restore, it will still be in MMAPv1. You can tell if a snapshot is in WiredTiger format by looking at the “Mongod Version” column on your snapshots listing page (just click on a replica set name on your Backup tab). If the version has “(wiredTiger)” after it, the snapshot is in WiredTiger format. You can see I converted this replica set to WiredTiger:
MongoDB and Leap Seconds
The short answer As the June 30, 2015 leap second event approaches, I have received a number of questions about how MongoDB is expected to behave during a leap second event. The short answer is “just fine.” MongoDB treats the observation of leap seconds similarly to the observation of clock skew between machines or the observation of other time-setting events, like manual clock adjustment. In more detail To understand why MongoDB is robust to leap seconds, it helps to think about how leap seconds affect the observation of wall clock time, especially the case where it can make it appear to processes that time has gone backwards, and about how MongoDB uses wall clock time. Leap seconds come in one of two forms: either an extra second added at the end of the last minute of a specific calendar day or the omission of the last second at the end of the last minute of a specific calendar day in UTC. So, this can lead to a time 23:59:60Z on a day with a leap second in the first case, or to time transitioning from 23:59:58Z to 00:00:00Z on a day with a leap second in the second case. Unfortunately, the time standard used by almost all computers defines a calendar day as being composed of 86,400 seconds. Two techniques are used to deal with this discrepancy. The cool but by far less common one is to make all the computer-reported seconds for a period of time leading up to the end of the leap-second day slightly longer or shorter than true seconds, “smearing” the leap second over several hours. Google apparently does this . The more mundane technique is for the OS clock to have the last second occur two times, from the point of view of observing processes, or to skip the last second, depending on the type of leap second. When the last second of the day occurs twice, an observer reading time with subsecond granularity could observe 23:59:59.800Z and subsequently observe 23:59:59:200Z, making it seem as though time has moved backwards. When the last second of the day is omitted, a process might believe that two seconds have passed when in fact only one has, because it observes 23:59:58Z and then 00:00:00Z. With this information about the observable effects of leap seconds in hand, we can now look at how this might affect MongoDB’s use of wall clock time. MongoDB uses wall clock time for the following: To generate diagnostic information, such as log messages; To record the wall clock time in fields of documents via the $currentDate update operator and related operators, and to generate OIDs; To generate “optime” fields in replication oplogs; To schedule periodic events, such as replication heartbeats or cursor expirations. Impact on Diagnostic Information Diagnostic data is used by human beings and tools such as MMS Monitoring to monitor the health of a MongoDB cluster, or to perform a forensic analysis after an observed failure. In these cases, the accuracy of the reported wall clock time aids in diagnosis, but is not required for correct operation of the cluster or for the analytic task. This must be so, because MongoDB clusters are distributed over asynchronous networks, and tight synchronization of clocks among the components of the system cannot be assured. One caveat in the forensics and monitoring use case is that, if your operating system might allow MongoDB to observe time moving backwards , some diagnostic log messages may indicate that an operation took a very long time when it in fact did not. These false positives for slow operations are typically easy to identify because they report absurdly long or negative durations (frequently on the order of two weeks, positive or negative). This can also occur if you manually reset your system clock during MongoDB operation. Impact on $currentDate et al When a client application requests a document be updated with the server’s notion of the current date and time, MongoDB simply asks the operating system for the current wall clock time and records that value in a client document. Any impact of clock adjustments for leap seconds or otherwise will effectively be passed through to the client application. Applications that require second-granularity precision of timestamps should be examined, whether or not they use MongoDB, as should the time synchronization technology used in support of that application (typically NTP). Impact on the replica set oplog MongoDB replica sets use a replicated operation log, or oplog, to inform secondary nodes of changes to make in order to stay consistent with the primary node. These changes are kept in a total order, described by an “optime”, sometimes called the timestamp. This optime is composed of wall clock time paired with an " increment ", an integer which uniquely identifies operations that execute during the same wall clock time. For example, the first operation recorded at 23:59:59Z would be recorded as optime (23:59:59Z,1) and the third operation would have optime (23:59:59Z,3). But wall clock time is not used indiscriminately, because system clocks can drift, or be reset. The time portion of the optime is actually the maximum of the current observed time and the greatest previous observation. If MongoDB records operation A with an optime of (23:59:59Z,1), and then observes a time of 23:59:58Z when it attempts to log a subsequent operation B, it will act as if operation B occurred during 23:59:59Z, and thus log it with an optime of (23:59:59Z,2).In addition to leap seconds, unsynchronized clocks between replica set members may cause the optime to be ahead of any one node’s local wall clock time. This situation is common and does not negatively affect replication operation. Impact on the scheduling of periodic tasks The final way that MongoDB uses wall clock time is to schedule periodic activities, like sending heartbeats to replica set nodes, cleaning up expired cursors or invalidating caches that use age-based invalidation policies. These activities are typically scheduled to run after some amount of wall clock time has elapsed, rather than at specific absolute wall clock times; the difference is not material. In either event, the introduction of a positive leap second may cause an event to occur later than it otherwise would have, and the introduction of a negative leap second may cause an event to occur sooner than it otherwise would have. MongoDB’s algorithms must already be robust to these behaviors, because they are typically indistinguishable from delays induced by higher-than-average network latency or virtual machine and operating system scheduling issues. Your Operating System matters Remember, MongoDB relies on host operating system capabilities for reading the wall clock time, and for synchronizing events with wall clock time. As such, you should ensure that the operating system running under MongoDB is itself prepared for leap seconds. The most widely documented database problems during the June 2012 leap second were actually caused by a livelock bug in the Linux kernel futex synchronization primitive. The DataStax developer blog has a brief summary of the cause of the June 2012 issue in Cassandra, which correctly assigns responsibility to a since-resolved issue in the Linux kernel. If you use Red Hat Enterprise Linux, they have a nice knowledge base article that covers the topic of leap second preparedness for RHEL. If you’re running on Windows, Microsoft has a very brief knowledge base article on the subject of leap seconds. If you’re interested in learning more about the operational best practices of MongoDB, download our guide: Learn Best Practices for Operations About the Author - Andy Andy Schwerin is the Director of Distributed Systems Engineering at MongoDB in New York.
Australian Start-Up Ynomia Is Building an IoT Platform to Transform the Construction Industry and its Hostile Environments
The trillion dollar construction industry has not yet experienced the same revolution in technology you might have expected. Low levels of R&D and difficult working environments have led to a lack of innovation and fundamental improvements have been slow. But one Australian start-up is changing that by building an Internet of Things (IoT) platform to harness construction and jobsite data in real time. “Productivity in construction is down there with hunting and fishing as one of the least productive industries per capita in the entire world. It's a space that's ripe for people to come in and really help,” explains Rob Postill , CTO at Ynomia. Ynomia has already been closely involved with many prestigious construction projects, including the residential N06 development in London’s famous 2012 Olympic Village. It was also integral to the construction of the Victoria University Tower in Australia. Link to Podcast Episode Here “These projects involve massive outflow of money: think about glass facades on modern buildings, which can represent 20-30 percent of the overall project cost. They are largely produced in China and can take 12 weeks to get here,” says Postill. “Meanwhile, the plasterer, the plumber, the electrician are all waiting for those glass facades to be put on so it is safe for them to work. If you get it wrong, you can go in the deep red very quickly.” To tackle these longstanding challenges, Ynomia aims to address the lack of connectivity, transparency and data management on construction sites, which has traditionally resulted in the inefficient use of critical personnel, equipment and materials; compressed timelines; and unpredictable cash flows. To optimize productivity, Ynomia offers a simple end-to-end technology solution that creates a Connected Jobsite. Helping teams manage materials, tools, and people across the worksite in real time. IOT in a Hostile Environment The deployment of technology in construction is often fraught with risk. As a result, construction sites are still largely run on paper, such as blueprints, diagrams and models as well as the more traditional invoices and filing. At the same time, there is a constant need to track progress and monitor massive volumes of information across the entire supply chain. Engineers, builders, electricians, plumbers, and all the other associated professionals need to know what they need to do, where they need to be, and when they need to start. “The environment is hostile to technology like GPS, computers, and mobile phone reception because you have a lot of Faraday cages and lots of water and dust,” explains Postill. “You can't have somebody wandering around a construction site with a laptop; it'll get trashed pretty quickly." Enter MongoDB Atlas “On a site, you might be talking about materials, then if you add to that plant & equipment, or bins, or tools etc, you're rapidly getting into thousands and thousands of tags, talking all the time, every day,” said Postill. That means thousands of tags now send millions of readings on Ynomia building sites around the world. All these IoT data packets must be stored efficiently and accurately so Ynomia can reassemble the history of what has happened and track tagged inventory, personnel, and vehicles around the site. Many of the tag events are also safety critical so accuracy is a vital component and packets can't be missed. To address these needs Ynomia was looking for a database that was scalable, flexible, resilient and could easily handle a wide variety of fast-changing sensor data captured from multiple devices. The final component Postill was looking for in a database layer was freedom: a database that didn't lock them into a single cloud platform as they were still in the early stages of assessing cloud partners. The Commonwealth Scientific and Industrial Research Organisation , which Postill had worked with in the past, suggested MongoDB , a general purpose, document-based database built for modern applications. “The most important factor was that the database is event-driven, which I knew would be difficult in the traditional relational model. We deal with millions of tag readings a day, which is a massive wall of data,” said Postill. A Cloud Database Ynomia is using MongoDB Atlas , the global cloud database service, now hosted on Microsoft Azure. Atlas offers best-in-class automation and proven practices that combine availability, scalability, and compliance with the most demanding data security and privacy standards. “When we started we didn't know enough about the problem and we didn't want to be constrained," explained Postill. "MongoDB Atlas gives us a cloud environment that moves with us. It allows us to understand what is happening and make changes to the architecture as we go." Postill says this combination of flexibility and management tooling also allows his developers to focus on business value not undifferentiated code. One example Postill gave was cluster administration: "Cluster administration for a start-up like us is wasted work," he said. "We’re not solving the customer's problem. We're not moving anything on. We’re focusing on the wrong thing. For us to be able to just make that problem go away is huge. Why wouldn’t you?" Atlas also gives Ynomia the option to spin out new clusters seamlessly anywhere in the world. This allows customers to keep data local to their construction site, improving latency and helping solve for regional data regulations. Real Time Analytics The company has also deployed MongoDB Charts, which takes this live data and automatically provides a real time view. Charts is the fastest and easiest way to visualize event data directly from MongoDB in order to act instantly and decisively based on the real-time insights generated by event-driven architecture. It allows Ynomia to share dashboards so all the right people can see what they need to and can collaborate accordingly. “Charts enables us to quickly visualize information without having to build more expensive tools, both internally and externally, to examine our data,” comments Postill. “As a startup, we go through this journey of: what are we doing and how are we doing it? There's a lot of stuff we are finding out along the way on how we slice and re-slice our data using Charts.” A Platform for Future Growth Ynomia is targeting a huge market and is set for ambitious growth in the coming years. How the platform, and its underlying architecture, can continue to scale and evolve will be crucial to enabling that business growth. “We do anything we can to keep things simple,” concluded Postill. “We pick technology partners that save us from spending time we shouldn't spend so we can solve real problems. We pick technologies that roll with the punches and that's MongoDB.” When we started we didn't know enough about the problem and we didn't want to be constrained," explained Postill. "MongoDB Atlas gives us a cloud environment that moves with us. It allows us to understand what is happening and make changes to the architecture as we go. Rob Postill, CTO, Ynomia