We are very excited to announce that Cloud Manager Backup now supports the following command-line options:
smallfiles(this is an option for MMAPv1 only)
Cloud Manager Backup will now take into account the command line options of the primary during an initial sync. If your primary uses one of these options, and you want your backup to as well, just resync your backups. If you’ve been holding off using Cloud Manager Backup because of our lack of support of these options, you need wait no more.
Please note that existing snapshots will not be converted, only snapshots created for jobs that were resynced after noon, EDT on May 16 will have these options enabled.
Unlocking Operational Intelligence from the Data Lake: Part 2 - Operationalizing the Data Lake
As we discussed in part 1 , Hadoop-based data lakes excel at generating new forms of insight from diverse data sets, but are not designed to provide real-time access to operational applications. Users need to make analytic outputs from Hadoop available to their online, operational apps. These applications have specific access demands that cannot be met by HDFS, including: Millisecond latency query responsiveness. Random access to indexed subsets of data. Supporting expressive ad-hoc queries and aggregations against the data, making online applications smarter and contextual. Updating fast-changing data in real time as users interact with online applications, without having to rewrite the entire data set. Bringing together operational and analytical processing across high volumes of variably structured data in a single database requires capabilities unique to MongoDB: Workload isolation. MongoDB replica sets can be provisioned with dedicated analytic nodes. This allows users to simultaneously run real-time analytics and reporting queries against live data, without impacting nodes servicing the operational application, and avoiding lengthy ETL cycles. Dynamic schema, coupled with data governance. MongoDB's document data model makes it easy for users to store and combine data of any structure, without giving up sophisticated validation rules, data access and rich indexing functionality. If new attributes need to be added – for example enriching user profiles with geo-location data – the schema can be modified without application downtime, and without having to update all existing records. Expressive queries. The MongoDB query language enables developers to build applications that can query and analyze the data in multiple ways – by single keys, ranges, text search, and geospatial queries through to complex aggregations and MapReduce jobs, returning responses in milliseconds. Complex queries are executed natively in the database without having to use additional analytics frameworks or tools, and avoiding the latency that comes from moving data between operational and analytical engines. Rich secondary indexes. Providing fast filtering and access to data by any attribute, MongoDB supports compound, unique, array, partial, TTL, geospatial, sparse, and text indexes to optimize for multiple query patterns, data types and application requirements. Indexes are essential when operating across slices of the data, for example updating the churn analysis of a subset of high net worth customers, without having to scan all customer data. BI & analytics integration. The MongoDB Connector for BI enables industry leading analytical and visualization tools such as Tableau to efficiently access data stored in MongoDB using standard SQL. Robust security controls. Extensive access controls, auditing for forensic analysis and encryption of data both in-flight and at-rest enables MongoDB to protect valuable information and meet the demands of big data workloads in regulated industries. Scale-out on commodity hardware. MongoDB can be scaled within and across geographically distributed data centers, providing extreme levels of availability and scalability. As your data lake grows, MongoDB scales easily with no downtime and no application changes. Advanced management and cloud platform. To reduce data lake TCO and risk of application downtime, MongoDB Ops Manager provides powerful tooling to automate database deployment, scaling, monitoring and alerting, and disaster recovery. Further simplifying operations, MongoDB Atlas delivers MongoDB as a service, providing all of the features of the database, without the operational heavy lifting required for any application. MongoDB Atlas is a great choice if you want the database run for you, or if your data lake and apps are also running on a public cloud platform. MongoDB Atlas is available on-demand through a pay-as-you-go model and billed on an hourly basis. High skills availability. With availability of Hadoop skills cited by Gartner analysts as a top challenge, it is essential you choose an operational database with a large available talent pool. This enables you to find staff who can rapidly build differentiated big data applications. Across multiple measures, including DB Engines Rankings , The 451 Group NoSQL Skills Index and the Gartner Magic Quadrant for Operational Databases , MongoDB is the leading non-relational database. In addition, the ability to apply the same distributed processing frameworks such as Apache Spark, MapReduce and Hive to data stored in both HDFS and MongoDB allows developers to converge analytics of both real time, rapidly changing data sets with the models created by batch Hadoop jobs. Through sophisticated connectors, Spark and Hadoop can pass queries as filters and take advantage of MongoDB’s rich secondary indexes to extract and process only the range of data it needs – for example, retrieving all customers located in a specific geography. This is very different from less featured datastores that do not support a rich query language or secondary indexes. In these cases, Spark and Hadoop jobs are limited to extracting all data based on a simple primary key, even if only a subset of that data is required for the query. This means more data movement between the data lake and the database, more processing overhead, more hardware, and longer time-to-insight for the user. Table 1: How MongoDB stacks up for operational intelligence As demonstrated in Table 1, operational intelligence requires a fully-featured database serving as a System of Record for online applications. These requirements exceed the capabilities of simple key-value or column-oriented datastores that are typically used for short lived, transient data, or legacy relational databases structured around rigid row and column table formats and scale-up architectures. Figure 1: Design pattern for operationalizing the data lake Figure 1 presents a design pattern for integrating MongoDB with a data lake: Data streams are ingested to a pub/sub message queue, which routes all raw data into HDFS. Processed events that drive real-time actions, such as personalizing an offer to a user browsing a product page, or alarms for vehicle telemetry, are routed to MongoDB for immediate consumption by operational applications. Distributed processing frameworks such as Spark or MapReduce jobs materialize batch views from the raw data stored in the Hadoop data lake. MongoDB exposes these models to the operational processes, serving queries and updates against them with real-time responsiveness. The distributed processing frameworks can re-compute analytics models, against data stored in either HDFS or MongoDB, continuously flowing updates from the operational database to analytics views. In part 3, we’ll demonstrate how leading companies are using the design pattern discussed above to operationalize their data lakes. Learn more by reading the Operational Data Lake white paper. Unlocking Operational Intelligence from the Data Lake About the Author - Mat Keep Mat is director of product and market analysis at MongoDB. He is responsible for building the vision, positioning and content for MongoDB’s products and services, including the analysis of market trends and customer requirements. Prior to MongoDB, Mat was director of product management at Oracle Corp. with responsibility for the MySQL database in web, telecoms, cloud and big data workloads. This followed a series of sales, business development and analyst / programmer positions with both technology vendors and end-user companies.
Australian Start-Up Ynomia Is Building an IoT Platform to Transform the Construction Industry and its Hostile Environments
The trillion dollar construction industry has not yet experienced the same revolution in technology you might have expected. Low levels of R&D and difficult working environments have led to a lack of innovation and fundamental improvements have been slow. But one Australian start-up is changing that by building an Internet of Things (IoT) platform to harness construction and jobsite data in real time. “Productivity in construction is down there with hunting and fishing as one of the least productive industries per capita in the entire world. It's a space that's ripe for people to come in and really help,” explains Rob Postill , CTO at Ynomia. Ynomia has already been closely involved with many prestigious construction projects, including the residential N06 development in London’s famous 2012 Olympic Village. It was also integral to the construction of the Victoria University Tower in Australia. Link to Podcast Episode Here “These projects involve massive outflow of money: think about glass facades on modern buildings, which can represent 20-30 percent of the overall project cost. They are largely produced in China and can take 12 weeks to get here,” says Postill. “Meanwhile, the plasterer, the plumber, the electrician are all waiting for those glass facades to be put on so it is safe for them to work. If you get it wrong, you can go in the deep red very quickly.” To tackle these longstanding challenges, Ynomia aims to address the lack of connectivity, transparency and data management on construction sites, which has traditionally resulted in the inefficient use of critical personnel, equipment and materials; compressed timelines; and unpredictable cash flows. To optimize productivity, Ynomia offers a simple end-to-end technology solution that creates a Connected Jobsite. Helping teams manage materials, tools, and people across the worksite in real time. IOT in a Hostile Environment The deployment of technology in construction is often fraught with risk. As a result, construction sites are still largely run on paper, such as blueprints, diagrams and models as well as the more traditional invoices and filing. At the same time, there is a constant need to track progress and monitor massive volumes of information across the entire supply chain. Engineers, builders, electricians, plumbers, and all the other associated professionals need to know what they need to do, where they need to be, and when they need to start. “The environment is hostile to technology like GPS, computers, and mobile phone reception because you have a lot of Faraday cages and lots of water and dust,” explains Postill. “You can't have somebody wandering around a construction site with a laptop; it'll get trashed pretty quickly." Enter MongoDB Atlas “On a site, you might be talking about materials, then if you add to that plant & equipment, or bins, or tools etc, you're rapidly getting into thousands and thousands of tags, talking all the time, every day,” said Postill. That means thousands of tags now send millions of readings on Ynomia building sites around the world. All these IoT data packets must be stored efficiently and accurately so Ynomia can reassemble the history of what has happened and track tagged inventory, personnel, and vehicles around the site. Many of the tag events are also safety critical so accuracy is a vital component and packets can't be missed. To address these needs Ynomia was looking for a database that was scalable, flexible, resilient and could easily handle a wide variety of fast-changing sensor data captured from multiple devices. The final component Postill was looking for in a database layer was freedom: a database that didn't lock them into a single cloud platform as they were still in the early stages of assessing cloud partners. The Commonwealth Scientific and Industrial Research Organisation , which Postill had worked with in the past, suggested MongoDB , a general purpose, document-based database built for modern applications. “The most important factor was that the database is event-driven, which I knew would be difficult in the traditional relational model. We deal with millions of tag readings a day, which is a massive wall of data,” said Postill. A Cloud Database Ynomia is using MongoDB Atlas , the global cloud database service, now hosted on Microsoft Azure. Atlas offers best-in-class automation and proven practices that combine availability, scalability, and compliance with the most demanding data security and privacy standards. “When we started we didn't know enough about the problem and we didn't want to be constrained," explained Postill. "MongoDB Atlas gives us a cloud environment that moves with us. It allows us to understand what is happening and make changes to the architecture as we go." Postill says this combination of flexibility and management tooling also allows his developers to focus on business value not undifferentiated code. One example Postill gave was cluster administration: "Cluster administration for a start-up like us is wasted work," he said. "We’re not solving the customer's problem. We're not moving anything on. We’re focusing on the wrong thing. For us to be able to just make that problem go away is huge. Why wouldn’t you?" Atlas also gives Ynomia the option to spin out new clusters seamlessly anywhere in the world. This allows customers to keep data local to their construction site, improving latency and helping solve for regional data regulations. Real Time Analytics The company has also deployed MongoDB Charts, which takes this live data and automatically provides a real time view. Charts is the fastest and easiest way to visualize event data directly from MongoDB in order to act instantly and decisively based on the real-time insights generated by event-driven architecture. It allows Ynomia to share dashboards so all the right people can see what they need to and can collaborate accordingly. “Charts enables us to quickly visualize information without having to build more expensive tools, both internally and externally, to examine our data,” comments Postill. “As a startup, we go through this journey of: what are we doing and how are we doing it? There's a lot of stuff we are finding out along the way on how we slice and re-slice our data using Charts.” A Platform for Future Growth Ynomia is targeting a huge market and is set for ambitious growth in the coming years. How the platform, and its underlying architecture, can continue to scale and evolve will be crucial to enabling that business growth. “We do anything we can to keep things simple,” concluded Postill. “We pick technology partners that save us from spending time we shouldn't spend so we can solve real problems. We pick technologies that roll with the punches and that's MongoDB.” When we started we didn't know enough about the problem and we didn't want to be constrained," explained Postill. "MongoDB Atlas gives us a cloud environment that moves with us. It allows us to understand what is happening and make changes to the architecture as we go. Rob Postill, CTO, Ynomia