In this blog post we will explore how to manage Mongos servers using Cloud Manager. Mongos (short for “MongoDB Shard”) is a routing service for MongoDB shard configurations that process queries from the application layer and determines the location of the data in the sharded cluster.
How to Add a Mongos server
For deployments under automation, it’s very easy to add a Mongos to your Sharded Cluster. Go to Deployments>Processes and click the wrench at the Sharded Cluster top level.
Next, scroll down to the MongoS settings. There you will see your current settings. Increase the # of Mongos processes to the desired number. For determining which host to deploy the new Mongos, here I chose Regular Expression and entered a string matching an existing host. Choose your desired port range for the Mongos. When you are finished click Apply.
Next review your changes. Here you see that the number of Mongos is increasing from 3 to 4. Confirm the host, port, and other details and click Click & Confirm.
How to Remove a Mongos server
Conversely, it’s also simple to remove a Mongos from your deployment if it’s under automation. From the Deployment>Processes view, change the filter to ‘Mongos’ as shown below. We must first Shutdown the process. Click the “…” menu for the target mongos and choose the Shutdown option. Review & deploy the change.
Once the target mongos is shut down go back to Deployment>Processes and select the ‘Mongos’ filter again. Click the “…” menu for the target mongos and this time choose ‘Remove from Cluster’
Finally, review your changes. Confirm that the number of Mongos processes is reduced and that the target host and server looks correct. Confirm & Deploy to complete the operation.
What’s New in MongoDB 3.2, Part 3: Tools and Integrations for Data Analysts and DBAs
Welcome to the final post in our 3-part MongoDB 3.2 blog series. In part 1 we covered the new storage engines and the use-cases they served. In part 2 we covered features designed to support mission-critical applications, including document validation and the enhanced replication protocol. In this post, I’ll provide an overview of new tools and integrations supporting data analysts, DBAs, and operations teams. Remember, you can get the detail now on everything MongoDB 3.2 offers by downloading the What’s New white paper . New Users With MongoDB deployed across a wider range of an organization’s application portfolio, data analysts, DBAs, and operations teams will need to integrate MongoDB within their existing processes and tool sets. MongoDB 3.2 allows analysts to support the business with new insights from untapped data sources, while DBAs and Ops teams are able to operationalize MongoDB alongside existing relational databases, protecting existing investments in management platforms and skillsets. Data Analysts and Scientists ### MongoDB Connector for BI Driven by growing requirements for self-service analytics, faster discovery and prediction based on real-time operational data, and the need to integrate multi-structured and streaming data sets, BI and analytics platforms are one of the fastest growing software markets. To address these requirements, modern application data stored in MongoDB can for the first time be easily explored with industry-standard SQL-based BI and analytics platforms. Using the MongoDB Connector for BI , analysts, data scientists and business users can now seamlessly visualize semi-structured and unstructured data managed in MongoDB, alongside traditional data in their SQL databases, using the same BI tools deployed within millions of enterprises. Figure 1: Uncover new insights with powerful visualizations generated from MongoDB MongoDB Connector for BI Implementation SQL-based BI tools expect to connect to a data source with a fixed schema presenting tabular data. This presents a challenge when working with MongoDB’s dynamic schema and rich, multi-dimensional documents. In order for BI tools to query MongoDB as a data source, the BI Connector does the following: Provides the BI tool with the schema of the MongoDB collection to be visualized. Users can review the schema output to ensure data types, sub-documents and arrays are correctly represented Translates SQL statements issued by the BI tool into equivalent MongoDB queries that are then sent to MongoDB for processing Converts the returned results into the tabular format expected by the BI tool, which can then visualize the data based on user requirements The BI Connector is available with MongoDB Enterprise Advanced. Watch the demo to see the BI Connector in action, and review the documentation to learn more. You can also download the BI Connector for evaluation Dynamic Lookup: Bringing Left Outer JOINs to MongoDB Applications get great efficiency from MongoDB by combining data that is accessed together into a single document. In contrast, a typical relational database schema scatters related data across scores of tables – e.g., a blog site that stores every tag, category, comment, author and callback as rows in separate tables from the blog post they’re associated with. Typically it is most advantageous to take a denormalized data modeling approach for operational databases – the efficiency of reading or writing an entire record in a single operation outweighing any modest increase in storage requirements. However, there are examples where normalizing data can be beneficial, especially when data from multiple sources needs to be blended for analysis. Consider a shopping cart, which presents two options for handling the order and product information: Include all data for an order in the same document Fast reads – one find delivers all the required data The order document contains the product details that were correct at the time the order was placed; the price of that product may change in the future but the order document remains an accurate representation of the order Consumes additional storage – the details of each product are stored in many order documents; this has become less of an issue as memory and storage prices have fallen Order document references product documents Space efficient – product details are stored just once Slower reads – multiple trips to the database If an attribute of the product (such as the unit price) changes in the future, any older order documents are then incorrect as they reference this newer version Extra application logic – an application must iterate over product IDs in the order document and then fetch the product documents MongoDB 3.2 introduces the ability to combine data from multiple collections by implementing left outer joins through the $lookup operator, which can now be included as a stage in a MongoDB Aggregation Framework pipeline . The new $lookup stage provides more flexibility in data modeling, and allows richer analytics to be run with higher performance and less application-side code. You can learn more from the documentation and see worked examples in this series of blog posts . Real-Time Analytics and Search MongoDB 3.2 extends the options for performing analytics on live, operational data – ensuring that answers are delivered quickly and simply, and are based on current data. Work that would previously have needed to be implemented in client code can now be performed by the database – freeing the developer to focus on building new features. Improved Aggregation The aggregation pipeline is a powerful way to perform complex analytical queries on MongoDB data. Aggregation pipeline stages allow manipulation of a "stream" of documents from a collection that can either be returned via a cursor to the client (similar to find ), or be stored in a new collection via a final $out stage. When analyzing very large data sets, it is frequently sufficient to look at a random sample of documents rather than all of the data. For example, if you wanted to compare the number of check-ins to coffee shops to those at bars, you can get a good approximation without searching through every single check-in. Previously, this sampling would have to have been implemented in the application, but MongoDB introduces the $sample operator, which can be included at any point in the aggregation pipeline. MongoDB documents can store arrays as well as simple values. While this feature is very expressive and powerful, without the corresponding ability to manipulate and filter arrays in documents during aggregations, their usefulness has been limited in an analytical context. New operators have been added to allow more flexibility when dealing with arrays: $slice , $arrayElemAt , $concatArrays , $isArray , $filter , and $min . New mathematical operators have been added for operations such as truncate, ceiling, floor, absolute, rounding, square root, logarithms and standard deviations. These operators can be used to move code from the client tier directly into the database, allowing higher performance with lower developer complexity. By combining the new and existing operators aggregation pipelines can be built to generate sophisticated results with a single query. Review the documentation to learn more about all of the aggregation pipeline enhancements. Improved Text Search Text searches on the data in MongoDB can either be performed in the database or by an external search engine. Performing the search within the database is more efficient and simpler to administer, so that is the preferred option whenever possible. MongoDB 3.2 increases the set of use cases that can be met with in-database text searches by adding support for case-sensitive searches, as well as additional languages including Arabic, Farsi, Urdu, Simplified Chinese, and Traditional Chinese. To provide support for these languages, MongoDB Enterprise Advanced provides integratation with Basis Technology Rosette Linguistics Platform (RLP) to perform normalization, word breaking, sentence breaking, and stemming or tokenization depending on the language. More information on text search enhancements can be found in the documentation . DBAs: MongoDB Compass MongoDB’s dynamic schema and rich document model make developers more productive, but they also make it difficult to explore and understand the underlying data and its structure – in particular for non-developers who aren't familiar with the MongoDB query language. The MongoDB Compass GUI allows users to understand the structure of data in the database and perform ad hoc queries against it – all with zero knowledge of MongoDB's query language. Typical users could include architects building a new MongoDB project or a DBA who has inherited a database from an engineering team, and who must now maintain it in production. They need to understand what kind of data is present, define what indexes might be appropriate, and identify if Document Validation rules should be added to enforce a consistent document structure. Until now, users wishing to understand the shape of their data would have to connect to the MongoDB shell and write queries to reverse engineer the document structure, field names and data types. Similarly, anyone wanting to run custom queries on the data would need to understand MongoDB's query language. MongoDB Compass provides users with a graphical view of their MongoDB schema by sampling a subset of documents from a collection. By using sampling, MongoDB Compass minimizes database overhead and can present results to the user almost instantly. Figure 2: Document structure and contents exposed by MongoDB Compass Querying Data As illustrated in Figure 3, a query can be built and executed by simply selecting document elements from the MongoDB Compass user interface. By selecting multiple values, sophisticated queries can be built. The query can then be executed at the push of a button and the results viewed both graphically and as a set of JSON documents. Figure 3: Interactively build and execute database queries with MongoDB Compass MongoDB Compass samples the database to provide a fast, interactive experience no matter how large the database. If the full results are needed then the query can be simply copied and pasted into a MongoDB shell window. MongoDB Compass is included with MongoDB Professional and MongoDB Enterprise Advanced. You can learn more about Compass in the documentation , and see it in action in our short 3-part demo series: Part 1: Introduction to Compass Part 2: Schema visualization Part 3: Visual query building To evaluate MongoDB Compass, head to the MongoDB Download Center . Operations Teams MongoDB Ops Manager and Cloud Manager are the best way to run MongoDB, reducing tasks such as deployment, scaling, upgrades and backups to just a few clicks or an API call. Operations teams can be 10-20x more productive using the Ops or Cloud Manager platforms. With the enhancements to both Ops and Cloud Manager in MongoDB 3.2, administrators can: Integrate MongoDB alongside existing Application Performance Monitoring platforms for global health visibility over the entire IT estate, all from a single pane of glass Drill down into any MongoDB-specific issues using Ops Manager’s granular monitoring of database telemetry, including new query profiler visualizations Use Ops Manager automation to initiate zero-downtime maintenance and upgrade activities, such as rolling out new indexes across a sharded cluster Create point-in-time, consistent snapshots of the database on standard network-mountable filesystems, and restore complete running MongoDB clusters from backup files. APM Integration: New Relic & AppDynamics Many operations teams use Application Performance Monitoring (APM) platforms such as New Relic and AppDynamics to gain global oversight of their complete IT infrastructure from a single management UI. Issues that risk affecting customer experience can be quickly identified and isolated to specific components – whether attributable to devices, hardware infrastructure, networks, APIs, application code, databases and more. The MongoDB drivers have been enhanced with a new API that exposes query performance metrics to APM tools. Administrators can monitor time spent on each operation, and identify slow running queries that require further analysis and optimization. In addition, Cloud Manager will provide packaged integration with the New Relic platform. Key metrics from Cloud Manager are accessible to the APM for visualization, enabling MongoDB health to be monitored and correlated with the rest of the application estate. Figure 4: MongoDB integrated into a single view of application performance As shown in Figure 4, summary metrics familiar to Cloud Manager users are presented within the APM’s UI. Administrators can also run New Relic Insights for analytics against monitoring data to generate dashboards that provide real-time tracking of Key Performance Indicators (KPIs). If the operations team needs finer grained telemetry, they can drill down into the 100+ system metrics maintained by Cloud Manager. For example, the new visual query profiler helps diagnose slow running queries, which can then be resolved by adding a new index and automatically deploying that across every node in the cluster. Query Performance Visualization: Enabling Fast and Simple Query Optimization The MongoDB database profiler collects fine-grained information that can be used to analyze query performance. However, the output could be difficult to parse, making slow running queries difficult to correct. In addition, the profiler had to be individually activated for each MongoDB instance, and the output manually aggregated from every node to provide a holistic view across the entire deployment. Delivered as part of Ops Manager and Cloud Manager, the new Visual Query Profiler provides a quick and convenient way for operations teams and DBAs to analyze specific queries or query families. The Visual Query Profiler (as shown in Figure 5) displays how query and write latency varies over time – making it simple to identify slower queries with common access patterns and characteristics, as well as identify any latency spikes. A single click in the Ops Manager UI activates the profiler, which then consolidates and displays metrics from every node in a single screen. Figure 5: Visual Query Profiling in MongoDB Ops Manager Index Suggestions & Automated Index Builds Further simplifying operations, the visual query profiler will analyze data it collects to provide recommendations for new indexes that can be created to improve query performance. Once identified, these new indexes need to be rolled out in the production system. In order to minimize the impact to the live system, the best practice is to perform a rolling index build – starting with each of the secondaries and finally applying changes to the original primary, after swapping its role with one of the secondaries. While this rolling process can be performed manually, Ops Manager and Cloud Manager can now automate the process across MongoDB replica sets, reducing operational overhead and the risk of failovers caused by incorrectly sequencing management processes. New Indexing Option: Partial Indexes Secondary indexes are one of the ways that MongoDB distinguishes itself from NoSQL databases – allowing applications to efficiently access their data in multiple ways. However, secondary indexes do come with a cost: Database writes will be slower when they need to update the secondary index Memory and storage is needed to store the secondary index Partial indexes balance delivering good query performance while consuming fewer system resources. For example, consider an order processing application. The order collection is frequently queried by the application to display all incomplete orders for a particular user. Building an index on the userID field of the collection is necessary for good performance. However, only a small percentage of orders are in progress at a given time. Limiting the index on userID to contain only orders that are in the “active” state could reduce the number of index entries from millions to thousands, saving working set memory and disk space, while speeding up queries even further as smaller indexes result in faster searches. By specifying a filtering expression during index creation, a user can instruct MongoDB to include only documents that meet the desired conditions. When performing the database find operation, the application should include the value being used for the filtering as well as the indexed value in order for the partial index to be used by the optimizer. Review the documentation to learn more. Additional Ops Manager Enhancements Beyond the functionality discussed above, a number of enhancements improve productivity of Ops teams and simply installation and management. Ops teams can now automate their database restores reliably and safely using Ops Manager and Cloud Manager. Complete development, test, and recovery clusters can now be built in a few simple clicks. Integrating with existing storage infrastructure, MongoDB backup files can now be stored on a standard network-mountable file system. Operations teams can configure backups against specific collections only, rather than the entire database, speeding up backups and reducing the requisite storage space. This enhancement is also available in the Cloud Manager platform. Installation and configuration of all application and backup components can now be made through the centralized Ops Manager UI, which also provides a single, at-a-glance dashboard for health monitoring. Enhancements to the backup architecture provide faster time to first database snapshot. Eliminating false alarms, maintenance windows can now be defined during which Ops Manager and Cloud Manager alerts will not be triggered. From APM integration through to profiler visualization with index suggestions and automated rolling index builds, through to platform simplification and automated cluster restores, Ops Manager can make your teams more productive. Next Steps This post wraps up our 3-part blog series. As you have seen, MongoDB 3.2 is a significant release of the world’s fastest growing database: New storage engine options, coupled with document validation, the enhanced replication protocol and sharding improvements extend the range of mission-critical applications MongoDB can serve. New tools such as the BI Connector, Compass, and Cloud Manager integration to APM platforms allow organizations to take advantage of MongoDB while protecting investments in existing frameworks and workflows. Remember, you can get the detail now on everything MongoDB 3.2 offers by downloading the What’s New white paper . Alternatively, if you’d had enough of reading about it and want to get your hands on the code now, then: Download MongoDB 3.2 today. Evaluate MongoDB Enterprise, along with the Connector for BI, MongoDB Compass and Ops Manager by heading to the MongoDB download center . To start using MongoDB 3.2 as quickly and efficiently as possible, bring in the experts. MongoDB’s consulting engineers can deliver a private training on 3.2 features tailored to your needs, then work with you to develop a customized upgrade plan for your deployment. Interested? Learn more About the Author - Mat Keep Mat is a director within the MongoDB product marketing team, responsible for building the vision, positioning and content for MongoDB’s products and services, including the analysis of market trends and customer requirements. Prior to MongoDB, Mat was director of product management at Oracle Corp. with responsibility for the MySQL database in web, telecoms, cloud and big data workloads. This followed a series of sales, business development and analyst / programmer positions with both technology vendors and end-user companies.
How DataSwitch And MongoDB Atlas Can Help Modernize Your Legacy Workloads
Data modernization is here to stay, and DataSwitch and MongoDB are leading the way forward. Research strongly indicates that the future of the Database Management System (DBMS) market is in the cloud, and the ideal way to shift from an outdated, legacy DBMS to a modern, cloud-friendly data warehouse is through data modernization. There are a few key factors driving this shift. Increasingly, companies need to store and manage unstructured data in a cloud-enabled system, as opposed to a legacy DBMS which is only designed for structured data. Moreover, the amount of data generated by a business is increasing at a rate of 55% to 65% every year and the majority of it is unstructured. A modernized database that can improve data quality and availability provides tremendous benefits in performance, scalability, and cost optimization. It also provides a foundation for improving business value through informed decision-making. Additionally, cloud-enabled databases support greater agility so you can upgrade current applications and build new ones faster to meet customer demand. Gartner predicts that by 2022, 75% of all databases will be on the cloud – either by direct deployment or through data migration and modernization. But research shows that over 40% of migration projects fail. This is due to challenges such as: Inadequate knowledge of legacy applications and their data design Complexity of code and design from different legacy applications Lack of automation tools for transforming from legacy data processing to cloud-friendly data and processes It is essential to harness a strategic approach and choose the right partner for your data modernization journey. We’re here to help you do just that. Why MongoDB? MongoDB is built for modern application developers and for the cloud era. As a general purpose, document-based, distributed database, it facilitates high productivity and can handle huge volumes of data. The document database stores data in JSON-like documents and is built on a scale-out architecture that is optimal for any kind of developer who builds scalable applications through agile methodologies. Ultimately, MongoDB fosters business agility, scalability and innovation. Key MongoDB advantages include: Rich JSON Documents Powerful query language Multi-cloud data distribution Security of sensitive data Quick storage and retrieval of data Capacity for huge volumes of data and traffic Design supports greater developer productivity Extremely reliable for mission-critical workloads Architected for optimal performance and efficiency Key advantages of MongoDB Atlas , MongoDB’s hosted database as a service, include: Multi-cloud data distribution Secure for sensitive data Designed for developer productivity Reliable for mission critical workloads Built for optimal performance Managed for operational efficiency To be clear, JSON documents are the most productive way to work with data as they support nested objects and arrays as values. They also support schemas that are flexible and dynamic. MongoDB’s powerful query language enables sorting and filtering of any field, regardless of how nested it is in a document. Moreover, it provides support for aggregations as well as modern use cases including graph search, geo-based search and text search. Queries are in JSON and are easy to compose. MongoDB provides support for joins in queries. MongoDB supports two types of relationships with the ability to reference and embed. It has all the power of a relational database and much, much more. Companies of all sizes can use MongoDB as it successfully operates on a large and mature platform ecosystem. Developers enjoy a great user experience with the ability to provision MongoDB Atlas clusters and commence coding instantly. A global community of developers and consultants makes it easy to get the help you need, if and when you need it. In addition, MongoDB supports all major languages and provides enterprise-grade support. Why DataSwitch as a partner for MongoDB? Automated schema re-design, data migration & code conversion DataSwitch is a trusted partner for cost-effective, accelerated solutions for digital data transformation, migration and modernization through a modern database platform. Our no-code and low-code solutions along with cloud data expertise and unique, automated schema generation accelerates time to market. We provide end-to-end data, schema and process migration with automated replatforming and refactoring, thereby delivering: 50% faster time to market 60% reduction in total cost of delivery Assured quality with built-in best practices, guidelines and accuracy Data modernization: How “DataSwitch Migrate” helps you migrate from RDBMS to MongoDB DataSwitch Migrate (“DS Migrate”) is a no-code and low-code toolkit that leverages advanced automation to provide intuitive, predictive and self-serviceable schema redesign from a traditional RDBMS model to MongoDB’s Document Model with built-in best practices. Based on data volume, performance, and criticality, DS Migrate automatically recommends the appropriate ETTL (Extract, Transfer, Transform & Load) data migration process. DataSwitch delivers data engineering solutions and transformations in half the timeframe of the existing typical data modernization solutions. Consider these key areas: Schema redesign – construct a new framework for data management. DS Migrate provides automated data migration and transformation based on your redesigned schema, as well as no-touch code conversion from legacy data scripts to MongoDB Atlas APIs. Users can simply drag and drop the schema for redesign and the platform converts it to a document-based JSON structure by applying MongoDB modeling best practices. The platform then automatically migrates data to the new, re-designed JSON structure. It also converts the legacy database script for MongoDB. This automated, user-friendly data migration is faster than anything you’ve ever seen. Here’s a look at how the schema designer works. Refactoring – change the data structure to match the new schema. DS Migrate handles this through auto code generation for migrating the data. This is far beyond a mere lift and shift. DataSwitch takes care of refactoring and replatforming (moving from the legacy platform to MongoDB) automatically. It is a game-changing unique capability to perform all these tasks within a single platform. Security – mask and tokenize data while moving the data from on-premise to the cloud. As the data is moving to a potentially public cloud, you must keep it secure. DataSwitch’s tool has the capability to configure and apply security measures automatically while migrating the data. Data Quality – ensure that data is clean, complete, trustworthy, consistent. DataSwitch allows you to configure your own quality rules and automatically apply them during data migration. In summary: first, the DataSwitch tool automatically extracts the data from an existing database, like Oracle. It then exports the data and stores it locally before zipping and transferring it to the cloud. Next, DataSwitch transforms the data by altering the data structure to match the re-designed schema, and applying data security measures during the transform step. Lastly, DS Migrate loads the data and processes it into MongoDB in its entirety. Process Conversion Process conversion, where scripts and process logic are migrated from legacy DBMS to a modern DBMS, is made easier thanks to a high degree of automation. Minimal coding and manual intervention are required and the journey is accelerated. It involves: DML – Data Manipulation Language CRUD – typical application functionality (Create, Read, Update & Delete) Converting to the equivalent of MongoDB Atlas API Degree of automation DataSwitch provides during Migration Schema Migration Activities DS Automation Capabilities Application Data Usage Analysis 70% 3NF to NoSQL Schema Recommendation 60% Schema Re-Design Self Services 50% Predictive Data Mapping 60% Process Migration Activities DS Automation Capabilities CRUD based SQL conversion (Oracle, MySQL, SQLServer, Teradata, DB2) to MongoDB API 70% Data Migration Activities DS Automation Capabilities Migration Script Creation 90% Historical Data Migration 90% 2 Catch Load 90% DataSwitch Legacy Modernization as a Service (LMaas): Our consulting expertise combined with the DS Migrate tool allows us to harness the power of the cloud for data transformation of RDBMS legacy data systems to MongoDB. Our solution delivers legacy transformation in half the time frame through pay-per-usage. Key strengths include: ● Data Architecture Consulting ● Data Modernization Assessment and Migration Strategy ● Specialized Modernization Services DS Migrate Architecture Diagram Contact us to learn more.