We recently introduced streamlined data sources in Atlas Charts, which eliminates the manual steps involved with adding data sources into Charts. With MongoDB Atlas project data automatically available in Charts, your visualization workflow can become quicker and simpler than ever.
For those unfamiliar with these data sources, here’s a quick summary:
A serverless instance is an Atlas deployment model that lets you seamlessly scale usage based on workload demand and ensures you are only charged for resources you need.
Online Archive enables automated data tiering of Atlas data, helping you scale your storage and optimize costs while keeping data accessible.
These data sources serve two distinct use cases, based on your needs. So, whether you are trying to eliminate upfront resource provisioning using a serverless instance or creating archives of your high-volume workloads, such as time-series or log data to reduce costs with Online Archive, Charts makes these sources natively available for visualization with zero ETL, just as it always has with your other Atlas clusters.
To learn how easy it is to visualize these new data sources, let’s create a serverless database called “ServerlessInstance0” and separately activate Online Archive on a database called “Cluster0” that will run daily in Atlas (Figure 1).
When setting up an Online Archive, Atlas creates two instances of your data (Figure 2). One instance includes only your archived data. The second instance contains your archive data and your live cluster data. This setup gives you additional flexibility to query data as your use case demands.
Moving on to the Data Sources page in Charts (Figure 3), all of the data sources are shown, including serverless instances and Atlas cluster data archived in Online Archive, neatly categorized based on the instance type and ready for use in charts and dashboards. (Note that project owners maintain full control of these data sources.) For more details about connecting and disconnecting data sources, review our documentation.
With these additions, Charts now supports all the cluster configurations you can create in Atlas, and we are excited to see how you achieve your visualization goals using these new data sources.
New to Atlas Charts? Get started today by logging into or signing up for MongoDB Atlas, deploying or selecting a cluster, and activating Charts for free.
MongoDB & IIoT: Turning Data into Business Intelligence
Manufacturing companies leverage business intelligence (BI) to sift through and analyze manufacturing and supply chain data in order to become more efficient and productive organizations. Often, the real hurdle with analytics is ensuring reliable access to relevant data sets. This article describes how to prepare data to yield strategic and operational insights through a combination of data tiering, federation, querying, and visualization. Consider the scenario of a car manufacturer looking to implement a predictive maintenance program to reduce maintenance costs for its car assembly machines. Establishing an optimal data storage infrastructure is critical to allow them to find correlations between live IoT sensor data and historical maintenance records, thereby gaining insights into maintenance trends and correlating sensor data. As shown in Figure 1, such a challenge falls under step 3 of our IIoT end-to-end data integration framework: Compute. Figure 1: Step 3 in end-to-end data integration framework for IIoT. Read the first , second , and third articles in this series on end-to-end data integration in the context of IIoT. Figure 2: Architecture overview of data visualization and analytics enabled by MongoDB. The proposed methodology leverages the different data tiering capabilities of MongoDB covering the full data lifecycle to create a single API access for BI/analytics. Figure 2 summarizes the different MongoDB features and third-party integrations available to take advantage of the volumes of data generated over time for data-driven business insights. The challenge of data tiering The car manufacturer in our example would most likely need to differentiate between the different types of data needed for its predictive maintenance model. Here we make a distinction between operational and analytical workloads. Operational workload: Refers to latency-sensitive data that affects functioning of equipment or powers critical applications/processes. Analytical workload: Refers to life and historical data that does not power mission-critical applications but is readily stored and queried for the purpose of reporting, analytics, or training of AI/ML models. Figure 3 provides a basic illustration of how MongoDB handles workload isolation leveraging MongoDB replica sets to support real-time BI and analytical workloads without additional ETL jobs. Figure 3: Illustration of workload isolation in MongoDB. More advanced architecture patterns for workload isolation or data tiering can be achieved through sharding. Although these approaches are suitable for many scenarios, they are still more like hot/warm data because storage and compute are still tightly coupled. For maximum cost efficiency at the expense of latency, we must consider newer cloud storage options, such as Amazon S3 or other Blob stores, which decouple storage and compute and are perfectly suited to store so-called cold data. The challenge, however, is how to extract the data from hot stores (such as MongoDB), bring it into the cold storage (such as S3) while maintaining the ability to query the data through a single API. MongoDB provides several options to facilitate fully automated data tiering, including: Online Archive Atlas Data Lake Online Archive: Rule-based data archiving Online Archive in MongoDB Atlas provides an automated rule-based mechanism for moving data out of live/hot clusters to more cost-effective/cold storage (for example, Amazon S3 buckets). This feature removes the burden of building and maintaining potentially complex ETL jobs and data purging functionality while allowing users to configure data offloading within a few simple configuration steps. Online Archive moves data based on criteria specified in archival rules (as shown in Figure 4). In our example of an auto manufacturing company, sensor data is an excellent use case for this type of data tiering. Sensor data is “hot” when it's created and cools down over time with less need for real-time queries. Our car manufacturer can easily configure an archival rule dependent on the timestamp and in combination with the number of days they want to keep the data in the MongoDB cluster . Figure 4: Animation showing how Online Archive works. A broad set of MongoDB Atlas customers across industries already uses Online Archive to save storage costs while maintaining query ability across hot and cold data. With Online Archive, we were able to save an astounding 60% in data storage costs and 70% in cloud backup costs — reducing our overall database spend by 35%. Martin Löper, Cloud Solutions Architect, Nesto Software Although offloading data already provides major cost savings, there is also potential for more efficient data processing on the consumer side by optimizing the data structures and file formats toward more column-oriented analytical queries. For this purpose, MongoDB has recently released a Data Lake feature set (currently in Preview) that allows users to take advantage of new features such as columnar indexing and an optimized analytical file format. Data Lake: Columnar indexing of database snapshots Data Lake is MongoDB’s offering of a fully managed analytical storage solution that provides the economics of cloud object storage and is optimized for high-performing analytical queries. It works by reformatting data from a backup snapshot of the Atlas cluster and creating partitioned indexes (illustrated in Figure 5). Figure 5: Diagram showing how Data Lake works. Fully integrated as part of MongoDB Atlas, Atlas Data Lake is provisioned alongside Atlas clusters with no infrastructure to set up or manage, and no storage capacity to predict, making the user experience, administration, and support easy. Returning to our example of predictive maintenance model development, performing columnar indexing on the collected data will result in high gains for analytical query performance. Data Federation: Data virtualization made simple Rarely do business analysts have all the required data in the same place. Often, it’s distributed among different domains and data stores as well as in different formats, like JSON, tabular, CSV, Parquet, Avro, and others. This leads to quite a complex landscape with different API languages, which makes it hard to get easy access to data across all these sources. That's where MongoDB's Atlas Data Federation comes in. Data Federation allows bridging of these data silos by consolidating all the discussed data sources behind a single API without the need for data duplication (Figure 6). Users can group different data sources to virtual databases and collections and query the data with MQL or SQL across the various sources just like talking to a single DBMS. This approach reduces the effort, time-sink, and complexity of pipelines and ETL tools when working with data in different formats. It also allows users to seamlessly query, transform, and aggregate data from one or more data stores (i.e., Atlas cluster, Atlas Data Lake, Amazon S3 buckets, Online Archive, and HTTP endpoints) to create a single virtual database using the full power of the aggregation pipeline (Figure 7). Figure 6: Diagram showing how Data Federation works in MongoDB Atlas. Figure 7: Creating a virtual database in the MongoDB Atlas GUI. Please refer to the documentation for a more detailed description of the process of creating a Federated Database Instance in MongoDB Atlas. Data Federation endpoints are not just read-only APIs. Results of querying a federated database instance can be stored back in MongoDB clusters or as files in S3 buckets to power other real-time enterprise or end-user applications, or for performing other analytical tasks and visualizations. In the case of our car manufacturer, real-time sensor data and maintenance history can be queried together and made available to an analytical engine training ML models for remaining useful life prediction. The fastest way to start building compelling visualizations and gaining insight into the data across MongoDB clusters and file-based data sources through federated instances is through the use of Charts , which comes fully integrated in the Atlas product suite. Data visualization with Charts Charts provides a quick, simple, and yet powerful way to visualize data with multiple widgets, dynamic filters, and automatic data refresh like you know it from traditional BI tools. Atlas users can connect dashboards created in Atlas Charts with federated databases and perform correlation analytics in a no-code environment. Charts is fully integrated with the MongoDB Atlas product suite, which means that data sources in Atlas are immediately accessible from the interface, allowing users to add federated databases as a source for a variety of dashboard visualizations. From displaying device sensor data to calculated values for more sophisticated insights, Charts provides widgets and custom fields calculations to achieve effective and insightful visualizations. Figures 8 and 9 show two examples of dashboards created in Charts showing time series sensor data from a smart factory and Overall Equipment Effectiveness (OEE) along with other manufacturing performance metrics information. Through the use of these powerful visualizations, the car manufacturer can understand the effect of optimal maintenance strategies on overall factory performance. Figure 8: Sample shop floor monitoring dashboard created in Atlas Charts. Figure 9: Sample OEE dashboard in Atlas Charts To harness existing knowledge and skills around familiar and popular BI tools such as Power BI and Tableau, MongoDB has developed Atlas SQL API , which gives users the option to connect SQL-based business intelligence and analytics tools to Atlas through a variety of drivers and connectors including: Tableau Connector Power BI Connector JDBC Driver ODBC Driver These Atlas SQL connectors and drivers leverage Data Federation functionality, thereby enabling users to query data across Atlas clusters and cloud storage (such as S3 buckets) and to maintain the comfort of existing SQL-based BI tools that they are familiar with. Getting started is easy using the Atlas SQL API at no cost with the detailed tutorial and the documentation . Register for a free Atlas user account to try it out. Thank you to Karolina Ruiz Rogelj for her contributions to this post. Watch our recorded webinar to see a live demonstration of how Atlas Federated Instances are created and used as a data source for MongoDB Charts and Tableau.
They Asked, We Answered: A Q&A on Joining MongoDB’s Remote Solutions Center
Our Remote Solutions Center (RSC) team offers those with technical backgrounds interested in working with customers an opportunity to jumpstart a career in pre-sales. We asked Soheyl Rafi, Solutions Architect and former Remote Solutions Center team member, some common questions candidates have about joining the team. What is the day-to-day like on the Remote Solutions Center team? Working in the Remote Solutions Center is a dynamic and ever-changing experience, with each day bringing unique challenges and opportunities. You can expect a blend of calls, hands-on activities, customer interactions, and problem-solving. The variety keeps the role super interesting. Part of the diverse work environment comes from the collaboration that this role inherently entails. Throughout the day, you’ll engage in various customer interactions such as discussing project requirements and challenges to proposing tailored solutions. Another aspect could be your involvement in Technical Feasibility Workshops with customers. This is where your deep technical knowledge comes into play. You’ll be addressing intricate technical questions, providing insights, and ensuring that our solutions align with the customer’s needs. Enablement also plays a pivotal part in the day-to-day. You will spend a lot of time learning new technologies and features released by our Product team, understanding the competition, and staying abreast of the market as a whole. All in all, the day-to-day work is extremely diverse, and you’ll need to both enjoy and be comfortable wearing multiple hats. What can I expect during the onboarding phase? When you join the RSC, you can expect a comprehensive onboarding experience that covers both technical and sales aspects. Our onboarding plan is designed to provide hands-on training, ensuring that you become familiar with MongoDB technology and business processes. From a technical standpoint, you’ll have access to detailed training sessions and resources. You’ll be guided through the intricacies of our products and services, allowing you to build a strong foundation in your technical knowledge. On the sales front, we have a tailored onboarding plan that focuses on honing the skills required for successful client engagement. This includes understanding our market positioning, customer needs, and effective sales strategies. You will be exposed to real-world scenarios and practical exercises to enhance your sales acumen. To complement your onboarding experience, each new team member is paired with a more senior peer within the RSC. This mentorship helps you begin to build relationships within the team and provides valuable insights and guidance on navigating the role effectively. In addition, RSC leadership recognizes the importance of learning from industry veterans. That’s why each new hire is also paired with a seasoned professional from outside the RSC who acts as a Solution Architecture buddy. This experienced mentor with tenure in the industry will offer a unique perspective and share valuable insights to accelerate your learning. To gain practical exposure, you will have the opportunity to shadow calls and workshops conducted by experienced team members. This hands-on approach allows you to observe firsthand how we conduct business, manage client interactions, and collaborate within the team. In summary, our onboarding program within the RSC is a holistic approach that combines technical training, sales development, mentorship, and practical experience. How often can I expect to collaborate with Solutions Architects (SAs) in the field? As an SA within the RSC, collaboration with SAs in the field is consistently high. We are strategically positioned around account activities and proactively engage with our counterparts in the field teams. Depending on the opportunity during an engagement, you might assist in the early stages of the sales cycle, such as discovery and demos, before passing the deal onto the field SA. Collaboration in these engagements is pivotal, requiring alignment to ensure a seamless handover to the field teams. What internal career development opportunities are there for me? Upon joining the RSC team, you’ll immerse yourself in a dynamic learning environment. Unlike traditional career structures, the RSC team fosters an environment where you are in the driver’s seat, dictating the pace of your internal career development. No artificial roadblocks hinder you from taking on more challenging and senior-level tasks. Your dedication, skills, and initiative are the primary determinants of how quickly you progress. While there are specific tasks you’ll work on in your day-to-day, you’ll also have a wide range of internal projects available to enhance your skill sets and advance your career within MongoDB. In a team with diverse interests and skills, you can choose projects that align with your passion and interests. You’ll find yourself in a situation with structured learning paths, mentorship programs, project leadership opportunities, and a career trajectory limited only by your aspirations. In my career journey, I’ve achieved my goal of growing from an Associate SA to being promoted to a Solutions Architect for our enterprise customers. I am now a dedicated Solutions Architect to one of our biggest financial customers, helping them in their digital transformation and expanding their MongoDB footprint. What new things will I learn if I join the team? The question should be, “What will you not learn?” Databases are at the center of every tech stack. You will be exposed to and gain an understanding of the entire tech stack, including the underlying infrastructure and the application that will be built on top of MongoDB. In your role, you’ll find yourself engaging with customers to discuss the various technologies used in application development, the infrastructure decisions made, and other database solutions that are either part of their tech stack or under consideration. Each customer employs different methodologies for developing software and utilizes various programming languages and solutions. It is pivotal as an SA in the RSC to comprehend these diverse solutions and effectively communicate them to our customers. Beyond the technical aspects, you will start to see things from a macro perspective. Understanding your customer’s business is crucial. You will need to learn how to align technical solutions with business objectives, considering factors such as budget, timelines, and return on investment. Learn more about applying your technical skills and engaging with customers as part of our Remote Solutions Center.