How to Rapidly Build a Globally Scalable & Secured eCommerce SaaS Platform
12 PM EST Join this Tech Talk as we sit down with Sanjay Garje, Co-Founder and CTO of Ribbon, the company behind the hybrid trade show SaaS platform that makes B2B Commerce fun, simple, and accessible to everyone across the globe. Prior to Ribbon, at Amazon Web Services (AWS), Sanjay was a Global leader helping companies like Netflix, Intuit, GoDaddy with their journey to AWS Cloud. So when a seasoned Cloud technology leader like Sanjay decided to build his own technology SaaS platform, he chose MongoDB Atlas as the database engine. Why? Listen in as Sanjay shares his insights into: Leading a growing eCommerce SaaS platform technology company Scaling the databases design strategy to meet global demand: Crawl, walk, run Delighting customers with the real-time business intelligence without costing an arm and a leg Making the switch from internally managing the database to a self-managed database Better alignment for Center for Internet Security (CIS v8) Data Protection and Recovery Controls Register today and save your spot for our live virtual session!
Safe Software Deployment: The 180 Rule
In my last post , I talked about the anxiety developers feel when they deploy software, and the negative impact that fear has on innovation. Today, I’m offering the first of four methods I’ve used to help teams overcome that fear: The 180 Rule. Developers need to be able to get software into production, and if it doesn’t work, back it out of production as quickly as possible and return the system to its prior working state. If they have confidence that they can detect problems and fix them, they can feel more confident about deploying. All deployments have the same overall stages: Deployment: You roll the software from staging to production, either in pieces -- by directing more and more transactions to it -- or by flipping a switch. This involves getting binaries or configuration files reliably to production and having the system start using them. Monitoring: How does the system behave under live load? Do we have signals that the software is behaving correctly and performantly? It’s essential that this monitoring focuses more on the existing functionality than just the “Happy Path” of the new functionality. In other words, did we damage the system through the rollout? Rollback: If there is any hint that the system is not working correctly, the change needs to be quickly rolled back from production. In a sense, a rollback is a kind of deployment, because you’re making another change to the live system: returning it to a prior state. The “180” in the name of the rule has a double meaning. Of course, we’re referring here to the “180 degree” about-face of a rollback. But it’s also a reference to an achievable goal of any deployment. I believe that any environment should be able to deploy software to production and roll it back if it doesn’t work in three minutes, or 180 seconds. This gives 60 seconds to roll binaries to the fleet and point your customers to them, 60 seconds to see if the transaction loads or your canaries see problems, and then 60 seconds to roll back the binaries or configurations if needed. Of course, in your industry or for your product, you might need this to be shorter. But the bottom line is that a failed software deployment should not live in production for more than three minutes. Developers follow these three stages all the time, and they often do it manually. I know what you’re thinking: “How can any human being deploy, monitor, and roll back software that fast?” And that is the hidden beauty of the 180 Rule. The only way to meet this requirement is by automating the process. Instead of making the decisions, we must teach the computers how to gather the information and make the decisions themselves. Sadly, this is a fundamental change for many companies. But it’s a necessary change. Because the alternative is hoping things will work while fearing that they will not. And that makes developers loath to deploy software. Sure, there are a lot of tools out there that help with deployments. But this is not an off-the-shelf, set-it-and-forget-it scenario. You, as the developer, must provide those tools with the right metrics to monitor and the right scripts to both deploy the software and possibly roll it back. The 180 Rule does not specify which tools to use. Instead it forces developers to create rigorous scripts and metrics, and ensure they can reliably detect and fix problems quickly. There’s a gotcha that many of you are thinking of: The 180 Rule is not applicable if the deployment is not reversible. For example, deploying a refactored relational schema can be a big problem, because a new schema might introduce information loss that prevents a roll-back. Or the deployment might delete some old config files that aren’t used by the new software. I’ll talk more about how to avoid wicked problems like these in my subsequent posts. But for now, I’m interested to hear what you think of The 180 Rule, and whether you’re using any similar heuristics in your approach to safe deployment.
Safe Software Deployment: Overcoming the Fear and Loathing of Pushing to Prod
Over the course of my career, I’ve had the privilege of deploying many different types of software. I’ve shipped CDs. I’ve pushed customer software over the web. I’ve updated database instances and control planes. And I’ve live-updated large, running, mission-critical systems. I call this a privilege because getting software into the hands of end users is what software engineers love most. But deployments are not all fun and games. And while each deployment presents its own unique challenges, there is one thing they all have in common: fear. Those of you responsible for significant software deployments know exactly what I’m talking about. You work, you prepare, you test. But when the day finally comes for your software to set sail, you are left hoping and praying it proves seaworthy on the Ocean of Production. In most companies, production is so different from your development and staging environments, that it’s almost impossible to know whether the code that worked in staging is going to succeed in production. Yet one thing is certain: if your software fails, everybody is going to know about it. Hence the fear. When it comes to understanding the effects of fear on the developer, I think Frank Herbert, author of the epic science-fiction saga Dune, said it best: “Fear is the mind-killer.” Fear undermines experimentation and the entrepreneurial spirit. It discourages risk-taking and leads to bad habits, like avoiding deployment for months. And worst of all, fear slows down the innovation process. (See my post on the Innovation Tax many organizations are paying, and don’t know it.) Pushing to production is undeniably scary. But over the last 30 years, working with my peers, I’ve developed a few methods for creating the conditions for safe, confident deployments. And my next four blogs in this series will unpack each of them in turn: · The 180 Rule - Enabling fast, automated, easily reversible deployments · Z Deployments - Limiting downtime from failed rollbacks · The Goldilocks Gauge - Making the size and frequency of deployments just right . Through the Looking Glass - Ensuring alignment between Dev, Stage, and Prod environments These methodologies aren’t perfect and they won’t guarantee you a bug-free deployment. But they’re the best practices I’ve seen. And they help create a culture of confidence within an engineering team, which is the foundation of meaningful innovation. To get started, my next blog will explain the “180 Rule” to help you reduce outage minutes in production. In the meantime, feel free to share your own tips and techniques for safe deployments with @MarkLovesTech .
Let’s .explain() Aggregations
EMEA/APAC: Nov 18, 9am BST | 2:30pm IST AMER: Nov 18, 9am PT | 12pm ET MongoDB’s aggregation pipeline makes it easy for you to transform and analyze data of any structure, right inside the database. You can leverage this functionality to easily create and run incredibly complex data pipelines. However, you must also take into consideration its performance characteristics compared to more traditional operations. Join us for our upcoming webinar to gain a better understanding of how aggregations are optimized and executed — whether you are writing a new one or looking to tune something that is already in production. We’ll cover: An overview on the structure of explain plans in MongoDB Various optimizations that MongoDB executes when parsing aggregation pipelines Best practices you can leverage to avoid commonly observed anti-patterns Additional tips & tricks to improve the query performance You’ll also have an opportunity to get your questions answered in the live Q&A session with MongoDB experts, so save your spot now!
Take Advantage of Low-Latency Innovation with MongoDB Atlas, Realm, and AWS Wavelength
Build a Single View of Your Customers with MongoDB Atlas and Cogniflare's Customer 360
The key to successful, long-lasting commerce is knowing your customers. If you truly know your customers, then you understand their needs and wants and can identify the right product to deliver to them—at the right time and in the right way. However, for most B2C enterprises, building a single view of the customer poses a major hurdle due to copious amounts of fragmented data. Businesses gather data from their customers in multiple locations, such as ecommerce platforms, CRM, ERP, loyalty programs, payment portals, web apps, mobile apps and more. Each data set can be structured, semi-structured or unstructured, delivered as stream or require batch processing, which makes compiling already fragmented customer data even more complex. This has led some organizations to bespoke solutions, which still only provide a partial view of the customer. Siloed data sets make running operations like customer service, targeted marketing and advanced analytics—such as churn prediction and recommendations—highly challenging. Only with a 360 degree view of the customer can an organization deeply understand their needs, wants and requirements, as well as how to satisfy them. A single view of that 360 data is therefore vital for a lasting relationship. In this blog, we’ll walk through how to build a single view of the customer using MongoDB’s database and Cogniflare’s Calledio Customer 360 tool. We’ll also explore a real-world use case focused on sentiment analysis. Building a single view with Calleido's Customer 360 With a Customer 360 database, organizations can access and analyze various individual interactions and touchpoints to build a holistic view of the customer. This is achieved by acquiring data from a number of disparate sources. However, routing and transforming this data is a complex and time-consuming process. Many of the existing Big Data tools often aren’t compatible with cloud environments. These challenges inspired Cogniflare to create Calleido . Figure 1: Calleido Customer 360 Use Case Architecture Calleido is a data processing platform built on top of battle-tested open source tools such as Apache NiFi. Calleido comes with over 300 processors to move structured and unstructured data from and to anywhere. It facilitates batch and real-time updates, and handles simple data transformations. Critically, Calleido seamlessly integrates with Google Cloud and offers one-click deployment. It uses Google Kubernetes Engine to scale up and down based on the demand, and provides an intuitive and slick low-code development environment. Figure 2: Calleido Data Pipeline to Copy Customers From PostgreSQL to MongoDB A real-world use case: Sentiment analysis of customer emails To demonstrate the power of Cogniflare’s Calleido , MongoDB Atlas , and the Customer 360 view, consider the use case of conducting a sentiment analysis on customer emails. To streamline the build of a Customer 360 database, the team at Cogniflare created flow templates for implementing data pipelines in seconds. In the upcoming sections, we’ll walk through some of the most common data movement patterns for this Customer 360 use case and showcase a sample dashboard. Figure 3: Sample Customer Dashboard The flow commences with a processor pulling IMAP messages from an email server (ConsumeIMAP). Each new email that arrives into the chosen inbox (e.g. customer service), triggers an event. Next, the process extracts email headers to determine topline details about the email content (ExtractEmailHeaders). Using the sender's email, Calleido identifies the customer (UpdateAttribute) and extracts the full email body by executing a script (ExecuteScript). Now, with all the data collected, a message payload is prepared and published through Google Cloud Platform (GCP) Pub/Sub (Kafka can also be used) for consumption by downstream flows and other services. Figure 4: Translating Emails to Cloud PubSub Messages The GCP Pub/Sub messages from the previous flow are then consumed (ConsumeGCPPubSub). This is where the power of MongoDB Atlas integration comes in as we verify each sender in the MongoDB database (GetMongo). If a customer exists in our system, we pass the email data to the next flow. Other emails are ignored. Figure 5: Validating Customer Email with MongoDB and Calleido Analysis of the email body copy is then conducted. For this flow, we use a processor to prepare a request body, which is then sent to Google Cloud Natural Language AI to assess the tone and sentiment of the message. The results from the Language Processing API then go straight to MongoDB Atlas so they can be pulled through into the dashboard. Figure 6: Making Cloud AutoML Call with Calleido End result in the dashboard: The Customer 360 database can be used in internal back-office systems to supplement and inform customer support. With a single view, it’s quicker and more effective to troubleshoot issues, handle returns and resolve complaints. Leveraging information from previous client conversations ensures each customer is given the most appropriate and effective response. These data sets can then be fed into analytics systems to generate learnings and optimizations, such as associating negative sentiment with churn rate. How MongoDB's document database helps In the example above, Calleido takes care of copying and routing data from the business source system into MongoDB Atlas, the operational data store (ODS). Thanks to MongoDB’s flexible data structure, we can transfer data in its original format, and subsequently implement necessary schema transformations in an iterative manner. There is no need to run complex schema migrations. This allows for the quick delivery of a single view database. Figures 7 & 8: Calleido Data Pipelines to Copy Products and Orders From PostgreSQL to MongoDB Atlas Calleido allows us to make this transition in just a few simple steps. The tool runs a custom SQL query (ExecuteSQL) that will join all the required data from outer tables and compile the results in order to parallelize the processing. The data arrives in Avro format, then Calleido converts it into JSON (ConvertAvroToJSON) and transforms it to the schema designed for MongoDB (JoltTransformJSON). End result in the Customer 360 dashboard: MongoDB Atlas is the market-leading choice for the Customer 360 database. Here are the core reasons for its world-class standard: MongoDB can efficiently handle non-standardized schema coming from legacy systems and efficiently store any custom attributes. Data models can include all the related data as nested documents. Unlike SQL databases, MongoDB avoids complicated join queries, which are difficult to write and not performant. MongoDB is rapid. The current view of a customer can be served in milliseconds without the need to introduce a caching layer. The MongoDB flexible schema model enables agility with an iterative approach. In the initial extraction, the data can be copied nearly exactly as its original shape. This drastically reduces latency. In subsequent phases, the schema can be standardized and the quality of the data can be improved without complex SQL migrations. MongoDB can store dozens of terabytes of data across multiple data centers and easily scale horizontally. Data can be shared across multiple regions to help navigate compliance requirements. Separate analytics nodes can be set up to avoid impacting performance of production systems. MongoDB has a proven record of acting as a single view database, with legacy and large organizations up and running with prototypes in two weeks and into production within a business quarter. MongoDB Atlas can autoscale out of the box, reducing costs and handling traffic peaks. The data can be encrypted both in transit and at rest, helping to accomplish compliance with security and privacy standards, including GDPR, HIPAA, PCI-DSS, and FERPA. Upselling the customer: Product recommendations Upselling customers is a key part of modern business, but the secret to doing it successfully is that it’s less about selling and more about educating. It’s about using data to identify where the customer is in the customer journey, what they may need, and which product or service can meet that need. Using a customer's purchase history, Calleido can help prepare product recommendations by routing data to the appropriate tools such as BigQuery ML. These recommendations can then be promoted through the call center and marketing teams for both online or mobile app recommendations. There are two flows to achieve this: preparing training data and generating recommendations: Preparing training data First, appropriate data from PostgreSQL to BigQuery is transferred using the ExecuteSQL processor. The data pipeline is scheduled to execute periodically. In the next step, appropriate data is fetched from PostgreSQL, dividing it to 1K row chunks with the ExecuteSQLRecord processor. These files are then passed to the next processor which uses load balancing enabled to utilize all available nodes. All that data then gets inserted to a BigQuery table using the PutBigQueryStreaming processor. Figure 9: Copying Data from PostgreSQL to BigQuery with Calleido Generating product recommendations Next, we move on to generating product recommendations. First, you must purchase Big Query capacity slots, which offer the most affordable way to take advantage of BigQuery ML features. Here, Calleido invokes an SQL procedure with the ExecuteSQL processor, then ensures that the requested BigQuery capacity is ready to use. The next processor (ExecuteSQL) executes an SQL query responsible for creating and training the Matrix Factorization ML model using the data copied by the first flow. Next in the queue, Calleido uses the ExecuteSQL processor to query our trained model to acquire all the predictions and store them in a dedicated BigQuery table. Finally, the Wait processor waits for both capacity slots to be removed, as they are no longer required. Figure 10 & 11: Generating Product Recommendations with Calleido Then, we remove old recommendations through the power of two processors. First, the ReplaceText processor updates the content of incoming flow files, setting the query body. This is then used by the DeleteMongo processor to perform the removal action. Figure 12: Remove Old Recommendations The whole flow ends with copying Recommendations to MongoDB. The ExecuteSQL processor fetches and aggregates the top 10 recommendations per user, all in chunks of 1k rows. Then, the following two processors (ConvertAvroToJSON and ExecuteScript) prepare data to be inserted into the MongoDB collection, by the PutMongoRecord processor. Figure 13: Copy Recommendations to MongoDB End result in the Customer 360 dashboard (the data used here in this example is autogenerated): Benefits of Calleido's 360 customer database on MongoDB Atlas Once the data is available in a centralized operational data store like MongoDB, Calleido can be used to sync it with an analytics data store such as Google BigQuery. Thanks to the Customer 360 database, internal stakeholders can then use the data to: Improve customer satisfaction through segmentation and targeted marketing Accurately and easily access compliance audits Build demand planning forecasts and analyses of market trends Reward customer loyalty and reduce churn Ultimately, a single view of the customer enables organizations to deliver the right message to prospective buyers, funneling those at the brand awareness stage into the conversion stage and ensures retention and post sales mechanics are working effectively. Historically, a 360 view of the customer was a complex and fragmented process, but with Cogniflare’s Calleido and MongoDB Atlas, a Customer 360 database has become the most powerful and cost efficient data management stack that an organization can harness.
Real-Time Analytics with MongoDB Change Streams
Querying over a million data points to compute analytics takes time — which is a problem when you need real-time updates. With MongoDB, you can meet the challenges of modern analytics by delivering insights directly on fresh, operational data without time-consuming and fragile ETL. In this 20-minute presentation, you’ll learn how organizations like Rent the Runway, Verizon, and Pitney Bowes are using MongoDB for analytics to help them visualize unstructured data, deliver real-time intelligence, scale diverse workloads in the cloud, and harness the power of AI. Shrey Batra, MongoDB User Group lead, will discuss how MongoDB creates patterns of data modelling and application architecture, so any new use case can be solved in a plug-and-play type architecture, without affecting your application code. Topics include: What is Real-Time Analytics? Modelling High Volume Metrics MongoDB Change Streams [Plug-n-Play] MongoDB Atlas Triggers Solving a Real Life Example
TECH REMIX WITH MONGODB
MongoDB hosted a virtual session, specifically for 17LIVE. MongoDB product experts and your dedicated Account Manager will share best practices for working on Replica Set, Sharing and provide examples of how you can innovate through MongoDB. What we will cover during the session: Replica Set What is ReplicaSet Write and writeConcern Read and readConcern Sharding What is Sharding The Components - mongos, config server Shard Key Strategy Advanced Topic about Sharding
MongoDB’s Application Data Platform via Edge through Cloud to Mobile
Industry of Things World 2021 This session provides an update on why MongoDB powers some of the most popular IoT platforms today and how our application data platform just got even better with native support for time series data. Watch the recording of this presentation by Felix Reichenbach, Princ. Industry Solutions at MongoDB, speaking at this year's Industry of Things World! What you'll learn: How to reduce complexity and data duplication with MongoDB’s application data platform Learn about our new native time series support Popular IoT Use Cases with MongoDB