Why I Wrote the New MongoDB Aggregations Book
In early May 2021, I published my book, Practical MongoDB Aggregations, which I released electronically and free for anyone to read.
I love the MongoDB database and the uniqueness and power of its aggregation framework to analyse and manipulate massive amounts of data intuitively and efficiently. The opportunity to share this passion with others spurred me to write the book, with which I aim to support developers, architects, data analysts, data engineers, and data scientists to better understand how to maximise their productivity and effectiveness when building aggregation pipelines, as well as how to optimise these pipelines.
Like many people over the past year during the pandemic, I’ve struggled to keep myself occupied when not busy doing my day job. Hence, my book was born not just from a desire to improve people’s knowledge but as my pandemic project, written over many weekends, to stave off the boredom.
I believe aggregation pipelines provide a powerful domain-specific language for data processing in a way I’ve not seen before in other data-oriented tools, languages, or standards. SQL is a good data query language that caters to some analytical use cases via “group-by/having” statements. However, it typically has to be paired with a procedural language (e.g., Oracle’s PL/SQL) to encompass an ordered set of complex data transformation rules. In the big data world of Hadoop, I find the MapReduce approach is too complex to develop with efficiently. Higher-level tools like Spark help alleviate some of this. However, by the necessity of still having to be general-purpose and versatile, the amount of Spark code required to process data sitting in any type of database is still too high for my liking. Many ETL tools provide proprietary data transformation capabilities, but these have to cater to the lowest common denominator capabilities across all the different types of databases they interact with.
For these reasons and from experience, I consider MongoDB Aggregations to be the best tool for processing large data sets because it combines performance with productivity. Nevertheless, I sense the aggregation framework is shrouded in mystery for many people, hence my desire to demystify it with this book.
I believe I identified a knowledge gap that many users wanted to be filled. MongoDB Inc. provides excellent reference documentation about aggregations in the MongoDB Manual, and MongoDB University provides a tremendous free online training course on aggregations. What I felt was still to be addressed was an opinionated yet informed perspective on how best to assemble aggregation pipelines from the well-documented parts—something that points the way to achieve optimal productivity and performance, accompanied by fully formed example pipelines to help put these approaches into practice.
I hope readers of my book will learn some new things of value and enjoy reading it. A good test of the relevance of my book, in time, will be if people come back to it repeatedly as they continue with their journey of developing aggregations.
Meet Some of MongoDB's Working Parents
In honor of International Family Day, I sat down with a few MongoDB parents to learn more about their experiences as a working parent, how they’ve utilized MongoDB’s family benefits, what this past year has been like for them, and their advice to others. Javier Molina , SVP of Global Corporate & Cloud Sales, Austin I was born and raised in Austin, Texas where I still live today with my family. I have been married to my college sweetheart for 13 years and we have three beautiful kids together. I am the SVP of Global Corporate & Cloud Sales and have been with MongoDB since August 2017, pre-IPO. I have teams all over the world, mainly located in Austin, Dublin, New Delhi, and Mexico City. Separating work from home life due to COVID-19 has been challenging. Early in my career, my wife and I established that my drive home would be my opportunity to unwind and mentally unplug from the day. Now, it’s become difficult to find that time for myself to reflect on the day and put it behind me. However, the additional time at home with my family has been very much welcomed. With growing responsibilities in my career, time had been moving extremely fast prior to the pandemic. I’m grateful that over the last 15 months I’ve had the opportunity to spend more time than ever with my kids. I taught my oldest to ride his bike, I was there to potty train my daughter, and with my youngest born in May of 2020, I’ve been able to spend every day with him; the first time I’ve been able to do so with any of my children. One of the things that I love about being a parent is seeing the joy of life and learning about the world through my children’s eyes. We’ve had two of my three children during my time at MongoDB, and outside of the extremely generous parental leave policy itself, my leadership team and direct reports have been extremely supportive. They’ve allowed me to take as much time as I needed without the feeling of guilt like I wasn’t fulfilling my responsibilities. Additionally, being in sales, sometimes it can be tough to take time off. However, with the benefits that come with our parental leave policy I felt extremely comfortable taking the time I needed over several months. Whether both parents work, you’re co-parenting, or one parent stays at home, being more intentional with your time is extremely important. If you’re not paying attention, you can find yourself working during family time or not applying yourself at work due to family obligations. Finding the balance between the two and being open with both your partner and your manager about your obligations helps align your support system to better support you and your family. Sinead Mcniel , Enterprise Territory Management Specialist, Austin I came to MongoDB three years ago as a sales rep and transitioned to our sales operations team in 2020. I live in Austin, Texas with my family, which includes my partner Conner, my 9-month old daughter Isla, and our two dogs and two cats. Being pregnant and having a child during the COVID-19 pandemic was interesting to say the least. The experience has been far from normal. During the beginning of the pandemic when I was pregnant, it was definitely stressful and scary not knowing much about the virus. Once Isla arrived and I went back to work there was a whole new challenge. Working from home brings a lot of distractions without a baby, so you can imagine what it is like with one! Although there have been challenges, there have also been a lot of positives as a work-from-home parent. The time I get to spend with my daughter is a huge positive. Between meetings, I can run downstairs and love on her or eat with her during my lunch break. A less obvious benefit was not having to worry about going into a room multiple times a day to pump milk or worry about my milk supply decreasing. I’m really grateful that I’ve been able to have this time at home with Isla. MongoDB has been incredibly supportive throughout my pregnancy and journey to becoming a new parent. I could not ask for more supportive or understanding leadership and colleagues. On top of that, MongoDB provides amazing benefits to new parents like a 20-week parental leave, a new moms Slack channel, and an awesome app called Cleo. Cleo has been one of the most valuable benefits to us as they offer virtual birthing classes, lactation consultants, and parenting guides/tips. This was especially helpful in a virtual environment. They even sent us a mini MongoDB hoodie for Isla! I also utilized our parental leave which was invaluable. The first few months of your child’s life are really demanding and juggling that plus work would have been an incredible challenge. Having 20 weeks to bond with my daughter and learn how to be the best mom I could be was so helpful. Leanna Lewis , Customer Success Manager, Sydney I’ve lived in Sydney, Australia for the past five years and joined MongoDB as the first Customer Success Manager (CSM) in APAC. I have a wonderful partner, Bryan, and a beautiful 1-year old daughter, Marceline. Outside of work, I enjoy travelling and skydiving (yes, I skydive for fun and have roughly 630 jumps). Marceline was born in April 2020, and I was fortunate to have 20 weeks of paid parental leave to bond with her. I also gave birth at an amazing private hospital under the care of a specialist Obstetrician because we have full coverage private medical insurance through MongoDB. When I returned to work, I received fantastic support from MongoDB, and my manager implemented a re-ramping plan to ensure I had a gentle transition back into the role. I was given plenty of time to train and re-familiarise myself with the technology and catch up on what had changed. I genuinely feel like the break reignited my passion for my role, and I became a much better CSM for it. A colleague also added me to a mums only Slack channel where we could share ideas and anecdotes of being a working mum, and it helped me connect to colleagues across the globe who were on a similar journey. The biggest challenge was the initial mental struggle of returning to work. I was torn because I was desperate for non-mum related conversations, and I needed the mental stimulation of work. As much as I loved being on parental leave, the 24/7 mum life doesn’t suit me, but I felt guilty feeling like I was abandoning my daughter every day. Prior to COVID-19, I spent a lot of time in the office. Removing the commute has doubled the time I get with my daughter on work days, which means the absolute world to me. Now I can be flexible, predominantly working from home and only going into the office when necessary. What I love most about being a parent is the overwhelming sense of love and connection to someone new in the world. I live life through Marceline’s eyes and love watching her grow, learn, and develop. It’s everything my partner and I could ever want! If you are a working parent, I cannot stress enough how important it is to take time for yourself. No matter how much guilt you may feel for working full-time, you need to set the right example for your kids so that they also put their health and happiness first. Eoin Brazil , Staff Curriculum Engineer, Dublin I have worked at MongoDB for around seven and a half years in various roles, starting as an engineer supporting our customers, to developing software for internal use, to most recently the Education team where I teach and create content to help people learn MongoDB. I live in the lovely Dublin suburb of Ranelagh with my wife, Gemma and our two daughters, Clodagh and Bronagh. In Ireland, the first wave of COVID-19 presented a real issue for childcare. Ireland had one of the most stringent lockdowns in Europe, and childcare facilities stayed closed for months. My wife is a community pharmacist who has gone to work as normal throughout the pandemic. The lack of childcare and balancing both of our jobs was the single biggest challenge we faced as working parents. MongoDB really helped with emergency leave which allowed me to look after the children whilst my wife ran her pharmacy. Without this help, things would have been so much more stressful and difficult to manage. My manager was very supportive and understanding of the entire situation as he too had a family and encountered several similar challenges. I also have to give a huge shout out to the MongoDB-Babies Slack channel. Even if it was just a cute baby photo every few days, it really did help to hear from colleagues who were facing the same challenges regardless of where in the world they were. A year before COVID-19, I utilized our parental leave and was lucky to have spent 20 weeks bonding with my youngest, Bronagh. Working from home has helped deepen the bond with both of my daughters, and the flexibility around scheduling has allowed me to spend more time with them. The curiosity of a young mind is amazing as are the questions without boundaries. I look forward to continuing to watch them experience the world. My wife and I met later in life and have been incredibly fortunate to have our daughters after encountering many difficulties trying to start a family. It turns out that more people than you think have challenges on the path to parenthood, so if you’re hoping to start a family, don’t be afraid to reach out to others for support - you will likely find that a difficulty shared is a difficulty halved. Any troubles you encounter will be rewarded a thousand fold by the simple smile and hand holding of a child who believes you are the center of their universe. MongoDB supports all employees on their journey to starting a family, regardless of age, sexual orientation, gender identity, or marital status. Our partnership with Carrot provides employees with customized fertility benefits including IVF treatments, genetic testing, egg freezing, donor eggs, donor sperm surrogacy, adoption, and more. Learn more about our employee benefits . Interested in pursuing a career at MongoDB? We have several open roles on our teams across the globe and would love for you to transform your career with us!
Congratulations to the 2023 APAC Innovation Award Winners
I’m thrilled to announce the nine winners of the 2023 MongoDB APAC Innovation Awards . The MongoDB Innovation Awards honor projects and people who dream big. They celebrate the groundbreaking use of data to build compelling applications and the creativity of professionals expanding the limits of technology with MongoDB. This year, we have broken the awards down regionally to celebrate organizations in APAC, from startups to industry-leading enterprises, across a wide variety of industries, who are delivering big results. We are delighted to announce the winners below: 2023 MongoDB APAC Innovation Award Winners: Positive Impact Open Government Products Open Government Products (OGP) is an in-house team of engineers, designers, and product managers, who is a part of the Singapore Government, and is responsible for building technologies for the public good. OGP used MongoDB’s developer data platform, MongoDB Atlas to create its digital form builder, FormSG. Used by the Singapore government and public healthcare institutions, FormSG securely collects data from residents and businesses and helps public officers to create digital government forms in minutes. It eliminates the use of paper forms and the manual process of transcribing physical documents, which had raised concerns around data privacy and protection. During the pandemic, FormSG enabled public officers to collect more than 100,000 daily temperature declarations nationwide. Today, FormSG has served more than 120,000 public officers from 155 agencies and it has created more than 500,000 digital forms to help the government collect data on travel and health declarations by visitors to the country, applications for COVID-19 swab tests, and applications for financial assistance. Organization Transformation Bendigo and Adelaide Bank Bendigo and Adelaide Bank is one of Australia’s largest banks, with around 7,000 employees helping more than 2.2 million customers achieve their financial goals. The bank has been on a multi-year journey of transformation using MongoDB's developer data platform to improve efficiency and deliver a better customer experience as they fulfill their vision to become Australia’s bank of choice. Recently, the cloud team launched Ready-Set-MongoDB (or RSM). This event-driven framework allows developers to streamline the consumption of internal or external APIs, and applies data transformations and storage automatically within a MongoDB collection of their choice. Using MongoDB Atlas Search, the bank also enabled developers to gain insights across its multi-cloud deployments, identifying cost savings, and providing inventory information to account owners and technical stakeholders. Within the first 18 months of launching these programmes, the automation had saved the organization more than 1,100 developers days. It also helped reduce human involvement, removed stale data, and allowed engineers to focus on the things that matter. The development of Ready-Set-MongoDB is ongoing and improving, as new Bendigo multi-cloud challenges arise and new MongoDB products are released. The application is a perfect representation of how Bendigo's Technology Department is using modern technology, rapid development, and innovation-led problem solving to drive organizational transformation. Heroes in Health Redcliffe Lifetech Private Limited Over the last few years, Redcliffe Labs has become India's fastest growing technology-driven diagnostics service provider. Redcliffe Labs is on a mission to serve 500 Million Indians by 2030 with fusion of technology and world- class laboratories. The company already serves thousands of people daily, with more than 73 labs and close to 1500 walk-in centers across 180 cities. Redcliffe Labs has relied on MongoDB Atlas’ flexible document model to power its innovative Smart Health Report, a patient resource that provides a number of indicators and trackers to gauge holistic health. The MongoDB developer data platform's best in class security, compliance, and privacy controls allows Redcliffe's team to confidently handle even the most sensitive patient data. MongoDB Atlas takes care of many of the traditional database management challenges, which means that developers can spend their time building diagnostics for patients, rather than managing databases. Redcliffe Labs is focusing on incorporating next-generation technologies in the diagnostics space with an AI platform that will make Interactive Diagnostics reports, Advanced Health Profiling and more detailed Diagnostics and Health Alerts. Industry Disruptor Cathay Pacific Cathay Pacific , Hong Kong’s home carrier operating in more than 60 destinations worldwide, has been on an impressive journey to become one of the very first airlines to create a truly paperless flight deck. Until recently, a flight from Hong Kong to New York would require a crew to review more than 150 pages of finely printed text and charts before their flight and make ongoing updates throughout the trip. In 2019, Cathay Pacific conducted the first zero paper flight, removing 50kg of manuals, charts, maps, and flight briefing paperwork. They achieved this enormous feat with the help of one seamless and highly customized iPad application: Flight Folder. Built on MongoDB Atlas, Flight Folder is designed to improve the pilot briefing experience. MongoDB helped consolidate dozens of different information sources into one place, and made it possible for flight crews to easily share their experiences with others. It also included a digital refueling feature that helps crews become much more efficient with fueling strategies – saving significant flight time and costs. The use of MongoDB Device Sync enables seamless syncing and no data loss even when the app goes on- and offline mid-flight. Since the Flight Folder launch, Cathay Pacific has completed more than 340,000 flights with full digital integration in the flight deck. In addition to the greatly improved flight crew experience, flight times have been reduced, and digital refueling saves eight minutes of ground time on average. All these efficiencies have helped the company avoid the release of 15,000 tons of carbon. From Batch to Real-Time Adani Digital Labs Adani Digital Labs is the India-based digital innovation arm of the larger Adani group. The lab’s team's mission is to create one single platform – a SuperApp called AdaniOne – to empower a billion stories in India. To address several use cases and the huge scale that will be required by the superapp, the Adani Digital team selected MongoDB Atlas as its the main transactional database that will further enhance the application. A key component of the app is how it can bring together disparate data in order to provide a single view of activity across the application. In the first process, developers had taken out the data in batches and sent it to their database However, this was too slow and unpredictable as far as business requirements are concerned. Also, the consolidated view of customer history, orders, inventory, and supply chain network updates was likely to impact their customer's ability to generate revenue. Therefore, in order to find a better solution, Adani Digital Labs built a more modern architecture in line with MongoDB. Using MongoDB's Change Streams and the data platform's native Kafka connector, they created an event-based architecture that pushes the data out in real-time for analysis. Adani Digital Labs is still in the early phases of the SuperApp's rollout and collaborating with MongoDB as its developer data platform continues to help the firm to grow and deliver insights in real time. Industry 4.0 Dongwha Founded in 1948, the Dongwha Group has evolved from a singular focus on the wood and timber industry into a global leader across a number of sectors including building materials, chemicals and media. As part of its wider digital transformation strategy, Dongwha required smarter factories that would improve and optimize their production efficiency. Dongwha built an innovative Smart Factory Software platform that collects and analyzes data to enhance quality and production management capabilities. Originally, the platform was built with the community version of MongoDB. However, in order to scale and adapt, the team recently migrated to MongoDB Atlas in the cloud. This enabled them to store large volumes in the fastest and most secure way, optimize their solution for time series data, and make it easy to run machine learning across their data. Dongwha completed the migration seamlessly, without any disruption or downtime to their factories, and it has now been launched across five different sites. Over the last year, the application has significantly increased its availability and reliability while performance has improved by as much as 6x . As they look to the future, Dongwha plans to roll out the software to more of its international factories. Digital Native myBillBook India is home to more than 60 million small and medium-sized businesses (SMBs) but only a small portion of those SMBs are taking advantage of digitization and many still operate using pen and paper. In addition, many businesses in India still struggle with fluctuations in internet services, outages, and latency. FloBiz is on a mission to change that with myBillBook , a one-stop solution that helps SMBs create professional invoices, manage stock, collect payments, automate reminders through smart banking, engage with their customers, manage staff attendance and payroll and generate more than 25 business reports for accounting and decision making. The app is also mobile-first, so businesses can access them from their mobile devices and allows users to manage billing and inventory in both online and offline environments. The myBillbook app is powered by MongoDB Atlas, providing the flexible and scalable foundation for the business to do everything from building new features to performing complex analytical queries. In addition, MongoDB Realm, the mobile database within the data platform, supports offline usage and syncing to ensure there is never data loss or functionality for users due to poor internet connection. Because of its success in supporting customers with business critical operations, more than 6.5 million business owners in India are now using myBillbook for their billing, accounting, collection and business growth. Customer Focused KASIKORN Business-Technology Group Established in 1945, Kasikornbank (KBank) is one of the largest and oldest banks in Thailand. Their mission is to strive towards service excellence and empower every customer’s life and business. One of KBank’s subsidiaries, KASIKORN Business-Technology Group (KBTG) , developed a mobile banking application – MAKE by KBank. MongoDB Atlas’ flexibility and ease of development enabled MAKE’s development team to choose the best type of database for its tasks, to automate data tiering with Atlas Online Archive, and to reduce hours spent on operational maintenance. With more time to focus on delivering new innovations to customers, they created unique features like Cloud Pocket which can allocate funds into unlimited customizable pockets for separate usage. They also built Pop Pay, a feature that allows users to easily search for nearby friends and transfer money by clicking their profile picture as well as “Expense Summary" a spending analysis services that helps inform and manage users’ financial habits. As of January 2023, MAKE has acquired more than 1 million users, and increased the number of transactions in MAKE from 900,000 to more than 7.5 million in a span of one year. Massive Scale China Mobile China Mobile provides mobile voice and multimedia services via its nationwide mobile telecommunications network across mainland China and Hong Kong. It is the world's largest mobile network operator by total number of subscribers. The telecommunications leader is using MongoDB to support one of its largest and most critical push services, which sends out billing details to more than 1 billion users every month. Prior to MongoDB, the tech team relied on Oracle, but as the user numbers increased, performance degraded. Despite large investments, it was still taking too long to do basic requests like finalize and deliver bills to users. In 2019, after comprehensive testing, China Mobile migrated to MongoDB. By taking advantage of MongoDB's native sharding, they were able to improve performance by 80% and go from 50 Oracle machines, to just 12 machines for the same workload. The service now handles all current requirements and is set up to scale with future growth. With the support of MongoDB, China Mobile is growing steadily,with more than 168 million monthly users and has one of the highest customer satisfaction scores in the China Mobile group.