When it comes to building charts, we know that details matter. Small differences in layout, styling or composition can make a big difference in how well your chart communicates the story behind your data. That’s why we’ve just released a whole bunch of new capabilities in MongoDB Charts, giving you more control than ever. Here’s what’s new:
- Secondary Y Axis: Charts can be a great way to show correlation between two different datasets, but when their scales differ greatly it can be hard to see the correlation. By choosing to plot one more series on a secondary Y Axis, you can allow them to make the most of the available space and highlight any interesting relationships. Secondary Y Axis can be enabled on Grouped Column, Discrete Line, Continuous Line and Continuous Area charts.
- Legend Position: Chart legends can now be moved to the top, right or bottom of your chart, or hidden altogether.
- “All Others” Group: Charts has long allowed you to limit a chart to show, say, just the top 10 values. The new “All Others” option allows you to add an additional bar or donut segment that shows the value of all other categories not included in the limit.
- “Count by Value” aggregation: Building multi-series charts is now easier than ever, with the new “Count by Value” aggregation option. This will automatically create series from each distinct value found in a field.
- String binning with Regular Expressions: Last month we introduced binning of string values, allowing you to choose the exact values to go into each bin. This month we’ve extended this further by allowing you to use Regular Expressions to assign values to a bin based on powerful patterns.
- Scatter Mark formatting: We’ve ramped up the customization options available on Scatter charts, allowing you to control the size, border thickness and opacity of each plotted mark.
- Line Dash Styles: A new option on Discrete and Continuous Line charts results in a different dash style for each series, making it easier to differentiate the series and improve the accessibility of your charts.
Here’s one example of a chart that shows off the secondary Y axis, custom legend position and line dash styles:
And here’s another, showing the effect you can get by customizing your scatter chart’s mark style:
We hope you enjoy these new charting capabilities, but we’re not done yet! Over the next couple of months, we’ll be moving our focus to Table charts, adding options like conditional formatting, text wrapping and column pinning. If you have any other ideas for new customization features, please let us know using the MongoDB Feedback Engine. If you haven’t tried Charts yet, you can get started for free by signing up for MongoDB Atlas and deploying a free tier cluster.
Meet the MongoDB Sharding Team’s New Barcelona Division
I sat down with Kaloian Manassiev (Kal), Lead Engineer on MongoDB’s Barcelona-based Sharding team, to better understand what the team does and how they plan to grow. The Sharding team started in our New York City headquarters and expanded to Barcelona in the summer of 2019. Here, we explore who they are recruiting in the growing Spanish market and why someone would be excited to join their team. Ashley Perez: First, can you give a quick overview of what the Sharding team does? Kal Manassiev: The Sharding team builds frameworks and tools that abstract away difficult distributed systems problems for database users. This frees developers to focus on working with the data itself and not have to worry about where it resides, whether there is some network problem, or if a data center catches fire. As a result of this, the projects delivered by the Sharding team are highly visible and are predominantly flagship features for each major MongoDB release. AP: Let’s dive in a little more. What projects has your team taken on? KM: In the past, we’ve delivered projects such as Distributed Transactions and Retryable Writes . Retryable Writes, for example, makes it much easier to implement scenarios so that if your browser crashes when you click the Pay button, it will not charge you two times when you try again. Just recently, we completed a project to assign vector or scalar clocks to all the distributed objects we manage, so that our system is easier to reason about and can be proven correct via theoretical proof models and correctness checkers such as TLA+. This project also makes it easier to add more distributed systems features and be confident in their correctness. AP: Very interesting! What are some projects on the horizon? KM: The biggest upcoming effort is to make sharding even more transparent (invisible) to developers so they can focus on working with data. Behind the scenes, we will analyze their workload patterns and apply balancing techniques to relocate data in order to squeeze the maximum performance out of the hardware and offer the best possible throughput and latency to users. There are a myriad of technical challenges we will need to solve. For example: how to decide the best placement for workloads that might change dynamically, how to ensure consistency while we are reshuffling in the background, and how to minimize the impact on the customers’ workloads so they are not aware of what is going on behind the scenes. AP: I’m sure our customers will be excited when you roll this out. Now, let’s talk a little bit about you. Why did you join MongoDB? KM: Before joining MongoDB, I worked in Seattle at Microsoft SQL Server and at AWS, where I was thrown in the deep end, working on a service running on thousands of nodes across the globe. One day, while I was on vacation, a recruiter from MongoDB randomly reached out to me. After learning about the new Document Model and how MongoDB is essentially taking the best things from the good old relational databases and making them more scalable and available, I was convinced that this was “the future.” So, I made the jump and moved to New York City. I have been at MongoDB for more than seven years because I still believe the direction we are going is the future. In addition, I love the company culture with respect to giving responsibility to engineers to provide input into the roadmap of their teams, and also with tasking them with doing features of critical importance to the business. AP: You went from Seattle to New York City, and now Spain. Can you tell me more about your move and how that sparked a new Sharding team in Barcelona? KM: After living in the United States for roughly 15 years, I decided to move to Europe. It had always been my dream to live in Barcelona because of the Mediterranean climate and lifestyle, which are very well paired with a good education and technology environment. For example, Universitat Politècnica de Catalunya is a well-known school here that hosts the Barcelona Supercomputing Centre and the Mare Nostrum supercomputer. They conduct research very closely related to what my team does, and a good portion of my team had some tenure there at some point in their careers. Because I was very familiar with the company culture and could mentor this team on our technological base, company values, and processes, MongoDB gave me the opportunity to build a small team in Barcelona to see how things would work out. Initially, we started with just two people. After the first eight months (which included the COVID-19 lockdown), it was obvious the team was very strong and that there is very good talent in Barcelona. Therefore, we decided to scale it up and now we have eight people. AP: I hear you’re planning to hire a few more to the Sharding team in Barcelona. What are the career opportunities for your team? KM: That’s correct. Our team is growing. Since sharding is at the forefront of the company’s products, there are many interesting projects to choose from that solve difficult distributed systems problems. With respect to career growth in general, it’s not much different from our North American teams. Our career growth guidelines are universal. Currently, there are two career paths; individual contributor (IC) and manager. On the Barcelona Sharding team, we have career growth opportunities mostly on the IC path. However, we have discovered that it is best to promote leads from within the team, because they already have established rapport with the team members and can work well with them. So while we are growing initially and we definitely lean on the IC path more, there are lead opportunities too. AP: How do you mentor individual contributors so they can move up on the team? KM: It’s a cliche, but the best way to build skills for new engineers is to “throw them in the deep end” and let them figure out how to swim. When people join, we generally let them ease into the team’s processes for a few weeks and train them on how to use MongoDB as a customer. Then, they spend the next month or so fixing small bugs, investigating failures, and so on. After that, they typically join an ongoing project, and little by little will become responsible for some aspect of the project. Mentorship comes as a byproduct of working together with engineers who have been on the team a long time already, and consists of providing feedback and explaining internals of the system and why things work the way they work historically. I also encourage people to read papers, see what other products are doing, and so forth. AP: What’s your proudest moment leading this team? KM: Realizing about five to six months after the first two engineers started and after we hired our third engineer that we have become a proper team and not that little group of people working out of Europe. Our team members were participating in discussions with the bigger team in New York, defending their ideas and proposing new ones. I believe this helped MongoDB see the value in our team and why we’re able to continue to add more hires in Barcelona. Interested in pursuing a career at MongoDB? We have several open roles on our teams across the globe, and would love for you to build your career with us!
How to Get Started with MongoDB Atlas and Confluent Cloud
Every year more and more applications are leveraging the public cloud and reaping the benefits of elastic scale and rapid provisioning. Forward-thinking companies such as MongoDB and Confluent have embraced this trend, building cloud-based solutions such as MongoDB Atlas and Confluent Cloud that work across all three major cloud providers. Companies across many industries have been leveraging Confluent and MongoDB to drive their businesses forward for years. From insurance providers gaining a customer-360 view for a personalized experience to global retail chains optimizing logistics with a real-time supply chain application, the connected technologies have made it easier to build applications with event-driven data requirements. The latest iteration of this technology partnership simplifies getting started with a cloud-first approach, ultimately improving developer’s productivity when building modern cloud-based applications with data in motion. Today, the MongoDB Atlas source and sink connectors are generally available within Confluent Cloud. With Confluent’s cloud-native service for Apache Kafka® and these fully managed connectors, setup of your MongoDB Atlas integration is simple. There is no need to install Kafka Connect or the MongoDB Connector for Apache Kafka, or to worry about scaling your deployment. All the infrastructure provisioning and management is taken care of for you, enabling you to focus on what brings you the most value — developing and releasing your applications rapidly. Let’s walk through a simple example of taking data from a MongoDB cluster in Virginia and writing it into a MongoDB cluster in Ireland. We will use a python application to write fictitious data into our source cluster. Step 1: Set up Confluent Cloud First, if you’ve not done so already, sign up for a free trial of Confluent Cloud . You can then use the Quick Start for Apache Kafka using Confluent Cloud tutorial to create a new Kafka cluster. Once the cluster is created, you need to enable egress IPs and copy the list of IP addresses. This list of IPs will be used as an IP Allow list in MongoDB Atlas. To locate this list, select “Custer Settings” and then the “Networking” tab. Keep this tab open for future reference: you will need to copy these IP addresses into the Atlas cluster in Step 2. Step 2: Set Up the Source MongoDB Atlas Cluster For a detailed guide on creating your own MongoDB Atlas cluster, see the Getting Started with Atlas tutorial. For the purposes of this article, we have created an M10 MongoDB Atlas cluster using the AWS cloud in the us-east-1 (Virginia) data center to be used as the source, and an M10 MongoDB Atlas cluster using the AWS cloud in the eu-west-1 (Ireland) data center to be used as the sink. Once your clusters are created, you will need to configure two settings in order to make a connection: database access and network access. Network Access You have two options for allowing secure network access from Confluent Cloud to MongoDB Atlas: You can use AWS PrivateLink, or you can secure the connection by allowing only specific IP connections from Confluent Cloud to your Atlas cluster. In this article, we cover securing via IPs. For information on setting up using PrivateLink, read the article Using the Fully Managed MongoDB Atlas Connector in a Secure Environment . To accept external connections in MongoDB Atlas via specific IP addresses, launch the “IP Access List” entry dialog under the Network Access menu. Here you add all the IP addresses that were listed in Confluent Cloud from Step 1. Once all the egress IPs from Confluent Cloud are added, you can configure the user account that will be used to connect from Confluent Cloud to MongoDB Atlas. Configure user authentication in the Database Access menu. Database Access You can authenticate to MongoDB Atlas using username/password, certificates, or AWS identity and access management (IAM) authentication methods. To create a username and password that will be used for connection from Confluent Cloud, select the “+ Add new Database User” option from the Database Access menu. Provide a username and password and make a note of this credential, because you will need it in Step 3 and Step 4 when you configure the MongoDB Atlas source and sink connectors in Confluent Cloud. Note: In this article we are creating one credential and using it for both the MongoDB Atlas source and MongoDB sink connectors. This is because both of the clusters used in this article are from the same Atlas project. Now that the Atlas cluster is created, the Confluent Cloud egress IPs are added to the MongoDB Atlas Allow list, and the database access credentials are defined, you are ready to configure the MongoDB Atlas source and MongoDB Atlas sink connectors in Confluent Cloud. Step 3: Configure the Atlas Source Now that you have two clusters up and running, you can configure the MongoDB Atlas connectors in Confluent Cloud. To do this, select “Connectors” from the menu, and type “MongoDB Atlas” in the Filters textbox. Note: When configuring MongoDB Atlas source And MongoDB Atlas sink, you will need the connection host name of your Atlas clusters. You can obtain this host name from the MongoDB connection string. An easy way to do this is by clicking on the "Connect" button for your cluster. This will launch the Connect dialog. You can choose any of the Connect options. For purposes of illustration, if you click on “Connect using MongoDB Compass.” you will see the following: The highlighted part in the above figure is the connection hostname you will use when configuring the source and sink connectors in Confluent Cloud. Configuring the MongoDB Atlas Source Connector Selecting “MongoDbAtlasSource” from the list of Confluent Cloud connectors presents you with several configuration options. The “Kafka Cluster credentials” choice is an API-based authentication that the connector will use for authentication with the Kafka broker. You can generate a new API key and secret by using the hyperlink. Recall that the connection host is obtained from the MongoDB connection string. Details on how to find this are described at the beginning of this section. The “Copy existing data” choice tells the connector upon initial startup to copy all the existing data in the source collection into the desired topic. Any changes to the data that occur during the copy process are applied once the copy is completed. By default, messages from the MongoDB source are sent to the Kafka topic as strings. The connector supports outputting messages in formats such as JSON and AVRO. Recall that the MongoDB source connector reads change stream data as events. Change stream event metadata is wrapped in the message sent to the Kafka topic. If you want just the message contents, you can set the “Publish full document only” output message to true. Note: For source connectors, the number of tasks will always be “1”: otherwise you will run the risk of duplicate data being written to the topic, because multiple workers would effectively be reading from the same change stream event stream. To scale the source, you could create multiple source connectors and define a pipeline that looks at only a portion of the collection. Currently this capability for defining a pipeline is not yet available in Confluent Cloud. Step 4: Generate Test Data At this point, you could run your python data generator application and start inserting data into the Stocks.StockData collection at your source. This will cause the connector to automatically create the topic “demo.Stocks.StockData.” To use the generator, git-clone the stockgenmongo folder in the above-referenced repository and launch the data generation as follows: python stockgen.py -c "< >" Where the MongoDB connection URL is the full connection string obtained from the Atlas source cluster. An example connection string is as follows: mongodb+srv://kafkauser:email@example.com Note: You might need to pip-install pymongo and dnspython first. If you do not wish to use this data generator, you will need to create the Kafka topic first before configuring the MongoDB Atlas sink. You can do this by using the Add a Topic dialog in the Topics tab of the Confluent Cloud administration portal. Step 5: Configuring the MongoDB Atlas Sink Selecting “MongoDB Atlas Sink” from the list of Confluent Cloud connectors will present you with several configuration options. After you pick the topic to source data from Kafka, you will be presented with additional configuration options. Because you chose to write your data in the source by using JSON, you need to select “JSON” in the input message format. The Kafka API key is an API key and secret used for connector authentication with Confluent Cloud. Recall that you obtain the connection host from the MongoDB connection string. Details on how to find this are described previously at the beginning of Step 3. The “Connection details” section allows you to define behavior such as creating a new document for every topic message or updating an existing document based upon a value in the message. These behaviors are known as document ID and write model strategies. For more information, check out the MongoDB Connector for Apache Kafka sink documentation . If order of the data in the sink collection is not important, you could spin up multiple tasks to gain an increase in write performance. Step 6: Verify Your Data Arrived at the Sink You can verify the data has arrived at the sink via the Atlas web interface. Navigate to the collection data via the Collections button. Now that your data is in Atlas, you can leverage many of the Atlas platform capabilities such as Atlas Search, Atlas Online Archive for easy data movement to low-cost storage, and MongoDB Charts for point-and-click data visualization. Here is a chart created in about one minute using the data generated from the sink cluster. Summary Apache Kafka and MongoDB help power many strategic business use cases, such as modernizing legacy monolithic systems, single views, batch processing, and event-driven architectures, to name a few. Today, Confluent and MongoDB Cloud and MongoDB Atlas provide fully managed solutions that enable you to focus on the business problem you are trying to solve versus spinning your tires in infrastructure configuration and maintenance. Register for our joint webinar to learn more!