We hope you enjoyed part 1 of our Back to Basics series where we introduced you to NoSQL and MongoDB.
In part II we will actually get to code and build a blogging application. We will start in the Mongo Shell to show you how to interact directly with MongoDB on the command line, then move to an IDE to show you how to build a complete web application with the MongoDB Python driver.
Once your application is built we will show you how to add indexes to improve the performance of your queries. Like most databases, indexes can dramatically improve query performance. MongoDB comes with a query analyser and a tool called Explain that can give detailed insight into query performance.
At the end of this webinar you will know how to:
- Run MongoDB
- Create a basic database and set of collections using the MongoDB Shell
- Create a basic database and collection using one of our language drivers
- Add an index to a collection to improve query performance
- Review the efficiency of your queries using the explain framework
Register now for this webinar and all the remaining webinars in the series.
MongoDB Connector for Apache Spark: Announcing Early Access Program & New Spark Course
**Update: August 4th 2016** Since this original post, the connector has been declared generally available, for production usage. Click through for a tutorial on using the new MongoDB Connector for Apache Spark . We live in a world of “big data”. But it isn’t only the data itself that is valuable – it’s the insight it can generate. How quickly an organization can unlock and act on that insight has become a major source of competitive advantage. Collecting data in operational systems and then relying on nightly batch ETL (Extract Transform Load) processes to update the Enterprise Data Warehouse (EDW) is no longer sufficient. Speed-to-insight is critical, and so analytics against live operational data to drive real-time action is fast becoming a necessity, enabled by a new generation of technologies like MongoDB and Apache Spark. The new native MongoDB Connector for Apache Spark provides higher performance, greater ease of use, and access to more advanced Spark functionality than any connector available today. The new MongoDB University course for Apache Spark provides a fast track introduction for developers and data scientists building new generations of operational applications incorporating sophisticated real-time analytics. The Rise of Apache Spark Apache Spark is one of the fastest-growing big data projects in the history of the Apache Software Foundation. With its memory-oriented architecture, flexible processing libraries, and ease-of-use, Spark has emerged as a leading distributed computing framework for real-time analytics. As a general-purpose framework, Spark is used for many types of data processing – it comes packaged with support for machine learning, interactive queries (SQL), statistical queries with R, graph processing, ETL, and streaming. Spark allows programmers to develop complex, multi-step data pipelines using a directed acyclic graph (DAG) pattern. It supports in-memory data sharing across DAGs, so that different jobs can work with the same data. Additionally, Spark supports a variety of popular programming languages including Scala, Java, and Python. Sign up for the new Spark course at MongoDB University. For loading and storing data, Spark integrates with a number of storage and messaging platforms including Amazon S3, Kafka, HDFS, machine logs, relational databases, NoSQL datastores, MongoDB, and more. MongoDB and Spark Today While MongoDB natively offers rich real-time analytics capabilities , there are use cases where integrating the Spark engine can extend the processing of operational data managed by MongoDB. This allows users to operationalize results generated from Spark within real-time business processes supported by MongoDB. Examples of users already using MongoDB and Spark to build modern-data driven applications include: A multinational banking group operating in 31 countries with 51 million clients has implemented a unified real-time monitoring application with Apache Spark and MongoDB . The platform enables the bank to improve customer experience by continuously monitoring client activity across its online channels to check service response times and identify potential issues. A global manufacturing company estimates warranty returns by analyzing material samples from production lines. The collected data enables them to build predictive failure models using Spark machine learning and MongoDB. A video sharing website is using Spark with MongoDB to place relevant advertisements in front of users as they browse, view, and share videos. A global airline has consolidated customer data scattered across more than 100 systems into a single view stored in MongoDB. Spark processes are run against the live operational data in MongoDB to update customer classifications and personalize offers in real time, as the customer is live on the web or speaking with the call center. Artificial intelligence personal assistant company x.ai uses MongoDB and Spark for distributed machine learning problems. There are a number of ways users integrate MongoDB with Spark. For example, the MongoDB Connector for Hadoop provides a plug-in for Spark. There are also multiple 3rd party connectors available. Today we are announcing the early access to a new native Spark connector for MongoDB. Introducing the MongoDB Connector for Apache Spark The new MongoDB Connector for Apache Spark provides higher performance, greater ease of use and, access to more advanced Spark functionality than the MongoDB Connector for Hadoop. The following table compares the capabilities of both connectors. MongoDB Connector for Spark MongoDB Connector for Hadoop with Spark Plug-In Written in Scala, Spark’s native language Yes No, Java Support for Scala, Java, Python & R APIs Yes Yes Support for the Spark interactive shell Yes Yes Support for native Spark RDDs Yes No Java RDDs. More verbose and complex to work with Support for Spark DataFrames and Datasets Yes DataFrames Only Schema must be manually inferred Automated MongoDB schema inference Yes No Support for Spark core Yes Yes Support for Spark SQL Yes Yes Support for Spark Streaming Yes Yes Support for Spark Machine Learning Yes Yes Support for Spark GraphX Yes No Data locality awareness Yes The Spark connector is aware which MongoDB partitions are storing data No Support for MongoDB secondary indexes to filter input data Yes Yes Support for MongoDB aggregation pipeline to filter input data Yes No Compatibility with MongoDB replica sets and sharded clusters Yes Yes Support for MongoDB 2.6 and higher Yes Yes Support for Spark 1.6 and above Yes Yes Supported for production usage Not Currently Available for early access evaluation Yes Written in Spark’s native language, the new connector provides a more natural development experience for Spark users as they are quickly able to apply their Scala expertise. The connector provides access to the Spark interactive shell for data exploration and rapid prototyping. The connector exposes all of Spark’s libraries, enabling MongoDB data to be materialized as DataFrames and Datasets for analysis with SQL (benefiting from automatic schema inference), streaming, machine learning, and graph APIs. The Spark connector can take advantage of MongoDB’s aggregation pipeline and rich secondary indexes to extract, filter, and process only the range of data it needs – for example, analyzing all customers located in a specific geography. This is very different from more simple NoSQL datastores that do not offer either secondary indexes or in-database aggregations. In these cases, Spark would need to extract all data based on a simple primary key, even if only a subset of that data is required for the Spark process. This means more processing overhead, more hardware, and longer time-to-insight for the analyst. To maximize performance across large, distributed data sets, the Spark connector is aware of data locality in a MongoDB cluster. RDDs are automatically co-located with the associated MongoDB shard to minimize data movement across the cluster. The nearest read preference can be used to route Spark queries to the closest physical node in a MongoDB replica set, thus reducing latency. Review the MongoDB Connector for Spark documentation to learn how to get started with the connector, and view code snippets for different APIs and libraries. Fast Track to Apache Spark: New MongoDB University Course To get the most out of any technology, you need more than documentation and code. Over 350,000 students have registered for developer and operations courses from MongoDB University. Now developers and budding data scientists can get a quick-start introduction to Apache Spark and the MongoDB connector with early access to our new online course. Getting Started with Spark and MongoDB provides an introduction to Spark and teaches students how to use the new connector to build data analytics applications. In this course, we provide an overview of the Spark Scala and Java APIs with plenty of sample code and demonstrations. Upon completing this course, students will be able to: Outline the roles of major components in the Spark framework Connect Spark to MongoDB Source data from MongoDB for processing in Spark Write data from Spark into MongoDB The course does not assume a prior knowledge of Spark, but does require an intermediate level of expertise with MongoDB. The course is free. Sign up at MongoDB University . Next Steps To wrap up, we are very excited about the possibilities Spark and MongoDB present together, and we hope with the new connector and course, you will be well on your way to building modern, data-driven applications. We would love to hear from you as you explore this new connector and put it through its paces - you can provide feedback and file bugs under the MongoDB Spark Jira project . Here’s a summary of how to get started: Read the MongoDB Connector for Spark documentation and download the connector If you have any questions, please send them to the MongoDB user mailing list Sign up for the new Spark course at MongoDB University
How Thoughtful Illustration Is Setting MongoDB Apart: Meet Champa Lo
I sat down with Champa Lo, Technical Illustrator based in our New York headquarters, to learn more about her role as the first full-time illustrator at MongoDB. We talked about her passion for illustration, what she does, and how she’s shaping the future of design within the company. Ashley Perez: Welcome to the team! Can you tell me about your role? Champa Lo: Sure. I joined MongoDB right before COVID-19 hit. I came into the headquarters twice for an interview but ended up being one of the first new hires who had to start at home, on top of being the first person in a brand-new role. Technical Illustration is a first for MongoDB. The company has never had an illustrator on hand. Although we have talented designers who can illustrate within a design, that’s not their main focus: the overall design is. The difference with my role is that I work specifically on illustration. I also work to define the illustration style and help create a style guide. The most important aspect of my job is building good relationships with people throughout the company. I need to understand their goals and what they’re looking for so I can tell a purely visual story. AP: How did you get into illustration? CL: I guess you can say I fell into it (at least the illustration part). I always knew I wanted to be a graphic designer early on. I was a mentee for a graphic designer in high school and absolutely fell in love with the profession. I even have a cute clipping from my senior year high school paper where I talk about my dreams of being a designer. Interview excerpt from Champa's senior-year high school newspaper After high school, I studied graphic design at the University of Colorado Denver. When I was in the design program, I always found ways to incorporate fun illustrations in my projects. A year after I graduated, I moved to New York City because there were more jobs in design there and landed a job that allowed me to put my illustrating skills to good use. My first job was working with an incredible Creative Director at a small startup who built an amazing brand using illustrations to convey the company’s goals and messages. This was a part-time job: for four hours a day, I would concentrate on illustrating bespoke email banners for marketing prompts the team created that morning. With her guidance, I saw my illustration skills grow. It showed me the possibility of being a full-time illustrator. Here’s an example of a design I did while I was there: Email banner Champa created for ThinkEco during her first job as illustrator I love to illustrate (especially this type of illustration) because I’m a designer by trade, and the core of designing is to problem-solve. Illustration is no different. As a Technical Illustrator, I simplify and visualize complicated theories and concepts. Also, it’s fun! If I’m not having fun while illustrating, I’m very unmotivated. My creativity relies on avoiding boredom. I’m always working to improve my artistic skills. I’m a lover of learning, so I subscribe to tutorial sites such as Skillshare; follow artists on YouTube who share tutorials; and subscribe to a monthly art box that sends paints, brushes, pens, and so forth so I can try new mediums. Champa's illustration for a Google Local Guides social media post AP: How do you make your illustrations purposeful, engaging, and memorable? CL: Having thoughtful conversations about the subject matter is how you get good designs and illustration. If you don’t understand the subject to the best of your ability, how can you be successful at visualizing it? In school, I was taught to always research your subject matter and not design blindly. Putting in the extra work makes a huge difference. That’s also why 1:1 meetings are so important. It’s a time for me to learn, and it’s also a creative process for the stakeholders, because they find creative ways to help me understand. GIF Champa created for a MongoDB University Page We want to understand the goal. For example, should the illustration be futuristic or nostalgic? Recently, we had a conversation about cars and how we wanted to present them for a project. We decided to design the cars as compact or electric to show MongoDB as forward thinking and environmentally conscious, because those are the kinds of people we want to hire and work with. Or take COVID-19, for instance. The pandemic has changed the way people illustrate office environments. No longer do you have teams sitting in conference rooms. Instead, you have people working at home. So, I had to think of things to illustrate such as a sofa, home desk, and desk lamp. Maybe even a dog or a child. We thought about how we could incorporate this into the Zoom interface. Before, we didn’t have to think about it. Now, Zoom can be a way to add some personality to everyone’s digital space as we work remotely. That’s what I’m here for. To have those conversations and get deeper behind the meaning of everything we create. AP: Let’s talk a little more about your role at MongoDB. What projects do you work on? CL: I’m part of the Visual Design Team, which supports the whole company. It’s fun to meet and talk to many different people at MongoDB. It gives us a lot of diversity in the projects we work on. Along with illustrations, I also work on diagrams and small animations. Projects include campaigns, web illustrations, and events. Because I’ve joined the team, we’re able to have fuller discussions about illustration. Our designers work in a fast-paced world, but my process is slower because I make more bespoke illustrations and have to talk to people to understand the technicalities so we can go beyond generic illustrations. I have to be more thoughtful of what we’re presenting to the audience. Even though by having these conversations I slow down how quickly the designers move, I'm striving to build stronger relationships on the team through this practice. Top left: Champa’s illustration for MongoDB's new multi-cloud feature. Bottom right: An illustration for MongoDB's vendors page. I have found that by showing and explaining my illustration process and inviting them into it, people seem to trust me more. For example, I always share my sketches with stakeholders before digitizing the work. My sketches aren’t perfect, but by showing them not-so-perfect work, we can build the relationship and align on ideas. My hope is that the sketches allow people to see I’m open for collaboration and conversation. Example of a project working with MongoDB's Web Design team from initial sketch through final illustration AP: How does having these conversations help your design? CL: Great question! Working with such a diversity of people and projects helps me gain an immense amount of knowledge and insight. Past conversations and concerns help inform my design decisions. I’m almost like a liaison for all these different departments, and it's nice to transfer the information so we’re all aligned. For example, I’ve been working closely with Product Marketing on diagrams, and soon I’ll be working on diagrams with a member from the Docs team, too. Each team has taken its own paths for diagrams, but I would love to eventually create a holistic style that works for all teams beyond just these two. I believe having a good process to follow leads to meaningful and engaging illustrations. However, it’s important to find balance. You can’t overengineer it, because that can easily turn unproductive and formulaic. I always want an open dialogue and strive to show there’s room to collaborate. The process we have created has been successful so far, but it’s not set in stone. Further along we can add another step, or we may find certain things aren’t needed. AP: What’s your creative vision for MongoDB? CL: My goal for illustration is that we are inclusive, diverse, and thoughtful. What I’ve seen here is a global company full of people who are very passionate and kind. As designers, we have the power to show who and what MongoDB is. For me, that’s showing off who we are. One of our company’s values is “Own What You Do.” I think it’s such an important one for designers, because we should always add our personal experiences and perspectives to our work and translate the rest of the company’s perspectives and experiences, too. For the team, my goal is to continue streamlining a process so we’re transparent and support a collaborative spirit when it comes to working with us. Champa’s illustration for the MongoDB Atlas onboarding experience My goal is to create a unified vision between our two audiences: developer and enterprise customers. My hope is the illustrations bring joy and delight, and that our audiences see MongoDB has a personality. A really effective illustration system is memorable, and our research is starting to show that our audiences are beginning to remember our visuals. This is a huge brand lift, creating a personal experience versus the cold one people may experience with other tech brands. Interested in pursuing a career at MongoDB? We have several open roles on our teams across the globe , and would love for you to build your career with us!