On June 1-2, over 2,000 developers, sysadmins, and DBAs will converge in New York City for MongoDB World, our annual user conference. It’s your chance to to get inspired, share ideas and get the latest insights on using MongoDB.
If you’re active on social media, you may have a chance to win a free conference pass. Simply share your enthusiasm for MongoDB World on Twitter during the month of March.
How to Enter
- Follow @mongodb or @mongodbinc on Twitter
- Tell us why you’re excited about MongoDB World by tweeting with a mention of @mongodb or @mongodbinc and make sure to use the #MongoDBWorld hashtag. Get creative! Start conversations, use humor, share photos, tell us what speakers you look forward to hearing from.
Click to tweet: I can't wait for #MongoDBWorld w/ @MongoDB because...
## Prizes 1st Prize: Ticket to MongoDB World 2015 2nd Prize: Fully loaded MongoDB World swag pack 3rd Prize: MongoDB T-shirt
How Winners Will Be Selected
MongoDB will pick the winning applicant by April 3rd, and will notify the winner via twitter direct message. The winners will be chosen based on a combination of most widely shared content and creativity used to experience their excitement about MongoDB World.
Call for Feedback: The New PHP and HHVM Drivers
In the beginning Kristina created the MongoDB PHP driver. Now the PECL mongo extension was new and untested, write operations tended to be fire-and-forget, and Boolean parameters made more sense than $options arrays. And Kristina said, "Let there be MongoCollection," and there was basic functionality. Since the PHP driver first appeared on the scene, MongoDB has gone through many changes. Replica sets and sharding arrived early on, but things like the aggregation framework and command cursors were little more than a twinkle in Eliot's eye at the time. The early drivers were designed with many assumptions in mind: write operations and commands were very different; the largest replica set would have no more than a dozen nodes; cursors were only returned by basic queries. In 2015, we know that these assumptions no longer hold true. Beyond MongoDB's features, our ecosystem has also changed. When the PHP driver, a C extension, was first implemented, there wasn't yet a C driver that we could utilize. Therefore, the 1.x PHP driver contains its own BSON and connection management C libraries. HHVM , an alternative PHP runtime with its own C++ extension API, also did not exist years ago, nor was PHP 7.0 on the horizon. Lastly, methods of packaging and distributing libraries have changed. Composer has superseded PEAR as the de facto standard for PHP libaries and support for extensions (currently handled by PECL) is forthcoming. During the spring of 2014, we worked with a team of students from Facebook's Open Academy program to prototype an HHVM driver modeled after the 1.x API. The purpose of that project was twofold: research HHVM's extension API and determine the feasibility of building a driver atop libmongoc (our then new C driver) and libbson . Although the final result was not feature complete, the project was a valuable learning experience. The C driver proved quite up to the task, and HNI, which allows an HHVM extension to be written with a combination of PHP and C++, highlighted critical areas of the driver for which we'd want to use C. This all leads up to the question of how best to support PHP 5.x, HHVM, and PHP 7.0 with our next-generation driver. Maintaining three disparate, monolithic extensions is not sustainable. We also cannot eschew the extension layer for a pure PHP library, like mongofill , without sacrificing performance. Thankfully, we can compromise! Here is a look at the architecture for our next-generation PHP driver: At the top of this stack sits a pure PHP library, which we will distribute as a Composer package. This library will provide an API similar to what users have come to expect from the 1.x driver (e.g. CRUD methods, database and collection objects, command helpers) and we expect it to be a common dependency for most applications built with MongoDB. This library will also implement common specifications , in the interest of improving API consistency across all of the drivers maintained by MongoDB (and hopefully some community drivers, too). Sitting below that library we have the lower level drivers (one per platform). These extensions will effectively form the glue between PHP and HHVM and our system libraries (libmongoc and libbson). These extensions will expose an identical public API for the most essential and performance-sensitive functionality: Connection management BSON encoding and decoding Object document serialization (to support ODM libraries) Executing commands and write operations Handling queries and cursors By decoupling the driver internals and a high-level API into extensions and PHP libraries, respectively, we hope to reduce our maintainence burden and allow for faster iteration on new features. As a welcome side effect, this also makes it easier for anyone to contribute to the driver. Additionally, an identical public API for these extensions will make it that much easier to port an application across PHP runtimes, whether the application uses the low-level driver directly or a higher-level PHP library. GridFS is a great example of why we chose this direction. Although we implemented GridFS in C for our 1.x driver, it is actually quite a high-level specification. Its API is just an abstraction for accessing two collections: files (i.e. metadata) and chunks (i.e. blocks of data). Likewise, all of the syntactic sugar found in the 1.x driver, such as processing uploaded files or exposing GridFS files as PHP streams, can be implemented in pure PHP. Provided we have performant methods for reading from and writing to GridFS' collections – and thanks to our low level extensions, we will – shifting this API to PHP is win-win. Earlier I mentioned that we expect the PHP library to be a common dependency for most applications, but not all. Some users may prefer to stick to the no-frills API offered by the extensions, or create their own high-level abstraction (akin to Doctrine MongoDB for the 1.x driver), and that's great! Hannes has talked about creating a PHP library geared for MongoDB administration, which provides an API for various user management and ops commands. I'm looking forward to building the next major version of Doctrine MongoDB ODM directly atop the extensions. While we will continue to maintain and support the 1.x driver and its users for the foreseeable future, we invite everyone to check out our next-generation driver and consider it for any new projects going forward. You can find all of the essential components across GitHub and JIRA: Project GitHub JIRA PHP Library mongodb/mongo-php-library PHPLIB PHP 5.x Driver (phongo) mongodb/mongo-php-driver PHPC HHVM Driver (hippo) mongodb/mongo-hhvm-driver HHVM The existing PHP project in JIRA will remain open for reporting bugs against the 1.x driver, but we would ask that you use the new projects above for anything pertaining to our next-generation drivers. If you're interested in hearing more about our upcoming PHP and HHVM drivers, Derick Rethans is presenting a new talk entitled One Extension, Two Engines at php[tek] 2015 in May. About the Author - Jeremy Jeremy Mikola is a software engineer at MongoDB's NYC office. As a member of the driver and evangelism team, he helps develop the PHP driver and contributes to various OSS projects, such as Doctrine ODM and React PHP. Jeremy lives in Hoboken, NJ and is known to enjoy a good sandwich.
Engineering, Done DIRT Cheap: How an Outdated Data Architecture Becomes a Tax on Innovation
In March 2021, I wrote about The Innovation Tax : the idea that clunky processes and outdated technologies make it harder for engineering teams to produce excellent tech that delights customers. In the months since then, my thinking has evolved even further. I couldn’t have guessed how many technology leaders would immediately recognize these problems in their own organizations and share their own deep frustrations with me. This article puts that evolved thought together with the massive feedback that piece received. It will give you actionable ways to decrease your tax burden — and who wouldn’t want that? The innovation tax, like income tax, is real. Of course, it saps morale (with resulting attrition and churn), but it also has other financial and opportunity costs. Taxed organizations see their pace of innovation suffer as people and resources are locked into maintaining rather than innovating. We named this tax DIRT . Why? Well, it’s rooted in data (D), because it so often springs from the difficulty of using legacy databases to support modern applications that require access to real-time data to create rich user experiences. It affects innovation (I), because your teams have little time to innovate if they’re constantly trying to figure out how to support a complex and rickety architecture. It’s recurring (R), because it’s not as if you pay the tax (T) once and get it over with. Quite the opposite. DIRT makes each new project ever more difficult because it introduces so many components, frameworks, and protocols that need to be managed by different teams of people. In retrospect, it’s clear that technology leaders would recognize this tax and immediately grasp the degree to which it’s caused -- or cured -- by their data architecture. Data is sticky, strategic, heavy, intricate -- and the core of the modern digital company. Modern applications have much more sophisticated data requirements than the applications we were building only 10 years ago. Obviously, there is more data, but it’s more complicated than that: Companies are expected to react more quickly and more cleverly to all of the signals in that data. Legacy technologies, including single-model rigid, inefficient, and hard-to-program relational databases, just don’t cut it. In over 300 CxO conversations I've had since joining MongoDB in 2020, fewer than a handful of CTOs disputed this statement. When your tech stack can’t handle the demands of new applications, engineering teams will often bolt on single-purpose niche databases to do the job (think time series, text, graph, etc.). Then they’ll build a series of pipelines to move data back and forth. And everything will get slow and complicated — and even political. Time to polish up that LinkedIn profile. If this were rare, it wouldn’t be such a big deal. But large enterprises can have hundreds or thousands of applications, each with their own sources of data and their own pipelines. Over time, as data stores and pipelines multiply, an organization’s data architecture starts to look like a plate of spaghetti. Soon you’re operating and maintaining an entire middleware layer of ETL, ELT, and streaming. The variety of technologies, each with their own frameworks, protocols, and sometimes languages, makes it harder for developers to collaborate. It makes it extremely difficult to scale, because every architecture is bespoke and brittle. Developers spend their precious “flow” hours doing integration work instead of building new applications and features that the business needs and customers will love. Enterprise architects often end up spending their time on all the wrong things. It’s clear to me that most customers are ready for a new approach to data architecture. One of the best parts of my job is listening to and learning from other CxOs. Since the pandemic made it impossible to do that in person, MongoDB moved these discussions online, inviting technology leaders to hash out some of their biggest problems 1:1 and in groups with me. In one of those sessions, a CTO commented, “Technical debt should be carried on your CFO's balance sheet.” Even on Zoom, the power of that statement was clear. We also started looking at slide decks about data architecture from some of the best-known venture capital firms. Certainly VCs must position each of their portfolio companies as a critical player in the data architecture of the future. But the overall vision was not compelling. One technology leader said, “When I look at 20 net-new technologies I need to learn, it’s terrifying.” Others commented that just looking at these architecture diagrams was a little off-putting, because they knew their own organization’s data architecture was at least that complicated already. They knew they needed to simplify their data architecture, but more than one admitted to postponing this work -- indefinitely -- because it was just too daunting. I recently met with a major health care company whose executives think it’s just barely possible, but they are bravely diving in anyway, knowing that they must do it and that they’ll learn along the way as they tear down their monoliths. In many cases, the innovation tax manifests as the inability to even consider new technology because the underlying architecture is too complex and difficult to maintain, much less understand and transform. This is why a lot of senior people at enterprise companies are sitting with their fingers in the transformation dike, waiting for retirement -- they think they can’t modernize. It won’t surprise you that we also saw how MongoDB, as a general purpose database able to handle all types of data at speed and scale, could help solve this problem. Let me be clear. I’ve been working on or with databases for my entire 35-year career, and I joined MongoDB for a reason: I believe we can build the database and application-building environment that I’ve wanted to create and use for at least 30 of those years. Our vision of MongoDB goes beyond our namesake database to a broader, more versatile application data platform that allows you to accelerate and simplify how you build any type of application. It represents significant progress toward our larger goal, which remains the same as ever: to make data stunningly easy to work with. We want to see data become an enabler of innovation, not a blocker. And we want to finally allow technology teams to start to untangle their sprawl and get rid of their DIRT. Where to start? It’s good to have a better understanding of just how DIRT might be holding your teams back. Do your developers have trouble collaborating because the development environment is so fragmented? Do schema changes take longer to roll out than the application changes they’re designed to support? Do you have trouble building 360-degree views of your customers? And if so, why? These are all good places to start digging in the DIRT. You might also take a hard look at your applications and data sources, as well as what it would take to move your data onto an application data platform. That could mean identifying the objects in your applications and all the applications that interact with them. You could then assign a complexity score to each one based on attributes such as properties, methods, collections, and attributes. Now take a step back and identify each application that connects to each of those objects and rank it based on how mission-critical it is, how many people rely on it, how many tasks it has to perform, and the complexity of those tasks. Once you have a better handle on all this complexity, you’ll be better positioned to create a plan to move off your legacy systems, perhaps starting with the least complex and least integrated data sources. Of course, your metrics and your mileage will vary, but the point is to start. I don’t pretend any of this is easy. Like many of you, I’ve spent most of my career working on problems just like these. But that also means I know progress when I see it, and the beginning of a way for organizations to start to clean up their DIRT. I’ll be continuing to write more about these challenges and hopefully continue to add some perspective. If you’re curious to learn more about DIRT, you can download our white paper . As always, I’m eager to have you tweet your alignment, lack thereof, or other thoughts at @MarkLovesTech . You can also reach out to me on marklovestech.com , where you will find a compilation of my latest musings related to MongoDB and otherwise.