77 results

MongoDB’s First Test Drive now on AWS

MongoDB is excited to announce the first Test Drive on Amazon Web Services (AWS), “ MongoDB Replica Set .” MongoDB has partnered with AWS and EPAM to create this lab for users, free of charge. The Test Drive is an introduction to MongoDB and teaches users how to quickly launch and interact with a MongoDB replica set. The Test Drive platform was developed by AWS to enable customers to rapidly deploy and evaluate enterprise solutions created by partners in the AWS Partner Network (APN). Overview of the Test Drive Through the MongoDB Replica Set Test Drive customers will learn how to do the following: Create a MongoDB replica set Set-up the appropriate parameters for optimal performance Check performance of the cluster Develop a replica-set as a disaster recovery plan To initiate the lab, click the “Create and Download” button for unique keys, then click “Launch”. The Test Drive includes a step-by-step video which makes it easy to follow along and also includes links to introductory MongoDB documentation. The Test Drive lab includes complimentary AWS server time to complete the lab. You can return to the Test Drive lab at any time! What's Next for MongoDB Test Drives MongoDB plans to make additional Test Drives available with other partners that are experienced with MongoDB, including independent software vendors (ISVs), system integrators (SIs) and value-added resellers (VARs). If you are a partner interested in working with us on Test Drives please contact us . Customers can email EPAM directly to learn more about their solutions and services related to integrating MongoDB at your organization. Additional Resources MongoDB Partner Program MongoDB on AWS Marketplace Documentation: MongoDB on AWS EC2 EPAM Systems, Inc. Cloud AWS Test Drive Program

January 22, 2014

10gen at DrupalCon Munich

DrupalCon Munich came together last week with over 1800 Drupal fans from around the world. The 10gen booth had stickers, t-shirts, and — of course — MongoDB mugs aplenty to share with attendees. From 10gen's perspective, we are interested in how Drupal and MongoDB work together, and what we can do to help make their integration better. Derick Rethans gave a Birds of a Feather session on Wednesday on Practical MongoDB with Drupal, and was co-hosted by Drupal expert, Károly Négyesi (known to the community as chx ). Chx and Derick together were able to give advice on the MongoDB module for Drupal, as well as the EntityFieldQuery module. Wednesday evening had Derick doing more work directly with the MongoDB community when he gave a talk for approximately 30 people at the MÃÂ_nchen MongoDB User Group on “Indexing and Query Optimisation” in MongoDB, covering what is indexing, understanding different types of indexing, and working with indexes. It was well received, and feedback on the talk was complimentary. You can see the slides from the talk here: Thursday afternoon was the highlight of DrupalCon for us. Anyone who visited the 10gen stand over the previous two days or who had come along to the previous day's BoF had heard about Derick's Introduction to MongoDB talk, and it gave attendees a firm grounding in getting started with MongoDB. The slides from this talk can be seen here: DrupalCon 2013 will be held in Prague, and we plan to have a MongoDB presence at the conference again. With the improvements heralded by Drupal 8, along with increased awareness in the community of how MongoDB can be used with Drupal, we expect next year to be an even bigger success. 10gen returns to Munich this October for a full-day conference dedicated to MongoDB. MongoDB Munich comes to the city on October 16 — tickets are available here . Tagged with: Drupal, Munich, DrupalCon, DrupalCon Munich, MongoDB Munich, Károly Négyesi, Derick Rethans, MÃÂ_nchen, MongoDB, Mongo, NoSQL, Polyglot persistence, 10gen

August 30, 2012

Forward Intel uses Mongo for ...causalâ?? analytics

This was originally posted to the Forward Intel blog Forward Intelligence Systems has chosen MongoDB to serve as the backend datastore to support DataPoint , a new analytics system designed to reveal deeper meaning behind typical business intelligence data. Endless tables, graphs and charts are replaced with simple decision-aiding ...plain English analytics“ to help make the job of the business analyst and company owner quicker and easier. DataPoint is designed to run on top of existing analytics systems like Google Analytics, Piwik, Open Web Analytics and HubSpot. Using the hosted solution, users simply connect their analytics account to their DataPoint profile and the application does the rest. DataPoint will import the data from multiple sources and will identify trends and patterns, removing the guesswork out of why a web site may have seen a decrease in traffic in the month of July. Using Bayesian math, DataPoint determines the causal relationship between an event and its most likely cause. MongoDB is a powerful semi-structured database engine built to withstand the increased traffic that today's web applications endure. It is fast, light-weight and extremely scalable, making it a clear and convincing choice for large scale business intelligence and analytics systems. Mongo stores data using a key/value paradigm within entities known as ...documents“, and queried using simple and straight-forward syntax similar to that of Structured Query Language (SQL). Mongo is schema-less, which means database developers are not confined to the typical column and row structure of relational databases. Dynamic data structures are essential for managing big data applications. Further - and critical to its power and flexibility - Mongo contains support for MapReduce, which is an engine that allows for rapid processing of large amounts of data. Implementing algorithms designed to chug through incredibly large volumes of data simply would not be feasible without Mongo's batch processing support. ...At Forward Intel, we're incredibly excited to start using MongoDB,“ said the company's CEO, Steve Adcock. “Mongo's recent award of over $40 million to further its development ensures that it is here to stay, and we are confident that Mongo will serve us quite well.“ Tagged with: analytics, production uses, mongodb good, MongoDB, Mongo, NoSQL, Polyglot persistence, 10gen

August 28, 2012

MongoDB Masters in the Spotlight: Flavio Percoco Premoli

10gen has a number of core contributorsâ€â€ùMongoDB User Group organizers, evangelists, contributors to the core server, connecting libraries and support forum. Last year, 10gen launched the MongoDB Masters program, to encourage the exchange of knowledge and expertise amongst MongoDB community evangelists and open source contributors. To introduce you to these core contributors, we're launching the MongoDB Masters in the Spotlight series on our blog. Flavio Percoco Premoli works in the Research and Development department at The Net Planet Europe and has been an avid MongoDB community contributor for over three years. His host of contributions include Pymongo, the Django MongoDB Engine (co-author and maintainer), the MongoDB plugin for eclipse, Half-Static , a distributed, GridFS based blog engine and the python virtual machine for MongoDB . He lives in Milan, Italy and is a frequent speaker at MongoDB and European technology conferences. What was it like getting started with MongoDB? It was a great experience. It was ~3 years ago when I first looked at mongodb and I was also kind of starting to dig into nosql technologies. It was easy to setup, fast and impressive even if the project was still very young. What advice do you have for other MongoDB users? Try to change the way you think about data and the well known data model paradigms. Models were created to ...model“ data of given a structure but models can be re-modeled too. Do not try to change the way mongodb data management works and forget about db-managed joins :) Oh, btw, Give GridFS a try. You've no idea how useful and powerful it is, I just love it! What has been your greatest accomplishment? I think that one of my biggest accomplishment so far has been making my way in this world and mostly in my professional life. I love what I do and every little goal I've reached is as important as the other ones. That's why making my way and keeping the right path is the most important / difficult one. What is your daily inspiration? ...Make sure you do what you're passioned about and smile while you're doing it; you're born to be happy.“ What do you do in your spare time? I code most of the time. I'm always reading, studying and coding on new projects, trying to find new things to do and to contribute to. If I'm not coding I'm sure you'll find me hanging around with my family and friends. What has been your greatest accomplishment with MongoDB? Every time I get started with MongoDB on a project is an accomplishment because, even for a young project, it has everything I need for that particular project I'm going into. I've done many things with MongoDB (private and public) and each one of them have been an amazing experience. I can't say much about the private projects but I can say that I managed to handle TBs of data but more important than that is that all of this required hundreds of operations per second. I stared at the process monitor amazed at what MongoDB is capable off. I most say that it was running on a really powerful hardware but that makes things even better ;) How has MongoDB helped you the most? In my case, It was helpful when choosing the right ...schema“ to use in our system. Its schema-less capabilities allowed me for making a more flexible, reusable and richer data structures. GridFS has been really helpful too, it allowed me to share big contents between nodes with a single operation without replicating the information or scarifying its consistency. Tagged with: mongodb masters, community, contributors, MongoDB, Mongo, NoSQL, Polyglot persistence, 10gen

August 21, 2012

MongoDB at DrupalCon Munich 2012

Fans of MongoDB and Drupal have the chance to flock together at Drupalcon in Munich this week. Running from August 20 - 24, the official conference of the Drupal community features ...Birds of a Feather“ sessions alongside formally scheduled presentations. Included among these is a session organised by 10gen on ... Practical MongoDB and Drupal “. Birds of a Feather workshops are informal and openly scheduled workshops, giving like-minded individuals a chance to talk about a common problem and discuss topical issues. ...Practical MongoDB and Drupal“ will look at Drupal7’s plugable class architecture, and how we can easily swap Drupal’s underlying data storage for MongoDB's faster performance for reads and writes. Furthermore, Microsoft will be joining the session and have 50 pre-allocated Azure passes to share with attendees. Drupalcon will also feature 10gen's own Derick Rethans presenting an ...Introduction to MongoDB“. The session will introduce how to get the most out of MongoDB and explain how MongoDB offers viable alternatives to the standard, normalized, relational mode. Derick's expertise and experience is well known in both the NOSQL and the PHP community. He will explain how to get the most out of MongoDB, why the technology fits well with Drupal; and not only to set it up, but also how to get going with it. Places are still available for Drupalcon here . ... Practical MongoDB and Drupal “ will be held in Chamonix room at Drupalcon at 11.45 on Wednesday, Aug 22. Space is very limited, so arrive promptly! Derick Rethans gives his Introduction to MongoDB on Thursday, Aug 23 at 13.00. Tagged with: MongoDB, Mongo, NoSQL, Polyglot persistence, 10gen

August 20, 2012

Pig as Hadoop Connector, Part One: Pig, MongoDB and Node.js

This post was originally published on the Hortonworks blog . .wp_syntax {background-color: #EEE; border: 1px solid #CCC; padding: 10px; margin-top: 10px; margin-bottom: 10px; white-space: pre; font-size: 12px; font-family: monospace;} Series Introduction Apache Pig is a dataflow oriented, scripting interface to Hadoop . Pig enables you to manipulate data as tuples in simple pipelines without thinking about the complexities of MapReduce. But Pig is more than that. Pig has emerged as the 'duct tape' of Big Data, enabling you to send data between distributed systems in a few lines of code. In this series, we're going to show you how to use Hadoop and Pig to connect different distributed systems, to enable you to process data from wherever and to wherever you like. Working code for this post as well as setup instructions for the tools we use are available at and you can download the Enron emails we use in the example in Avro format here . You can run our example Pig scripts in local mode (without Hadoop) with the -x local flag: pig -x local . This enables new Hadoop users to try out Pig without a Hadoop cluster. Introduction In this post we'll be using Hadoop, Pig, mongo-hadoop , MongoDB and Node.js to turn Avro records into a web service. We do so to illustrate Pig's ability to act as glue between distributed systems, and to show how easy it is to publish data from Hadoop to the web. Pig and Avro Pig's Avro support is solid in Pig 0.10.0 . To use AvroStorage, we need only load piggbank.jar, and the jars for avro and json-simple. A shortcut to AvroStorage is convenient as well. Note that all paths are relative to your Pig install path. We load Avro support into Pig like so: /* Load Avro jars and define shortcut */ register /me/pig/build/ivy/lib/Pig/avro-1.5.3.jar register /me/pig/build/ivy/lib/Pig/json-simple-1.1.jar register /me/pig/contrib/piggybank/java/piggybank.jar define AvroStorage; /* Shortcut */ MongoDB's Java Driver To connect to MongoDB, we'll need the MongoDB Java Driver. You can download it here: . We'll load this jar in our Pig script. Mongo-Hadoop The mongo-hadoop project provides integration between MongoDB and Hadoop. You can download the latest version at . Once you download and unzip the project, you'll need to build it with sbt. . / sbt package This will produce the following jars: $ find . | grep jar . / core / target / mongo-hadoop-core-1.1.0-SNAPSHOT.jar . / pig / target / mongo-hadoop-pig-1.1.0-SNAPSHOT.jar . / target / mongo-hadoop-1.1.0-SNAPSHOT.jar We load these MongoDB libraries in Pig like so: /* MongoDB libraries and configuration */ register /me/mongo-hadoop/mongo-2.7.3.jar /* MongoDB Java Driver */ register /me/mongo-hadoop/core/target/mongo-hadoop-core-1.1.0-SNAPSHOT.jar register /me/mongo-hadoop/pig/target/mongo-hadoop-pig-1.1.0-SNAPSHOT.jar /* Set speculative execution off so we don't have the chance of duplicate records in Mongo */ set false set mapred.reduce.tasks.speculative.execution false define MongoStorage com.mongodb.hadoop.pig.MongoStorage(); /* Shortcut */ set default_parallel 5 /* By default, lets have 5 reducers */ Writing to MongoDB Loading Avro data and storing records to MongoDB are one-liners in Pig. avros = load 'enron.avro' using AvroStorage(); store avros into 'mongodb://localhost/enron.emails' using MongoStorage(); From Avro to Mongo in One Line I've automated loading Avros and storing them to MongoDB in the script at , using Pig's parameter substitution : avros = load '$avros' using AvroStorage(); store avros into '$mongourl' using MongoStorage(); We can then call our script like this, and it will load our Avros to Mongo: pig -l / tmp -x local -v -w -param avros =enron.avro \ -param mongourl = 'mongodb://localhost/enron.emails' avro_to_mongo.pig We can verify our data is in MongoDB like so: $ mongo enron MongoDB shell version: 2.0.2 connecting to: enron > show collections emails system.indexes > db.emails.findOne({message_id: "%3C3607504.1075843446517.JavaMail.evans@thyme%3E"}) { "_id" : ObjectId("502b4ae703643a6a49c8d180"), "message_id" : "", "date" : "2001-04-25T12:35:00.000Z", "from" : { "address" : "", "name" : "Jeff Dasovich" }, "subject" : null, "body" : "Breathitt's hanging tough, siding w/Hebert, standing for markets. Jeff", "tos" : [ { "address" : "", "name" : null } ], "ccs" : [ ], "bccs" : [ ] } To the Web with Node.js We've come this far, so we may as well publish our data on the web via a simple web service. Lets use Node.js to fetch a record from MongoDB by message ID, and then return it as JSON. To do this, we'll use Node's mongodb package . Installation instructions are available in our github project . Our node application is simple enough. We listen for an http request on port 1337, and use the messageId parameter to query an email by message id. // Dependencies var mongodb = require("mongodb"), http = require('http'), url = require('url'); // Set up Mongo var Db = mongodb.Db, Server = mongodb.Server; // Connect to the MongoDB 'enron' database and its 'emails' collection var db = new Db("enron", new Server("", 27017, {}));, n_db) { db = n_db }); var collection = db.collection("emails"); // Setup a simple API server returning JSON http.createServer(function (req, res) { var inUrl = url.parse(req.url, true); var messageId = inUrl.query.messageId; // Given a message ID, find one record that matches in MongoDB collection.findOne({message_id: messageId}, function(err, item) { // Return 404 on error if(err) { console.log("Error:" + err); res.writeHead(404); res.end(); } // Return 200/json on success if(item) { res.writeHead(200, {'Content-Type': 'application/json'}); res.send(JSON.stringify(item)); res.end(); } }); }).listen(1337, ''); console.log('Server running at' ); Navigating to http://localhost:1337/?messageId=%3C3607504.1075843446517.JavaMail.evans@thyme%3E returns an enron email as JSON: We'll leave the CSS as an exercise for your web developer, or you might try Bootstrap if you don't have one. Conclusion The Hadoop Filesystem serves as a dumping ground for aggregating events. Apache Pig is a scripting interface to Hadoop MapReduce . We can manipulate and mine data on Hadoop, and when we're ready to publish it to an application we use mongo-hadoop to store our records in MongoDB . From there, creating a web service is a few lines of javascript with Node.js - or your favorite web framework. MongoDB is a popular NoSQL database for web applications. Using Hadoop and Pig we can aggregate and process logs at scale and publish new data-driven features back to MongoDB - or whatever our favorite database is. Note: we should ensure that there is sufficient I/O between our Hadoop cluster and our MongoDB cluster, lest we overload Mongo with writes from Hadoop. Be careful out there! I have however verified that writing from an Elastic MapReduce Hadoop cluster to a replicated MongoHQ cluster (on Amazon EC2) works well. About the Author Russell Jurney is a data scientist and the author of the book Agile Data (O'Reilly, Dec 2012) , which teaches a flexible toolset and methodology for building effective analytics applications using Apache Hadoop and cloud computing. About Hortonworks Hortonworks is a leading commercial vendor of Apache Hadoop, the preeminent open source platform for storing, managing and analyzing big data. Our distribution, Hortonworks Data Platform powered by Apache Hadoop, provides an open and stable foundation for enterprises and a growing ecosystem to build and deploy big data solutions. Hortonworks is the trusted source for information on Hadoop, and together with the Apache community, Hortonworks is making Hadoop more robust and easier to install, manage and use. Hortonworks provides unmatched technical support, training and certification programs for enterprises, systems integrators, and technology vendors. For more information, visit . Tagged with: MongoDB, Mongo, NoSQL, Polyglot persistence, 10gen

August 17, 2012