February 7, 2013 by MongoDB | Comments
This is a guest post by Yash Badiani, Practice Head - Big Data, CIGNEX Datamatics.
Record keeping and document archiving are such common practices within enterprises that their importance often goes unrecognized. An efficient archivist was the person who preserved records with such a systematic finesse and structured pattern that archives filed decades ago could be retrieved in matter of minutes. But when enterprise transactions took an innovative leap through computers playing an important role in operations, the volume of data to be managed by an archivist went beyond their scope. The digital data explosion has paved the way for content archival applications that can seamlessly manage operational data.
As a first step towards data archival solutions, the enterprise used turnkey applications to archive emails, legal documents, invoices, and important documents leveraging the raw disk capacity and the security footprint of the applications. But, the world of data archiving was set for another innovative leap, as the requirements of data to be archived bulged not only in VOLUME, but also VARIETY as enterprises identified that their data archiving policy aren't limited to storing emails and invoices but also include log management, enterprise videos, audios, and images on the web, social media feeds, audit trails, data from online transactions etc.
In addition, there was immense metadata associated with all the content which weren't completely leveraged leading to difficulties in content retrieval. Suddenly, the resident enterprise application was under scrutiny much like our pedantic data archivist as the data infused was out of the permissible boundary of their limitations. But the challenge was not just limited to massive volume and variety:
The advent of Big Data has given us a new outlook to address these challenges. The ability of Big Data technologies to store large volumes of structured and unstructured data, arriving at high rates, all at low cost, makes it the most suitable candidate to take the position of data archival solution.
Here are the key requirements of a data archival solution:
Going through our above wish list, it doesn't take us much time to recognize that MongoDB passes the litmus test. Given below is one proposed design we architected on how we can leverage MongoDB as a scalable back-end solution to come up with an enterprise-class data archival solution:
Other than the design features, MongoDB offers numerous advantages in designing applications and integrating them with front-end technologies due to MongoDB’s rich driver support. Its replicated setup allows us to keep systems up-to-date with no downtime. The application is deployable on cloud as SaaS, and allows analytics on stored objects.
Among other things, a MongoDB-based data archival solution offers the following benefits:
A data archival solution leveraging MongoDB would offer tremendous value for various enterprise use cases. For example, consider the Media and Publishing market. A news website might produce a huge amount of content each day, including news articles, feeds for readers, related videos and audio content, images, logs, user comments and chat transcripts. Not only would such an organization produce such varied content, but it would also need to archive the content for long-term retention and future reference. In addition, archival of articles is becoming standard procedure for compliance, auditability, and litigation support purposes.
By designing a data archival solution leveraging MongoDB, the data archivist not only has the advantage of business agility but also benefits from a broader scope for analysis and lack of dependence on IT for organization of her files.
The data archival space has come a long way. With an enterprise data archival solution leveraging MongoDB , we can be assured that the challenges around VOLUME, VARIETY & VELOCITY of data can be handled in an agile and elegant way.
CIGNEX Datamatics Inc. (a subsidiary of Datamatics Global Services Ltd.) is the global leader in Commercial Open Source Enterprise solutions and a global partner of 10gen (MongoDB) offering advisory consulting, implementation, and support services around the MongoDB application. Since year 2000, CIGNEX Datamatics has implemented over 400 Open Source enterprise solutions addressing enterprise requirements from Portals, Content to Big Data solutions.
For more details, contact: Yash Badiani at yash dot badiani at cignex dot com.