Clear: Enabling Seamless Tax Management for Millions of People with MongoDB Atlas



Building India's largest tax and financial services software platform trusted by more than six million Indians

With India’s large population and growing middle class, the country’s tax-paying population has been rising steadily. At the end of the financial year 2021-22, about 5.83 crore (58.3 million ) individuals filed tax returns with the Indian Income Tax department. In addition, India also has about 13.8 million registered Goods and Services Tax (GST) taxpayers.

When juxtaposed with growing digitization in India, this opens up massive demand for a convenient and effective platform to manage tax returns. Clear realized this need early on and launched as a SaaS offering for ITR filing to individuals in 2011 that is currently trusted by more than six million Indians. It is second only to the Indian IT Department’s portal in terms of registered users.

More recently, Clear has been focused on expanding its B2B portfolio, including launching an e-invoicing system. Today, the system supports about 50,000 tax professionals, one million small businesses, and 4000 enterprises in GST filing.

How to ensure a seamless experience for all users at scale

Clear built the initial version of its B2B e-invoicing system on MySQL. However, as adoption grew, the team started to see the limits of the systems tested.

Certain batches of invoices were taking upwards of 25 minutes to process, an issue for the time-sensitive nature of tax filing. If any Clear customer failed to file in time, that customer could be given a penalty and labeled as non-compliant by the Indian government. The team knew they needed to take a step back and reevaluate the core structure of their system.

The Clear team started the system rework by outlining a set of required capabilities. The new database system would need to be able to scale up quickly to handle periods of peak demand and down when traffic was low to save on costs. Tax professionals need to be able to see multiple cuts of the data at different levels, so the database would need to be able to support quick and complex aggregations. Lastly, the team knew that didn’t want to be accountable for the management of the system themselves. They needed a fully-managed option.

MongoDB Atlas chosen for best in class scale and performance

The company ran a proof of concept (POC) study comparing MySQL’s performance with other competitive offerings, including MongoDB. It found that, in terms of the time taken to execute different batch sizes of data, MongoDB was considerably faster in all instances. For example, MongoDB’s processing time was 122% faster than the closest competitor and 767% faster than the farthest competitor.

Comparison of performance among databases

Given the document-based nature of invoices, the results of the POC made sense. With MongoDB, the Clear team could store invoice data together instead of splitting it across tables. This minimized the number of costly joins required to obtain data, leading to faster reads. MongoDB also allowed the team to easily split reads and writes in use cases where the system experienced high volumes of reads and where reading slightly stale data was permissible.

Clear’s aggregation needs were also easily met with MongoDB’s aggregation pipeline. The combination of aggregation support and MongoDB’s full-text search capabilities meant that the Clear team could easily build filterable and searchable dashboards on top of their invoice data.

Lastly, the team also loved the easy-to-use nature of MongoDB Atlas, MongoDB’s fully-managed developer data platform. With Atlas, the team could easily scale up and down their clusters on a schedule to match fluctuations in user traffic.

Achieving a 2900% jump in processing speed along with cost savings

After Clear replatformed from MySQL to MongoDB Atlas on AWS, their customers were shocked by the improvement. Pranesh Vittal, Director Of Engineering, ClearTax India said, “We have achieved considerable optimization with MongoDB. Our customers are often surprised by the pace of execution. There is a significant improvement in the performance, with as much as a 2900% jump in processing speed in some instances.”

Comparing the performance of the new MongoDB powered platform

On top of increased speeds, the team is also saving money. “We’ve generated over 20 crore invoices to date running on a single sharded cluster with a 4TB disc,” said Pranesh Vittal. “The ability to store older data in cold storage [with Online Archive] helped us achieve this.”

Atlas Triggers also help the team automatically scale down their clusters each night and scale them up in the morning. The triggers are fully-managed and schedule-based, so it’s as easy as setting them up and letting them run. This automatic right-sizing is saving the team upwards of $7000 each month ($700 per cluster for 10 clusters).

After seeing such positive results, the team has since decided to replatform multiple other products onto MongoDB. “Here, MongoDB’s live support and consultation have proved very useful,” said Pranesh Vittal. Now, Clear manages 25+ clusters and over 10TB of data on MongoDB Atlas.