MongoDB Connector for BI in Tableau

Hi everyone!

I’m working with some collections hosted in an atlas cluster and my mongo bi connector in tableau prep.

I’m having a lot of problems when I doing aggregations in tableau due to the memory limitation set in the $group parameter.

My main idea is to create views of these collections to load the data into tableau, but I can’t find a way to set the {AllowDiskUse: True} option in my pipeline. This would not be a problem when running from the mongo shell but I am trying to load these views directly from Tableau so I can run the flows automatically.

I have tried to create a new collection with the already aggregated data with success, but this is not the best solution as this way I am duplicating data in a new collection and storing it twice.

My questions are: Is there any way when creating these views to set the {AllowDiskUse: True} parameter? Is it possible to set by default that there is no memory limit in the aggregations? How can I query these views from Tableau prep with my mongo connector without having this problem of memory in the aggregations?

Thanks in advance, I hope you can help me with this problem.

Hi @Fernando_Lumbreras and welcome to the community!!

In order to deal with the above problem, there could be two possible methods which might help. But you would need to trade off between the memory used and response time for the aggregation query being used.

One would be a method where you could use the find method along side {AllowDiscUse: true}. Please refer to the following documentation in cursor.allowDiskUse().

The other method could be to make use of Materialised View which is available for MongoDB version 4.2.x which creates on-demand materialised views and updates the contents with each pipeline run.

This could be one of the possible ways to solve the issue when you could compromise space compared to time.

Let us know if you have further questions.

Thanks
Aasawari

1 Like

Hi @Aasawari ! First of all, thank you very much for you reply.

The first method works correctly to query the views from the mongo shell but my problem comes when querying these views from the Tableau Prep Builder application where I don’t have the mongo shell. The only thing I can do is call tables and views (without being able to add find and {AllowDiscUse: true}) or write custom mysql queries.

As for the second method. In this case, the problem is that materialized views increase the size of the database. This would make me have duplicate data and increase the cost of the service.

Is there no way to configure the mongo BI connector to avoid having these memory problems?

Thanks

Hi @Fernando_Lumbreras

The allowDiskUse is used by default in the BI Connector. The following documentation on how-are-queries-processed would hopefully be useful for you.

Could you elaborate on how you determined that the issue was caused by the $group stage? Is there any error message that you see?

Please help us with the above information to help you further.

Thanks
Aasawari

1 Like

Hi @Aasawari

Yes, I have read something about the BI Connector having this parameter active by default. But in practice I don’t see this working.

I am working with a collection that contains about 55 million data (about machine errors).

Once the collection is opened in tableau prep, what I am trying to do is group all this data by error type. Everything seems to work fine, but when it reaches the end of execution I get the following error on the Tableau server.

ErrorType

Thanks

Hello, @Fernando Lumbreras, and thank you for sharing the information above.

The error notice GroupByStage - an allocation of 1318 bytes pushes overall allocation over the limit of 2147483647 is an error message that was raised by the BI connector due the $group stage, user internally, exceeding the 2GB max allocation that is enforced by Atlas. Notably, allowDiskUse allows a stage to grow beyond 100MB, and this message was not caused by the lack of allowDiskUse option.

Unfortunately, this limit is currently not customisable. Furthermore, even if the limit is customisable, I tend to think that with more data in your database, you would hit the elevated limitation again sooner rather than later. Thus, the only workaround at this moment is to limit the amount of data that the Tableau query pulls.

Please let us know if you have any further questions.

Thanks
Aasawari

1 Like

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.