PyMongoArrow 0.6.2 Released

We are pleased to announce the 0.6.2 release of PyMongoArrow - a PyMongo extension containing tools for loading MongoDB query result sets as Apache Arrow tables, Pandas and NumPy arrays.

This is a minor release that brings support for PyArrow 10.0. We did not
publish 0.6.0 or 0.6.1 due to technical errors.

See the changelog for a high level summary of what’s new and improved or see the 0.6.2 release notes in JIRA for the complete list of resolved issues.

Documentation: PyMongoArrow 0.6.2 Documentation
Changelog: Changelog
Source: GitHub

Thank you to everyone who contributed to this release!

I took the MongoDB university’s PyMongoArrow course yesterday, and then realised that support for many types is still not there.

On the other hand, the same functionality (with support for all Python types) is already provided by Pandas through one of its DataFrame constructors. The list of Python “dict” objects provided in the output of pymongo’s “find()” method (see MongoDB Python Connection | MongoDB) can be directly given as input to the DataFrame constructor.

So, what is the need for, or advantage of, using PyMongoArrow?

Hi @Sanjay_Dasgupta, thank you for the question, and for opening Documentation should describe advantages over DataFrame constructor (of Pandas) · Issue #107 · mongodb-labs/mongo-arrow · GitHub.

For completeness, we’re tracking this issue in https://jira.mongodb.org/browse/ARROW-129, summarized as:

We should list the pros and cons of using this library versus using the PyMongo API directly, highlighting the benchmarks as well as the limitations.

We should give examples showing how the same tasks could be accomplished with each.

1 Like