I have 1 million documents each with 256 bytes of bit flags (so 2k bits) used as a custom search filter in a binary field. Running a search for all records with certain bits set (essentially a bitwise operation on all records) is slow, taking tens of seconds compared to a fraction of a second with a relational database. I am wondering if there is a good way of implemeting this kind of data and query in Mongo?
The key operation is in algebraic form is as follows - a check that the field in the document has at least all the bits set that are in the query (and it may also have many more set).
query & = query
I tried adding an index on the bitwise field, as that should allow this to be an index scan but mongodb doesn’t use the index for bitwise operations - at least it didn’t last time I tried, I haven’t rerun my tests with the latest versions.
Some background for those wondering why one would implement something like this. The documents contain complex scientific graph data which needs to be sub-graph matched, which is an NP problem. The bits each represent complex composite properties of the nodes and edges within the graph data, addressing common cases of the underlying NP problem giving a speed up of several orders of magnitude. Filtering on the bits is used as a pre-filter for fetching documents to avoid fetching the entire database.
Fetching everything will also be a not uncommon use case which I assume would be better addressed by adding caching but that is a separate issue. The result is similar to a bloom filter.