Limiting query results

I’m putting together a small doc mgmt solution based on node.js and mongodb. MongoDB can store metadata and files pretty easily, although I’ll need to use GridFS to break up larger documents, but beyond that everything I need is there.

I need to control user access to documents (meaning what they can do on a document-by-document basis) via the equivalent to an ACL. The most stringent limit is where a document isn’t even visibile in query results. Next level up is visibility that it’s there, but no ability to read; then read; then update; then delete. I’m looking for ideas on how to accomplish this.

The challenge is how I can limit visibility of search results in the first place so that users that aren’t authorized to see that a document is even present. I can check a user’s rights for a single document without issue. The overhead is negligible for a single document. The problem comes in if I have millions of documents. I need to limit search results with something that can be combined in the search with minimal overhead.

Is there anything natively in MongoDB like this (which is something available in Oracle and SQL Server)? Is there some approach that may not be built-in, but is available as an add-on or custom?

The one thing I thought of was using a bit field (bits positions representing groups) of enough size to perform or operations against a user’s group memberships (generated when they logon) and doing an or operation of their memberships against it, with bits set to 1 (or 0) when that group bit is excluded from seeing the object in a query result. I’m afraid of the overhead of this though given this would not be indexable, but ensuring the other conditions of the search are applied first.

Thanks,
Gene

What does it mean to see what document is there but not be able to read it?

Is that seeing things like “title” but not content?

Sorry, it’s a fussy thing to describe. The example I have is Documentum (a traditional doc mgmt product) that demonstrates this functionality. As I described, we’re talking about a small document management solution (where the documents are traditional documents, not the data structures that are often called documents in NoSQL DBs.). Most of the time these are PDF, but also MS Office. These can be stored via BSON, with the majority of the other information being metadata such as customer, consignee, etc.

In cases of sensitive documents (HIPPA, ITAR, highly valuable IP), only limited people can actually view the documents, but it is useful for users of the system to know that they are there. At that point they can ask someone allowed to view the documents if they can be allowed to view them as well.

So, in Documentum, this is called browse access. It’s a step above no awareness of the documents, but a step below being able to actually look at them. A person with browse access is able to see metadata, but not the PDF/Word/etc. document itself.

Sorry, but that’s a bit long-winded.

Sorry missed your reply - this can be handled by defining a “read only view” and excluding the field(s) that contain content. Then give those users the ability to read the view but not the underlying collection and they will only see a subset of fields of each document.

See more details here: https://docs.mongodb.com/manual/core/views/

1 Like