Understanding CSFLE strategies for "forgetting a user's data"

Hi team,

I hope you’re all well.

I was wondering if any of you had implemented CSFLE (Client Side Field Level Encryption)?

I’ve finally got a use case for it, and am implementing it into our framework, but have seen an interesting scenario that I would love your advice on.

Consider a scenario such as GDPR.

Client A has their data encrypted with Data Encryption Key A, and Client B has their data encrypted with Data Encryption Key B. This strategy was chosen so we could guarantee clients that we are unable to read their sensitive data from our database or, indeed, our backups once their Data Encryption Key is deleted (different database and backup schedule).

So that all works well. Client A now wants to exercise their right to be forgotten. Given that their sensitive data is all encrypted, we’re exploring just deleting their Data Encryption Key. This would mean that we can’t decrypt any of client A’s sensitive data within the database. Let’s assume that that’s OK for all non-technical reasons, as this is a technical question.

Now, if some documents for Client A are not actually physically deleted from our databases, and contain encrypted data (for which we no longer have the Data Encryption Key), and we run a query that somehow covers some of those records (perhaps to read some non-personal data, like telemetry or something), as well as many other records from other clients who have not had their keys deleted, that query throws an error.

The specific error for the nerds in the group is {“name”:“MongoCryptError”,“message”:“not all keys requested were satisfied. Verify that key vault DB/collection name was correctly specified.”}

I understand why this is happening - I just don’t want it to happen. What are my mitigation strategies?

Has anyone experienced this, and more specifically … how did you handle it, or how would you handle it?

10 points for the best answer :wink:

One solution to the above is to not project any encrypted fields in queries that might cross DEK boundaries. If you’re within one DEK boundary, it would make sense to throw the exception.

Of course, another solution is to delete the data itself, but that could be easier said than done on a large micro-serviced platform [anyway, that was out of scope for this discussion].

I’d be interested to hear any other proposals.

Hey @Erich, I’m not a subject matter expert on this topic but wanted to share Implementing Right to Erasure with CSFLE | MongoDB in case you hadn’t seen it yet as you may find it useful.

Thank you @alexbevi. That’s exactly what we’ve implemented and what the issue above relates to. CSFLE and key shredding is a neat solution to a real problem, but the issue above really does complicate matters a bit.

So far, I think the mitigation is that cross-DEK queries will have to exclude encrypted fields through projection, and where services need access to encrypted field content, they will have to process DEK boundaries individually. This will work fine, although it is a bit inefficient - but acceptable.

I really appreciate you commenting on this. The link you provided is very useful for anyone diving into this topic.

1 Like