Client-Side Field Level Encryption (CSFLE) in MongoDB with Golang
Rate this tutorial
One of the many great things about MongoDB is how secure you can make your data in it. In addition to network and user-based rules, you have encryption of your data at rest, encryption over the wire, and now recently, client-side encryption known as client-side field level encryption (CSFLE).
So, what exactly is client-side field level encryption (CSFLE) and how do you use it?
With field level encryption, you can choose to encrypt certain fields within a document, client-side, while leaving other fields as plain text. This is particularly useful because when viewing a CSFLE document with the , , or directly within , the encrypted fields will not be human readable. When they are not human readable, if the documents should get into the wrong hands, those fields will be useless to the malicious user. However, when using the MongoDB language drivers while using the same encryption keys, those fields can be decrypted and are queryable within the application.
There are a few requirements that must be met prior to attempting to use CSFLE with the Go driver.
To use field level encryption, you're going to need a little more than just having an appropriate version of MongoDB and the MongoDB Go driver. We'll need libmongocrypt, which is a companion library for encryption in the MongoDB drivers, and mongocryptd, which is a binary for parsing automatic encryption rules based on the extended JSON format.
Because we want to do automatic encryption with the driver using an extended JSON schema, we need mongocryptd, a binary that ships with MongoDB Enterprise Edition. The mongocryptd binary needs to exist on the computer or server where the Go application intends to run. It is not a development dependency like libmongocrypt, but a runtime dependency.
By this point, all the appropriate components for field level encryption should be installed or available.
Before we can start encrypting and decrypting fields within our documents, we need to establish keys to do the bulk of the work. This means defining our key vault location within MongoDB and the Key Management System (KMS) we wish to use for decrypting the data encryption keys.
The key vault is a collection that we'll create within MongoDB for storing encrypted keys for our document fields. The primary key within the KMS will decrypt the keys within the key vault.
On your computer, create a new Go project with the following main.go file:
In the above code, we have a few variables defined as well as a few functions. We're going to focus on the
kmsProvidersvariable and the
createDataKeyfunction for this particular part of the tutorial.
Take a look at the following
In the above
createDataKeyfunction, we are first connecting to MongoDB. The MongoDB connection string is defined by the environment variable
ATLAS_URIin the above code. While you could hard-code this connection string or store it in a configuration file, for security reasons, it makes a lot of sense to use environment variables instead.
If the connection was successful, we need to define the key vault namespace and the KMS provider as part of the encryption configuration options. The namespace is composed of the database name followed by the collection name. This is where the key information will be stored. The
kmsProvidersmap, which will be defined later, will have local key information.
CreateDataKeyfunction will create the key information within MongoDB as a document.
We are choosing to specify an alternate key name of
exampleso that we don't have to refer to the data key by its
_idwhen using it with our documents. Instead, we'll be able to use the unique alternate name which could follow a special naming convention. It is important to note that the alternate key name is only useful when using the
AEAD_AES_256_CBC_HMAC_SHA_512-Random, something we'll explore later in this tutorial.
To use the
createDataKeyfunction, we can make some modifications to the
In the above code, we are generating a random key. This random key is added to the
kmsProvidersmap that we were using within the
It is insecure to have your local key stored within the application or on the same server. In production, consider using AWS KMS or accessing your local key through a separate request before adding it to the Local Key Provider.
If you ran the code so far, you'd end up with a
keyvaultdatabase and a
datakeyscollection which has a document of a key with an alternate name. That document would look something like this:
There are a few important things to note with our code so far:
localKeyis random and is not persisting beyond the runtime which will result in key mismatches upon consecutive runs of the application. Either specify a non-random key or store it somewhere after generation.
- We're using a Local Key Provider with a key that exists locally. This is not recommended in a production scenario due to security concerns. Instead, use a provider like AWS KMS or store the key externally.
createDataKeyshould only be executed when a particular key is needed to be created, not every time the application runs.
- There is no strict naming convention for the key vault and the keys that reside in it. Name your database and collection however makes sense to you.
After we run our application the first time, we'll probably want to comment out the
createDataKeyline in the
With the data key created, we're at a point in time where we need to figure out what fields should be encrypted in a document and what fields should be left as plain text. The easiest way to do this is with a schema map.
A schema map for encryption is extended JSON and can be added directly to the Go source code or loaded from an external file. From a maintenance perspective, loading from an external file is easier to maintain.
Take a look at the following schema map for encryption:
Let's assume the above JSON exists in a schema.json file which sits relative to our Go files or binary. In the above JSON, we're saying that the map applies to the
peoplecollection within the
keyIdfield within the
encryptMetadataobject says that documents within the
peoplecollection must have a string field called
keyAltName. The value of this field will reflect the alternate key name that we defined when creating the data key. Notice the
/that prefixes the value. That is not an error. It is a requirement for this particular value since it is a pointer.
propertiesfield lists fields within our document and in this example lists the fields that should be encrypted along with the encryption algorithm to use. In our example, only the
ssnfield will be encrypted while all other fields will remain as plain text.
There are two algorithms currently supported:
In short, the
AEAD_AES_256_CBC_HMAC_SHA_512-Randomalgorithm is best used on fields that have low cardinality or don't need to be used within a filter for a query. The
AEAD_AES_256_CBC_HMAC_SHA_512-Deterministicalgorithm should be used for fields with high cardinality or for fields that need to be used within a filter.
If we wanted to, we could change the schema map to the following:
The change made in the above example has to do with the
keyIdfield. Rather than declaring it as part of the
encryptMetadata, we've declared it as part of a particular field. This could be useful if you want to use different keys for different fields.
Remember, the pointer used for the
keyIdwill only work with the
AEAD_AES_256_CBC_HMAC_SHA_512-Randomalgorithm. You can, however, use the actual key id for both algorithms.
With a schema map for encryption available, let's get it loaded in the Go application. Change the
readSchemaFromFilefunction to look like the following:
In the above code, we are reading the file, which will be the schema.json file soon enough. If it is read successfully, we use the
UnmarshalExtJSONfunction to load it into a
bson.Mobject that is more pleasant to work with in Go.
By this point, you should have the code in place for creating a data key and a schema map defined to be used with the automatic client encryption functionality that MongoDB supports. It's time to bring it together to actually encrypt and decrypt fields.
We're going to start with the
createEncryptedClientfunction within our project:
In the above code we are making use of the
readSchemaFromFilefunction that we had just created to load our schema map for encryption. Next, we are defining our auto encryption options and establishing a connection to MongoDB. This will look somewhat familiar to what we did in the
createDataKeyfunction. When defining the auto encryption options, not only are we specifying the KMS for our key and vault, but we're also supplying the schema map for encryption.
You'll notice that we are using
mongocryptdBypassSpawnas an extra option. We're doing this so that the client doesn't try to automatically start the mongocryptd daemon if it is already running. You may or may not want to use this in your own application.
If the connection was successful, the client is returned.
It's time to revisit the
mainfunction within the project:
In the above code, we are creating our Local Key Provider using a local key that was randomly generated. Remember, this key should match what was used when creating the data key, so random may not be the best long-term. Likewise, a local key shouldn't be used in production because of security reasons.
Once the KMS providers are established, the
createEncryptedClientfunction is executed. Remember, this particular function will set the automatic encryption options and establish a connection to MongoDB.
To match the database and collection used in the schema map definition, we are using
fle-exampleas the database and
peopleas the collection. The operations that follow, such as
FindOne, can be used as if field level encryption wasn't even a thing. Because we have an
ssnfield and the
ssnfield will be encrypted client-side and saved to MongoDB. When doing lookup operation, the encrypted field will be decrypted.
When looking at the data in Atlas, for example, the encrypted fields will not be human readable as seen in the above screenshot.
When field level encryption is included in the Go application, a special tag must be included in the build or run process, depending on the route you choose. You should already have mongocryptd and libmongocrypt, so to build your Go application, you'd do the following:
If you use the above command to build your binary, you can use it as normal. However, if you're running your application without building, you can do something like the following:
The above command will run the application with client-side encryption enabled.
If you've run the example so far, you'll probably notice that while you can automatically encrypt fields and decrypt fields, you'll get an error if you try to use a filter that contains an encrypted field.
To use the deterministic approach, we need to make a few revisions to our project. These changes are a result of the fact that we won't be able to use alternate key names within our schema map.
First, let's change the schema.json file to the following:
The two changes in the above JSON reflect the new algorithm and the
keyIdusing the actual
_idvalue rather than an alias. For the
base64field, notice the use of the
%splaceholder. If you know the base64 string version of your key, then swap it out and save yourself a bunch of work. Since this tutorial is an example and the data changes pretty much every time we run it, we probably want to swap out that field after the file is loaded.
Starting with the
createDataKeyfunction, find the following line with the
What we didn't see in the previous parts of this tutorial is that this function returns the
_idof the data key. We should probably update our
createDataKeyfunction to return
primitive.Binaryand then return that
We need to move that
dataKeyIdvalue around until it reaches where we load our JSON file. We're doing a lot of work for the following reasons:
- We're in the scenario where we don't know the
_idof our data key prior to runtime. If we know it, we can add it to the schema and be done.
- We designed our code to jump around with functions.
The schema map requires a base64 value to be used, so when we pass around
dataKeyId, we need to have first encoded it.
mainfunction, we might have something that looks like this:
This means that the
createEncryptedClientneeds to receive a string argument. Update the
createEncryptedClientto accept a string and then change how we're reading our JSON file:
Remember, we're just passing the base64 encoded value through the pipeline. By the end of this, in the
readSchemaFromFilefunction, we can update our code to look like the following:
Not only are we receiving the base64 string, but we are using an
Sprintffunction to swap our
%splaceholder with the actual value.
Again, these changes were based around how we designed our code. At the end of the day, we were really only changing the
keyIdin the schema map and the algorithm used for encryption. By doing this, we are not only able to decrypt fields that had been encrypted, but we're also able to filter for documents using encrypted fields.
While it might seem like we wrote a lot of code, the reality is that the code was far simpler than the concepts involved. To get a better look at the code, you can find it below:
Try to set the
ATLAS_URIin your environment variables and give the code a spin.
If you ran the above code and found some encrypted data in your database, fantastic! However, if you didn't get so lucky, I want to address a few of the common problems that come up.
Let's start with the following runtime error:
If you see the above error, it is likely because you forgot to use the
-tags cseflag when building or running your application. To get beyond this, just build your application with the following:
Assuming there aren't other problems, you won't receive that error anymore.
When you build or run with the
-tags cseflag, you might stumble upon the following error:
Now, what if you encounter the following?
You just saw how to use MongoDB client-side field level encryption (CSFLE) in your Go application. This is useful if you'd like to encrypt fields within MongoDB documents client-side before it reaches the database.
There are a few things that I want to reiterate:
- Using a local key is a security risk in production. Either use something like AWS KMS or load your Local Key Provider with a key that was obtained through an external request.
- The mongocryptd binary must be available on the computer or server running the Go application. This is easily installed through the MongoDB Enterprise Edition installation.
- The libmongocrypt library must be available to add compatibility to the Go driver for client-side encryption and decryption.
- Don't lose your client-side key. Otherwise, you lose the ability to decrypt your fields.
In a future tutorial, we'll explore how to use AWS KMS and similar for key management.