GIANT Stories at MongoDB

How to create Dynamic Custom Roles with MongoDB Stitch

None of us want to write lots of code to control what data each user can access. In most cases, you can set up your data access controls in seconds by using Stitch's built-in templates. Stitch also lets you create custom rules that you tailor to your application and schema – this post creates such a rule, querying a second collection when deciding whether to allow a user to insert a document.

JSON Schema Validation - Locking down your model the smart way

If you've read through the Building with Patterns series, you've likely seen the flexibility of MongoDB's document data model. This provides many advantages when it comes to building applications quickly as we aren't locked into a rigid data structure like we are in a legacy, tabular database. Once your schema design is determined and settled upon, it is often useful to "lock" it into place. In MongoDB, we can use JSON Schema validation to accomplish this task. Since MongoDB 3.6, we have supported schema validation based on the JSON Schemas Draft Specification.

This ability to lock down the document model with a strictly designed schema means you can, for example, introduce concrete milestones in the evolution of your data model which you can test against. One potential scenario would be that after an application has gone through the development cycle and the data structure has become more rigid. At this point, defining a structure for the data may be desirable to ensure there are no unintended changes to the schema, or unexpected data being put into a specific field. For example, someone storing an image in a password field is not a desirable experience.

Schema validation can be accomplished in MongoDB both during the creation of a collection and on existing documents. The validation process occurs during document updates and inserts. Therefore, when applying rules to an existing collection the rules will undergo validation only when they are modified. The syntax for implementing validation is similar in either case:

New Collection

db.createCollection("recipes",
    validator: { $jsonSchema: {
         <<Validation Rules>>
        }
    }
)

Existing Collection

db.runCommand( {
    collMod: "recipes",
    validator: { $jsonSchema: {
         <<Validation Rules>>
        }
    }
} )

Inside the validator section of the document, we can explicitly state the fields and field types the document must have. We can define the values that a field may accept, a minimum and/or a maximum number of items a field may contain, and if we are allowed to add additional fields to the document. An example of some of these features will likely help clarify these a bit more.

JSON Schema Validation

For the example schema, let's think about a collection of cooking recipes. The basic information we need in each recipe will be the name, the number of servings, and a list of ingredients. We'll make those required. We'll allow for an optional _cooking_method _field should we want to be able to find all recipes for items that are sauteed, for example. We'll create a new collection and set up our validation rules.

db.createCollection("recipes", {
  validator: {
    $jsonSchema: {
      bsonType: "object",
      required: ["name", "servings", "ingredients"],
      additionalProperties: false,
      properties: {
        _id: {},
        name: {
          bsonType: "string",
          description: "'name' is required and is a string"
        },
        servings: {
          bsonType: ["int", "double"],
          minimum: 0,
          description:
            "'servings' is required and must be an integer with a minimum of zero."
        },
        cooking_method: {
          enum: [
            "broil",
            "grill",
            "roast",
            "bake",
            "saute",
            "pan-fry",
            "deep-fry",
            "poach",
            "simmer",
            "boil",
            "steam",
            "braise",
            "stew"
          ],
          description:
            "'cooking_method' is optional but, if used, must be one of the listed options."
        },
        ingredients: {
          bsonType: ["array"],
          minItems: 1,
          maxItems: 50,
          items: {
            bsonType: ["object"],
            required: ["quantity", "measure", "ingredient"],
            additionalProperties: false,
            description: "'ingredients' must contain the stated fields.",
            properties: {
              quantity: {
                bsonType: ["int", "double", "decimal"],
                description:
                  "'quantity' is required and is of double or decimal type"
              },
              measure: {
                enum: ["tsp", "Tbsp", "cup", "ounce", "pound", "each"],
                description:
                  "'measure' is required and can only be one of the given enum values"
              },
              ingredient: {
                bsonType: "string",
                description: "'ingredient' is required and is a string"
              },
              format: {
                bsonType: "string",
                description:
                  "'format' is an optional field of type string, e.g. chopped or diced"
              }
            }
          }
        }
      }
    }
  }
});

If we look at what's been defined here, we have our required fields, name, servings, and ingredients. The additionalProperties: false rule prevents other fields from being added beyond those three fields, or fields that we explicitly state in our validation rule. We've allowed an _id field in our document which is important. If we did not specify this in the schema no document would be inserted as _id is autogenerated and no document can exist without in the database as it is our primary key.

The name field has been set to a required string valued field. The servings field is required and must be an integer or double. Next here is an optional cooking_method field. If it is included in the document, only values listed are acceptable.

The ingredients field has some additional complexity to the validation process. It has been defined as an array of items that each have a required quantity, measure, and ingredient. There is an optional format field as well to handle descriptions such as whole, diced, chopped, etc. The accepted data types for the various fields have been set for each ingredient during the schema validation process. Double or decimal for quantity, one of the predefined values for measure, and string values for ingredient and format.

With the validation rules in place, let's try to insert some sample documents into the collection and see what would happen:

Document 1

db.recipes.insertOne({
  name: "Chocolate Sponge Cake Filling",
  servings: 4,
  ingredients: [
    {
      quantity: 7,
      measure: "ounce",
      ingredient: "bittersweet chocolate",
      format: "chopped"
    },
    { quantity: 2, measure: "cup", ingredient: "heavy cream" }
  ]
});

This insert works as all of the required fields in the validation requirements are present and in the correct format.

Document 2

db.recipes.insertOne({
  name: "Chocolate Sponge Cake Filling",
  servings: 4,
  ingredients: [
    {
      quantity: 7,
      measure: "ounce",
      ingredient_name: "bittersweet chocolate",
      format: "chopped"
    },
    { quantity: 2, measure: "cup", ingredient: "heavy cream" }
  ],
  directions:
    "Boil cream and pour over chocolate. Stir until chocolate is melted."
});

This insert would fail with a WriteError: Document failed validation error due to the additional directions field in the document.

There are other rules that can be applied to the document as well and schema validation can be carried out on sub-documents like we've seen with ingredients but also on arrays. Additionally, schema dependencies can be set to move application logic natively into the database. Validation strictness can also be set to allow for outright rejection of the document write operation or just a warning.

Conclusion

JSON schema validation can be a powerful tool to maintain your data structure. This provides even greater power with the document model in MongoDB. We have the ability to rapidly try out different schema designs in an application and then, once the model has been solidified, enforce some standards. We get to take advantage of both the flexibility of the document model along with data validation.

Calling the MongoDB Atlas API - How to do it from Go

After last week's article on how to access the Atlas API with Node, Python, and Ruby, I was asked why didn't I mention Go (among other languages). Well, no need to worry. Here's an extra slab of Atlas API access in Go.

Stitching Sheets: Using MongoDB Stitch To Create An API For Data In Google Sheets

Thanks to MongoDB Stitch, it is easier than ever to integrate web services with MongoDB. In this example, we are going to use it to make calendar data flow between Google Sheets and MongoDB, complete with Google Sheets menus and an optional slack bot to access the data in MongoDB.

Calling the MongoDB Atlas API - How to do it from Node, Python, and Ruby

The real power of a cloud-hosted, fully managed service like MongoDB Atlas is that you can create whole new database deployment architectures automatically, using the services API. Getting to the MongoDB Atlas API is relatively simple and, once unlocked, it opens up a massive opportunity to integrate and automate the management of database deployments from creation to deletion. The API itself is an extensive REST API, there's role-based access control and you can have user or app-specific credentials to access it.

There is one tiny thing that can trip people up though. The credentials have to be passed over using the digest authentication mechanism, not the more common basic authentication or using an issued token. Digest authentication, at its simplest, waits to get an HTTP 401 (not authorized) from the web endpoint. That response comes with data and the client then sends an encrypted form of the username and password as a digest and the server works with that.

The 2038 problem and how to solve it for MongoDB

This year, 2019, is halfway between 2000 and 2038. If you don't know, 2038 is going to be an interesting year like 2000 was an interesting year for dates and times. 2038 is the year that the 32-bit signed integers that people have been using since the 1970s to represent time will roll over; 2,147,483,647 seconds will have passed since 1 January 1970 and rolling over means the signed value flip to the largest negative value.

Build a Slack App in 10 minutes with MongoDB Stitch

Slack is not only the fastest growing startup in history, but it's also an app by the same name and one of the most popular communication tools in use today. We use it extensively at MongoDB to foster efficient communications between teams and across the company. We're not alone. It seems like every developer I encounter uses it in their company as well.

One interesting thing about Slack (and there are many) is its extensibility. There are several ways you can extend Slack. Building chatbots, applications that interface with the communication service and extending Slack through the introduction of additional commands called "slash commands" that enable Slack users to communicate with external services. In this article, we'll build a simple slash command that enables users to store and retrieve data in and from a MongoDB database. I'm always finding interesting information on the internet that I want to share with my team members so let's build an application we'll call URL Stash that will store interesting URLs for later retrieval via a Slack slash command.

Five Minute MongoDB - Change Streams and MongoDB 4.x

Change Streams are a powerful tool in MongoDB for monitoring changes in a collection's documents. They got even more powerful in MongoDB 4.0 enabling you to act on changes to any document in any collection in any database in your MongoDB deployment. Read this Five Minute MongoDB to find out how.

Five Minute MongoDB: Why Documents?

The document is the natural representation of data. We only broke data up into rows and columns back in the 70s as a way to optimize data access. Back then, storage and compute power was expensive and so it made sense to use developer time to reduce the data set into a schema of rows and column, interlinked by relationships and then normalized between tables to reduce duplication. This process was cost-effective then and so it came to dominate database thinking.

That domination means that many people accept the burden of defining rows and columns as an essential part of using databases. In many ways though, relational databases are still expecting the designer and developer to pre-chew the data for easier processing by the database.

The Document Alternative

MongoDB Q&A: What's the deal with data integrity in relational databases vs MongoDB?

Previously in MongoDB Q&A, we looked at agile development and MongoDB. This time, it's all about data integrity...

I've been doing a lot of reading lately on relational vs non-relational databases, investigating the typical reasons why you might pick one over the other. A quick Google search of "relational vs non-relational databases" returns over 18 million results. Digging into that massive pile of results brought up a few key themes around why you would select a non-relational database: horizontal scaling, performance, managing unstructured and polymorphic data, and minimal upfront planning.