HomeLearnArticleStructuring Data with Serde in Rust

Structuring Data with Serde in Rust

Published: Feb 19, 2021

  • Atlas
  • Rust

By Isabel Atkinson

Rate this article

#Introduction

This post details new upgrades in the Rust MongoDB Driver and BSON library to improve our integration with Serde. In the Rust Quick Start blog post, we discussed the trickiness of working with BSON, which has a dynamic schema, in Rust, which uses a static type system. The MongoDB Rust driver and BSON library use Serde to make the conversion between BSON and Rust structs and enums easier. In the 1.2.0 releases of these two libraries, we've included new Serde integration to make working directly with your own Rust data types more seamless and user-friendly.

#Prerequisites

This post assumes that you have a recent version of the Rust toolchain installed (v1.44+), and that you're comfortable with Rust syntax. It also assumes you're familiar with the Rust Serde library.

#Driver Changes

The 1.2.0 Rust driver release introduces a generic type parameter to the Collection type. The generic parameter represents the type of data you want to insert into and find from your MongoDB collection. Any Rust data type that derives/implements the Serde Serialize and Deserialize traits can be used as a type parameter for a Collection.

For example, I'm working with the following struct that defines the schema of the data in my students collection:

1#[derive(Serialize, Deserialize)]
2struct Student {
3 name: String,
4 grade: u32,
5 test_scores: Vec<u32>,
6}

I can create a generic Collection by using the Database::collection_with_type method and specifying Student as the data type I'm working with.

1let students: Collection<Student> = db.collection_with_type("students");

Prior to the introduction of the generic Collection, the various CRUD Collection methods accepted and returned the Document type. This meant I would need to serialize my Student structs to Documents before inserting them into the students collection. Now, I can insert a Student directly into my collection:

1let student = Student {
2 name: "Emily".to_string(),
3 grade: 10,
4 test_scores: vec![98, 87, 100],
5};
6let result = students.insert_one(student, None).await;

I can also find a Student directly from my Collection:

1// student is of type Student
2let student = students.find_one(doc! { "name": "Emily" }, None).await?;

I may decide that I want to insert a different type of data into the students collection at some point. Although my students collection is restricted to the Student data type, I can easily create a clone of the students collection with a new type parameter:

1let students: Collection<CollegeStudent> = students.clone_with_type();

The default generic type for Collection is Document. This means that any Collection created without a generic type will continue to find and return the Document type, and any existing code that uses Collection will be unaffected by these changes.

#BSON Changes

The 1.2.0 release also includes changes to the Rust BSON library that improve usability when working with Serde.

#Serde Helper Functions

Sometimes you may want to serialize or deserialize data in your structs or enums differently than the default behavior. Serde provides serialize_with and deserialize_with attributes that allow you to specify functions to use for serialization and deserialization on specific fields and variants.

The BSON library now includes a set of functions that implement common strategies for custom serialization and deserialization when working with BSON. You can use these functions by importing them from the serde_helpers module in the bson-rust crate and using the serialize_with and deserialize_with attributes. A few of these functions are detailed below.

Some users prefer to represent the object ID field in their data with a hexidecimal string rather than the BSON library ObjectId type:

1#[derive(Serialize, Deserialize)]
2 struct Item {
3 oid: String,
4 // rest of fields
5}

We've introduced a method for serializing a hex string into an ObjectId in the serde_helpers module called serialize_hex_string_as_object_id. I can annotate my oid field with this function using serialize_with:

1#[derive(Serialize, Deserialize)]
2struct Item {
3 #[serde(serialize_with = "serialize_hex_string_as_object_id")]
4 oid: String,
5 // rest of fields
6}

Now, if I serialize an instance of the Item struct into BSON, the oid field will be represented by an ObjectId rather than a string.

We've also introduced modules that take care of both serialization and deserialization. For instance, I might want to represent binary data using the Uuid type in the Rust uuid crate:

1#[derive(Serialize, Deserialize)]
2struct Item {
3 uuid: Uuid,
4 // rest of fields
5}

Since BSON doesn't have a specific UUID type, I'll need to convert this data into binary if I want to serialize into BSON. I'll also want to convert back to Uuid when deserializing from BSON. The uuid_as_binary module in the serde_helpers module can take care of both of these conversions. I'll add the following attribute to use this module:

1#[derive(Serialize, Deserialize)]
2struct Item {
3 #[serde(with = "uuid_as_binary")]
4 uuid: Uuid,
5 // rest of fields
6}

Now, I can work directly with the Uuid type without needing to worry about how to convert it to and from BSON!

The serde_helpers module introduces functions for several other common strategies; you can check out the documentation here.

#Unsigned Integers

The BSON specification defines two integer types: a signed 32 bit integer and a signed 64 bit integer. This can prevent challenges when you attempt to insert data with unsigned integers into your collections.

My Student struct from the previous example contains unsigned integers in the grade and test_score fields. Previous versions of the BSON library would return an error if I attempted to serialize an instance of this struct into Document, since there isn't always a clear mapping between unsigned and signed integer types. However, many unsigned integers can fit into signed types! For example, I might want to create the following student:

1let student = Student {
2 name: "Alyson".to_string(),
3 grade: 11,
4 test_scores: vec![89, 92, 99],
5};

While the numbers in the grade and test_scores fields are technically unsigned 32 bit integers, they can be converted to signed 32 bit integers without losing any data.

The 1.2.0 release of the BSON library no longer returns an error when attempting to serialize unsigned integers into BSON, and instead tries to perform a lossless conversion into one of the signed integer BSON types. If the unsigned integer cannot be converted to a signed 32 bit or 64 bit integer, serialization will still return an error.

Now, I can convert my student struct into a Document:

1let doc = to_document(&student)?;

I can use println! to see the following Document is returned:

1Document({
2 "name": String("Alyson"),
3 "grade": Int32(11),
4 "test_scores": Array([Int32(89), Int32(92), Int32(99)])
5})

#Conclusion

Serde is a powerful tool that provides lots of functionality for customizing the way you convert between different data formats. In the 1.2.0 releases of the Rust driver and BSON library, we've made it even easier to work directly with your Rust data types. If you're interested in more complex mapping capabilities, it's worth reading the Serde documentation on attributes. For more details on working with MongoDB in Rust, you can check out the documentation for the Rust driver and BSON library. We also happily accept contributions in the form of Github pull requests - please see the section in our README for info on how to run our tests.

If you have questions, please head to our developer community website where the MongoDB engineers and the MongoDB community will help you build your next big idea with MongoDB.

Rate this article
MongoDB Icon
  • Developer Hub
  • Documentation
  • University
  • Community Forums

© MongoDB, Inc.