Blog
{Blog}  Join us at AWS re:Invent 2022 Nov. 28 - Dec. 2 to learn how to build the next big thing on MongoDB and AWS

An Introduction to Data Persistence

An application creates data. Then, that application quits. What happens to the data? This is where data persistence comes into play. This introductory article will provide you with a brief and very basic introduction to help you get started and understand the basic vocabulary used in regard to data persistence.

What is data persistence?

Data persistence is the longevity of data after the application that created it has been closed. In order for this to happen, the data must be written to non-volatile storage — a type of memory that can retain that information long-term, even if the application is no longer running.

When data is persisted, it means that the exact same data can be retrieved by an application when it’s opened again. The data remains the same, and nothing is lost in between sessions.

There are different ways to persist data, which brings us to an important distinction: local storage versus remote storage. Think of a Word document that you open to write an article like this one. When you’re done writing for the day, you click on Save, choose a location on your computer, and you’re done. The data — in this case, your Word document — gets saved on your disk and can be accessed the next time you start your computer again.

Now, what if you wanted to edit this Word document on your laptop rather than on your desktop? Therein lies the difference between local and remote storage, because remote persisted storage would allow this. Let’s say the location you saved your Word document in is actually not just any folder but your Google Drive folder on your disk. The Google Drive application will then automatically send this file to your Google Drive Cloud to save it remotely and securely, and not locally.

Long story short, data persistence is the difference between data getting lost between sessions and data remaining between sessions.

Persist data at the user level vs. application level

The Word document we talked about in the previous paragraph is persisted on what we call a user level. Only you have access to this file. On the other hand, we have the application level. This is data that is not just saved for a single user but for everyone using the same application.

Think of a comment that you put below a YouTube video. Anyone can access the comment box and anyone can see the comment. Comments are saved on the application level.

Why is data persistence important?

Apart from keeping your own data safe, you might still wonder why using persistence in your application is important.

Imagine using an app that asks you for some information about yourself when you start it the first time, like your name and your email address, and maybe more. If you had to close the application for the moment and open it again only to have to re-enter your data, you wouldn’t be very happy about the user experience. Persisting the already-provided data of a user and making it visible the next time they want to continue the process is one of the most basic use cases of data persistence, and it can help improve the UX greatly.

Performance is another important factor in this equation. Imagine using an email app on your smartphone. And every time you open the app, you have to download each and every email. By persisting those emails locally on your device, you avoid having to use up your data plan every time you want to open your inbox. This is another example of local storage persistence.

There are many different solutions out there, and it depends on your specific use case which approach is the best. From relational databases over object-based databases to document databases like MongoDB, the world of data persistence is waiting for you!

Why is persistence important in programming?

Persistence is important in programming because it means that data that is persisted to storage that can be accessed from different applications, different devices, and different operating systems. Think about trying to access your data from your web browser as well as your Xbox. Both access mechanisms were programmed differently but are working with the same data.

Let’s talk again more specifically about the client-side benefits. From a UX perspective, data persistence means that you don’t have to ask the user for the same information more than once. In terms of performance, the app doesn’t need to download the same data over and over again. Server-side persistence is also beneficial. For example, Google Drive saves everything in the cloud and only downloads it when needed. This saves disk space on the client side.

Persistent vs non-persistent data

Persistent data always maintains the last version (or even multiple versions) of itself even after it’s been modified. If data is persistent, then it’s sustained even if the process, cluster, node, or container is changed or removed.

If data is non-persistent, this means that it can be altered, removed, or lost if the process, cluster, node, or container is changed or removed.

How do you ensure data persistence?

There are various ways you can make sure that data persists. One of the more common methods is to take periodic snapshots to your disk at a configurable interval. Here are a few other approaches:

  • Saving data in a file.
  • Saving your data to a local device database like MongoDB Realm.
  • Sending data to a server.
  • Saving data to external storage, like a USB stick.
  • Saving your data in a remote database like MongoDB Atlas.
  • Printing out your data.

What are the four device states for persistent storage acquisition?

The four device states for persistent storage acquisition are pure in-memory, in-memory, disk-based, and commitlog-based.

  • Pure in-memory storage: This potentially offers the least amount of persistence.
  • In-memory storage: This takes periodic snapshots in order to maintain persistence.
  • Disk-based storage: This triggers update-in-place writes.
  • Commitlog-based storage: This includes all traditional OLTP databases.

Can data persistence be achieved using a database system?

Yes! Databases are one of the most common ways of achieving data persistence. Whether you’re using a document-based database, like MongoDB, or a relational database system, you can achieve data persistence.