Blog
{Blog}  Join us at AWS re:Invent 2022 Nov. 28 - Dec. 2 to learn how to build the next big thing on MongoDB and AWS

What is Time Series Data Management?

Time series data is the collection of data that is queried and indexed based on time-period.

Relational and non-relational databases have timestamp data types to store time-related data. Time series databases are specifically designed for time series data management. In this article, we discuss the importance of a time series database and how it works.

Time series data

If you think about it, all the data that we store has a timestamp attached to it. For example, log files, customer login times, sensor data from IoT devices, traffic data, weather data, and browser history all have timestamps attached.

CUSTOMER_TRANSACTIONS
2022-04-14 11:25:25 Login attempt
2022-04-14 11:25:26 Login success
2022-04-14 11:26:03 Browse category accessories
2022-04-14 11:27:04 Added 2 items in cart
2022-04-14 11:28:02 Browse category electronics

Time series data can be measured in seconds and minutes (like sensor-based devices), hourly (like phone usage), daily (petrol price), weekly (timesheets), monthly (electricity consumption), quarterly (performance reports), half-yearly (company growth), or annually (profits and revenue). Time series data can be at regular intervals or event-driven (irregular):

Event-driven time series data

DateDiesel price (in US$)
16-04-20225.00
17-04-20226.00
18-04-20225.60
18-04-20225.12
21-04-20225.10
22-04-20225.34
24-04-20225.45
25-04-20225.32
26-04-20226.17
26-04-20226.08
29-04-20225.95
30-04-20225.13
02-05-20224.99

Date

Diesel price (in US$)
16-04-20225.00
17-04-20226.00
18-04-20225.60
18-04-20225.12
21-04-20225.10
22-04-20225.34
24-04-20225.45
25-04-20225.32
26-04-20226.17
26-04-20226.08
29-04-20225.95
30-04-20225.13
02-05-20224.99

Line graph depicting event driven time-series data.

Regular time series data

DateDiesel price (in US$)
16-04-20225
17-04-20226
18-04-20225.36
19-04-20225.12
20-04-20225.12
21-04-20225.1
22-04-20225.34
23-04-20225.34
24-04-20225.45
25-04-20225.32
26-04-20226.13
27-04-20226.13
28-04-20226.13
29-04-20225.95
30-04-20225.13
01-05-20225.09
02-05-20224.99

Date

Diesel price (in US$)
16-04-20225
17-04-20226
18-04-20225.36
19-04-20225.12
20-04-20225.12
21-04-20225.1
22-04-20225.34
23-04-20225.34
24-04-20225.45
25-04-20225.32
26-04-20226.13
27-04-20226.13
28-04-20226.13
29-04-20225.95
30-04-20225.13
01-05-20225.09
02-05-20224.99

Line graph depicting regular time-series data.

In event-driven time series data, a new row is inserted only if there is a change in the price of diesel (event). In regular time series data, the price is checked at regular intervals. When plotted, time series data will always have one time axis.

The above examples are of linear time series data, where each point can be viewed as the linear combination between past, present, and future data, and can be analyzed using regression, auto-correlation, and other methods.

Time series database

Databases that provide special features to efficiently handle (store, manipulate, and retrieve) time series data are called time series databases. Some popular time series databases are Prometheus, InfluxDB, and TimeScaleDB. Databases like MongoDB provide time series collections to handle time series data, so you can get the benefits of both a time series and a non-relational database in one.

How is data stored in a time series database?

As shown in the above example, data in a time series database has a timestamp and at least one metric related to it. For example, the diesel price was $5.45 (metric) on 24-04-2022. We can add more metrics as well—for example, petrol price, stock prices, or the number of cars visiting the state museum.

DateDiesel pricePetrol priceStock priceNumber of cars visiting the state museum
24-04-20225.457.13100780

Date

Diesel price
24-04-20225.45
Petrol price
24-04-20227.13
Stock price
24-04-2022100
Number of cars visiting the state museum
24-04-2022780

This way, we can store any amount of data that changes with time. Time series data is almost always appended in comparison to updates or deletion. That means databases can have huge workloads, and even indexes may not be enough for optimization. Also, more often than not, you would want statistics or aggregates collected over a time period—for example, average diesel prices from 24-04-2022 to 01-05-2022. Time series databases are optimized for performance as well as performing specialized functions.

How does a time series database work?

Time series databases store data as time-value pairs for easy analysis and querying. Time series databases can efficiently handle concurrent series—i.e., multiple metrics parallel—making them well-suited for banking and financial transactions.

There are three aspects of a time series database: database features, time series features, and data features.

Database features

This includes the basic CRUD (Create, Read, Update, and Delete) features, as well as features like high availability, scalability, and reliability. The database should be able to handle large amounts of writes, and reads/updates should be at particular time windows.

Time-series features

The time is stored as a timestamp, which includes the time in precision of seconds and milliseconds. Date can be stored in various formats using the DateTime data type. Timestamp supports calendar and time zone adaptation. Time series databases also provide support for getting aggregations and statistics about the data based on time.

Data features

Data is appended in the sequence of time and is stored as time, value, and events. Data can have many dimensions. The data often does not require relationships between entries of different tables and older data is purged or compressed and archived.

Benefits of time series databases

A time series database (TSD) contains special tools and features to handle huge loads of time-series data. Some major benefits are:

Efficient, consistent, and cost-effective data storage

TSD consists of tools and features to store data at a very high speed. It also provides compression algorithms to store older data that can be retrieved when needed.

Faster data querying

TSD is indexed on time, making it easy to get data based on a certain period of time. This is particularly useful to analyze IoT data, financial data, weather forecasting, and many other real-life use cases.

Real-time streaming

As data is sent at regular intervals and writes are fast and consistent, data can be sent to a streaming engine to perform real-time analytics and visualization. TSD also allows for data mining as it can scale, and huge amounts of data can be stored as the requirement grows.

Computing and processing features

TSD contains many functions like aggregation, grouping, comparison, machine learning, and other similar functions to perform complex analysis on the data. These functions are optimized for performance and help in faster decision-making.

Efficient data lifecycle management

It’s easier to pull out reports and summaries for a period of time, as TSD is already optimized for getting precise reports calculated over a period of time, particularly if some metrics like percentile, max, min, and trends are needed.

Use cases for time series database

Initially, TSDs were intended mainly for financial purposes. However, with digitization and the popularity of smart devices, the use cases of time series databases have gone up.

  • IoT (Internet of Things): Smart home and wearable devices, mobile phones, and inventory management systems keep track of every activity and keep sending data for generating alerts and patterns to track usage and set goals.

  • Sales forecasting: Based on data for a period of time, sales teams can generate reports and summaries and predict the performance and trends for the next quarter (year) and suggest improvements.

  • Financial trends: Making financial predictions—for example, stock market predictions—is quite easy with a time series database, as it stores a lot of contextual data that can be cross-referenced later, for analysis.

  • Data summary and reporting: Using time series features of a TSD, you can get a summary of data for different times in a more optimized way. You can get accurate reports, based on the smallest measurements of time (like milliseconds).

Time series database requirements

A time series database should satisfy the following requirements:

  • High capacity to handle huge volumes of data.
  • In-memory interactive dashboards for live updates, trends, events, and alerts.
  • Quick querying capabilities and access to Machine Learning and Artificial Intelligence algorithms for analytics.
  • Support for standard queries and custom statistical functions for working with time series data.
  • Ability to handle multiple logins, multiple queries, and high load.

MongoDB’s data platform provides all of these features and is quite suitable for handling time series data. You can access MongoDB data from anywhere using MongoDB Atlas, MongoDB’s cloud-based application data platform.

Ready to get started?

MongoDB provides advanced query capabilities as well as time series collections, which makes it suitable for handling time-series data. Being a non-relational database, MongoDB offers better scalability and performance. With MongoDB Atlas, you can access and query data from anywhere in the world.

FAQs

How is time series data treated?

Time series data always has one of the axes as time. The other metrics can be anything, like stock price, diesel price, number of users that visited a museum, and so on. Time series data is queried based on time to data for a period of time.

What is time series data structure?

Time series data is time-stamped data arranged as a sequence of data points indexed in the order of time—for example, daily petrol price, average monthly wages of employees, hourly Facebook logins, etc. Mathematically, time series can be represented as variable x = f(t), where f(t) is the function of time.

What are the 4 main components of time series?

The four components of time series are based on different aspects of movement of time series:

  • Trends: Trends show the increase and decrease over a period of time—for example, population, items in inventory, and number of schools opened.

  • Seasonal variations: These are regular periodic variations observed during one year—for example, the sale of geysers and air coolers, and the number of weddings in a particular period.

  • Cyclical fluctuations: These are time series variations for more than a year. Cyclic variations form a complete circle and return to the start state, with oscillations—for example, business cycles and weather cycles.

  • Irregular variations: These are unforeseen variations in a regular time series—for example, the impact of floods on crop production and the sudden collapse of a warehouse.

What is time series and what are its uses?

Time series is a collection of data stored in sequence of time—for example, the daily diesel price. Time series data finds use in many domains. By analyzing past and current time series data, businesses can determine metrics, find patterns and trends, and forecast business performance for a period of time. Time series analysis is also useful in ecommerce applications, finance, IoT, and sales.

What is the meaning of time series data management?

Time series data management refers to managing time series data. While relational databases have timestamp options, time series databases like MongoDB and InfluxDB provide special features to optimally handle time series data and give better performance.