Distributed database system communication
While nodes are able to fully function on their own, it is necessary for them to communicate with other nodes as well since, unlike centralized databases, they do not share the same physical components or even the same data sets. There are three types of distributed database communication:
- Broadcast communication: One message is sent to all other nodes within the distributed database system.
- Multicast communication: One message is sent to some but not all other nodes within the distributed database system.
- Unicast communication: A message is sent from an individual node to one other individual node.
Transaction management
Distributed databases must often support distributed transactions, where one transaction can involve more than one node. This support methodology is highlighted in the ACID properties (atomicity, consistency, isolation, durability) of transactions across distributed database systems. Key elements of ACID properties include:
(Source: Dev.to, 2020)Atomicity means that a transaction is treated as a single unit. This also means that either a complete transaction is available for storage or it's rejected as an error which ensures data integrity.
Consistency is maintained in distributed database systems by enforcing predefined rules and data constraints. If the state, nature, or content of a transaction violates these rules, the transaction will not be ingested and stored in the distributed system.
Isolation involves the separation of each transaction from the other transactions to prevent data conflicts and maintain data integrity. In addition, this benefits operations when managing multiple distributed data records that may exist across local data stores, virtual machines via cloud computing, and multiple database nodes which may be located across multiple sites.
Durability ensures that stored data is preserved in the event of a system failure. There are a variety of ways that a transactional distributed database management system accomplishes this task, including:
Fault tolerance
Because distributed database systems are more likely to experience failures or operations interruptions than centralized databases (e.g., due to multiple sites or a suboptimal file system), strong fault tolerance processes are essential to maintain access reliability and effective database operations. With that said, the number of individual components that distributed systems are able to preserve removes the risk of a single point of failure.
Some common fault tolerance processes include data replication, backup protocols, continuous failure detection, data checksums, load balancing, and query optimization.
Data replication
Data replication is the process by which multiple copies of data are maintained across different nodes, servers, or sites. There are different types of database replication schema to choose from, including:
Full replication: In full replication, a complete, functional copy of the entire database is sent to all sites within the distributed database system. Database copy updates are provided on a routine schedule. There are two subtypes of full replication, as well.
Transactional replication: In transactional replication, a full and complete database copy is provided to each node, and then data changes are updated to that copy as transaction processing occurs, often in real-time.
Snapshot replication: Using snapshot replication, a copy of the database at a specific point in time is captured. This snapshot is then distributed across nodes and the user base as needed but does not consistently monitor for data changes. For this reason, snapshot database replication is only recommended for infrequently changing content.
Partial replication: In some cases, certain nodes only require specific portions of the database, so a defined portion of the database is replicated to a select group. In this type of data replication, any number of nodes or sites can receive the replication.
Merge replication: As its name indicates, merge database replication is the merging of two databases into one. This is the most complex of the database replication types.
Backup protocols
Through a consistent program of automated data backups, data integrity and database systems availability can be maintained without overburdening organizational employees. Some of the most common solutions in the marketplace include backup software from Veeam, Druva, and Commvault.
Three of the most used types of backup for distributed databases include:
Full backup: The entire database is copied and stored every time a database backup is executed.
Differential backup: Only the changes made since the last full backup are copied and stored.
Incremental backup: Incremental backups do not require a previous full backup — they can save changes since the previous differential or incremental backup.
Continuous failure detection
As with any system, it is critical for distributed database systems to be continuously monitored for system failures — whether they be technical issues, natural disasters, or cyberattacks. Just a few of the ways this monitoring is accomplished include:
Heartbeating: In heartbeating, each node sends out a signal (heartbeat) to other nodes to verify it's operational. If that signal isn't received, a failure message is created and further investigation of that node's operations by system administration is undertaken.
Watchdog timers: Individual nodes will have watchdog timers that are focused on a specific activity or process. If the timer expires without the activity or process being completed, a failure message is generated indicating further investigation is required.
Data checksums: In order to identify data tampering or other issues with data transmission, when a data transmission is sent, it is assigned a certain value (or checksum). When that transmission is received, it is also assigned a checksum. By using software to verify that both the sender and receiver have equivalent checksums for that transmission, issues with data transmission integrity can be quickly identified.
Load Balancing
Load balancing techniques distribute user requests and queries evenly across database nodes. This not only improves performance but also ensures that the failure of one node does not cause an overload on others.
Usually, load balancing software is deployed as the intermediary between the applications or database users. When a query is received, the load balancer will evaluate the request and determine which node(s) are best equipped to respond. During this evaluation, such factors as proximity, current load, and other predetermined system rules will be considered. This evaluation and assignment helps the system avoid system overload and system inefficiency which can result in long wait times for users.
Query optimization
Distributed databases use query optimization techniques to distribute queries efficiently across nodes while minimizing data transfer traffic between nodes. One of the ways this is accomplished is through cost-based query optimization. This form of query optimization considers the most efficient execution for the query, with such factors as query complexity, available data, and the location of the site containing that data.