From ACID to CAP/BASE

original
2016/11/29 17:00
Reading 2.4K

With the development from centralized system to distributed system, the transaction principle has also changed from the original ACID to CAP/BASE.

ACID
Transaction is a program execution logical unit (Unit) composed of a series of operation locks that access and update data in the system. In the narrow sense, transaction refers to database transaction in particular.
Transactions have four characteristics, namely Atomicity, Consistency, Isolation and Durability, which are referred to as the ACID characteristics of transactions.

Atomicity: A transaction must be an atomic operation sequence unit, either all succeed or all fail.
Consistency: The execution of a transaction cannot destroy the integrity and consistency of database data. Before and after a transaction is executed, the database must be in a consistent state.
Isolation: Concurrent transactions are isolated from each other. The execution of one transaction cannot be interfered by other transactions. Four transaction isolation levels are defined.
Persistence: Once a transaction is committed, its status changes to the corresponding data in the database should be permanent.

CAP and BASE theory
Transactions in a centralized system can easily meet the characteristics of ACID. However, for a high access and high concurrency Internet distributed system, if we want to implement a set of distributed transactions that strictly meet the characteristics of ACID, it is likely that there will be a conflict between the availability and strict consistency of the system, Because when we require a distributed system to have strict consistency, we may need to sacrifice the availability of the system. Availability and consistency are also two indispensable attributes of distributed systems. There can never be a solution that gives the best of both worlds between availability and consistency, so classic theories of distributed systems such as CAP and BASE emerge.

CAP theorem
A distributed system cannot meet the three basic requirements of consistency, availability, and partition tolerance at the same time. It can only meet two of them at most.

Consistency: All nodes have the same data at the same time. Note: consistency here is different from that in ACID
Availability: The service is always available, and the response time is normal
Partition fault tolerance: When a distributed system encounters a node or network partition failure, it can still provide services that meet consistency and availability.

CAP certificate
Here is a simple proof of why only two of the CAP theorems can be satisfied at the same time:
Premise: For a distributed system, partition fault tolerance can be said to be the most basic requirement. Since it is a distributed system, the components in the distributed system must be deployed to different nodes. Otherwise, it is not a distributed system

Suppose there are two nodes N1 and N2 in the network, and databases D1 (active) and D2 (standby) are installed on N1 and N2 respectively to form the active standby mode. D1 (active) is responsible for writing and reading, and D2 (standby) shares a part of reading
Under normal conditions: D1 (main) writes the data and synchronizes it to D2 (standby). Reading D2 can read the latest data
Under abnormal conditions: As a distributed system, the biggest difference between it and a stand-alone system lies in the network. Now suppose that in an extreme case, the network between N1 and N2 is disconnected; D1 (master) has finished writing data, and D2 (standby) has not updated to the latest data; At this time, there are two options: first, sacrifice data consistency and respond to old data to users; Second, sacrifice the availability, block and wait until the network connection is restored, and then respond to the latest data to the user after the data update operation is completed.

BASE theory
BASE is short for Basically Available, Soft state and Eventually consistent. It is obvious that BASE theory is more inclined to meet the AP in CAP theory. Systems that meet availability and partition tolerance may have lower requirements for consistency.

Relationship among CAP, ACID and BASE
I saw a picture on the Internet and felt that the relationship between the three was clearly described, as shown in the following figure:

According to different tendencies in CAP theory:
CA – Single point cluster, a system that meets consistency and availability, but is usually not very powerful in scalability (ACID)
AP – A system that meets availability and partition tolerance, and may generally have lower requirements for consistency (BASE)
CP – A system that meets the requirements of consistency and partition tolerance. Generally, its performance is not very high (BASE)? It can also be written as BASE

In fact, from the perspective of database:
Relational databases and non relational databases can also be reflected in the CAP theory, as shown in the figure below (source online):

Relational databases (RDBMS) follow the ACID principle, while non relational databases follow the BASE principle

summary
ACID strong consistency model, while BASE proposes to obtain availability at the expense of strong consistency, but ultimately reach a consistent state; In the actual distributed scenario, different business units and components have different requirements for data consistency. Therefore, in the specific design of distributed system architecture, the ACID feature and BASE theory are often used together.

Expand to read the full text
Loading
Click to lead the topic 📣 Post and join the discussion 🔥
Reward
zero comment
thirty-six Collection
two fabulous
 Back to top
Top