Collection
zero Useful+1
zero

Distributed database system

DDBS
Distributed database system (DDBS) includes Distributed Database Management System (DDBMS) and Distributed database (DDB)。 In a distributed database system, an application program can transparently operate the database. The data in the database are stored in different local databases, managed by different DBMSs, run on different machines, supported by different operating systems, and connected by different communication networks.
Chinese name
Distributed database system
Foreign name
Distributed Database System
Classification
Database management system Distributed database
R&D sequence
Bottom up

System definition

Announce
edit
A distributed database is logically a unified whole, while physically it is stored on different physical nodes. An application can access databases distributed in different geographical locations through network connection. Its distribution is shown in that the data in the database are not stored in the same place. More specifically, it is not stored on the storage device of the same computer. This is the difference from centralized database. From the perspective of users, a distributed database system is logically the same as a centralized database system. Users can implement global applications in any site. Just like those data are stored on the same computer and managed by a single database management system (DBMS), users don't feel different.
Distributed database system is developed on the basis of centralized database system, and is the product of the combination of computer technology and network technology. Distributed database system is suitable for departments with scattered units, allowing each department to data storage Local storage and local use are implemented to improve response speed and reduce communication costs. Compared with the centralized database system, the distributed database system has scalability. By adding appropriate data redundancy System reliability In a centralized database, one of the goals of the system is to reduce redundancy as much as possible. The reason is that redundant data wastes storage space and is easy to cause inconsistency between replicas. In order to ensure data consistency, the system has to pay a certain maintenance cost. The goal of reducing redundancy is to use data sharing. However, in distributed databases, redundant data is expected to be added to store multiple copies of the same data in different sites. The reasons are as follows: ①. Improve the reliability and availability of the system. When a site fails, the system can operate on the same copy in another site, without paralyzing the entire system due to one failure. ② To improve system performance, the system can select the nearest data copy to operate according to the distance, reduce the communication cost, and improve the performance of the entire system.

main features

Announce
edit

Independent transparency

Data independence is one of the main goals of database methods. Distribution transparency means that users do not need to care about the logical partition of data, the details of data physical location distribution, or the consistency of duplicate copies (redundant data), At the same time, it is not necessary to care which data model the database supports on the local site. The advantages of distributed transparency are obvious. With distributed transparency, The user's application program is written as if the data is not distributed. When the data is moved from one site to another, it is not necessary to rewrite the application program. When adding duplicate copies of some data, it is not necessary to rewrite the application program. The information about data distribution is stored in the data dictionary by the system. The user's access request to non local data is interpreted by the system according to the data dictionary Conversion, transmission

Replication Transparency

Users do not care about the replication of the database at each node in the network, and the update of the replicated data is automatically completed by the system. In a distributed database system, data from one site can be copied to other sites for storage, and applications can use the data copied to the local site to complete distributed operations locally, avoiding network transmission Data, which improves the operation and query efficiency of the system. However, the update operation of replicated data involves the update of all replicated data.

Easy scalability

In most network environments, a single database server It will not meet the requirements in the end. If the server software supports transparent horizontal expansion, multiple servers can be added to further distribute data and share processing tasks.

Main advantages

Announce
edit
(1) It has a flexible architecture.
(2) Adapt to distributed management and control mechanism.
(3) Excellent economic performance.
(4) The system has high reliability and good availability.
(5) The response speed of local application is fast.
(6) It has good scalability and is easy to integrate existing systems.

Main disadvantages

Announce
edit
(1) System overhead Large, mainly spent on communication.
(2) Complex access structure, originally in Centralized System The technology of effective data access in distributed systems is no longer applicable.
(3) The security and confidentiality of data are difficult to deal with.

System objectives

Announce
edit
The goal of distributed database system is the goal and motivation of developing distributed database system, which mainly includes the goals of technology and organization

adaptability

Organizations that use databases are often distributed organizationally (such as departments, departments, workshops, etc.), and geographically. The structure of the distributed database system conforms to the organizational structure of department distribution, allowing each department to store its commonly used data locally, enter, query, and maintain it locally, Implement local control. Because computer resources are close to users, it can reduce communication costs, improve response speed, and make these departments more convenient and economical to use databases.

Reliability availability

Improving the reliability and availability of the system is the main goal of the distributed database. Distributing data to multiple sites and increasing appropriate redundancy can provide better reliability. For some systems with high reliability requirements, this is especially important, because a failure in one site will not cause the whole System crash Users in the failure site can enter the system through other sites. Users in other sites can automatically select access paths by the system, avoid the failure site, and use other data replicas to perform operations without affecting the normal operation of the business

adequacy

Improve utilization of existing centralized databases
When several databases have been built in a large enterprise or department, in order to use mutual resources and develop global applications, it is necessary to develop a distributed database system. This situation can be called bottom-up building of a distributed system. Although this method also requires some changes and reconstruction of existing local database systems, However, compared with centralizing these databases to rebuild a centralized database, distributed databases are a better choice for both economic and organizational reasons

Scalability

When a unit needs to add new departments (such as new branches in the banking system and new departments and workshops in the factory) to expand its scale, the structure of the distributed database system provides a better way to expand the processing capacity of the system: add a new node in the distributed database system. This is more convenient, flexible, and Much more economical.
There are two common methods to expand the scale of centralized system: one is to leave a large margin at the beginning of design, which is easy to cause waste, and because of the difficulty in prediction, the design results may still not adapt to changes in the situation. The other is to upgrade the system, This will affect the normal operation of existing applications, and when the upgrade involves major modifications to incompatible hardware or system software, the developed Application software The upgrade cost is very expensive and often makes the upgrade method infeasible. Distributed database systems can easily incorporate a new node into the system without affecting the structure of the existing system and the normal operation of the system, providing a better way to gradually expand the system capacity, sometimes even the only way. [1-2]