Collection
zero Useful+1
zero

Data segmentation

Data segmentation
Data segmentation refers to the integration of logic data Divide into smaller physical units that can be managed independently for storage, so as to facilitate restructure , reorganization and recovery to improve creation Indexes And sequential scanning efficiency. Data segmentation data warehouse More flexibility for developers and users. [1]
Chinese name
Data segmentation
Foreign name
data partitioning; data partition

advantage

Announce
edit
The overall purpose of segmentation of current detailed data is to divide the data into small physical units, providing greater flexibility for operators and designers in managing data. Small physical units have the advantages of easy reconfiguration, free indexing, sequential scanning, easy reorganization, easy recovery and easy monitoring. One of the essence of data warehouse is to access data flexibly. Large pieces of data cannot achieve this purpose.

standard

Announce
edit
The criteria for data segmentation can be determined according to the actual situation. Generally, you can choose to segment data by date, region, business field or organizational unit, or you can combine multiple segmentation criteria. Generally, the segmentation criteria should include date items.
For example, the standard of data segmentation is selected by the developer, and it is always necessary to press the date in the data warehouse.

arrangement

Announce
edit
The level of segmentation is generally divided into system level and application level. The division of the system layer is completed by the database management system and the operating system; The segmentation of the application layer is completed by the application system, and the segmentation on the application layer is more meaningful. [2]

Segmentation method

Announce
edit

Horizontal Split

Horizontal Splitting is to divide the tuple of the global relationship into some subsets, which are called data fragments. The data in the data shards may need to be aggregated due to some common nature (such as geography and attribution). Usually, data fragments in a relationship are disjoint. These fragments can be selectively placed on one site or repeatedly placed on different sites through replicas. [1]

Vertical Split

Vertical Splitting is to divide the global relationship into some data fragments or segments according to the attribute group (vertical). The data in the data shards may need to be aggregated due to the convenience of use or the commonness of access. Usually, the vertical data partitions in a relationship only overlap on some key values, and other attributes do not intersect each other. These vertical partitions can be placed on one site, or they can be repeatedly placed on different sites through replicas. [1]