Apache CarbonData 2.0 Online Release
-
CarbonData 1.4 became the top project of Apache (at the beginning of 2017): a number of large domestic and foreign customers tried it out. At that time, Spark on CarbonData was 1.5~2 times more than Spark on Parquet in customer performance tests. Promoted CarbonData 1.0 to be released as an official product, and became the first top Apache project contributed by a Chinese company.
-
CarbonData 1.5, 1.6 (early 2019): Hadoop ecological ACID capabilities, including transaction, fault tolerance, metadata management, etc.
-
CarbonData 2.0 release (current): the system architecture for the cloud environment has been redesigned, with dozens of advanced functions, including storage and computing separation optimization, index and materialized view capabilities, data lake capabilities, real-time data synchronization and update, and so on.
-
-Storage optimization: metadata management optimization of object-oriented storage to avoid the high cost of moving objects and enumerating objects in data management
-
-Computing ecology: support Spark 2.4.5, Flink Hive、Alluxio、Presto、PyTorch、TensorFlow
-
-Detail query: secondary index, spatial index, segment level MinMax index, realizing second level response to PB level detail query
-
-Complex query: materialized view, timing pre aggregation, bucket index, and second level response to complex query
-
-Data lake index management: distributed index cache - IndexServer, and support index memory preloading
-
-Insert, Update and Delete performance enhancements, support Merge syntax
-
-Support Hive to read and write CarbonData transaction tables, and deep optimization of read and write performance
-
-Machine learning library supporting annotation and training analysis
Live broadcast information
Time: 19:30-21:00, June 3, 2020 (Wednesday) Address of live broadcast room: https://www.slidestalk.com/w/191 Co sponsor: Apache CarbonData Community, Kaiyuanshe Cooperative community: Shishuo.com http://huaweicloud.ai/
Special Guests
-
Chen Liang (Huawei; Apache CarbonData PMC & Committer) -
Li Kun (Huawei; Apache CarbonData PMC & Committer) -
Kunal Kapoor(Apache CarbonData PMC & Committer) -
Ravindra Pesala (Development Bank of Singapore, Apache CarbonData PMC & Committer) -
Vimal Das(UBER, Apache CarbonData PMC & Committer) -
Zhichao Zhang(Kyligence, Apache CarbonData PMC & Committer) -
Cao Lu (big data architect of SAIC Data Business Department, Apache CarbonData Committer) -
He Xiaoqiao (Meituan Reviews Data Platform Engineer, Apache CarbonData Committer) -
Hao Xingjun (core contributor of Apache CarbonData) -
Lin Lvqiang, Richard Lin (director of Kaiyuan News Agency; host of this press conference)
Agenda
How to start the first open source project?