Baidu Data Warehouse Palo for Apache Doris

Baidu Data Warehouse Palo is an MPP based cloud data warehouse built on the industry's leading OLAP database Apache Doris. It supports efficient import and real-time update of massive data, can simultaneously meet the different needs of enterprises for reports and OLAP analysis, and helps enterprises quickly and cheaply build an extremely fast and easy to use cloud data analysis platform.

  • Product Overview
  • Popular specifications
  • Product advantages
  • Product Functions
  • Application scenarios
  • Customer Stories
  • Product Dynamics
  • Video Introduction
  • Help Documents

Product Overview

Baidu data warehouse Palo for Apache Doris is a high-performance parallel database supporting online reports and multidimensional analysis applications. Its distributed architecture is simple and easy to use, operation and maintenance, helping enterprises build a cloud data analysis platform quickly and at low cost. It supports efficient import of massive data. Users can import data from RDS, object storage BOS, Baidu MapReduce, etc. for multidimensional analysis of big data.

 Product Overview

Popular specifications

See more product specifications

 background

Entry type

It is suitable for personal learning and production testing.

CPU

4-core

Memory

16G

Disk storage

200G

Annual payment of 8.3%

one thousand and five hundred /From
Buy Now
 background

Standard

It is applicable to small and medium-sized enterprises and Internet business.

Popular recommendation

CPU

8 cores

Memory

32G

Disk storage

1024G

Annual payment of 8.3%

three thousand five hundred and three /From
Buy Now
 background

Enhanced

It is applicable to large-scale e-commerce, games and financial platforms.

CPU

16 cores

Memory

64G

Disk storage

High performance

Annual payment of 8.3%

five thousand four hundred and eleven /From
Buy Now

Product advantages

Data warehouse query acceleration

PB level data millisecond/second delay; Seamless application of massive data; Greatly improve query efficiency and effect.

Multi source federated query

Unify the query entry and filter conditions across multiple data sources, significantly improve the query performance, and meet the diversified query needs of business personnel.

Interactive data analysis

Build interactive data analysis applications with BI, self-help exploration and multi-dimensional analysis of massive data, and conduct in-depth exploration and rapid decision-making of business.

Real time data warehouse construction

Efficient import of streaming data, real-time business data insight, unified big data platform architecture and data flow.

Product Functions

  • Efficient query

    Rich query methods
    Industry leading MPP query engine, columnar storage, intelligent index, vector execution.
    Provide analysis function
    It is highly compatible with SQL standards and provides advanced analysis functions such as in library analysis and window functions.
  • Data import

    Batch import of data
    Atomic import of batch data is supported, and different data sources can be selected.
    Low latency import
    Support low latency import of stream data, and support rich column mapping, transformation and filtering operations.
  • Online data management

    Change online
    You can create materialized views and change the table structure without stopping the service.
    data recovery
    The backup and recovery function is mainly used to quickly back up cluster snapshots to remote storage.
  • Materialized view

    Materialized View Auto Update
    It can not only analyze any dimension of original detailed data, but also analyze and query fixed dimensions.
    Select the optimal materialized view
    A set of DDL syntax for materialized views is provided to automatically select the most appropriate materialized view.

Application scenarios

Real time insight into user behavior
Business analysis and decision-making
Real time data warehouse

Application case

As the largest Chinese website analysis platform in China, Baidu Statistics provides users with refined data analysis SaaS services, and uses Baidu data warehouse Palo to achieve more analysis and real-time report output of user behavior.

We can provide

  • Quasi real-time data insight, large scale data volume.
  • High concurrency and stable operation, low latency and efficient query.
 Real time insight into user behavior

Customer Stories

Product Dynamics

  • New functions go online
    It supports the automatic creation of partitions by scheduling tasks, which reduces the user maintenance cost of partition operations.   2020.05.19 details
  • New functions go online
    Supports INTERSECT and EXCEPT operators   2020.07.23 details
  • New functions go online
    PALO UI is online. It supports logging in to the web UI from the console to quickly connect to the cluster and query.   2020.10.27 details
  • Function optimization
    Query push down of the VALUE column of UNIQUE table improves the query performance by 2-100 times.   2020.11.25 details
  • Function optimization
    Query performance optimization, multi table association query performance improved by 100+times, and memory consumption reduced by 5~10 times   2020.12.15 details

Video Introduction

Documentation and Tools