Hologres

Hologres integrated real-time warehouse platform, one data, one calculation and one service, greatly improving the efficiency of data development and application.

Fast entry
26:00 Introduction to Hologres 3.0
Product Introduction

Hologres is an Alibaba Cloud self-developed integrated real-time warehouse platform, which integrates warehouse storage, multi-mode computing, analysis and service through a unified data platform Data+AI integrates seamlessly with mainstream BI tools, supports multiple scenarios such as OLAP query, ad hoc analysis, online services, and vector computing, breaks the TPC-H world record for analysis performance, and is deeply integrated with MaxCompute, Flink, and DataWorks to provide an off line integrated full stack digital warehouse solution. View details

Product advantages
  •  Unified data platform

    Solve the problem of data islands and inconsistent data caliber

    One piece of data for multiple scenarios can replace OLAP engine (Greenplum/Presto/Impala/ClickHouse, etc.) or KV database (HBase/Redis, etc.) at the same time.

  •  Fast data processing

    Solve the problem of analysis efficiency

    TPC-H 30000GB standard test result is the first in the world, 23% ahead of the second. It supports real-time write and update with high throughput of 1 billion+/sec, and petabyte data can be analyzed in seconds.

  •  Full link real-time

    Solve the problem of data timeliness

    It supports high-performance data real-time writing, real-time updating, and real-time query, maintains data freshness, and helps enterprises analyze data in real time.

Product Functions
  • Sub second interactive analysis

    Scalable MPP architecture, computing and storage index optimization give full play to extreme performance, and realize sub second analysis of petabyte level data.

  • High performance primary key query

    Hundreds of thousands of QPS high-performance spot checks per second, supporting high throughput updates, and improving the performance by more than 10 times compared with open source.

  • Vector computing

    Integrate the Proxima vector engine, and combine with the machine learning platform PAI to seamlessly connect various large models.

  • Federated query

    There is no need for data movement, and the query of MaxCompute offline data is accelerated, and OSS data is read and written on the surface.

  • High throughput real-time write and update

    Native integration with Flink, Spark and other computing frameworks supports real-time writing and updating of high-throughput data.

  • Full link real-time

    Data can be checked immediately after being written in real time. Binlog transmission capability of table update events is supported to reduce data processing delay.

  • Load isolation

    The resource group supports resource competition scenarios such as isolation of different businesses, different query types, and write and read to ensure stability.

  • High reliability design

    Multiple computing group instances form a high availability mode, share a share of storage, and support fast recovery of failed nodes.

  • Enterprise level operation and maintenance

    Provide rich monitoring and alarm information, support system hot upgrade, and meet various enterprise level operation and maintenance requirements.

Product selection
Getting Started and Trying Out
quick get start
  • zero one Get a free trial

    one

    Get a free trial of Hologres

    two

    Get a free trial of DataWorks

  • zero two Real time synchronization of RDS data

    one

    New Hologres database

    two

    Create a new DataWorks real-time synchronization task

  • zero three real-time analysis

    one

    Real time analysis of data

    two

    Build a large data visualization screen

Free trial
Real time synchronization and analysis of Github public data sets
This tutorial synchronizes Github public data set stored in RDS data source to Hologres in real time to complete real-time data analysis and visualization.
30 Minutes
Technical solutions
Unified digital warehouse solution

Build the company's centralized digital warehouse service layer, provide a unified data query interface and consistent indicator caliber, meet the demand for real-time data, support the ability to access data in multiple scenarios, reduce data fragmentation, and essentially improve the problems of traditional big data digital warehouse with many components, complex operation and maintenance, long development cycle, and inconsistent caliber.

 Unified digital warehouse solution
  • Unified storage

    Unified data storage, unified indicator caliber, no data islands, simplified architecture, and guaranteed data consistency.

  • Standard SQL

    Perfect SQL capability, support complex multi table, nested, window and other queries, reduce learning costs and shorten development cycle.

  • Unified data service layer

    One data supports multiple scenarios, such as large-scale multidimensional analysis and high QPS online services, with millisecond response and interactive analysis.

Offline real-time all-in-one solution

The big data warehouse system has evolved from the "complicated" Lambda architecture to the "simplified" real-time offline integrated digital warehouse. Its core is to connect MaxCompute+Hologres offline and real-time digital warehouses based on the streaming computing engine, and realize the hierarchical processing of digital warehouses through metadata and data interchange.

 Offline real-time all-in-one solution
  • Metadata auto discovery

    MaxCompute and Hologres have realized two-way metadata automatic discovery and refresh as well as perfect data type support.

  • Data sharing and interworking

    Storage direct reading is more than 10 times faster than accessing ordinary surfaces, and supports two-way synchronization of data in millions of rows/second, simplifying data publishing and back flushing scenarios.

  • Unified service export

    Hologres directly accelerates the query of MaxCompute data without data movement, reducing data redundancy and realizing BI acceleration.

Streaming digital warehouse solution

With the increasing demand of enterprises for data timeliness, the problem of real-time scenarios for real-time processing, real-time storage, and real-time analysis has become increasingly prominent. Based on the concept of Streaming Warehouse, Hologres realizes the efficient flow of real-time data between data warehouse tiers and solves the problem of real-time data warehouse tiers.

 Streaming digital warehouse solution
  • one-stop

    The whole link can be expressed in SQL, and Hologres data at each layer can be reused and checked, facilitating the construction of data layering and multiplexing system of real-time data warehouse.

  • High performance

    Flink's powerful real-time computing is perfectly combined with Hologres' extreme real-time writing and updating capabilities, multi-dimensional OLAP and high concurrency spot checking capabilities.

  • Enterprise level operation and maintenance

    The operation and maintenance is simpler, the observability is better, and the security capability is stronger. It provides a variety of high availability capabilities to facilitate the construction of an enterprise class streaming warehouse.

Related products

Online consultation
Lake warehouse integrated solution

Through seamless integration with DLF and OSS, Hologres can directly speed up the reading and writing of data in various formats and types stored on OSS, reduce development, operation and maintenance costs, break data silos, and achieve business insight without moving data (the surface is only used for field mapping, not really storing data) in the form of external tables.

 Lake warehouse integrated solution
  • High performance

    Use vector engine to accelerate OSS/DLF/MaxCompute.

  • Openness

    It is convenient to import and export data, and data warehouse data flows freely.

  • cost performance

    The exclusive instance lake warehouse resource reuse does not require additional cost calculation, and the shared cluster serverless mode pays by use.

Vector Retrieval Solution

The big model can be widely used in all walks of life, but the ability to answer professional questions in vertical industries is not perfect. Hologres supports vector retrieval capability with high concurrency and low latency, which can be perfectly combined with large models and PAI to complete the construction of enterprise specific question and answer knowledge base.

 Vector Retrieval Solution
  • Extreme performance

    It supports efficient index construction and vector retrieval with high concurrency and low latency.

  • Real time capability

    Support vector data can be written and updated in real-time with high performance, and data can be checked after being written.

  • Easy to use

    The use of vector computation can be completed through standard syntax.

Related products

Online consultation
High availability solutions

For the high availability scenario of online production environment, it provides a master slave multi instance deployment mode of shared storage and computing group instances. In this mode, fault isolation and load isolation are supported, which effectively supports the high availability scenario.

 High availability solutions
  • On demand expansion and reduction

    Warehouse can be pulled up on time or on demand (Scale Out); Warehouse dynamic thermal expansion (Scale Up); Computing and storage are highly scalable and dual elastic.

  • Cost reduction and efficiency increase

    Users can use resources as needed, and the cost can be minimized; Based on physical replication, physical files are fully reused, reducing cost and increasing efficiency.

  • Computing group resource isolation

    Each computing group is naturally isolated from each other in physical resources to avoid interaction between computing groups and reduce service jitter.

Related products

  • Hologres This product
Online consultation
Product pricing

Hologres provides new customers with 5000CUh+20GB storage for free trial, and 8C and 32C specifications can be selected for calculation. For example, if the 8C specification is selected, the consumption of 8 * 24 in 24 hours is 192 CU. After the free trial, if you need to continue the test, you can purchase 32C128G as a prepaid fee, which is 888 yuan/month after the first month discount, only once.

Billing method

Hologres provides flexible billing methods to help you save use costs.

  • Monthly package (prepayment)

    It supports ascending and descending configuration. You can flexibly expand or shrink Hologres resources according to business requirements.
    View details
  • Pay as you go

    It is settled every hour. One master and multiple slave instances can be used to ensure load isolation
    View details
  • Storage and computing resource package

    Deduct the calculation and storage fees based on the volume, which is lower than the cost of pay as you go method.
    View details
Security Compliance

Hologres passed the independent third-party auditor's audit on Alibaba Cloud's description of the security, availability and confidentiality principles in the AICPA trusted service standard, and passed the PCI DSS certification. PCI-DSS is the most stringent and highest level financial data security standard in the world at present.

  •  data security

    data security

    • Storage transmission encryption The storage supports visible and controllable semi managed encryption (BYOK), and separate encryption rules can be set for each table. SSL can be enabled to encrypt network connections at the transport layer.

    • Data desensitization It supports desensitization by column level, and desensitization policies are set for specified users. At the same time, it supports multiple desensitization rules, such as IP address desensitization, email address desensitization, Hash desensitization, etc.

  •  system safety

    system safety

    • Permission management Alibaba Cloud general RAM authentication is supported, and AccessKey is created for identity authentication. Support multiple permission models such as simple permission, expert permission and schema level simple permission

    • Operational audit AliCloud operation audit ActionTrail console OpenAPI、 Developer tools, etc., query the instance operation event log in the past 90 days, and provide Query log information.

  •  network security

    network security

    • Access isolation The classic network, VPC network, and public network of each instance are isolated, and only the corresponding Endpoint and virtual intranet IP (VIP) can be accessed.

    • IP whitelist On the basis of various access authentications, when the white list function is enabled, only devices in the white list are allowed to access Hologres instances, and devices not in the white list cannot be accessed through authentication.

Customer Stories
common problem
Q: Lambda or Kappa architecture for real-time data warehouse?
A: Lambda architecture stores fragmented status, resulting in inconsistent data and caliber, while Kappa architecture cannot meet the requirements of frequent data correction and update. Hologres proposed HSAP architecture to achieve the integration of offline real-time data analysis services. View details
Q: How does Streaming Warehouse choose to ensure real-time performance?
A: Hologres combined with Flink can directly replace Flink+Kafka, realizing real-time write and update of 1 billion+/second data with high throughput, and solving the problem of real-time data warehouse layering. View details
Q: How to achieve performance tuning through Hologres?
A: Hologres can optimize processes such as data table construction and data query. Ali Mama has reduced the time spent on data analysis of 600 million people by 72% through practice. View details
Q: How does Hologres optimize the query performance of semi-structured data?
A: Hologres upgraded JSONB columnar storage, improved query performance by 400%+, reduced storage by 45%, and saved thousands of cores (estimated cost savings of millions of yuan) in Taobao's dual 11 search scenario. View details
Q: How to realize self diagnosis and self operation and maintenance through Hologres?
A: Hologres can reveal worker level monitoring indicators to help businesses more accurately locate problems and check resource usage, so as to improve the overall availability of the system. View details
Q: How to troubleshoot Hologres OOM problems?
A: The OOM problem usually occurs in the query, data import/export and other scenarios. The main reason is that the memory consumption is too high. Hologres has a variety of ways to gradually solve the problem of high memory water level. View details
community
Experiments and courses
Technical exchange