Big data basic suite (Baidu Luban) solution
Based on years of technology accumulation, Baidu inherits the craftsmanship spirit of meticulous craftsmanship, and launches the big data basic kit (Baidu Luban) to help
Enterprises are stepping into a new era of data intelligence and artificial intelligence.
Scheme overview
Baidu Luban is a basic suite for big data analysis and processing provided by Baidu. It mainly provides data warehouse, log analysis and data mining services, as well as some corresponding basic services. Baidu Luban bottom layer relies on infrastructure services composed of computing and storage, which can run in the infrastructure environment of private cloud, public cloud and hybrid cloud, and we also provide all-in-one solutions.
Programme composition
Baidu big data transmission Minos
It provides general big data transmission services and is responsible for data transmission between various heterogeneous media, such as MySQL, HDFS, Kafka, LocalFileSystem, etc. The business side can collect these data through Minos for offline and online system analysis, so as to give full play to the value of big data.
Learn more
Baidu offline processing Pingo
Pingo is a unified batch and streaming data processing system. On top of the elastic computing resource management and optimized data access management layer, it runs an optimized Spark computing engine, provides SQL analysis and DataFrame APIs, supports low latency streaming data processing and processing, and provides external REST Service task execution interfaces.
Learn more
Baidu Data Warehouse Palo
The fully hosted PB level MPP architecture data warehouse service provides multi-dimensional analysis and report query functions on large data sets at a lower cost. After the ETL structured data is imported into Palo, the industry's leading BI tools can be used for real-time analysis and visual display to explore the value of the data.
Learn more
Baidu Elasticsearch
Baidu Elasticsearch is the hosting service of the open source full-text search and analysis engine Elasticsearch, which provides automatic operation and maintenance and tuning, reduces the cost of customers' self-management infrastructure, and is fully compatible with the open source interface, facilitating the zero cost migration of your existing business.
Learn more
Baidu Online Computing IntelliS
One stop big data online intelligent computing platform helps users quickly build model prediction services, online feature computing, online data services, etc. It supports HDFS, KV database SimpleDB, graph database GraphDB and other data sources, and helps users realize the whole process service from the original data source to online cluster query.
Learn more
Baidu Big Data Visualization Habo
Habo is a one-stop business intelligence visualization platform for big data. It integrates functions such as multi person collaboration, data exploration, interactive visual analysis and flexible job scheduling to help users explore the business value hidden in massive data faster and better.
Learn more
Scenario solution
data warehouse
It provides one-stop big data platform services for data storage, processing, analysis and visualization. It is embedded with Apache Spark computing framework and Impala's MPP high-performance query engine. Without going through the tedious data ETL/ELT process, it can conduct data access, processing, exploration analysis and data visualization in a simple and easy-to-use way.
Learn more
Log analysis
It provides a complete log analysis solution, covering log collection, storage, analysis, visual presentation and other links. It can quickly convert log data stored in different systems into events that can be searched, and help enterprises quickly achieve business goals such as operation and maintenance monitoring, danger early warning, security audit, and troubleshooting in a unified system.
Learn more
data mining
Built in data pre-processing and feature engineering, it supports rich high-level algorithms and frameworks such as machine learning and deep learning, integrates knowledge maps, user profiles, personalized recommendations and other functions, and covers the whole process of model training, prediction and deployment. Provide one-stop, interactive and visual data mining solutions.
Learn more
Application scenarios
Report platform construction
The underlying layer of Habo supports a variety of data sources, provides a variety of visual charts, and provides the ability to transform data sources into reports, facilitating users to quickly build a report platform.
Text retrieval
Text retrieval helps enterprises quickly and easily realize the search function for unstructured data. In the era of big data, Baidu has provided private and cloud based text retrieval solutions to guide enterprises in building text retrieval systems.
Real time feature calculation
Through IntelliS, enterprises can easily realize online data feature extraction, Join and other computing services, so that enterprises can focus on computing logic research and development, save deployment, operation and maintenance, and service costs, and quickly achieve business goals.
Online model prediction
IntelliS has a variety of built-in in-depth learning models and a convenient version management mechanism to facilitate enterprises to quickly implement online forecasting services, efficiently iterate versions, and quickly maximize business value.
Database synchronization
With the help of Minos, enterprises can easily realize data transmission between various heterogeneous media, and real-time synchronization between databases such as MySQL and HIVE or other MPP data warehouses.
Intelligent operation and maintenance
The log analysis solution can centralize and integrate logs scattered on thousands of machines, quickly locate faults through powerful query language, solve problems faster, shorten business unavailable time, and effectively improve service quality.
Business analysis
With the log analysis solution, enterprises can understand user behavior characteristics more clearly, grasp business operation status more comprehensively, build users and business models, and grasp business development direction scientifically based on the collection and analysis of user access logs and business process logs.
Model development
With the help of data mining solutions, enterprises can select the required features in the feature library, try different models, and observe the evaluation results of the models. During the support period, frequently increase or decrease features, adjust model parameters, compare multiple versions, and finally select the optimal model.