Research
Data Governance Innovation Lab adheres to the win-win development concept and welcomes cooperation with experts in academia and industry in the following research areas. For any queries, contact us at longjiang4@huawei.com.
Research
Data Governance Innovation Lab adheres to the win-win development concept and welcomes cooperation with experts in academia and industry in the following research areas. For any queries, contact us at longjiang4@huawei.com.
-
Intelligent Data Value Exploration Platform
Traditional data analysis is based on specific service requirements, including data integration, governance, development, and analysis. The future is an era of data-driven innovation. Mining data value and new service scenarios from massive data through uncertain and random data exploration behavior will become the norm. Therefore, we are exploring the random and informative intelligent data exploration platform to help customers discover value.Traditional data analysis is based on specific service requirements, including data integration, governance, development, and analysis. The future is an era of data-driven innovation. Mining data value and new service scenarios from massive data through uncertain and random data exploration behavior will become the norm. Therefore, we are exploring the random and informative intelligent data exploration platform to help customers discover value. -
Next-Generation Intelligent Data Lake Computing Mode Powered by Vector Calculation
Factors of AI such as feature vectorization, confidence, and probability pose new requirements on data computing and storage. The collision of vector calculation and statistical analysis can guide exploration for the next-generation of big data computing.Factors of AI such as feature vectorization, confidence, and probability pose new requirements on data computing and storage. The collision of vector calculation and statistical analysis can guide exploration for the next-generation of big data computing. -
Intelligent Data Detection, Repair, Association, and Sampling
Intelligent data quality detection and repair, association, entity merging, sampling, and comprehensive profilingIntelligent data quality detection and repair, association, entity merging, sampling, and comprehensive profiling
-
Intelligent Data Asset Management Engine
Federated metadata management of data assets of public cloud, private cloud, and local data sources; tens of millions of metadata and their relationships, and millisecond-level query performance; unstructured metadata governance, and fuzzy retrieval and recommendation of images, video, and text; real-time metadata system of a data lake for unified metadata management of a big data cluster with more than 20,000 nodesFederated metadata management of data assets of public cloud, private cloud, and local data sources; tens of millions of metadata and their relationships, and millisecond-level query performance; unstructured metadata governance, and fuzzy retrieval and recommendation of images, video, and text; real-time metadata system of a data lake for unified metadata management of a big data cluster with more than 20,000 nodes -
Intelligent Data Security Management Engine
Full-link security governance: algorithms for various GDPR-compliant data classification and masking scenarios, including data labeling and watermarkingFull-link security governance: algorithms for various GDPR-compliant data classification and masking scenarios, including data labeling and watermarking -
Intelligent Data Quality Engine
Intelligent data quality algorithms: abnormal data detection and repair algorithm,entity merging algorithm,and data column association algorithm; higher than 90% accuracy and recall rate for all datasets; high-performance data quality engine: TB-level data quality in seconds and distributed memory cache and automatic scaling.Intelligent data quality algorithms: abnormal data detection and repair algorithm,entity merging algorithm,and data column association algorithm; higher than 90% accuracy and recall rate for all datasets; high-performance data quality engine: TB-level data quality in seconds and distributed memory cache and automatic scaling.
-
Intelligent Model-driven Engine
Model-driven intelligent data pipeline construction and data asset generationModel-driven intelligent data pipeline construction and data asset generation -
High-Performance Cross-Source Query Optimizer
Multiple computing engines, such as Hive, Spark, HBase, and MySQL, implementing cross-region and cross-engine scheduling and optimization, and improving performance by over 10 times compared with open-source Rheem and CalciteMultiple computing engines, such as Hive, Spark, HBase, and MySQL, implementing cross-region and cross-engine scheduling and optimization, and improving performance by over 10 times compared with open-source Rheem and Calcite -
Intelligent Hybrid Data Lake Scheduling Engine
Cross-region data resource scheduling, cross-public cloud and HCS hybrid cloud data resource scheduling, and AI operator scheduling; concurrent scheduling of millions of nodes during peak hoursCross-region data resource scheduling, cross-public cloud and HCS hybrid cloud data resource scheduling, and AI operator scheduling; concurrent scheduling of millions of nodes during peak hours
-
Visualized Development Recommendation Engine Based on Machine Learning
Intelligent industry module recommendation on visualized screens: intelligent template recommendation based on users' industry background; smart assistance optimization on visualized screens: intelligent one-click optimization (intelligent color matching and layout) through machine learning; scenario-based visualized modeling and development platforms, such as 3D city and 3D campus, as well as device-edge-cloud big data input and visualized interaction and presentationIntelligent industry module recommendation on visualized screens: intelligent template recommendation based on users' industry background; smart assistance optimization on visualized screens: intelligent one-click optimization (intelligent color matching and layout) through machine learning; scenario-based visualized modeling and development platforms, such as 3D city and 3D campus, as well as device-edge-cloud big data input and visualized interaction and presentation