Benefits for Serverless Infrastructure

Benefits for Serverless Infrastructure

  • Independent Scaling

    Save money by scaling storage and compute resources independently of each other.
  • Auto Scaling

    Make sure you always have enough capacity on hand no matter what sort of traffic spikes come your way.
  • Pay-per-Use

    Pay only for what you actually use (scanned data volume/CUH packages). When no jobs are running, you pay nothing.
  • HA Dual-AZ Deployment

    Free yourself from the hassle of complicated O&M and upgrade operations while you enjoy high data availability with dual-AZ deployment.

Functions

Functions

  • Full SQL Compatibility

    You don't need to have a background in big data to do big data analysis. If you know SQL, you are good to go. The SQL syntax is fully compatible with the standard ANSI SQL 2003.
    You don't need to a background in big data to do big data analysis. If you know SQL, you are good to go. The SQL syntax is fully compatible with the standard ANSI SQL 2003.
  • Serverless Spark/Flink/openLooKeng

    Seamlessly migrate your offline applications to the cloud with serverless technology. DLI is fully compatible with Apache Spark, Apache Flink, and Apache Presto ecosystems and APIs.
    Seamlessly migrate your offline applications to the cloud with serverless technology. DLI is fully compatible with Apache Spark, Apache Flink, and Apache Presto ecosystems and APIs.
  • Cross-source Analysis

    Analyze your data across databases. No migration required. A unified view of your data gives you a comprehensive understanding of your data and helps you innovate faster. There are no restrictions on data formats, cloud data sources, or whether the database is created online or off.
    Analyze your data across databases. No migration required. A unified view of your data gives you a comprehensive understanding of your data and helps you innovate faster. There are no restrictions on data formats, cloud data sources, or whether the database is created online or off.
  • Enterprise Multi-tenant

    Manage compute or resource related permissions by project or by user. Enjoy fine-grained control that makes it easy to maintain data independence for separate tasks.
    Manage compute or resource related permissions by project or by user. Enjoy fine-grained control that makes it easy to maintain data independence for separate tasks.

Application Scenarios

  • Database Analysis

  • E-commerce

  • Gaming

  • Large Enterprises

  • Genetics

  • Finance

  • Government

  • Geography

Database Analysis

Database Analysis

Application data (such as registration details) is stored in a relational database and needs analysis. 


Pain points

•Complicated queries are not supported for larger relational databases.

•Comprehensive analysis is not possible because database and table partitions are spread in multiple relational databases.

•Business data analysis might overload available resources and impact business operations.

Advantages

A familiar SQL experience

Hit the ground running with new services. DLI supports standard ANSI SQL 2003 relational database syntax so there is almost no learning curve. 

The ultimate performance

Use DLI, powered by distributed in-memory computing models, to easily process massive data.

Related Services

E-commerce

Precision Marketing

Information from multiple channels needs to be combined for associative analysis to improve the conversion rate.  

Advantages

Cross-source analysis

When advertisement CTR data is stored in OBS and user registration data in RDS, you can query and analyze it directly. There is no need to migrate it to DLI.

Pure SQL

With multiple data sources interconnected, you can map them together by creating a table using just SQL statements.

Related Services

Gaming

Log Analysis

Running a gaming company calls for a quality data analysis platform to improve ad placements, improve new player retention, improve operations, and get better feedback for future game iterations.  


PainPoints

•Log analysis is usually performed by period. During the idle periods between each task, resources are wasted.

Advantages

Pay-per-use billing

Reduce costs by more than 50% compared with purchasing exclusive clusters. We only bill you for the resources actually used for scheduling.

Converged analysis

You only need a single copy of metadata for real-time cleaning, offline ETL processing, and interactive analysis. You can directly use the data processing result for data mining.

Related Services

Large Enterprises

Log Analysis

If you run a large enterprise, different departments in the company may need to manage resources independently. You need fine-grained permissions management for data security and improved management efficiency.  

Advantages

Fine-grained permission management

Grant permissions by column or by specific operations, such as insert and overwrite; and control read and write permissions for the metadata.

Unified management

Use a single IAM account to manage the multiple employees in your company.

Genetics

Gene Data Management

Third-party analysis libraries based on the Spark distributed framework, such as ADAM and Hail, are necessary for genome analysis.


Pain Points

•High technical skills are required to install analysis libraries such as ADAM and Hail.

•Every time you create a cluster, you have to install these analysis libraries again.

Advantages

Custom images

You can package third-party analysis libraries, like ADAM and Hail, into custom images and upload them directly to the Software Repository for Container (SWR). When using DLI, custom images in SWR are automatically pulled.

Built-in base images

Huawei enhanced Spark and Flink images in multiple versions and open-source AI images (Tensorflow/Keras/PyTorch) are built in to DLI for your convenience.

Related Services

Finance

Real-time Risk Control

Almost every aspect of financial services involves some sort of risk control. A comprehensive system is required.


Pain Points

•There is very little tolerance for excessive latency when it comes to risk control.

Advantages

High throughput/Low latency

You can perform real-time data analysis in DLI with the help of an Apache Flink dataflow model. A single CPU can process 1,000 to 20,000 messages per second.

Cloud ecosystem

You can save real-time data streams to multiple cloud services such as CloudTable and SMN.

Related Services

Government

Real-time Large Screen

With epidemics like COVID-19 raging across the globe, governments need to be able to monitor key data in real-time on a large screen display, so they can scientifically manage epidemic control.


Pain Points

•Government employees do not necessarily have a background in big data. SQL is usually far more familiar.

Advantages

Query within milliseconds

Based on the powerful in-memory computing framework, the built-in openLooKeng engine optimizes query performance to achieve interactive analysis in milliseconds.

Easy to use

You only need SQL syntax for DLI queries. The syntax is fully compatible with standard ANSI SQL 2003.

Related Services

Geography

Geographic Big Data Analysis

Geographic big data can involve massive volumes of data, for example petabytes of satellite imaging data. It also includes many different types of data. There is, for example, structured remote sensing image raster data, vector data, unstructured spatial location data, and 3D modeling data. An efficient mining tool is a must for geographic data analysis.

Advantages

Spatial data analysis operators

Spark spatial data analysis algorithm operators in DLI enable real-time stream processing and offline batch processing. You can import massive data types, including structured remote sensing image data, unstructured 3D modeling, and laser point cloud data in DLI.

CEP SQL

SQL statements are all that is needed for yaw detection and geo-fencing.

Big data processing

You can quickly migrate terabytes,or even exabytes, of remote sensing images to the cloud and slice the images to data sources for distributed batch processing.

Comparison Between DLI and Self-built Hadoop

Data Lake Insight
Self-built Hadoop system

Cost

Billing is based on the actual amount of data scanned or used CUH. Up to 50% costs saved.

High cost due to long-term resource occupation; wastage of resources

Elastic scalability

Container-based Kubernetes, intelligent elastic scaling

N/A

O&M availability

Out-of-the-box, serverless architecture, and cross-AZ DR

Strong technical capabilities are required for configuration and O&M

Learning cost

Low.

The optimization parameters are standardized based on 10 years' experience in thousands of projects. In addition, DLI provides a GUI for intelligent optimization.

High.

Hundreds of tuning parameters need to be learned.

Supported data sources

Cloud: OBS/RDS/DWS/CSS/MongoDB/Redis;

On-premises: self-built database/MongoDB/Redis

Cloud: OBS;

On-premises: HDFS

Ecosystem compatibility
Data Lake Visualization (DLV), Tableau, Yonghong BI, and Fanruan BI
Big data ecosystem tool
Custom image
Supported. Dependencies can be added as required to meet service diversity requirements.
Not supported.
Workflow scheduling
Scheduling through Data Lake Factory (DLF) in DAYU
Self-built scheduling tools, such as Airflow
Multiple enterprise-level tenants
Table-based permission management, providing column level permission granularity.
File-based permission management
Performance
Higher performance with in-depth software and hardware optimization
Performance is the same as that of Hadoop open-source versions
Success Stories
alt-logo-car
Chengdu Longyuan Network works with HUAWEI CLOUD to query and analyze gaming data in an efficient manner. The analysis provides support for different departments launching new services. Data applications are integrated, benefitting the entire organization.
Chengdu Longyuan Network

New Features

Create an Account and Experience HUAWEI CLOUD for Free

Register Now