检测到您已登录华为云国际站账号,为了您更好的体验,建议您访问国际站服务网站 https://www.huaweicloud.com/intl/zh-cn
不再显示此消息
Public Images Supported by AI-accelerated ECSs Table 2 Public images Type Series Public Images Enhanced AI inference-accelerated (type I) Ai1s Ubuntu Server 18.04 64bit CentOS 7.6 64bit AI inference-accelerated (type I) Ai1 Ubuntu Server 16.04 64bit CentOS 7.4 64bit Enhanced AI Inference-accelerated
Model Name Minimum Flavor GPU 0 DeepSeek-R1 DeepSeek-V3 p2s.16xlarge.8 V100 (32 GiB) × 8 GPUs × 8 nodes p2v.16xlarge.8 V100 (16 GiB) × 8 GPUs × 16 nodes pi2.4xlarge.4 T4 (16 GiB) × 8 GPUs × 16 nodes Manually Deploying a DeepSeek-R1 or DeepSeek-V3 model Using SGLang and Docker on Multi-GPU
Model Name Minimum Flavor GPU Nodes 0 DeepSeek-R1 DeepSeek-V3 p2s.16xlarge.8 V100 (32 GiB) × 8 8 p2v.16xlarge.8 V100 (16 GiB) × 8 16 pi2.4xlarge.4 T4 (16 GiB) × 8 16 Contact Huawei Cloud technical support to select GPU ECSs suitable for your deployment.
9xlarge.2 36 72 12/8 200 8 6 12 KVM kai1s.12xlarge.2 48 96 12/8 200 16 6 12 KVM Features Ascend 310 processors, four of which in an Atlas 300I accelerator card 8 TeraFLOPS of half-precision computing (FP16) on one processor 16 TeraOPS of integer-precision computing (INT8) on one
General-purpose web server: By default, ports SSH (22), RDP (3389), HTTP (80), HTTPS (443), and ICMP (All) are allowed. All ports open: Inbound traffic on any port is allowed. This poses security risks, so exercise caution when selecting this template.
Modify the file as follows: const http = require('http'); const hostname = '0.0.0.0'; const port = 3000; const server = http.createServer((req, res) => { res.statusCode = 200; res.setHeader('Content-Type', 'text/plain'); res.end('Hello World\n'); }); server.listen(port, hostname
Deploying a Quantized DeepSeek Model with Ollama on a Single Server (Linux) Scenarios Quantization is a process of converting 32-bit floating-point numbers into 8-bit or 4-bit integers by reducing the accuracy of model parameters.
Deploying a Distilled DeepSeek Model with Ollama on a Single Server (Linux) Scenarios Distillation is a technology that transfers knowledge of a large pre-trained model into a smaller model.
Such an ECS functions as the client (TX end) or server (RX end) in netperf tests. Auxiliary ECSs: an ECS that is used to exchange test data with the tested ECS. The auxiliary ECS functions as the client (TX end) or server (RX end) in netperf tests.
Scenarios System disk When a server is created, a system disk is automatically initialized with Master Boot Record (MBR). New data disk If a data disk is created together with a server, EVS automatically attaches it to the server.