检测到您已登录华为云国际站账号,为了您更好的体验,建议您访问国际站服务网站 https://www.huaweicloud.com/intl/zh-cn
不再显示此消息
For example, you can choose a cluster management scale of 50 or 200 nodes when creating the cluster. The flavors of the master nodes are influenced by the cluster scale. Higher cluster scales require higher flavors of the master nodes.
BMS nodes can be created in a CCE cluster (when the tunnel network model is used). Ascend-accelerated nodes (powered by HiSilicon Ascend 310 AI processors) apply to scenarios such as image recognition, video processing, inference computing, and machine learning.
BMS nodes can be created in a CCE cluster (when the tunnel network model is used). Ascend-accelerated nodes (powered by HiSilicon Ascend 310 AI processors) apply to scenarios such as image recognition, video processing, inference computing, and machine learning.
Configuring Cluster Logs Function This API is used to select the master node components whose logs are reported to LTS. Calling Method For details, see Calling APIs.
Master Node AZs You can check how many master nodes are supported in a cluster. To check data such as the resource usage of the master nodes, click View Monitoring in the upper right corner to go to the Monitoring (Monitoring Center) page.
Pod IP Addresses Reserved for Each Node (supported by clusters that use VPC networks) The number of container IP addresses that can be allocated to each node (alpha.cce/fixPoolMask) during cluster creation.
Number of master nodes: For example, a non-HA cluster (with one master node) cannot be changed to an HA cluster (with three master nodes).
Locate the row that contains the security group (starting with {CCE cluster name}-cce-control) of the master node and click Manage Rules in the Operation column.
Scheduling Policy (Affinity/Anti-affinity) Negative example: For application A, nodes 1 and 2 are set as affinity nodes, and nodes 3 and 4 are set as anti-affinity nodes. Application A exposes a Service through the ELB, and the ELB listens to node 1 and node 2.
# java -version openjdk version "1.8.0_382" OpenJDK Runtime Environment (build 1.8.0_382-b05) OpenJDK 64-Bit Server VM (build 25.382-b05, mixed mode) Add environment variables.
With Kubeflow 1.0, you first develop a model using Jupyter, and then set up containers using tools such as Fairing (SDK). Next, you create Kubernetes resources to train the model. After the training is complete, you create and deploy servers for inference using KFServing.
GPU) Exceptions Nodes' System Parameters Residual Package Version Data Node Commands Node Swap NGINX Ingress Controller Upgrade of Cloud Native Cluster Monitoring containerd Pod Restart Risks Key CCE AI Suite (NVIDIA GPU) Parameters GPU or NPU Pod Rebuild Risks ELB Listener Access
Table 2 Supported GPU drivers GPU Model Supported Cluster Type Specification OS Huawei Cloud EulerOS 2.0 (GPU Virtualization Supported) Ubuntu 22.04 CentOS Linux release 7.6 EulerOS release 2.9 EulerOS release 2.5 Ubuntu 18.04 (EOM) EulerOS release 2.3 (EOM) Tesla T4 CCE standard
Support for Kubernetes pods (a group of containers). Cross-AZ deployment of master nodes. Automatic bare-metal server deployment. Support for SFS Turbo. 2018-10-18 Changes: Kubernetes resource quota management. Users can create Services using YAML files.
Changing the node IP address The master node will be unavailable. Change the IP address back to the original one. Modifying parameters of core components (such as etcd, kube-apiserver, and docker) The master node may be unavailable.
the configuration of CCE AI Suite (NVIDIA GPU) in a cluster has been intrusively modified.
Configuring Intra-VPC Access This section describes how to access an intranet from a container (outside the cluster in a VPC), including intra-VPC access and cross-VPC access.
is used to transmit VXLAN packets in container networking (involved when the container tunnel network model is used) Dynamic port (30000-32767) TCP Listening port of kube-proxy for layer-4 load balancing.
You are advised to harden the inbound rule of port 5443 for the master node security group. For details, see How Can I Configure a Security Group Rule in a Cluster? This operation will restart kube-apiserver and update the cluster access certificate (kubeconfig).
Failed to delete the server group of master Major Check whether the master node (ECS) is deleted from the cluster. Failed to delete the virtual IP for the master Major Check whether the virtual IP address is deleted from the cluster.