Leading Cloud Native | HUAWEI CLOUD Volcano Batch Computing System Officially Becomes a CNCF Project
Apr 24, 2020
On April 10 (Beijing time), Cloud Native Computing Foundation (CNCF) officially announced HUAWEI CLOUD Volcano (https://github.com/volcano-sh/volcano) will be the first container batch computing project of CNCF. CNCF stated that with Volcano, its cloud native layout extends to batch computing domains such as AI, big data, and genomic sequencing. Huawei has created a foundation for cloud native batch computing platforms.
"HUAWEI CLOUD has been dedicated to promoting cloud native technologies for many years. We were the first in China to launch commercial container products like Cloud Container Engine (CCE) and Cloud Container Instance (CCI) using the Kubernetes cloud native container technology. In addition, HUAWEI CLOUD has initiated and led multiple ecosystem projects in the Kubernetes community, making it easier for enterprises to adopt and profit from cloud native technologies", said Zhang Yuxin, CTO of HUAWEI CLOUD. Zhang continued, "Volcano is a cloud native batch computing engine based on Kubernetes. With Huawei's profound service experience in AI and big data, Volcano can overcome the shortcomings of Kubernetes in terms of scheduling batch computing tasks, and orchestration scenarios when AI, big data, or high-performance computing are involved. By using multi-architecture computing that includes Kunpeng, Ascend, and x86, Volcano enables common industry computing frameworks, such as TensorFlow, Spark, and Huawei MindSpore to work more efficiently. Volcano delivers more efficient computing. Volcano delivers the ultimate experience to data scientists and algorithm engineers."
Introduction to Volcano
As Kubernetes matures, more and more enterprises are using its next-generation infrastructure for AI, big data, and high-performance batch computing. Kubernetes features application consistency and allows for convenient cross-cloud migration and flexible task scheduling, which make it popular for big data, AI, and high-performance batch computing scenarios.
However, as a universal container-based solution, Kubernetes is sometimes not the best choice for specific domains such as big data, AI, and high-performance batch computing. For example:
l Kubernetes native scheduling cannot meet batch computing requirements.
l Kubernetes cannot manage complex AI training jobs.
l There are not advanced data management functions, such as data caching, on the compute side, and there is no data location awareness.
l Kubernetes does not support time-based resource sharing and resource utilization is insufficient.
l Kubernetes does not support heterogeneous hardware very well.
To address these problems, HUAWEI CLOUD Container Team has launched a Kubernetes-native batch computing solution. In addition, the open-source Volcano (the core engine in the solution), from HUAWEI CLOUD, was released in 2019 to promote the use of cloud native technologies in various industries. HUAWEI CLOUD has also optimized Volcano scheduling, job, data, and resource management. Volcano provides:
l Enhanced task scheduling capabilities, such as fair-share and gang scheduling
l Optimized job management, for example, multiple pod-template and a more flexible error handling mechanism
l Support for data caching on the compute side to improve the data transmission and read efficiency
l A multi-dimensional comprehensive scoring mechanism to manage and allocate resources efficiently
l Support for multi-architecture computing, so x86, Kunpeng, and Ascend can all work together.
Volcano has completed official integration with multiple common computing frameworks, including Kubeflow, Spark, PaddlePaddle, Horovod (MPI), Cromwell, and MindSpore, to address different service scenarios.
"Volcano makes up for the shortcomings of Kubernetes in AI scenarios and provides better support for interconnecting PaddlePaddle's distributed deep learning with Kubernetes. The PaddlePaddle on Volcano solution significantly simplifies the deployment of our recommendation system solution ElasticCTR. We are looking forward to a more mature and complete open-source deployment solution using Kubernetes+Volcano+PaddlePaddle to bring more convenience to AI developers."
-By Yu Dianhai, Chief Architect of PaddlePaddle
"Huawei open-sourced MindSpore is a deep learning training and inference framework that supports all device-edge-cloud scenarios. It is mainly used in AI scenarios such as computer vision and natural language processing to provide data scientists and algorithm engineers with an efficient, design-friendly development experience. MindSpore natively supports Ascend AI processors and optimizes software and hardware synergy. Volcano improves Kubernetes scheduling capability of AI tasks, which facilitates the deployment of deep learning frameworks such as MindSpore and creates a solid foundation for AI and cloud native domains to jointly create a prosperous open-source ecosystem."
-By Professor Chen Lei, Chairman of the MindSpore Community Technology Committee and Chief Scientist of Huawei MindSpore
Volcano has received much attention and support since it was open-sourced in June 2019. More than 80 major developers from 15 large-scale enterprises and institutions have participated in and contributed to community development.
Currently, Volcano is being used commercially in a HUAWEI CLOUD Kubernetes-native batch computing solution and applied to computing scenarios such as AI, big data, and genomic sequencing by multiple industry-leading enterprises in China and around the world.
Volcano can provision up to 1,000 containers per second and provides advanced batch scheduling functions such as fair-share and gang scheduling. In addition, Volcano can be combined with Huawei Kunpeng and Ascend chips to build a high-performance, cost-effective container batch computing solution.
CNCF's endorsement of Volcano as its only container batch computing project in the cloud native field will give Volcano ecosystem development a tremendous boot as cooperation along the industry chain attracts more cloud native enterprise users. Volcano will play an increasingly important role in enterprise digitalization and cloud native transformation. HUAWEI CLOUD will continue to innovate in the cloud native field and prosper the ecosystem, accelerating intelligent development of various industries.