Yang Xiaohu began teaching in the College of Computer and Science Technology in 1994 and has served as Deputy Director of the Software Institute, Vice President of the Computer Software Institute, and Deputy Director of the Internet Finance Research Institute of Zhejiang University. His research interests over the decades have spanned software engineering, FinTech, and cloud computing. Established in 2001, the Daofu Technology Center of Zhejiang University has become a model of successful cooperation in international research and development. In recent years, Yang has orchestrated research efforts in open source technologies for cloud computing. Teams have contributed more than 10 million lines of source code to key open source projects, including Kubernetes and Docker. Thanks to the spearheading from Mr. Yang, the center had gained great influence in the open source technologies of the global container cloud.
Introduction to the College of Computer Science and Technology at Zhejiang University
The College of Computer Science and Technology at Zhejiang University places the highest of premium on developing talent for the benefit of socioeconomic development. One of its missions is to cultivate computer professionals with an international vision and passion to make the world better through technology. The College has produced many academicians who are now working in the Chinese Academy of Engineering and the Chinese Academy of Sciences. The College has five departments, four research centers, and two centers. It offers four nationally recognized top-tier diplomas in computer science and technology, software engineering, cyberspace security, and design. According to data released in the May 2018 edition of the Essential Science Indicators (ESI) database, the College of Computer Science and Technology at Zhejiang University ranked No. 25 in the world. In December 2017, the Ministry of Education published its fourth assessment on academic programs in which the computer science and software engineering disciplines at Zhejiang University earned an A+ rating – the top ranking in the country.
Open source cloud computing technologies and communities have gained tremendous steam in adoption and prestige over recent years. A large number of institutions, including Zhejiang University, and private individuals have collaborated over such platforms as GitHub and Stack Overflow to create an unprecedentedly open-source cloud native ecosystem that includes containers, microservices, and AI technologies.
Concerted Efforts from Enterprises, Universities, and Individuals Spurring Development of Open Source Cloud Native Ecosystem
The mission of the Cloud Native Computing Foundation (CNCF), established by such entities as Zhejiang University, Google, Red Hat, and Huawei in 2015, is to help, enable, and encourage participants in the building up of sustainable ecosystems around a set of high-quality open-source projects that orchestrate containers as part of a microservices architecture.[GV1] CNCF is one of the most active communities in containers, microservices, and cloud native fields. CNCF promotes a series of open source technologies and standards to build agile applications that are scalable, operable, and observable in dynamic and distributed environments in the cloud native era.
Figure 1: Cloud native ecosystem in the CNCF community
In looking at the statistics for code contributions for Kubernetes, the first open source container orchestration project in the CNCF community, it becomes apparent that close collaboration of enterprises, universities, and individuals is the cornerstone of the development of the community. Since the establishment of the community, Zhejiang University has continuously and enthusiastically invested in the forum with contributions ranking in the top tier. Along with Huawei, Zhejiang University is helping lead the Kubernetes ecosystem.
Standardization Ensures Healthy Development of Open Source Communities
In the same way that biodiversity plays an important role in maintaining vibrant ecosystems, technological diversity also guarantees progress. Currently, CNCF is managing more than 20 open source projects. To date, more than 500 open source technologies (Figure 1) are included in the cloud ecosystem map at the organization. Technical standards ensure seamless interconnection between different types of technologies and prevent malicious competition of similar tooling, which is an important guarantee for the healthy development of the ecosystem.
When CNCF was established in 2015, the founding members provided their constructive ideas on technical standardization and made the overall architectural design of the community's future work scope (see Figure 2), including the interconnection standards between such components as resource scheduling, distributed system services, and application definition and orchestration. While many technical standards are still being conceptualized, many others have come into being in 2018, including container runtime interface, container storage interface, and container network interface standards. Of course, in today's rich ecosystem, technical standards that were not considered at formation are being incorporated, including open service broker API and cloud event standards.
Zhejiang University fully promotes the development of community standards. Team members have participated in R&D of the runC open source project that makes reference to the Open Container Initiative (OCI), the cri-tools open source project as a maintainer, and the container runtime interface (CRI) standard in the community.
Figure 2: Architectural conceptualization at inception of CNCF in 2015
Infrastructure Stability and Ecosystem Prosperity at the Upper Layer
"Infrastructure should be boring" can often be heard in cloud native open source circles, indicating accelerated maturation of IT infrastructure technologies. The emergence of OCI and the extensive use of the reference in basic container represented by containerd/runC have helped shore up stability in the first layer of infrastructure in the cloud native era. The victories with Kubernetes in the container orchestration field have also helped steady the second layer in the infrastructure.
The stability of these underlying technologies enables vendors in the ecosystem to invest in related technologies that build upon the foundation. This means repeated investments can be avoided, helping put the cloud native and microservice tech in the hands of users early as the host of utilities mature quickly in the ecosystem. This level of openness delivers a powerful boost for the upper-layer technology in the cloud native ecosystem.
There are a large number of Kubernetes-Native upper-layer technologies in the CNCF community, including Istio and Linkerd in Service Mesh, the Rook project in cloud native storage, the Fission project in serverless/function computing, Kubeflow project for fast deployment and improved management of deep learning frameworks, Spark on Kubernetes for big data framework management, Ksonnet and the Helm project for application definition and management, and many more. The emergence of these upper-layer technologies enables cloud native applications to be applied more extensively. In addition to common stateless/stateful applications, cloud native applications include serverless, AI, and big data workloads. All of these applications are constantly being improved, and feedback is constantly returned to Kubernetes and other cloud native communities to drive developments further, faster.
The trend in infrastructure stability has prompted some exciting new developments in the underlying profile from vendors. Google recently released the gVisor runtime technology, bringing all new concepts to operation of containers while remaining compliant with the OCI runtime standards.
The Emergence of Serverless Computing and Improvements in the Abstraction Level
Serverless technology is thriving in the cloud native ecosystem and CNCF has defined a landscape just for this segment (Figure 3). We should not equate serverless computing to a specific technology (such as Amazon's Lambda), nor should it be equated to a specific type of technology (such as function computing). Serverless technology represents an improvement in the abstraction level of cloud computing services. Users do not need to be concerned with the underlying technologies (such as specification definition and management of virtual machine clusters), they need only focus on developing applications at the higher levels of abstraction.
From this perspective, we can think of serverless computing as both old and new. It is old because it includes Mobile-Backend-as-a-Service (MBaaS) that has existed for quite some time in IT. Serverless also includes the open source PaaS technology Cloud Foundry BOSH IaaS that came out in 2011 (although serverless was not considered at that time, the IaaS layer interface is automatically invoked through BOSH). This allowed us to implement transparent management of the infrastructure, enabling dynamic adjustment to the scale of IaaS virtual machine clusters aligned with Cloud Foundry concepts. Therefore, although PaaS, FaaS, and other serverless technologies differ, we put Cloud Foundry BOSH IaaS tech into the serverless column. It is new because function computing (function compute) represented by AWS Lambda and more recently by AWS Fargate, Azure Container Instances (ACI), and Huawei Cloud Container Instance (CCI) services are expanding the connotation of serverless computing. Zhejiang University has invested heavily into the R&D of new cloud computing technologies. It started to participate in the Cloud Foundry open source project in 2011, and recently participated in open source FaaS projects like Fission.
Serverless computing aligns well with the concepts in cloud computing, including precise division of workload for improved productivity. We can predict that serverless computing will evolve into multiple forms in the future, not limited to just function compute. The security and monitoring technologies for serverless computing will also continue to flourish and evolve (currently, the serverless ecosystem in CNCF contains only a few types of tools and frameworks; see Figure 3.) The complexities and evolutions in the application architecture mean that serverless computing cannot be implemented overnight. Instead, it will be combined with microservice technology and gradually promoted over the next few years. In this process, the virtual kubelet technology in the CNCF community will be used as a bridge between the new and old architecture application development and O&M models.
Figure 3: Serverless cloud native landscape in the CNCF community.
The New Wave: Integrative Cloud - Edge - Device Computing
According to data released from IDC, with the advent of 5G and the development of IoT, more than 50 billion devices will be connected to the Internet by 2020. Considering the challenges in bandwidth consumption, network delay, and data privacy protection, more than half of the data generated by terminals needs to be analyzed and processed near the device or network edge, especially for Smart City, Smart Healthcare, Smart Manufacturing, Smart Home, and other privacy- and latency-sensitive scenarios. Time insensitive tasks requiring large amounts of compute resource, like AI model training, are processed in the central cloud hub. The future of computing is not limited to large-scale data centers, it will extend from the cloud to the edge to the device to accelerate the entire spectrum.
From the perspective of the computing platform, the cloud-edge-device integrative strategy poses huge challenges to edge operating systems and a device-cloud integrated management platform.
Considering a cluster consisting of terminal devices and access gateways as a small data center means that each edge node will no longer handle a single type of task. Instead, it becomes a general-purpose computing node that can dynamically execute multiple types of tasks scheduled on the node. Therefore, the edge operating system not only needs to be responsible for traditional operating system tasks like scheduling and storage network management on edge devices, it also needs to provide a complete security isolation mechanism to prevent impact to tasks scheduled on the same edge device.
The container, as a kind of lightweight operating system isolation technology, plays a significant role here. According to the amount of resources and functions required for the situation, a complete Docker solution can be deployed or even a more lightweight IoT platform like Eliot built based on containerd/runC or a platform like Baidu IoT Intelligent Edge built based on Linux kernel namespace. The cgroup technology is used to construct a customized container isolation solution. The research at Zhejiang University focuses on unikernel technology. The rumpkernel, OSv, and other unikernel technologies can further shrink the attack surface, reduce resource occupation, and accelerate response in comparison to containerd/runC, and it does it in a more secure computing environment for edge devices.
The device-cloud integrated management platform is responsible for managing a large number of small data centers formed by edge devices. The open source community has already applied container orchestration engines like Kubernetes to a large number of small data centers. In May 2018 during the KubeCon and CloudNativeCon conferences in Copenhagen, Denmark, a breakout session was established to discuss Kubernetes and edge computing topics. EdgeX Foundry on Kubernetes was also initiated by the EdgeX Foundry community under the same Linux Foundation as CNCF. The edge computing platform runs on Kubernetes to implement resource scheduling management. Microsoft's IoT Edge Virtual Kubelet open source project (Figure 4) discusses how to use Kubernetes to build an integrated platform able to manage traditional data centers and edge computing nodes. In the Frakti project sponsored by the CNCF community, Zhejiang University uses unikernel as a runtime for Kubernetes in an attempt to apply Kubernetes to edge computing scenarios.
Figure 4: Microsoft's IoT Edge Virtual Kubelet project architecture uses Kubernetes to build an integrated hybrid management platform containing traditional data center and new edge computing capabilities
Image source: github.com/azure/iot-edge-virtual-kubelet-provider
Promotion of Cloud Native in Academia
In the past five years, containers and microservices have become some of the most talked about technologies in the cloud native wave. Academia has also paid plenty of attention to these new concepts. Industry experts at Google released white papers (including "Large-scale Cluster Management at Google with Borg" and "Design Patterns for Container-Based Distributed Systems") to explain how these technologies are supporting the high reliability and scalability requirements on global service systems at large Internet companies. Researchers from various industries are finding ways to creatively apply cloud native technologies like containers to improve resource scheduling and promote edge computing, IoT, big data, and AI technologies. Scholars in the software engineering field are also paying a lot of attention to new technology wave in open source collaborative development platforms. They have begun to study the iterations in open source projects, the formation of open source communities, and how the internal communication that takes place in the communities promotes R&D efforts in producing high-quality open source software.