Virtio_user for Container Networking DPDK in Containers Hands-on Lab Accelerate Clear Container Network performance kubernetes 网络组件简介(Flannel, Open vSwitch,Calico)

Accelerate Clear Container Network performance

Microsoft Research released FreeFlow on GitHub.

 https://www.microsoft.com/en-us/research/blog/high-performance-container-networking/

kubernetes 网络组件简介(Flannel, Open vSwitch,Calico)

https://blog.51cto.com/michaelkang/2344724

7.2. Sample Usage

Here we use Docker as container engine. It also applies to LXC, Rocket with some minor changes.

  1. Write a Dockerfile like below.

    cat <<EOT >> Dockerfile
    FROM ubuntu:latest
    WORKDIR /usr/src/dpdk
    COPY . /usr/src/dpdk
    ENV PATH "$PATH:/usr/src/dpdk/<build_dir>/app/"
    EOT
    
  2. Build a Docker image.

    docker build -t dpdk-app-testpmd .
    
  3. Start a testpmd on the host with a vhost-user port.

    $(testpmd) -l 0-1 -n 4 --socket-mem 1024,1024 
        --vdev 'eth_vhost0,iface=/tmp/sock0' 
        --file-prefix=host --no-pci -- -i
    
  4. Start a container instance with a virtio-user port.

    docker run -i -t -v /tmp/sock0:/var/run/usvhost 
        -v /dev/hugepages:/dev/hugepages 
        dpdk-app-testpmd testpmd -l 6-7 -n 4 -m 1024 --no-pci 
        --vdev=virtio_user0,path=/var/run/usvhost 
        --file-prefix=container 
        -- -i
    

Note: If we run all above setup on the host, it’s a shm-based IPC.

Recently, a lightweight and portable application-sandboxing mechanism called containers has become popular among developers who build applications for a wide variety of targets ranging from IoT Edge to planet-scale distributed web applications for multi-national enterprises. A container is an isolated execution environment on a Linux host. It supports its own file system, processes and network stack. A single machine—host—can support a significantly larger number of containers than standard virtual machines, providing attractive cost savings. Running an application inside a container isolates it from the host and other applications running in other containers. Even when the applications are running with superuser privileges, they cannot access or modify the files, processes or memory of the host or other containers. There is more to say, but this is not intended to be a tutorial for containers. Let’s instead talk about networking between containers.

As it turns out, many container-based applications are developed, deployed and managed as groups of containers that communicate with one other to deliver the desired service. Unfortunately, until recently, container networking solutions had either poor performance or poor portability, which undermined some of the advantages of containerization.

Enter Microsoft FreeFlow. Jointly developed by researchers at Microsoft Research and Carnegie Mellon University, Freeflow is an inter-container networking technology that achieves high performance and good portability by using a new software element we call the Orchestrator. The Orchestrator is aware of the location of each container, and by leveraging the fact that containers for the same application do not require strict isolation, Orchestrator is able to speed things up. FreeFlow uses a variety of cool techniques such as shared memory and Remote Direct Memory Access (RDMA) to improve network performance—that is, higher throughput, lower latency, and less CPU overhead. This is accomplished while maintaining full portability and in a manner that is transparent to application developers. It’s said a picture is worth a thousand words and the following figure does a nice job of illustrating Freeflow’s capabilities.

Virtio_user for Container Networking
DPDK in Containers Hands-on Lab
Accelerate Clear Container Network performance
kubernetes 网络组件简介(Flannel, Open vSwitch,Calico)

Freeflow is a high-performance container overlay networking solution that takes advantage of RDMA and accelerates TCP sessions between containers used by the same applications.

Freeflow is a high-performance container overlay networking solution that takes advantage of RDMA and accelerates TCP sessions between containers used by the same applications.

Yibo Zhu, a former colleague from Microsoft Research and a co-inventor of this technology neatly summed up some unique advantages of FreeFlow. “One of the nice features of FreeFlow is that it works on top of popular technologies including Flannel, and Weave,” he said. “Containers have their individual virtual network interfaces and IP addresses. They do not need direct access to the hardware network interface. A lightweight FreeFlow library inside containers intercepts RDMA and TCP socket calls, and a FreeFlow router outside the containers helps accelerate the flow of data.”

You can read all the important details in our NSDI paper, “FreeFlow: Software-based Virtual RDMA Networking for Containerized Clouds.”

Contribution to Open Source

We believe this technology is important for everyone working in this space, so on June 12, 2018, Microsoft Research released FreeFlow on GitHub. Our release is based on the Linux RDMA project with the MIT license. This technology supports three modes of operation—fully isolated RDMA, semi-isolated RDMA, and TCP. Fully isolated RDMA works very well in multi-tenant environment such as Azure cloud. Most RDMA applications should run with no or very little modification and outperform traditional TCP socket-based implementation. We have tested with RDMA enabled –Spark, –HERD, –Tensorflow and –rsocket.

Virtio_user for Container Networking
DPDK in Containers Hands-on Lab
Accelerate Clear Container Network performance
kubernetes 网络组件简介(Flannel, Open vSwitch,Calico)

While the paper includes a solid evaluation, the following table shows some throughput results for a single TCP connection using container overlay across different VMs (28 Gbps models), measured by iperf3.

Virtio_user for Container Networking
DPDK in Containers Hands-on Lab
Accelerate Clear Container Network performance
kubernetes 网络组件简介(Flannel, Open vSwitch,Calico)

Super charging the Kubernetes networks with Freeflow.

These results were measured just before we released Freeflow in June 2018 to open source. We will continue to find ways to improve performance (bet on it), but here’s the takeaway—before FreeFlow, it was not possible to run RDMA on container overlay. Freeflow enables it with minimal overhead.

Speeding up container overlay network

In addition to FreeFlow, I’d like to call attention to another related paper, “Slim: OS Kernel Support for a Low-Overhead Container Overlay Network.” Slim was jointly developed by researchers at the University of Washington and Microsoft Research.

Container overlay networks is a technology that enables a set of containers, potentially distributed over several machines, to communicate with one another using their own independently assigned IP addresses and port numbers. For an application running in these containers, the overlay network coordinates the connection between the various ports and IP addresses. Popular container orchestrators, such as Docker Swarm, require overlay network for hosting containerized applications.

Before we introduced Slim, container overlay networks imposed significant overhead. This was because of the multiple packet transformations within the operating system needed to implement network virtualization. Specifically, every packet had to traverse the network stack twice in the sender’s and receiver’s host OS kernel. To avoid this, researchers developed Slim, which implements network virtualization by manipulating connection-level metadata while maintaining compatibility with existing containerized applications. Packets go through the OS kernel’s network stack only once. Performance improvements are substantial. For example, the throughput of an in-memory key-value store improved by 66% while latency reduced by 42% and CPU utilization by 54%. Check out the paper for a thorough description and evaluation with additional results. If you are wondering what the difference between the two systems is, FreeFlow creates a fast container RDMA network; Slim provides an end-to-end low-overhead container overlay network for TCP traffic.

I will end by saying how proud I am that this is the fifteenth time since its inception that Microsoft Research has sponsored NSDI. Over the years, our researchers and engineers have contributed well over a hundred papers to the technical program and helped organize this symposium, its sessions, and co-located workshops. We are deeply committed to sharing knowledge and supporting our academic colleagues and the networking research community at large.

I am also delighted to announce that our paper Sora: High Performance Software Radio Using General Purpose Multi-core Processors has been awarded the NSDI ‘19 Test of Time Award for research results we presented over ten years ago.

If you plan to attend NSDI ’19, I urge you to attend the presentations of Microsoft papers. I also encourage you to meet our researchers; they will be happy to discuss their latest research and new ideas. And who knows, maybe that leads to some future collaborations and more great NSDI papers!