Streamlined, High-Speed Virtualized Packet I/O

An overview over the performance optimizations.

Looking for Xen examples? Check out Getting started section for more details

Architecture


Our goal is to build a multi-tenant, high performance software middlebox platform on commodity hardware.

To achieve isolation and multi-tenancy, we must rely on hypervisor virtualisation, which adds an extra software layer between the hardware and the middlebox software and could potentially hurt throughput or increase delay.

To minimize these effects, paravirtualization is preferable to full virtualization: para-virtualization makes minor changes to the guest OSes, greatly reducing the overheads inherent in full virtualization such as VM exits or the need for instruction emulation.

We base our work on top Xen since its support for paravirtualized modes provides the possibility to build a low delay and high throughput platform.

High-level architecture overview.

Xen is typically split into a privileged virtual machine or domain called Domain-0 (typically running Linux), and a set of guest domains comprising the users’ virtual machines (also known as DomUs). In addition, Xen includes the notion of a driver domain VM which hosts the device drivers, though in most cases Dom0 acts as the driver domain.

Our platform components consist of a fast backend switch, a new netback driver and new corresponding netfront drivers for MiniOS and Linux.

Xen Network I/O Optimizations


The Xen network I/O pipe has a number of components and mechanisms that add overhead but that are not fundamental to the task of getting packets in and out of VMs. In order to optimize this, it would be ideal if we could have a more direct path between the backend NIC and switch and the actual VMs. Conceptually, we would like to directly map ring packet buffers from the device driver or back-end switch all the way into the VMs’ memory space, much like certain fast packet I/O frameworks do between kernel and user-space in non-virtualized environments.

More specifically, we replaced the standard but sub-optimal Open vSwitch backend switch with a high-speed, VALE switch; this switch exposes per-port ring packet buffers which are the ones we map into VM memory space. We observe that in our model the VALE switch and netfront driver transfer packets to each other directly so that the netback driver becomes a redundant component of the data plane. As a result, we remove it from the pipe, but keep it as a control plane driver for things like communicating ring buffer addresses (grants) to the netfront driver. Finally, we revamp the netfront driver to map the ring buffers into its memory space.

Backend driver

As mentioned, we redesigned the netback driver to turn it (mostly) into a control-plane only driver. Our modified driver is in charge of allocating memory for the receive and transmit packet rings and their buffers and to set-up memory grants