Optimized virtualization for embedded computing

Story

October 11, 2019

Rich Jaenicke

Green Hills Software, Inc.

Many embedded systems already employ system virtualization, decoupling virtual and physical systems through the use of virtual machines (VMs). Virtualization in mission-critical embedded systems can be implemented using technologies similar to those used for enterprise systems, but the different use cases of embedded virtualization open the door to additional solutions that align more closely with priorities of embedded systems.

Virtualization has been deployed widely in enterprise servers since the early 2000s. That initial drive for server virtualization was all about server consolidation, which combines services from multiple, underutilized servers running different applications onto one computer. Reducing the number of servers resulted in savings for both capital and operating costs. Such consolidation requires workload isolation that separates applications from each other and the rest of the system, thereby providing some level of security and application autonomy. Because virtual machines (VMs) pair applications with the operating system they rely upon, virtualization can also allow the migration of VMs from one server to another, enabling high availability, load balancing, and additional power savings in a variety of mission-critical military applications.

A secondary drive for server virtualization was the ability to run applications designed for a different operating system (OS) or different versions of the OS. For example, it is common for an engineering workstation running Linux to also run Microsoft Windows to interact with business applications. This was particularly useful for supporting legacy systems, such as when migrating from a mainframe to a server.

Use cases for embedded virtualization

Embedded virtualization has much overlap with enterprise use cases but with different priorities and additional requirements. The primary use cases for embedded virtualization are supporting heterogeneous OSes and increased security. Secondary use cases can include workload consolidation, software license segregation, and facilitating the move to multicore processors. A common driver for supporting heterogeneous OSes is the need to support general OSes such as Linux and Windows for some applications while the critical and trusted applications run on a real-time OS (RTOS). Increased security is particularly important in systems with mixed criticality to isolate the less-critical applications from the ones with more critical real-time, safety, or security requirements.

When assessing a security solution, a key concept is the size of the trusted computing base (TCB) that is comprised of the hardware, software, and controls that enforce the security policy. The general goal is to minimize the size of the TCB and number of interfaces so that it can be verified more easily. The larger the TCB and number of interfaces, the larger the attack surface is. Minimizing the TCB requires moving many noncritical services out of the TCB, which in turn requires both the ability to isolate those services and to provide secure communication between trusted and nontrusted components. Note that minimizing the TCB is not the end goal but only a means to ease verification. For systems requiring high security, the end goal is certification to applicable security assurance requirements.

Unlike VMs in server virtualization, the applications in an embedded system often are highly integrated and need to cooperate. Subsequently, part of the solution needs to include predictable low-latency, high-bandwidth communication paths with permissions enforced by a secure TCB. For embedded real-time systems in particular, meeting the virtualization goals of heterogeneous OSes and increased security cannot come at the expense of determinism of the system or greatly increased latencies. That is doubly true for safety-critical systems. Maintaining determinism presents a challenge for any virtualization solution because efficient virtualization implementations generally use heuristics to recognize variations in code sequences across different OSes and different versions of a given OS.

Hardware support for virtualization

Early virtualization of x86 processors was notoriously low-performance because of the lack of hardware support for virtualization, including virtualized memory management unit (MMU) and input/output memory management unit (IOMMU). Modern processors provide support for hardware-assisted virtualization. One example is Intel VT-x and VT-d.

Intel VT-x provides instructions for entering and exiting a virtual execution mode where the guest OS sees itself as running with full privilege, while the host OS remains protected. Memory virtualization actually requires two levels of virtualization. First, the guest OS stores the mapping from physical to virtual address space in page tables. That guest OS does not have direct access to the physical memory so the virtual machine monitor (VMM) needs to provide virtualization of those page tables. For Intel processors, acceleration of page table virtualization is called Extended Page Tables (EPT).

Intel Virtualization Technology for Directed I/O (Intel VT-d) provides a hardware assist for remapping direct memory access (DMA) transfers and device-generated interrupts. The IOMMU keeps track of which physical memory regions are mapped to which I/O devices. An I/O device assigned to a particular VM is not accessible to other VMs nor can the I/O device access the other VMs.

Even with IOMMU support, the VMM still needs to copy data from the network interface chip (NIC) to the virtual machine or vice-versa. The Single Root I/O Virtualization (SR-IOV) standard from PCI-SIG removes the VMM from the process of moving data to and from the VM. Data is DMAd directly to and from the VM without the software switch in the VMM ever touching it.

Although the key technologies for hardware acceleration of virtualization are implemented at the chip level, board-level decisions also affect the system performance. For example, processors with the most virtualization features often are the ones consuming the most power, so there is often a tradeoff decision for optimizing size, weight, and power (SWaP). Selection of the NIC affects which I/O virtualization features are accelerated. The amount of memory on the board is also an important consideration, as virtualization can consume large amounts of memory.

Embedded virtualization technologies

Once the need for virtualization has been established and supported by the underlying hardware, the next question is what software virtualization technology to use. In the enterprise space, the main choices are Type 1 (Figure 1) and Type 2 hypervisors, where Type 1 runs on bare metal and Type 2 runs on top of another OS. For embedded systems there is a third choice: microkernels with a virtualization layer. Although it is convenient to put any given solution into one of those three buckets, the reality is that there is a gray zone between Type 1 and Type 2, and Type 1 hypervisors can be implemented using microkernel technology. Even with some degree of overlap, it is useful to look at defining characteristics and capabilities.

Figure 1 | With a Type 1 hypervisor, all applications run on two layers of software, the hypervisor and a guest OS, which incurs extra latency and variability in execution time.

Hypervisors, also called virtual machine monitors (VMMs), got their start in enterprise systems with little in the way of resource constraints. As such, many hypervisors and their VMs are heavyweight constructs that often include capabilities such as device drivers and sometimes even networking stacks and file systems. All that functionality requires a large TCB. Networking stacks are particularly high security risk, as seen with the recent “URGENT/11” vulnerabilities. For both Type 1 and Type 2 hypervisors, a guest OS runs inside the VMs along with the applications. Although Type 1 hypervisors running on bare metal are generally more efficient, Type 2 hypervisors can be the right solution if only a small percentage of the applications need a guest OS. In an enterprise context, one example is an engineering environment (for example, Linux) or a creative environment (like macOS) that needs to run a business application that runs only on Windows. Similarly, embedded systems often have a mix of real-time and non-real-time requirements. Using Type 2 hypervisor, the larger set of real-time applications would only rely on the base RTOS, instead of an RTOS and a hypervisor, while only the non-real-time applications would incur the virtualization overhead with a guest OS, hypervisor, and host OS.

Microkernels came from a different direction, aiming to reduce the amount of code executing in the kernel by moving services, including virtualization, to user-mode servers. This also minimizes the TCB to improve both safety and security. A virtualization layer providing guest OS support can be implemented in user space, similar to a Type 2 hypervisor, along with the network stack and file system. Note that the isolation foundation is implemented in the microkernel, including use of the hardware virtualization features.

Getting the virtualization layer out of the trusted computing base is a significant advantage for both security and safety, as virtualization code can be huge. To enable a guest OS to think it is running on bare metal, every part of the system must be virtualized. Although the hardware technologies accelerate memory virtualization, only recently are some processors beginning to accelerate some portions of I/O. Some examples of needed virtualization include device emulation, bus emulation, and interrupt emulation and routing. The code for all that emulation is quite large and also creates a performance penalty. Every call to the kernel from the guest OS needs to be trapped, examined, and determined if the guest OS is permitted that access. In order for a hypervisor to be efficient, it needs to virtualize sequences of instructions instead of single instructions. Such look-ahead capability is just one example of increasing the already large code base of a hypervisor in pursuit of minimizing the virtualization performance penalty.

One specific type of microkernel is the separation kernel, which allocates all exported resources under its control into partitions, and those partitions are isolated except for explicitly allowed information flows. Separation kernels that are designed for the highest security meet the Separation Kernel Protection Profile (SKPP) defined by U.S. National Security Agency (NSA), which was created for the most hostile threat environments.

Comparison of hypervisor and microkernel technology

Today, there is great deal of overlap between the broad set of features in hypervisors and microkernels with a virtualization layer. Both technologies utilize the underlying hardware features such as multiple privilege modes/levels, MMUs, and IOMMUs to provide hardware-enforced isolation and give separate address spaces to different applications. Both hypervisors and microkernels with a virtualization layer each provide the ability to run multiple OSes in a virtualized environment, including mixing RTOSes and non-RTOSes. Even with those similarities, the two technologies can have significant differences in levels of determinism and security.

Microkernel-based RTOSes were designed from the beginning for low latency and high determinism. Running an RTOS on top of a hypervisor adds latency for every system call that has to be intercepted and virtualized. The result is increased latency and lower determinism. To address this, some hypervisors claim to allow running on bare metal, but that is really a misnomer. Even when there is no guest OS, applications still have to run on the hypervisor, which is typically larger than a microkernel. Running on just a hypervisor without a guest OS also means there are no tasking services, no semaphores, and no message passing.

In the case of safety-critical systems, a hypervisor-based solution needs both the safety-critical OS and the hypervisor certified to the highest level of criticality of any of the hosted applications. The total size of that codebase creates a substantial certification burden compared to a microkernel and presents an unnecessary risk.

Alternatively, microkernels with a virtualization layer achieve higher performance by limiting the virtualization side-effects of higher latency and decreased determinism to only the applications that do not run the host microkernel RTOS. In a safety-critical system, the noncritical applications can run on top of the virtualization layer without increasing the size of the codebase required for certification. (Figure 2.)

Figure 2 | With a microkernel, the real-time applications do not incur the overhead of the virtualization layer but still benefit from the isolation provided by the separation kernel.

Security is often the most cited reason for considering a hypervisor. It is a common misconception that hypervisors are inherently secure because they utilize hardware to enforce virtual address spaces and virtual I/O to isolate VMs. First, other technologies, such as partitioning operating systems and separation kernels, also use the same hardware features to enforce isolation. However, the primary consideration for security is that the full solution is only as secure as the underlying software. Hypervisors have been shown to be susceptible to flaws that could allow code execution through buffer overflows and other exploits. For example, the Spectre vulnerability revealed in early 2018 can trick a hypervisor into leaking secrets to a guest application. Because hypervisors run below the guest operating system, a compromised hypervisor is not detectable by the VM. Such exploits even have a catchy name: hyperjacking.

Microkernels have a smaller TCB, and those using separation kernel technology can have the highest levels of security and isolation. The proof of that security level is certification to the SKPP published by the NSA or similar security standards such as Common Criteria EAL6. Some hypervisors include some separation kernel principles to improve security, but no hypervisor has been certified to the SKPP or similar security standards such as Common Criteria EAL6. For systems that require isolation but not virtualization, a microkernel-based separation kernel provides the highest level of security without the overhead and extended code base of a hypervisor.

Optimizing for performance, security

An example of a virtualization solution optimized for both the highest real-time performance and the highest security is the INTEGRITY-178 tuMP RTOS from Green Hills Software, a microkernel-based separation kernel with full virtualization services including the ability to run multiple guest operating systems without modification. As opposed to hypervisor-based virtualization solutions, real-time applications can run directly on this RTOS without a virtualization layer penalty in terms of latency or determinism.

As a separation kernel, the RTOS fully isolates multiple applications/partitions and controls the information flow between applications/partitions and external resources. In part, that includes protection of all resources from unauthorized access, isolation of partitions except for explicitly allowed information flows, and a set of audit services. The result is that a separation kernel provides high-assurance partitioning and information flow control that satisfy the NEAT [nonbypassable, evaluatable, always invoked, and tamperproof] security policy attributes.

INTEGRITY-178 is the only commercial OS or hypervisor that has ever achieved certification to the SKPP published by the NSA as well as Common Criteria EAL6+. That security pedigree has been extended to the multicore INTEGRITY-178 tuMP RTOS.

Rich Jaenicke is director of marketing for safety and security-critical products at Green Hills Software. Prior to Green Hills, he served as director of strategic marketing and alliances at Mercury Systems, and held marketing and technology positions at XCube, EMC, and AMD. Rich earned an MS in computer systems engineering from Rensselaer Polytechnic Institute and a BA in computer science from Dartmouth College.

Green Hills Software
www.ghs.com