Book Review: Virtual Machines: Versatile Platforms for Systems and Processes
Book Review: Virtual Machines: Versatile Platforms for Systems and Processes
By Vineet Chadha
published: Friday, September 02 2005


James E. Smith, Ravi Nair
Morgan Kaufmann, Published June 2005, 638 pages, ISBN 1558609105

Introduction

During the course of my graduate studies and current research work in virtualization, I have felt the need for a book that could explicitly relate the fundamentals in computer systems. This book by Smith and Nair forms the explicit relational bridge between the three fundamental areas of systems in computer science: operating system, computer architecture and programming languages.

The authors have leveraged years of research work in systems and architecture to explore the concepts of virtual machines. The book takes you where other books fail, exploring system research into virtual machines and virtualization technology. Virtual machines are classified into different categories and the pro and cons of various virtualization approaches are discussed.

The book is very well structured. The virtual machine is a complex topic to explain, but the authors have depicted the inner workings and architecture of VMs (virtual machines) with simple and comprehensive diagrams. They introduce and classify virtual machines from a process and system point of view: process virtual machines and system virtual machines, and explain interpretation and binary translation as two kinds of emulation techniques, emulation being the primary mechanism used in the implementation of virtual machines.

The descriptions of different virtual machines address past background and implementation architecture, and touch on the virtualization of multiprocessor machines. The authors explain the potential use and impact of virtual machines in computers. Case studies at the end of virtually every chapter map virtual machine concepts and ideas to commercial products and related research work in the field of virtualization. The transitions from simple topics to more complex ones are easily comprehensible, like the process VM to HLL (High Level Language) VM and from HLL VM to system VM. The book explains the workings of the VM at the level of primitives of CPU such as register state and PC.


Chapter Reviews

Chapter 1 explains the difference between virtualization and abstraction, noting that virtualization doesn’t necessarily hide details. This difference is important, as the term “virtualization” is often interchangeably used with “abstraction”. The text discusses virtualization as isomorphism between guest and host systems, and the interfaces between different layers in the computer system. Virtual machine taxonomy is classified based on whether the virtualizing software is placed at the application binary interface (process virtual machine) or between the underlying hardware machine and conventional software (system virtual machine). Each section is further classified for same and different instruction set architectures. The chapter wraps up with a realistic system comprising a Java application running on a Linux guest VM with Windows as the host VM on Crusoe architecture, which provides support for legacy ISA (instruction set architecture) through a combination of hardware and software in concealed memory (called as co-designed virtual machines).

Virtual machines emulate source ISA into target ISA. In explaining the working of virtual machines at instruction level granularity, describing two kinds of emulation techniques, the authors explain challenges in emulating instructions such as conditional branch instructions. Chapter 2 ends with an excellent summary of emulation methods based on system memory requirements, start-up, steady state performance and code portability. The emulation explanation is well supported by the working of a shade simulation tool.

Why are computer programs compiled for a certain instruction set and operating system not portable to other combinations of ISA and OS? Chapter 3 stresses the need for a process virtual machine, looking at the implementation details and the interaction between guest and host processes. Compatibility is a critical issue in the implementation of any virtual machine, and extrinsic and intrinsic compatibility are well explained. The mapping compatibility diagram back to the formal depiction of virtual machine is the best description of virtual machine state mapping with host physical machine I have found. Memory emulation and issues such as self-referencing code, self-modifying code and memory protection are addressed. Exception handling emulation is a challenge in virtual machines because of its asynchronous nature, and this is explained with a sequence of events between translated and source code. The distinction between Windows and Linux OS emulation is succinctly explained with user/kernel mechanisms such as callbacks, asynchronous procedural call and exceptions. Primarily, the performance of a virtual machine depends on the efficiency of the binary translation system, and code caching is suggested as a mechanism for fast binary translation. Traditional memory management algorithms such as LRU, flush when full, preemptive flush and FIFO are explained in the context of code caching.

Chapter 4 is completely devoted to dynamic binary optimization, the most important part of a virtual machine to give reasonable performance in comparison to a physical machine. Look for the authors’ spectrum of emulation techniques and performance tradeoffs. Profiling is explained as a means to provide feedback information and statistics for future code optimization, especially profiles such as the frequency of certain sections of code and control flow. Other topics considered are code reordering, optimization based on spatial or temporal locality, procedure inlining, and ways to rearrange blocks: traces, superblock and tee groups. The chapter ends with a specialized case of Same ISA optimization.

Before the advent of VMware as a virtualization platform, virtual machines were very much prevalent in the executing platform of higher-level languages such as Java. Chapter 5 reviews high-level virtual machine architecture, gives a historical introduction to p-code Pascal virtual machines, and moves on to present day object-oriented virtual machines like Java VMs. Chapters 5 and 6 discusses programming languages, including an excellent description of Java virtual machine architecture and Java API, a view available to developer and users. Look for the excellent distinction between virtual ISA and conventional ISA at the end of chapter.

Chapter 6 presents the components and implementation details for higher-level virtual machines such as JVM. I liked the authors’ pointing out the limitations of the security manager in JVM through comparison with the classic Turing Machine halting problem. Garbage collection as an important module of JVM is explained, specifically the pros and cons of mark-and-sweep, compacting, copying, generational and incremental collectors. This chapter cleared up my misconception that high-level language machines are slow, and the just-in-time compiler is succinctly explained. The chapter ends with an explanation of Java performance improvements through code optimization such as code relay-out, method inlining, on-stack replacement, multi-versioning and specialization.

In the past, hardware was expensive and slow, and legacy ISA used to be customized to specific hardware implementation. Chapter 7 focuses on backward compatibility of legacy ISA in current microprocessors. For example, modern IA-32 (Intel Architecture, 32 bits) implementation converts CISC (complex instruction set computer) into RISC (reduced instruction set computer) instructions; the conversion is performed totally in the hardware. The text stresses that compatibility is a major obstacle in implementing new ISAs that are more suited to current technology, and that virtual machine technology addresses the problem of compatibility through co-designed VMs. A co-designed VM supports legacy ISA through a combination of hardware and software running in concealed memory; Transmeta Crusoe is an example of co-designed virtual machine.

Look for potential applications for system virtual machines in Chapter 8. System virtualization is explained in the context of three resource virtualizations: processors, memory and I/O. What are the conditions of ISA virtualizability? The authors take us back to a classic research paper by Popek and Goldberg. The relationship between a virtual machine monitor (VMM) and a virtual machine is analogous to the relationship between an operating system and application programs. The functions of a VMM are explained in three parts: dispatcher, allocator and interpreter routines. Look for the classification of ISA instructions: privileged, non-privileged, sensitive and innocuous. Limitation of recursive virtualization is succinctly explained, and memory virtualization is described through mapping between virtual, real and physical memory.

What are the common categories of devices in computer systems? The authors explain virtualizing techniques such as dedicated, partitioned and shared for different devices. Performance is an important issue for virtual machine technology, and paravirtualization is presented as a way to modify guest operating systems for performance enhancement. Reasons for performance degradation and possible remedies are discussed.

Finally, the most popular virtualization platform – VMware – is explained. A case study of the recently released Intel Vanderpool (VT-x) architecture shows that Intel-32 architecture is not efficiently virtualizable. VT-x technology addresses these limitations by providing 2 dimensions to privileged rings. Specifically, VT-x technology eliminates the need to run all guest code in the user mode, reduces the overhead of maintaining and changing state and also eliminates the need for paravirtualization. I liked the details of using VMCS data structure to manage the state of a virtual machine. The authors provide an excellent preview of virtual machine development in the future.

Multiprocessors are omnipresent these days, and Chapter 9 discusses the motivation behind the virtualization of multiprocessors. The legacy IBM mainframe system is used to explain the concept of logical and physical partitioning. Issues related to virtualization with different host and guest ISA are explained in the context of multiprocessors.

In Chapter 10, security, grid computing and the migration of computers are presented as potential and emerging applications of virtual machines. Research projects such as livewire intrusion detection systems are used to explain the role of virtual machines in securing computer systems.


Summary

In the appendix the authors have laid out the prerequisites necessary to understand virtual machine concepts and challenges. Appendix A explains hardware components of real computer: processor, memory I/O. Even though virtual machines have been present for decades, there is revived interest because of the successful virtualization of the Intel IA-32 ISA. The authors explain the privileged rings and registers architecture of the IA-32 architecture often found in Intel manuals. In addition, the Linux operating system organization is presented. The appendix is rounded off with a basic introduction to multiprocessor systems, memory coherency issues and common consistency models. Finally, the authors discuss PowerPC and Intel-32 ISA as an example of the virtualization of the instruction set architectures.

I would have added a timeline of important dates with respect to virtualization history, and an initial chapter for terminology similar to virtualization. For example, my understanding is that virtualization is also the interposition or interception of a normal course of events. In the context of systems and architecture engineering, the interposition and interception of system calls, firewall security (again interception) and proxy-based web caching have been prevalent from the early days. Another example is that emulation is also interception during normal course of an event. In fact, at places, authors do talk about instruction interception.

Also, the authors could have addressed other open source projects like User Mode Linux, an open source alternative to VMware, or the virtualization of IA-64 (Intel architecture, 64 bits) ISA. The Xen virtual machine platform is gaining popularity. I would have liked to see the details of the Xen virtual machine platform. But in fact, virtualization is such a dynamic field that it is very difficult to incorporate all emerging trends. The book stresses the software aspect of virtual machines and architecture. I believe there is lot more to explore related to hardware virtualization and impact of virtualization in hardware industry.

This book is a good candidate as a primary text for courses related to virtual machines and advanced operating systems. It was written for the research community and successfully connects to the audience, and is a good addendum to the collections of performance and compiler engineers.

*****

Vineet Chadha is pursuing his PhD in computer information science and engineering at the University of Florida. His research interests include virtualization, operating systems and distributed computing. You can contact him at This e-mail address is being protected from spam bots, you need JavaScript enabled to view it