domdomcnada

Pre-migration compatibility checks

It's not possible, or at least not wise, to allow migration from any host to any other host. There are a whole host of things that need to be set up precisely the same on both the source and destination side for migration to have any chance of working. Libvirt should have some sort of pre-migration call that, given a destination host, will tell you whether a migration from here to there is likely to succeed.

For such a call to be widely useful, the libvirt API would have to allow a caller to specify which of the available checks they are interested in, and then have some sort of return value that indicated if there are show-stopper problems, or just problems that may cause things to be sub-optimal on the remote side. The caller can then decide what action it wants to take.

The corollary to this "will migrate succeed" call is a call which, given two hosts A and B, figure out the lowest common denominator a guest needs to be run at so that migration in the future will likely be successful.

BASIC CHECKS

  1. Matching hypervisors - Make sure migration between Xen and KVM, for instance, is disallowed. Trying to migrate from a "new" hypervisor (say, Xen 3.2) to an "old" hypervisor (say, Xen 3.1) would generate a warning. That should work, in theory, but maybe the caller would prefer not to do that if possible. KVM and Xen are currently *accidentally* incompatible, but this should be explicit.
  2. Matching CPU architectures - Make sure the architecture of the destination can run a guest of the type that is currently running on the source. For instance, an x86_64 hypervisor might be able to run i386 guests (subject to hypervisor versioning, of course), but an ia64 hypervisor can't run an x86_64 guest.
  3. CPU flags - The CPU flags of the destination must be a superset of the CPU flags that were presented to the guest at startup. Many OS's and applications check for CPU flags once at startup to choose optimized routines, and then never check again. If the guest happens to select sse3, and sse3 is not there on the destination, the guest will take an unrecoverable GPF the next time it executes one of those instructions, and crash.

    This is where CPU masking technology and the "lowest common denomintor" API can make a difference. If a guest is started with some CPU flags masked off, it widens the potential migration pool for that guest.
  4. Number of CPUs - For performance reasons, the destination must usually have one physical CPU for each guest virtual CPU. However, for temporary or emergency situations, this may not be a hard requirement, so the caller would choose whether to check for and act on this.
  5. Memory
    1. non-NUMA -> non-NUMA - The destination should have enough memory to fit the guest memory. With certain hypervisors that support memory overcommit, this might be a little tricky, but the caller should be warned that they may either OOM the destination, or cause poor performance over there by dipping into swap.
    2. non-NUMA -> NUMA - For best performance, the guest should be put into a single NUMA node on the destination side. This may be a bad idea, though, if that node is already very busy. This is a NUMA placement problem.
    3. NUMA -> non-NUMA - Like 1, the destination just needs to have enough memory to fit the guest memory.
    4. NUMA -> NUMA - Like 2, in that this is a NUMA placement problem.
  6. Networking
    1. The destination must have the same bridges (at the same names) that the guest is currently using. Additional checks to see if those bridges are hooked to the same physical network as the source side would be nice, but may be too difficult/out-of-bounds.
    2. The device model on the destination must support the network devices that are currently in use in the guest. That is, if the guest is emulating an e1000 nic, and the destination doesn't support it, the migration is going to fail.
  7. Disks
    1. All of disks that the guest is currently using must be available on the destination side at the same paths. The file on the destination must actually be the same file as on the source side, not just a file with the same path. For traditional file-based disks, path names may be the only viable checks. For devices (like LVM, disk partitions, and full disks), it may be possible to validate that the device on the destination is exactly the same one (by checking UUID, etc).
    2. The device model on the destination must support the disk devices that are currently in use in the guest. That is, if the guest is using a virtio drive, then the destination device model must also support virtio.

ADDITIONAL CHECKS

  1. Time sources - When starting a guest, it does some initial checks to determine the frequency of the processor it is running on. If it is then migrated to a machine with a different clock frequency, it can cause time to drift in the guest. Guests with paravirtual interfaces can be told to re-synchronize their clocks on certain events (like migration), but unmodified guests cannot. For this reason, the API may want to warn if the guest is being migrated from a host with a particular clock frequency to another host with a different clock frequency.
  2. PCI-passthrough - If the guest is using PCI passthrough support, it usually doesn't make sense to migrate it. However, this may be possible to support in the future (by bonding a PCI-passthrough device to a PV NIC), so this isn't a hard failure
  3. MSRs - Model Specific Registers are done hodge-podge in virtualization; some are emulated for guests, some are directly controlled by guests (xenoprofile), and some aren't emulated at all. This may or may not be a real problem in practice, and given all of the possible permutation, it's probably best to ignore it for now.
  4. CPUID - There is a lot of model-specific information encoded in the various CPUID calls (cache size, cache line size, etc). If this changes underneath a guest, it might get unhappy.