Linux kvm guest freeze under Debian

If you run uniprocessor Linux guests using kvm virtualization under Linux Debian as host, you may hit the issue with kvm-clock clocksource causing freezes described here and here.

The issue is difficult to track down; search engines do not help with the most obvious query “kvm linux guest freeze”; the actual bugs cited above are filed with redhat bugzilla and not on any of the three separate bug trackers referenced by the kvm project (the kernel’s bug trackerQemu’s launchpad tracker and the obsolete sourceforge bug tracker). To complicate things further, there is a seemingly unrelated issue with SMP guests.

For all these reasons, I have put together here a thorough description of the symptoms and the cure.

The issue: the guests will freeze after some time (1 h – 48 h)  from boot, even if there is nothing going on in any of them. Launching more than one guest in a short sequence is more likely to make them all hang within 1 to 60 min. You can see the kvm process corresponding to the frozen guest takes 100% CPU on the host; they will not respond to ssh, ACPI shutdown/restart, the only way to get rid of them is to kill the process in the host. Even the libvirt daemon does not respond and will need to be force restarted, for example with this script:

service libvirt-bin stop
`ps aux | grep libvirt | grep -v kvm | grep -v grep | awk '{ print "kill -9 " $2; }'`
service libvirt-bin start

Host: The problem occurred on a single-CPU server with an Intel Xeon 2-core CPU and on a similar machine with a 4-core Intel Xeon, both with hardware virtualization support (VT-x) enabled.

Host OS: Linux Debian 6.0 Squeeze AMD64 with 2.6.32-5-amd64 kernel.

Host kvm version: The kvm package from Debian 6.0 Squeeze (version 0.12.5).

Guest OS: Linux Debian 5 Lenny to Debian 7 Wheezy, both x86 and AMD64. Windows guests do not suffer from this problem.

Qemu command line used to start the guests, something like:

/usr/bin/kvm -S -M pc-0.12 -cpu qemu32 -enable-kvm -m 1024 -smp 1,sockets=1,cores=1,threads=1 -name deb6_32_dev -uuid ... -nodefaults -chardev socket,id=monitor,path=/var/lib/libvirt/qemu/deb6_32_dev.monitor,server,nowait -mon chardev=monitor,mode=readline -rtc base=utc -boot c -drive file=/dev/...,if=none,id=drive-virtio-disk0,boot=on,format=raw -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -device virtio-net-pci,vlan=0,id=net0,mac=...,bus=pci.0,addr=0x3 -net tap,fd=48,vlan=0,name=hostnet0 -chardev pty,id=serial0 -device isa-serial,chardev=serial0 -usb -device usb-tablet,id=input0 -vnc 127.0.0.1:17 -k it -vga vmware -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5

Troubleshooting tricks you could try (with no success): The problem does not go away if kvm is started with the -no-kvm-irqchip switch and it is also there if kvm is started with the -no-kvm switch (freezes only take longer to occur). There is nothing relevant in the libvirt logs, nothing on the guest console, nothing in the guest dmesg, no particular process active when the freeze occur. The strace output for the kvm processes always shows something like this as last system call:

futex(0x858560, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>

The solution: Set the guest clock source to acpi_pm rather than kvm-clock. This is best accomplished at boot via kernel parameters passed by grub. With grub2 (the default for Debian 6 Squeeze and 7 Wheezy), modify the /etc/default/grub file replacing this line:

GRUB_CMDLINE_LINUX=""

with:

GRUB_CMDLINE_LINUX="clocksource=acpi_pm"

Then run update-grub to update the actual configuration file /boot/grub/grub.cfg and reboot.

With legacy grub (the default with Debian 5 Lenny), edit the /boot/grub/, changing the following line in the Default Options section:

# kopt=root=/dev/hda1 ro

to:

# kopt=root=/dev/hda1 ro clocksource=acpi_pm

but do this without uncommenting it ! Then run update-grub and reboot.

After doing these changes, verify that the correct clock source is active in each guest, by issuing:

cat /sys/devices/system/clocksource/clocksource0/current_clocksource

you should get:

acpi_pm

Good luck !