KVM migration fails when tunnelled due to parsing error in qemu monitor

Bug #869590 reported by Simon Déziel
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
libvirt (Ubuntu)
Fix Released
High
Serge Hallyn
Lucid
Fix Released
Medium
Serge Hallyn

Bug Description

While attempting a live migration tunnelled like this :

virsh migrate --live --tunnelled --p2p guest1 qemu+ssh://192.168.99.3/system

the source host (node1) logs this error :

Oct 6 17:44:06 node2 libvirtd: 17:44:06.513: error : qemuMonitorTextGetMigrationStatus:982 : internal error cannot parse migration data transferred statistic 1052 kbytes#015#012remaining ram: 539972 kbytes#015#012total ram: 541024 kbytes#015#012

And the guest VM freezes and start consuming 100% CPU infinitely.

=========================================================
SRU Justification:
1. Impact: tunnelled migration fails
2. Development fix: consists of a patch which fixes the parsing of the 'info migration' reply.
3. Stable Fix: a simple cherrypick of the development fix
4. Test case:
 1. Set up two hosts with libvirt-bin.
 2. Work around bug 869553. You can do this by simply turning off apparmor on both.
 3. Have them share storage (I install nfs-kernel-server on one, nfs-client on the other, place /mnt on the server's /etc/exports, and mount it on the client)
 4. Create a VM with disks on the shared storage. I basically follow https://wiki.ubuntu.com/SergeHallyn_libvirtnest but replace the disk and cdrom images under /mnt.
 5. Create a ssh key for root on the client; place it in root's .ssh/authorized_keys on the server. Test that root on the client can ssh to the server. Restart libvirt-bin
 6. start the VM on the client
 7. Migrate using:
virsh migrate --live --tunnelled --p2p cdboot qemu+ssh://server/system
 8. With the patch, the vm should now be running on the server and not the client.
5. Regression potential: this patch only affects the migration code, so if there is a regression it should only affect users of (non-tunnelled, as tunnelled is broken anyway) migration.

=========================================================

Revision history for this message
Simon Déziel (sdeziel) wrote :

When looking at the upstream git I found that Daniel P. Berrange had already fixed that in commit 0d3eee7fe8bcaa49. I have extract only the portion of this commit that touches src/qemu/qemu_monitor_text.c and created a debdiff with it.

Revision history for this message
Simon Déziel (sdeziel) wrote :

This bug does not affect Maverick and higher as they are based from a version including the upstream fix already.

I forgot to mention by here are the information of my affected system :

# lsb_release -rd
Description: Ubuntu 10.04.3 LTS
Release: 10.04

# apt-cache policy libvirt-bin
libvirt-bin:
  Installed: 0.7.5-5ubuntu27.16
  Candidate: 0.7.5-5ubuntu27.16
  Version table:
 *** 0.7.5-5ubuntu27.16 0
        500 http://archive.ubuntu.com/ubuntu/ lucid-updates/main Packages
        500 http://archive.ubuntu.com/ubuntu/ lucid-security/main Packages
        100 /var/lib/dpkg/status
     0.7.5-5ubuntu27 0
        500 http://archive.ubuntu.com/ubuntu/ lucid/main Packages

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Thanks very much for submitting this bug and the patch. I will test these and submit for SRU.

Changed in libvirt (Ubuntu):
importance: Undecided → High
assignee: nobody → Serge Hallyn (serge-hallyn)
Revision history for this message
Simon Déziel (sdeziel) wrote :

Thanks Serge, I appreciate your responsiveness.

Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "Fix parsing of 'info migration' reply" of this bug report has been identified as being a patch in the form of a debdiff. The ubuntu-sponsors team has been subscribed to the bug report so that they can review and hopefully sponsor the debdiff. In the event that this is in fact not a patch you can resolve this situation by removing the tag 'patch' from the bug report and editing the attachment so that it is not flagged as a patch. Additionally, if you are member of the ubuntu-sponsors please also unsubscribe the team from this bug report.

[This is an automated message performed by a Launchpad user owned by Brian Murray. Please contact him regarding any issues with the action taken in this bug report.]

tags: added: patch
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Hmm, I'm failing to get tunnelled migration working at all on lucid (tried two different environments). Without --tunnelled, migration succeeds.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Actually it may be a problem with my debdiff application, i'll give it another shot.

Revision history for this message
Dave Walker (davewalker) wrote :

Marking main task as Fix Released, and opened a Lucid tracking task.

Thanks.

Changed in libvirt (Ubuntu):
status: New → Fix Released
Changed in libvirt (Ubuntu Lucid):
importance: Undecided → Medium
status: New → In Progress
assignee: nobody → Serge Hallyn (serge-hallyn)
milestone: none → lucid-updates
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

My past (failed) attempts at getting it to work with this patch were due to libvirt-bin running as root, and root not sharing a valid ssh key with root on the remote host (because with virsh migrate --tunneled, the remote connection is done by libvirt-bin, not by the user; with 'virsh -c qemu+ssh:///', it is virsh running as the user which does the connection).

The attached debdiff works for me. I'm going to write up the SRU justification for both this bug and bug 869553 and push this to proposed.

Thank you very much for the patches!

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Ah, no, the apparmor part has to wait until a fix is in precise.

description: updated
Revision history for this message
Stéphane Graber (stgraber) wrote :

Unsubscribing ubuntu-sponsors from the bug as there's nothing ready for upload just yet and Serge has the required upload rights.

Revision history for this message
Martin Pitt (pitti) wrote : Please test proposed package

Hello Simon, or anyone else affected,

Accepted libvirt into lucid-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Changed in libvirt (Ubuntu Lucid):
status: In Progress → Fix Committed
tags: added: verification-needed
Revision history for this message
Simon Déziel (sdeziel) wrote :

Serge, after installing the -proposed package and working around LP: #869553, I can confirm that the tunnel migration works well. I've migrated back and forth several times. Thank you for this SRU, it's much appreciated.

tags: added: verification-done
removed: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package libvirt - 0.7.5-5ubuntu27.20

---------------
libvirt (0.7.5-5ubuntu27.20) lucid-proposed; urgency=low

  * Fix parsing of 'info migration' from upstream git. (LP: #869590)
 -- Simon Deziel <email address hidden> Thu, 06 Oct 2011 23:14:08 +0000

Changed in libvirt (Ubuntu Lucid):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.