Results 1 to 7 of 7

Thread: Updates break xen + ceph RBD

  1. #1

    Default Updates break xen + ceph RBD

    After updating, xen domains with ceph RBD disks are unable to attach their disks.

    disk = [ "backendtype=qdisk,vdev=xvda,target=rbd:rbd/test1", ...


    for LOG in `ls -1t /var/log/xen/*test1* | head -2`; do echo "== $LOG =="; cat $LOG; done
    == /var/log/xen/qemu-dm-test1.log ==
    VNC server running on 127.0.0.1:5910
    xen be: qdisk-51712: xen be: qdisk-51712: error: Parameter 'locking' is unexpected
    error: Parameter 'locking' is unexpected
    xen be: qdisk-51712: xen be: qdisk-51712: initialise() failed
    initialise() failed
    == /var/log/xen/xl-test1.log ==
    Waiting for domain test1 (domid 35) to die [pid 12605]

    Anyone know if the "locking" parameter can be dropped?

    --
    Karl

  2. #2
    Join Date
    Jun 2008
    Location
    San Diego, Ca, USA
    Posts
    11,468
    Blog Entries
    2

    Default Re: Updates break xen + ceph RBD

    There is "can" and there is "should."

    Does the lock name or anything else provide a clue why the lock was applied?
    And, depending on the situation is there a broken shared lock that might have provided over-riding access?

    Only you know what you created so only you probably can know for sure what breaks your design or not

    TSU
    Beginner Wiki Quickstart - https://en.opensuse.org/User:Tsu2/Quickstart_Wiki
    Solved a problem recently? Create a wiki page for future personal reference!
    Learn something new?
    Attended a computing event?
    Post and Share!

  3. #3

    Default Re: Updates break xen + ceph RBD

    Well an RBD can be locked / unlocked :

    # rbd lock ls rbd/test1
    # rbd lock add rbd/test1 1
    # rbd lock ls rbd/test1
    There is 1 exclusive lock on this image.
    Locker ID Address
    client.54470026 1 192.168.XXX.XXX:0/597590677
    # rbd lock remove rbd/test1 1 client.54470026
    # rbd lock ls rbd/test1
    #

    But it would be qemu that would manage that while the rbd is being used as a qdisk by a xen domain.

    If the xen backend is calling qemu with a "locking" parameter and the updated qemu no longer has support for the qdisk that the xen backend is trying to create.

    Seems to me that the xen and qemu updates are out of sync.

  4. #4
    Join Date
    Jun 2008
    Location
    San Diego, Ca, USA
    Posts
    11,468
    Blog Entries
    2

    Default Re: Updates break xen + ceph RBD

    If it doesn't disrupt your storage,
    You might test your theory by mounting your disk file as a loop device and then accessing as a raw disk,
    Thereby bypassing the normal disk virtualized access.
    And yes, AFAIK it's pretty common for virtualized I/O to place a lock on configured storage devices.

    If you don't mount loop devices often,
    You might find my Wiki writeup useful
    https://en.opensuse.org/User:Tsu2/loop_devices

    If you confirm what you theorize and your problem just happened with latest update,
    It's worth submitting a bug to https://bugzilla.opensuse.org.

    Personally,
    I hadn't heard of anyone running Ceph virtualized... And I'd have to think about why someone would do so.
    My first instinct is that Ceph is best deployed on bare metal as the underlying storage for other use... But I'll look into this further and may find some interesting things I hadn't thought of.

    TSU
    Beginner Wiki Quickstart - https://en.opensuse.org/User:Tsu2/Quickstart_Wiki
    Solved a problem recently? Create a wiki page for future personal reference!
    Learn something new?
    Attended a computing event?
    Post and Share!

  5. #5
    Join Date
    Jun 2008
    Location
    San Diego, Ca, USA
    Posts
    11,468
    Blog Entries
    2

    Default Re: Updates break xen + ceph RBD

    Thought more about what I just posted,
    My suggestion about RBD and mounting as a loop device doesn't make much sense (of course).
    Although you weren't specific, I'm assuming you're configuring block device access rather than as a storage device, file system or other type of storage to Xen.

    I assume you've installed librbd,
    https://wiki.xenproject.org/wiki/Ceph

    and followed Xen/Ceph guidelines
    https://wiki.xenproject.org/wiki/Ceph

    It does sound like your description suggests a problem with librbd, and accessing as a raw file file system could be a test,
    But in any case if was a result of a recent update, should be reported to the bugzilla.

    TSU
    Beginner Wiki Quickstart - https://en.opensuse.org/User:Tsu2/Quickstart_Wiki
    Solved a problem recently? Create a wiki page for future personal reference!
    Learn something new?
    Attended a computing event?
    Post and Share!

  6. #6

    Default Re: Updates break xen + ceph RBD

    Using rbd's as physical disks passed to xen domains is on my list of things to play with.

    Working with two xen dom0's that are ceph clients. It is really simple to "map" an rbd as a block device on each dom0.

    xen0:~ # rbd lock ls test1
    xen0:~ # rbd map rbd/test1
    /dev/rbd0
    xen0:~ # ls -l /dev/rbd
    rbd/ rbd0 rbd0p1 rbd0p2
    xen0:~ # ls -l /dev/rbd
    rbd/ rbd0 rbd0p1 rbd0p2
    xen0:~ # ls -l /dev/rbd/rbd/test1
    lrwxrwxrwx 1 root root 10 Nov 19 21:50 /dev/rbd/rbd/test1 -> ../../rbd0

    The test1 domain's xl config is on a cephfs that both dom0's share:

    xen1:/cephfs/space/etc/xen/vm # cat test1
    < ... snip ... >
    disk = [ "vdev=xvda,target=/dev/rbd/rbd/test1",
    "file:/cephfs/space/etc/xen/images/openSUSE-Leap-15.1-DVD-x86_64.iso,xvdb:cdrom,r" ]
    <... snip ... >

    xen1:/cephfs/space/etc/xen/vm # rbd map rbd/test1
    /dev/rbd0
    xen1:/cephfs/space/etc/xen/vm # ls -l /dev/rbd/rbd/test1
    lrwxrwxrwx 1 root root 10 Nov 19 21:51 /dev/rbd/rbd/test1 -> ../../rbd0

    xen1:/cephfs/space/etc/xen/vm # xl create test1
    Parsing config from test1
    xen1:/cephfs/space/etc/xen/vm # xl migrate test1 xen0
    migration target: Ready to receive domain.
    Saving to migration stream new xl format (info 0x3/0x0/1686)
    Loading new save file <incoming migration stream> (new xl fmt info 0x3/0x0/1686)
    Savefile contains xl domain config in JSON format
    Parsing config from <saved>
    xc: info: Saving domain 13, type x86 HVM
    xc: info: Found x86 HVM domain from Xen 4.12
    xc: info: Restoring domain
    xc: info: suse_precopy_policy: domU 13, too many iterations (6/5)
    xc: info: Restore successful
    xc: info: XenStore: mfn 0xfeffc, dom 1, evt 1
    xc: info: Console: mfn 0xfefff, dom 0, evt 2
    migration target: Transfer complete, requesting permission to start domain.
    migration sender: Target has acknowledged transfer.
    migration sender: Giving target permission to start.
    migration target: Got permission, starting domain.
    migration target: Domain started successsfully.
    migration sender: Target reports successful startup.
    Migration successful.

    test1:~ # lsblk
    NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
    sr0 11:0 1 3.8G 0 rom
    xvda 202:0 0 36G 0 disk
    ├─xvda1 202:1 0 8M 0 part
    └─xvda2 202:2 0 36G 0 part
    ├─system-swap 254:0 0 1.8G 0 lvm [SWAP]
    ├─system-root 254:1 0 19.8G 0 lvm /
    └─system-home 254:2 0 14.5G 0 lvm /home
    test1:~ #

    xen0:~ # xl migrate test1 xen1
    mgration target: Ready to receive domain.
    Saving to migration stream new xl format (info 0x3/0x0/1686)
    Loading new save file <incoming migration stream> (new xl fmt info 0x3/0x0/1686)
    Savefile contains xl domain config in JSON format
    Parsing config from <saved>
    xc: info: Saving domain 55, type x86 HVM
    xc: info: Found x86 HVM domain from Xen 4.12
    xc: info: Restoring domain
    xc: info: suse_precopy_policy: domU 55, too many iterations (6/5)
    xc: info: Restore successful
    xc: info: XenStore: mfn 0xfeffc, dom 1, evt 1
    xc: info: Console: mfn 0xfefff, dom 0, evt 2
    migration target: Transfer complete, requesting permission to start domain.
    migration sender: Target has acknowledged transfer.
    migration sender: Giving target permission to start.
    migration target: Got permission, starting domain.
    migration target: Domain started successsfully.
    migration sender: Target reports successful startup.
    Migration successful.


    It's not a bad setup, but i have little expericence with it. The complex part of this approach is having to have the "rbdmap" sync'd across all the xen dom0:

    https://docs.ceph.com/docs/master/ma...ghlight=rbdmap

    I'm thinking qcow2 issue is in the qemu code ... I found "is unexpected" in the qemu source but don't have time to run it down.

  7. #7
    Join Date
    Jun 2008
    Location
    San Diego, Ca, USA
    Posts
    11,468
    Blog Entries
    2

    Default Re: Updates break xen + ceph RBD

    Very nice.
    It's intended only as a test and not something that should be deployed regularly, the locks that normally would be applied by Xen which you've bypassed can be useful to avoid file version contention and corruption. In general, direct physical access to shared storage should be read-only or not shared... or otherwise managed "intelligently." As the common platform, by using virtual I/O Xen is in a position to properly adjudicate contentions to avoid problems.

    Based on your test, it does look like something likely happened to librbd, recommend submitting a bug to https://bugzilla.opensuse.org.
    You can reference this Forum thread, and you should probably also include the version number of librbd (You can get that various ways including "zypper info package")

    TSU
    Beginner Wiki Quickstart - https://en.opensuse.org/User:Tsu2/Quickstart_Wiki
    Solved a problem recently? Create a wiki page for future personal reference!
    Learn something new?
    Attended a computing event?
    Post and Share!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •