[ClusterLabs] Live migration not working on shutdown

Ken Gaillot kgaillot at redhat.com
Tue Nov 8 21:16:15 UTC 2016


On 11/04/2016 05:51 AM, IT Nerb GmbH wrote:
> Zitat von Klaus Wenninger <kwenning at redhat.com>:
> 
>> On 11/02/2016 06:32 PM, Ken Gaillot wrote:
>>> On 10/26/2016 06:12 AM, Rainer Nerb wrote:
>>>> Hello all,
>>>>
>>>> we're currently testing a 2-node-cluster with 2 vms and live migration
>>>> on CentOS 7.2 and Pacemaker 1.1.13-10 with disks on iSCSI-targets and
>>>> migration via ssh-method.
>>>>
>>>> Live migration works, if we issue "pcs resource move ...", "pcs cluster
>>>> standby", "pcs cluster stop" and even "systemctl rescue".
>>>> The latter only worked, after adding the following additional
>>>> dependencies to pacemaker.service and leaving the management of those
>>>> services to systemd:
>>>>
>>>>   * After/Requires=systemd-machined.service
>>>>   * After/Requires=systemd-machine-id-commit.service
>>>>   * After/Requires=remote-fs.target
>>>>   * After/Requires=libvirtd.service
>>>>   * After/Requires=iscsi.service
>>>>   * After/Requires=iscsid.service
>>>>   * After/Requires=sshd.service
>>> This makes sense when clustered resources depend on services that aren't
>>> themselves managed by the cluster. It's dependent on your situation, so
>>> it's not something that pacemaker can solve generically.
> First approach was to use systemd-resources as there are no ocf:
> resource-agents for iSCSI-Initiators or libvirtd in our distribution.
> But then migration failed even on "systemctl rescue".
>>>
>>> You may already be aware, but the easiest way to add such requirements
>>> is to put them in a systemd unit override, e.g.
>>> /etc/systemd/system/pacemaker.service.d/dependencies.conf.
> Yes, that's how we implemented the additional dependencies.
>>>
>>>> When shutting down or rebooting migration fails and not even the
>>>> regular
>>>> shutdown of the vms succeeds. Systemd seems to tear down the vms by
>>>> terminating something they depend on.
>>>>
>>>> Is this a known issue? Did we miss any further dependencies?
>>> There was a shutdown issue when using systemd-class cluster resources
>>> (systemd: instead of ocf:), but I believe that was fixed in the package
>>> you're using, and it's probably not relevant here anyway.
>> Speaking of
>> https://github.com/ClusterLabs/pacemaker/pull/887/commits/6aae8542abedc755b90c8c49aa5c429718fd12f1?
>>
>>
>> It shouldn't be in Centos 7.2 but I agree unless there are no
>> systemd-resources involved it wouldn't matter.
>>
>>>
>>> It does sound like there's another dependency, but I don't know what.
>>>
>>> What log messages do you see on the failure?
> See attached log files.

The line that stands out to me is:

Nov  4 11:11:39 kvm02 systemd: Stopping Virtual Machine qemu-2-samba2.

Systemd is stopping the VM before pacemaker is able to migrate it. I'm
guessing that is due to this line in the libvirt unit:

Before=libvirt-guests.service

It appears systemd feels free to do that part in parallel, even though
libvirt itself has to wait until pacemaker finishes stopping. Try adding
libvirt-guests to your pacemaker override.

>>>
>>>> Tia
>>>> Rainer
>>>>
>>>>
>>>> ------------------------------------------------------------------------
>>>>
>>>> IT Nerb GmbH
>>>> Lessingstraße 8
>>>> 85098 Großmehring
>>>>
>>>> Telefon     :     +49 700 ITNERBGMBH
>>>> Telefax     :     +49 8407 939 284
>>>> email     :     info at it-nerb.de
>>>> Internet     :     www.it-nerb.de <http://www.it-nerb.de>
>>>> Geschäftsführer    :    Rainer Nerb
>>>> Handelsregister    :    HRB 2592
>>>> HR-Gericht    :    Ingolstadt
>>>>
>>>> ------------------------------------------------------------------------




More information about the Users mailing list