[ClusterLabs] SOLVED: Antw: Re: Antw: can't live migrate VirtualDomain which is part of a group
Lentes, Bernd
bernd.lentes at helmholtz-muenchen.de
Mon May 8 10:56:38 EDT 2017
----- On Apr 25, 2017, at 1:37 PM, Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de wrote:
>>>> "Lentes, Bernd" <bernd.lentes at helmholtz-muenchen.de> schrieb am 25.04.2017 um
> 11:02 in Nachricht
> <406563603.26964612.1493110931994.JavaMail.zimbra at helmholtz-muenchen.de>:
>
>>
>> ----- On Apr 25, 2017, at 8:08 AM, Ulrich Windl
>> Ulrich.Windl at rz.uni-regensburg.de
>> wrote:
>>
>>> Berdn
>>>
>>> you are long enough on this list to know that the reason for your failure is
>>> most likely to be found in the logs which you did not provide. Couldn't you
>>> find out yourself from the logs?
>>>
>>> Regards,
>>> Ulrich
>>>
>>>
>>
>> Hi Ulrich,
>>
>>
>> if i had found something in the log i would not have asked.
>> From what i understand from Ken is that the error is the resource IPaddr
>> which is by default not able to live-migrate.
>>
>> Just a few minutes ago i tried again to live migrate the VirtualDomain
>> resource, and again it shutted down on one node
>> and rebooted on the other.
>>
>> Here is the respective excerpt from the log. Maybe you can point out to me
>> where i can find the reason for the problem:
>>
>
> Usually there is a kind of action summary that is logged before the first action
> is executed. If any of these actions fail, the outcome could be different from
> what was intended. In your case there does not seem to be an error in any
> action, so the outcome is what was planned (by crm). So (as we learned) the
> plans have to be changed.
> I see a migration of prim_vnc_ip_mausdb via restart and some operation with
> prim_vnc_ip_mousdb is already in progress...
>
>> Apr 25 10:54:18 ha-idg-2 crmd[8587]: notice: te_rsc_command: Initiating
>> action 52: stop prim_vnc_ip_mausdb_stop_0 on ha-idg-1
>> Apr 25 10:54:18 ha-idg-2 crmd[8587]: notice: te_rsc_command: Initiating
>> action 53: start prim_vnc_ip_mausdb_start_0 on ha-idg-2 (local)
>> Apr 25 10:54:18 ha-idg-2 IPaddr(prim_vnc_ip_mausdb)[25724]: INFO: Using
>> calculated netmask for 146.107.235.161: 255.255.255.0
>> Apr 25 10:54:18 ha-idg-2 IPaddr(prim_vnc_ip_mausdb)[25724]: INFO: eval
>> ifconfig br0:0 146.107.235.161 netmask 255.255.255.0 broadcast
>> 146.107.235.255
>> Apr 25 10:54:18 ha-idg-2 crmd[8587]: notice: process_lrm_event: Operation
>> prim_vnc_ip_mausdb_start_0: ok (node=ha-idg-2, call=283, rc=0, cib-update=1567,
>> confirmed=true)
>> Apr 25 10:54:18 ha-idg-2 crmd[8587]: notice: te_rsc_command: Initiating
>> action 55: start prim_vm_mausdb_start_0 on ha-idg-2 (local)
>> Apr 25 10:54:19 ha-idg-2 kernel: [583994.652325] device vnet0 entered
>> promiscuous mode
>> Apr 25 10:54:19 ha-idg-2 kernel: [583994.718044] br0: port 2(vnet0) entering
>> forwarding state
>> Apr 25 10:54:19 ha-idg-2 kernel: [583994.718049] br0: port 2(vnet0) entering
>> forwarding state
>> Apr 25 10:54:20 ha-idg-2 crmd[8587]: notice: handle_request: Current ping
>> state: S_TRANSITION_ENGINE
>> Apr 25 10:54:21 ha-idg-2 crmd[8587]: notice: handle_request: Current ping
>> state: S_TRANSITION_ENGINE
>> Apr 25 10:54:22 ha-idg-2 crmd[8587]: notice: process_lrm_event: Operation
>> prim_vm_mausdb_start_0: ok (node=ha-idg-2, call=284, rc=0, cib-update=1568,
>> confirmed=true)
>> Apr 25 10:54:22 ha-idg-2 crmd[8587]: notice: te_rsc_command: Initiating
>> action 56: monitor prim_vm_mausdb_monitor_30000 on ha-idg-2 (local)
>> Apr 25 10:54:22 ha-idg-2 crmd[8587]: notice: process_lrm_event: Operation
>> prim_vm_mausdb_monitor_30000: ok (node=ha-idg-2, call=285, rc=0,
>> cib-update=1569, confirmed=false)
>> Apr 25 10:54:22 ha-idg-2 crmd[8587]: notice: run_graph: Transition 817
>> (Complete=10, Pending=0, Fired=0, Skipped=0, Incomplete=0,
>> Source=/var/lib/pacemaker/pengine/pe-input-1601.bz2): Complete
>> Apr 25 10:54:22 ha-idg-2 crmd[8587]: notice: do_state_transition: State
>> transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS
>> cause=C_FSA_INTERNAL origin=notify_crmd ]
>> Apr 25 10:54:24 ha-idg-2 crmd[8587]: notice: handle_request: Current ping
>> state: S_IDLE
>> Apr 25 10:54:25 ha-idg-2 crmd[8587]: notice: handle_request: Current ping
>> state: S_IDLE
>>
>>
>>
for the sake of completeness:
i changed the RA, but still couldn't live migrate the complete group.
What i found out then is that i related the start/stop of the primitive IPaddr wrong to the actions migrate_to and migrate_from.
First i related migrate_to with ip_start and migrate_from with ip_stop.
But then i could just live migrate in one direction, not vice versa.
When i related migrate_to with ip_stop and migrate_from with ip_start everything went fine.
And i forgot to set the monitor operations in the definition of the resource.
I thought they are added by default. They aren't ! And they are very important :-)
This is now my resource:
primitive prim_vnc_ip_mausdb ocf:lentes:IPaddr \
params ip=146.107.235.161 nic=br0 cidr_netmask=24 \
op migrate_from interval=0 timeout=30 \
op migrate_to interval=0 timeout=30 \
op monitor interval=10 timeout=20 \
meta allow-migrate=true is-managed=true
And here are my changes to the RA:
ha-idg-1:~ # diff /usr/lib/ocf/resource.d/lentes/IPaddr /usr/lib/ocf/resource.d/heartbeat/IPaddr
5d4
< # modified by Bernd Lentes, 25042017, Livemigration added (migrate_to, migrate_from)
41c40
< USAGE="usage: $0 {start|stop|status|monitor|migrate_to|migrate_from|validate-all|meta-data}";
---
> USAGE="usage: $0 {start|stop|status|monitor|validate-all|meta-data}";
70d68
< Live-Migration added Bernd Lentes 25042017
202,203d199
< <action name="migrate_to" timeout="20s" />
< <action name="migrate_from" timeout="20s" />
889,890d884
< migrate_to) ip_stop;;
< migrate_from) ip_validate_all && ip_start;;
I can now live migrate the group of a VirtualDomain and an IPaddr resource in both directions (two-node cluster).
Thanks for any help.
Bernd
Helmholtz Zentrum Muenchen
Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
Ingolstaedter Landstr. 1
85764 Neuherberg
www.helmholtz-muenchen.de
Aufsichtsratsvorsitzende: MinDir'in Baerbel Brumme-Bothe
Geschaeftsfuehrer: Prof. Dr. Guenther Wess, Heinrich Bassler, Dr. Alfons Enhsen
Registergericht: Amtsgericht Muenchen HRB 6466
USt-IdNr: DE 129521671
More information about the Users
mailing list