[ClusterLabs] Antw: how to always promote current slave on master death

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Mon Apr 27 05:58:51 UTC 2015


Hi!

To me it seems your problem is tha t"monitor" returns "unknown error" instead of "not running".

Regards,
Ulrich

>>> Brett Moser <brett.moser at slacorp.com> schrieb am 24.04.2015 um 01:49 in
Nachricht
<CAO3aD9McSd6i81aVzZPVAQheDReaVz8SOk1LjbjhiTkV4jaywQ at mail.gmail.com>:
> Hello,
> 
> I'm writing an OCF script for a m/s resource and I am having a bit of
> trouble achieving what I desire.
> 
> When the master dies (e.g. I am killing it from the command line to test) I
> want the current slave to always be promoted.   First off -- I am assuming
> that this can be achieved in Pacemaker, please correct me if I am wrong.
> 
> In order to force this I have tried to alter the crm_master level inside
> the OCF script during the demote() action (e.g. crm_master -l reboot -v 0).
> 
> However, it doesn't seem to have any affect, the failed master resource is
> still promoted:
> 
> Apr 23 23:25:06 server-dmz-b attrd[1668]:   notice: attrd_perform_update:
> Sent update 5034: master-epttd=0
> Apr 23 23:25:06 server-dmz-b crmd[1670]:   notice: process_lrm_event:
> Operation epttd_monitor_2000: unknown error (node=server-dmz-b, call=100,
> rc=1, cib-update=4667, confirmed=false)
> Apr 23 23:25:06 server-dmz-b attrd[1668]:   notice: attrd_cs_dispatch:
> Update relayed from server-dmz-a
> Apr 23 23:25:06 server-dmz-b attrd[1668]:   notice: attrd_trigger_update:
> Sending flush op to all hosts for: fail-count-epttd (2)
> Apr 23 23:25:06 server-dmz-b attrd[1668]:   notice: attrd_perform_update:
> Sent update 5036: fail-count-epttd=2
> Apr 23 23:25:06 server-dmz-b attrd[1668]:   notice: attrd_cs_dispatch:
> Update relayed from server-dmz-a
> Apr 23 23:25:06 server-dmz-b attrd[1668]:   notice: attrd_trigger_update:
> Sending flush op to all hosts for: last-failure-epttd (1429831506)
> Apr 23 23:25:06 server-dmz-b attrd[1668]:   notice: attrd_perform_update:
> Sent update 5038: last-failure-epttd=1429831506
> Apr 23 23:25:06 server-dmz-b epttd(epttd)[28692]: INFO:  not running
> Apr 23 23:25:06 server-dmz-b crmd[1670]:   notice: process_lrm_event:
> Operation epttd_demote_0: ok (node=server-dmz-b, call=102, rc=0,
> cib-update=4669, confirmed=true)
> Apr 23 23:25:07 server-dmz-b attrd[1668]:   notice: attrd_trigger_update:
> Sending flush op to all hosts for: master-epttd (<null>)
> Apr 23 23:25:07 server-dmz-b attrd[1668]:   notice: attrd_perform_update:
> Sent delete 5040: node=server-dmz-b, attr=master-epttd, id=<n/a>,
> set=(null), section=status
> Apr 23 23:25:07 server-dmz-b attrd[1668]:   notice: attrd_perform_update:
> Sent delete 5042: node=server-dmz-b, attr=master-epttd, id=<n/a>,
> set=(null), section=status
> Apr 23 23:25:07 server-dmz-b epttd(epttd)[28717]: INFO:  is not running.
> Apr 23 23:25:07 server-dmz-b crmd[1670]:   notice: process_lrm_event:
> Operation epttd_stop_0: ok (node=server-dmz-b, call=103, rc=0,
> cib-update=4670, confirmed=true)
> Apr 23 23:25:07 server-dmz-b epttd(epttd)[28742]: INFO:  not running
> Apr 23 23:25:07 server-dmz-b epttd(epttd)[28742]: INFO:  not running
> Apr 23 23:25:08 server-dmz-b attrd[1668]:   notice: attrd_trigger_update:
> Sending flush op to all hosts for: master-epttd (100)
> Apr 23 23:25:08 server-dmz-b attrd[1668]:   notice: attrd_perform_update:
> Sent update 5044: master-epttd=100
> Apr 23 23:25:08 server-dmz-b crmd[1670]:   notice: process_lrm_event:
> Operation epttd_start_0: ok (node=server-dmz-b, call=104, rc=0,
> cib-update=4671, confirmed=true)
> Apr 23 23:25:08 server-dmz-b crmd[1670]:   notice: process_lrm_event:
> Operation epttd_promote_0: ok (node=server-dmz-b, call=105, rc=0,
> cib-update=4672, confirmed=true)
> Apr 23 23:25:08 server-dmz-b crmd[1670]:   notice: process_lrm_event:
> Operation epttd_monitor_2000: master (node=server-dmz-b, call=106, rc=8,
> cib-update=4673, confirmed=false)
> 
> 
> Any advice would be greatly appreciated,
> -Brett Moser








More information about the Users mailing list