[Pacemaker] Pacemaker fails to transition single node master/slave resource to master

Eliot Gable egable+pacemaker at gmail.com
Sun Aug 26 11:31:12 EDT 2012


node node1
primitive FreeSWITCH ocf:fssolutions:FreeSWITCH \
        params ips="bond2/212.163.22.155/26:bond2/212.163.22.156/26"
user="freeswitch" group="freeswitch" \
        op monitor interval="3s" role="Master" depth="0" \
        op monitor interval="10s" role="Slave" depth="0" \
        op start interval="0" timeout="65" \
        op stop interval="0" timeout="60"
ms FreeSWITCH-MS FreeSWITCH \
        meta master-max="1" master-node-max="1" clone-max="1"
clone-node-max="1" notify="false" target-role="Master"
location FreeSWITCH-MS-on-node1 FreeSWITCH-MS 50: node1
property $id="cib-bootstrap-options" \
        dc-version="1.1.7-6.el6-148fccfd5985c5590cc601123c6c16e966b85d14" \
        cluster-infrastructure="corosync" \
        expected-quorum-votes="2" \
        stonith-enabled="false" \
        no-quorum-policy="ignore"
rsc_defaults $id="rsc-options" \
        resource-stickiness="100"


============
Last updated: Sun Aug 26 16:13:05 2012
Last change: Sun Aug 26 16:11:18 2012 via crm_resource on node1
Stack: corosync
Current DC: node1 - partition WITHOUT quorum
Version: 1.1.7-6.el6-148fccfd5985c5590cc601123c6c16e966b85d14
1 Nodes configured, 2 expected votes
1 Resources configured.
============

Online: [ node1 ]

 Master/Slave Set: FreeSWITCH-MS [FreeSWITCH]
     Slaves: [ node1 ]


I run this:

crm resource promote FreeSWITCH-MS

Resulting log:

Aug 26 16:19:14 node1 cib[18166]:     info: cib_process_request: Operation
complete: op cib_modify for section resources (origin=local/crm_resource/4,
version=0.7.2): ok (rc=0)


That's all it does. So I try:

crm resource demote FreeSWITCH-MS

And I get this:

Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: - <cib
admin_epoch="0" epoch="7" num_updates="2" >
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: -   <configuration >
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: -     <resources >
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: -       <master
id="FreeSWITCH-MS" >
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: -
<meta_attributes id="FreeSWITCH-MS-meta_attributes" >
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: -           <nvpair
value="Master" id="FreeSWITCH-MS-meta_attributes-target-role" />
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: -
</meta_attributes>
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: -       </master>
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: -     </resources>
Aug 26 16:20:24 node1 crmd[18171]:     info: abort_transition_graph:
te_update_diff:126 - Triggered transition abort (complete=1, tag=diff,
id=(null), magic=NA, cib=0.8.1) : Non-status change
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: -   </configuration>
Aug 26 16:20:24 node1 crmd[18171]:   notice: do_state_transition: State
transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL
origin=abort_transition_graph ]
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: - </cib>
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: + <cib epoch="8"
num_updates="1" admin_epoch="0" validate-with="pacemaker-1.2"
crm_feature_set="3.0.6" update-origin="node1" update-client="crm_resource"
cib-last-written="Sun Aug 26 16:11:18 2012" have-quorum="0" dc-uuid="node1"
>
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: +   <configuration >
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: +     <resources >
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: +       <master
id="FreeSWITCH-MS" >
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: +
<meta_attributes id="FreeSWITCH-MS-meta_attributes" >
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: +           <nvpair
id="FreeSWITCH-MS-meta_attributes-target-role" name="target-role"
value="Slave" />
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: +
</meta_attributes>
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: +       </master>
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: +     </resources>
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: +   </configuration>
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: + </cib>
Aug 26 16:20:24 node1 cib[18166]:     info: cib_process_request: Operation
complete: op cib_modify for section resources (origin=local/crm_resource/4,
version=0.8.1): ok (rc=0)
Aug 26 16:20:24 node1 pengine[18170]:   notice: unpack_config: On loss of
CCM Quorum: Ignore
Aug 26 16:20:24 node1 crmd[18171]:   notice: do_state_transition: State
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS
cause=C_IPC_MESSAGE origin=handle_response ]
Aug 26 16:20:24 node1 crmd[18171]:     info: do_te_invoke: Processing graph
5 (ref=pe_calc-dc-1345994424-17) derived from
/var/lib/pengine/pe-input-5.bz2
Aug 26 16:20:24 node1 crmd[18171]:   notice: run_graph: ==== Transition 5
(Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0,
Source=/var/lib/pengine/pe-input-5.bz2): Complete
Aug 26 16:20:24 node1 crmd[18171]:   notice: do_state_transition: State
transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS
cause=C_FSA_INTERNAL origin=notify_crmd ]
Aug 26 16:20:24 node1 pengine[18170]:   notice: process_pe_message:
Transition 5: PEngine Input stored in: /var/lib/pengine/pe-input-5.bz2
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: - <cib
admin_epoch="0" epoch="7" num_updates="2" >
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: -   <configuration >
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: -     <resources >
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: -       <master
id="FreeSWITCH-MS" >
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: -
<meta_attributes id="FreeSWITCH-MS-meta_attributes" >
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: -           <nvpair
value="Master" id="FreeSWITCH-MS-meta_attributes-target-role" />
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: -
</meta_attributes>
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: -       </master>
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: -     </resources>
Aug 26 16:20:24 node1 crmd[18171]:     info: abort_transition_graph:
te_update_diff:126 - Triggered transition abort (complete=1, tag=diff,
id=(null), magic=NA, cib=0.8.1) : Non-status change
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: -   </configuration>
Aug 26 16:20:24 node1 crmd[18171]:   notice: do_state_transition: State
transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL
origin=abort_transition_graph ]
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: - </cib>
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: + <cib epoch="8"
num_updates="1" admin_epoch="0" validate-with="pacemaker-1.2"
crm_feature_set="3.0.6" update-origin="node1" update-client="crm_resource"
cib-last-written="Sun Aug 26 16:11:18 2012" have-quorum="0" dc-uuid="node1"
>
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: +   <configuration >
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: +     <resources >
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: +       <master
id="FreeSWITCH-MS" >
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: +
<meta_attributes id="FreeSWITCH-MS-meta_attributes" >
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: +           <nvpair
id="FreeSWITCH-MS-meta_attributes-target-role" name="target-role"
value="Slave" />
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: +
</meta_attributes>
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: +       </master>
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: +     </resources>
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: +   </configuration>
Aug 26 16:20:24 node1 cib[18166]:     info: cib:diff: + </cib>
Aug 26 16:20:24 node1 cib[18166]:     info: cib_process_request: Operation
complete: op cib_modify for section resources (origin=local/crm_resource/4,
version=0.8.1): ok (rc=0)
Aug 26 16:20:24 node1 pengine[18170]:   notice: unpack_config: On loss of
CCM Quorum: Ignore
Aug 26 16:20:24 node1 crmd[18171]:   notice: do_state_transition: State
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS
cause=C_IPC_MESSAGE origin=handle_response ]
Aug 26 16:20:24 node1 crmd[18171]:     info: do_te_invoke: Processing graph
5 (ref=pe_calc-dc-1345994424-17) derived from
/var/lib/pengine/pe-input-5.bz2
Aug 26 16:20:24 node1 crmd[18171]:   notice: run_graph: ==== Transition 5
(Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0,
Source=/var/lib/pengine/pe-input-5.bz2): Complete
Aug 26 16:20:24 node1 crmd[18171]:   notice: do_state_transition: State
transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS
cause=C_FSA_INTERNAL origin=notify_crmd ]
Aug 26 16:20:24 node1 pengine[18170]:   notice: process_pe_message:
Transition 5: PEngine Input stored in: /var/lib/pengine/pe-input-5.bz2


Then I try:

crm resource promote FreeSWITCH-MS


This time I get:

Aug 26 16:21:39 node1 cib[18166]:     info: cib:diff: - <cib
admin_epoch="0" epoch="8" num_updates="1" >
Aug 26 16:21:39 node1 cib[18166]:     info: cib:diff: -   <configuration >
Aug 26 16:21:39 node1 cib[18166]:     info: cib:diff: -     <resources >
Aug 26 16:21:39 node1 cib[18166]:     info: cib:diff: -       <master
id="FreeSWITCH-MS" >
Aug 26 16:21:39 node1 cib[18166]:     info: cib:diff: -
<meta_attributes id="FreeSWITCH-MS-meta_attributes" >
Aug 26 16:21:39 node1 cib[18166]:     info: cib:diff: -           <nvpair
value="Slave" id="FreeSWITCH-MS-meta_attributes-target-role" />
Aug 26 16:21:39 node1 cib[18166]:     info: cib:diff: -
</meta_attributes>
Aug 26 16:21:39 node1 cib[18166]:     info: cib:diff: -       </master>
Aug 26 16:21:39 node1 cib[18166]:     info: cib:diff: -     </resources>
Aug 26 16:21:39 node1 cib[18166]:     info: cib:diff: -   </configuration>
Aug 26 16:21:39 node1 crmd[18171]:     info: abort_transition_graph:
te_update_diff:126 - Triggered transition abort (complete=1, tag=diff,
id=(null), magic=NA, cib=0.9.1) : Non-status change
Aug 26 16:21:39 node1 crmd[18171]:   notice: do_state_transition: State
transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL
origin=abort_transition_graph ]
Aug 26 16:21:39 node1 cib[18166]:     info: cib:diff: - </cib>
Aug 26 16:21:39 node1 cib[18166]:     info: cib:diff: + <cib epoch="9"
num_updates="1" admin_epoch="0" validate-with="pacemaker-1.2"
crm_feature_set="3.0.6" update-origin="node1" update-client="crm_resource"
cib-last-written="Sun Aug 26 16:20:24 2012" have-quorum="0" dc-uuid="node1"
>
Aug 26 16:21:39 node1 cib[18166]:     info: cib:diff: +   <configuration >
Aug 26 16:21:39 node1 cib[18166]:     info: cib:diff: +     <resources >
Aug 26 16:21:39 node1 cib[18166]:     info: cib:diff: +       <master
id="FreeSWITCH-MS" >
Aug 26 16:21:39 node1 cib[18166]:     info: cib:diff: +
<meta_attributes id="FreeSWITCH-MS-meta_attributes" >
Aug 26 16:21:39 node1 cib[18166]:     info: cib:diff: +           <nvpair
id="FreeSWITCH-MS-meta_attributes-target-role" name="target-role"
value="Master" />
Aug 26 16:21:39 node1 cib[18166]:     info: cib:diff: +
</meta_attributes>
Aug 26 16:21:39 node1 cib[18166]:     info: cib:diff: +       </master>
Aug 26 16:21:39 node1 cib[18166]:     info: cib:diff: +     </resources>
Aug 26 16:21:39 node1 cib[18166]:     info: cib:diff: +   </configuration>
Aug 26 16:21:39 node1 cib[18166]:     info: cib:diff: + </cib>
Aug 26 16:21:39 node1 cib[18166]:     info: cib_process_request: Operation
complete: op cib_modify for section resources (origin=local/crm_resource/4,
version=0.9.1): ok (rc=0)
Aug 26 16:21:39 node1 pengine[18170]:   notice: unpack_config: On loss of
CCM Quorum: Ignore
Aug 26 16:21:39 node1 crmd[18171]:   notice: do_state_transition: State
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS
cause=C_IPC_MESSAGE origin=handle_response ]
Aug 26 16:21:39 node1 crmd[18171]:     info: do_te_invoke: Processing graph
6 (ref=pe_calc-dc-1345994499-18) derived from
/var/lib/pengine/pe-input-6.bz2
Aug 26 16:21:39 node1 crmd[18171]:   notice: run_graph: ==== Transition 6
(Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0,
Source=/var/lib/pengine/pe-input-6.bz2): Complete
Aug 26 16:21:39 node1 crmd[18171]:   notice: do_state_transition: State
transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS
cause=C_FSA_INTERNAL origin=notify_crmd ]
Aug 26 16:21:39 node1 pengine[18170]:   notice: process_pe_message:
Transition 6: PEngine Input stored in: /var/lib/pengine/pe-input-6.bz2

However, crm status shows:

============
Last updated: Sun Aug 26 16:22:04 2012
Last change: Sun Aug 26 16:21:39 2012 via crm_resource on node1
Stack: corosync
Current DC: node1 - partition WITHOUT quorum
Version: 1.1.7-6.el6-148fccfd5985c5590cc601123c6c16e966b85d14
1 Nodes configured, 2 expected votes
1 Resources configured.
============

Online: [ node1 ]

 Master/Slave Set: FreeSWITCH-MS [FreeSWITCH]
     Slaves: [ node1 ]


So, I thought maybe my resource agent (which I wrote myself) might be
broken. So I ran ocf-tester on it again, but it passes all tests. Further,
I don't actually get any errors in the log file.

I have also tried doing a resource cleanup on FreeSWITCH-MS, restarting
pacemaker and corosync, putting the node in standby and bringing it back
out, upgrading pacemaker and corosync (to the version you seen in the
output in this E-mail from 1.1.6 previously where it first started
happening) and also completely wiping out everything in the
/var/lib/corosync, /var/lib/heartbeat, etc directories and recreating the
entire config from scratch (which, incidentally, DID clear up a warning I
was getting about a bad UUID), but nothing I have tried has resolved it.
The odd thing is, this single-node setup has been working for nearly a year
with this exact configuration. The box was rebooted last week, and now it
just seems to sit there playing dead.

Thanks in advance for any suggestions on how I might solve this!
i
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/pacemaker/attachments/20120826/ad249744/attachment-0002.html>


More information about the Pacemaker mailing list