[Pacemaker] [Problem] The cluster fails in the stop of the node.

Andrew Beekhof andrew at beekhof.net
Thu Mar 29 02:12:54 EDT 2012


This appears to be resolved with 1.1.7, perhaps look for a patch to backport?

On Tue, Mar 27, 2012 at 4:46 PM,  <renayama19661014 at ybb.ne.jp> wrote:
> Hi All,
>
> When we set a group resource within Master/Slave resource, we found the problem that a node could not stop.
>
> This problem occurs in Pacemaker1.0.11.
>
> We confirmed a problem in the following procedure.
>
> Step1) Start all nodes.
>
> ============
> Last updated: Tue Mar 27 14:35:16 2012
> Stack: Heartbeat
> Current DC: test2 (b645c456-af78-429e-a40a-279ed063b97d) - partition WITHOUT quorum
> Version: 1.0.12-unknown
> 2 Nodes configured, unknown expected votes
> 4 Resources configured.
> ============
>
> Online: [ test1 test2 ]
>
>  Master/Slave Set: msGroup01
>     Masters: [ test1 ]
>     Slaves: [ test2 ]
>  Resource Group: testGroup
>     prmDummy1  (ocf::pacemaker:Dummy): Started test1
>     prmDummy2  (ocf::pacemaker:Dummy): Started test1
>  Resource Group: grpStonith1
>     prmStonithN1       (stonith:external/ssh): Started test2
>  Resource Group: grpStonith2
>     prmStonithN2       (stonith:external/ssh): Started test1
>
> Migration summary:
> * Node test2:
> * Node test1:
>
> Step2) Stop Slave node.
>
> [root at test2 ~]# service heartbeat stop
> Stopping High-Availability services: Done.
>
> Step3) Stop Master node. However, a loop does the Master node and does not stop.
>
> (snip)
> Mar 27 14:38:06 test1 crmd: [21443]: WARN: run_graph: Transition 3 (Complete=7, Pending=0, Fired=0, Skipped=0, Incomplete=23, Source=/var/lib/pengine/pe-input-3.bz2): Terminated
> Mar 27 14:38:06 test1 crmd: [21443]: ERROR: te_graph_trigger: Transition failed: terminated
> Mar 27 14:38:06 test1 crmd: [21443]: WARN: print_graph: Graph 3 (30 actions in 30 synapses): batch-limit=30 jobs, network-delay=60000ms
> Mar 27 14:38:06 test1 crmd: [21443]: WARN: print_graph: Synapse 0 is pending (priority: 0)
> Mar 27 14:38:06 test1 crmd: [21443]: WARN: print_elem:     [Action 12]: Pending (id: testMsGroup01:0_stop_0, type: pseduo, priority: 0)
> Mar 27 14:38:06 test1 crmd: [21443]: WARN: print_elem:      * [Input 14]: Completed (id: testMsGroup01:0_demote_0, type: pseduo, priority: 0)
> Mar 27 14:38:06 test1 crmd: [21443]: WARN: print_elem:      * [Input 32]: Pending (id: msGroup01_stop_0, type: pseduo, priority: 0)
> Mar 27 14:38:06 test1 crmd: [21443]: WARN: print_graph: Synapse 1 is pending (priority: 0)
> Mar 27 14:38:06 test1 crmd: [21443]: WARN: print_elem:     [Action 13]: Pending (id: testMsGroup01:0_stopped_0, type: pseduo, priority: 0)
> Mar 27 14:38:06 test1 crmd: [21443]: WARN: print_elem:      * [Input 8]: Pending (id: prmStateful1:0_stop_0, loc: test1, priority: 0)
> Mar 27 14:38:06 test1 crmd: [21443]: WARN: print_elem:      * [Input 9]: Pending (id: prmStateful2:0_stop_0, loc: test1, priority: 0)
> Mar 27 14:38:06 test1 crmd: [21443]: WARN: print_elem:      * [Input 12]: Pending (id: testMsGroup01:0_stop_0, type: pseduo, priority: 0)
> Mar 27 14:38:06 test1 crmd: [21443]: WARN: print_graph: Synapse 2 was confirmed (priority: 0)
> (snip)
>
> I attach data of hb_report.
>
> Best Regards,
> Hideo Yamauchi.
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>




More information about the Pacemaker mailing list