[Pacemaker] [Question] About the stop order at the time of the Probe error.
renayama19661014 at ybb.ne.jp
renayama19661014 at ybb.ne.jp
Wed Aug 22 02:44:43 EDT 2012
Hi All,
We found a problem at the time of Porobe error.
It is the following simple resource constitution.
============
Last updated: Wed Aug 22 15:19:50 2012
Stack: Heartbeat
Current DC: drbd1 (6081ac99-d941-40b9-a4a3-9f996ff291c0) - partition with quorum
Version: 1.0.12-c6770b8
1 Nodes configured, unknown expected votes
1 Resources configured.
============
Online: [ drbd1 ]
Resource Group: grpTest
resource1 (ocf::pacemaker:Dummy): Started drbd1
resource2 (ocf::pacemaker:Dummy): Started drbd1
resource3 (ocf::pacemaker:Dummy): Started drbd1
resource4 (ocf::pacemaker:Dummy): Started drbd1
Node Attributes:
* Node drbd1:
Migration summary:
* Node drbd1:
Depending on the resource that the Probe error occurs, the stop of the resource does not become the inverse order.
I confirmed it in the next procedure.
Step 1) Make resource2 and resource4 a starting state.
[root at drbd1 ~]# touch /var/run/Dummy-resource2.state
[root at drbd1 ~]# touch /var/run/Dummy-resource4.state
Step 2) Start a node and send cib.
Step 3) Resource2 and resource3 stop, but are not inverse order.
(snip)
Aug 22 15:19:47 drbd1 pengine: [32722]: notice: group_print: Resource Group: grpTest
Aug 22 15:19:47 drbd1 pengine: [32722]: notice: native_print: resource1#011(ocf::pacemaker:Dummy):#011Stopped
Aug 22 15:19:47 drbd1 pengine: [32722]: notice: native_print: resource2#011(ocf::pacemaker:Dummy):#011Started drbd1
Aug 22 15:19:47 drbd1 pengine: [32722]: notice: native_print: resource3#011(ocf::pacemaker:Dummy):#011Stopped
Aug 22 15:19:47 drbd1 pengine: [32722]: notice: native_print: resource4#011(ocf::pacemaker:Dummy):#011Started drbd1
(snip)
Aug 22 15:19:47 drbd1 crmd: [32719]: info: te_rsc_command: Initiating action 6: stop resource2_stop_0 on drbd1 (local)
Aug 22 15:19:47 drbd1 crmd: [32719]: info: do_lrm_rsc_op: Performing key=6:2:0:5c924067-0d20-48fd-9772-88e530661270 op=resource2_stop_0 )
Aug 22 15:19:47 drbd1 lrmd: [32716]: info: rsc:resource2 stop[6] (pid 32745)
Aug 22 15:19:47 drbd1 crmd: [32719]: info: te_rsc_command: Initiating action 11: stop resource4_stop_0 on drbd1 (local)
Aug 22 15:19:47 drbd1 crmd: [32719]: info: do_lrm_rsc_op: Performing key=11:2:0:5c924067-0d20-48fd-9772-88e530661270 op=resource4_stop_0 )
Aug 22 15:19:47 drbd1 lrmd: [32716]: info: rsc:resource4 stop[7] (pid 32746)
Aug 22 15:19:47 drbd1 lrmd: [32716]: info: operation stop[6] on resource2 for client 32719: pid 32745 exited with return code 0
(snip)
I know that there is a cause of this stop order for order in group.
In this case our user wants to stop a resource in inverse order definitely.
* resource4_stop -> resource2_stop
Stop order is important to the resource of our user.
I ask next question.
Question 1) Is there right setting in cib.xml to evade this problem?
Question 2) In Pacemaker1.1, does this problem occur?
Question 3) I added following order.
<rsc_order id="order-2" first="resource1" then="resource3" />
<rsc_order id="order-3" first="resource1" then="resource4" />
<rsc_order id="order-5" first="resource2" then="resource4" />
And the addition of this order seems to solve a problem.
Is the addition of order right as one method of the solution, too?
Best Regards,
Hideo Yamauchi.
More information about the Pacemaker
mailing list