[ClusterLabs] [Question] About the timing of the stop of the monitor of the Slave resource.

Fri Oct 7 21:29:14 UTC 2016

Hi All,

(Sorry...Because a format collapsed, I send it again.)

I ask about the movement of the Master/Slave resource.

Does the next movement not have the problem?

Step 1) Constitute a cluster.
-----
[root at rh72-01 ~]# crm_mon -1
Stack: corosync
Current DC: rh72-01 (version 1.1.15-e174ec8) - partition with quorum
Last updated: Fri Oct  7 22:12:31 2016          Last change: Fri Oct  7
22:12:29 2016 by root via cibadmin on rh72-01

2 nodes and 3 resources configured

Online: [ rh72-01 rh72-02 ]

 prmDummy       (ocf::pacemaker:Dummy): Started rh72-01
 Master/Slave Set: msStateful [prmStateful]
     Masters: [ rh72-01 ]
     Slaves: [ rh72-02 ]
-----

Step 2) Set pseudotrouble in start of prmDummy of the rh72-02 node.
-----
dummy_start() {
return $OCF_ERR_GENERIC
    local RETVAL

    dummy_monitor
(snip)
-----

Step 3) Stop rh72-01 node.
The monitor of msStateful stops.
Promote has not been yet carried out.
-----
[root at rh72-01 ~]# systemctl stop pacemaker

[root at rh72-02 ~]# crm_mon -1
Stack: corosync
Current DC: rh72-02 (version 1.1.15-e174ec8) - partition WITHOUT quorum
Last updated: Fri Oct  7 22:14:30 2016          Last change: Fri Oct  7
22:14:11 2016 by root via cibadmin on rh72-01

2 nodes and 3 resources configured

Online: [ rh72-02 ]
OFFLINE: [ rh72-01 ]

 Master/Slave Set: msStateful [prmStateful]
     Slaves: [ rh72-02 ]

Failed Actions:
* prmDummy_start_0 on rh72-02 'unknown error' (1): call=14, status=complete,
exitreason='none',
    last-rc-change='Fri Oct  7 22:14:27 2016', queued=0ms, exec=36ms

Oct  7 22:14:27 rh72-02 lrmd[2772]:    info: Cancelling ocf operation
prmStateful_monitor_20000
Oct  7 22:14:27 rh72-02 crmd[2775]:    info: Result of monitor operation for
prmStateful on rh72-02: Cancelled

-----

The indication of crm_mon sees msStateful as Slave, too.

Because the Promote handling of resource of Slave was not carried out, I
thought that the monitor should not stop.

Sorry...Possibly I may only forget a past discussion.

By a past discussion, was there any reason to carry out the cancellation of the
monitor of the Slave resource first?

I registered these contents with Bugzilla.
I attach the crm_report file to Bugzilla.* http://bugs.clusterlabs.org/show_bug.cgi?id=5302

Best Regards,
Hideo Yamacuhi.