[ClusterLabs] [Problem] Remote resource does not move when bundle resource moves.

renayama19661014 at ybb.ne.jp renayama19661014 at ybb.ne.jp
Sat Dec 8 15:29:22 EST 2018


Hi All,

Sorry...

I made a mistake in line breaks.
to send again.

---

Hi All,


We have confirmed a slightly strange configuration of the bundle.
There is only one bundle resource, and it has an association with a group resource.
The operation was confirmed in PM 1.1.19.

Step1) Configure the cluster.
--------
[root at cent7-host1 ~]# crm_mon -R
Defaulting to one-shot mode
You need to have curses available at compile time to enable console mode
Stack: corosync
Current DC: cent7-host2 (3232262829) (version 1.1.19-c3c624ea3d) - partition with quorum
Last updated: Thu Dec  6 13:20:21 2018
Last change: Thu Dec  6 13:20:05 2018 by root via cibadmin on cent7-host1

4 nodes configured
10 resources configured

Online: [ cent7-host1 (3232262828) cent7-host2 (3232262829) ]
GuestOnline: [ httpd-bundle1-0 at cent7-host1 httpd-bundle2-0 at cent7-host2 ]

Active resources:

Resource Group: group1
dummy1     (ocf::pacemaker:Dummy): Started cent7-host1
Resource Group: group2
dummy2     (ocf::pacemaker:Dummy): Started cent7-host2
Docker container: httpd-bundle1 [pcmktest:http]
httpd-bundle1-ip-192.168.20.188   (ocf::heartbeat:IPaddr2):       Started cent7-host1
httpd-bundle1-docker-0    (ocf::heartbeat:docker):        Started cent7-host1
httpd-bundle1-0   (ocf::pacemaker:remote):        Started cent7-host1
httpd1    (ocf::heartbeat:apache):        Started httpd-bundle1-0
Docker container: httpd-bundle2 [pcmktest:http]
httpd-bundle2-ip-192.168.20.190   (ocf::heartbeat:IPaddr2):       Started cent7-host2
httpd-bundle2-docker-0    (ocf::heartbeat:docker):        Started cent7-host2
httpd-bundle2-0   (ocf::pacemaker:remote):        Started cent7-host2
httpd2    (ocf::heartbeat:apache):        Started httpd-bundle2-0
--------

Step2) Once we have cent7-host1 as standby, move the resource to cent7-host2.

--------
[root at cent7-host1 ~]# crm_standby -v on
[root at cent7-host1 ~]# crm_mon -R
Defaulting to one-shot mode
You need to have curses available at compile time to enable console mode
Stack: corosync
Current DC: cent7-host2 (3232262829) (version 1.1.19-c3c624ea3d) - partition with quorum
Last updated: Thu Dec  6 13:21:36 2018
Last change: Thu Dec  6 13:21:17 2018 by root via crm_attribute on cent7-host1

4 nodes configured
10 resources configured

Node cent7-host1 (3232262828): standby
Online: [ cent7-host2 (3232262829) ]
GuestOnline: [ httpd-bundle1-0 at cent7-host2 httpd-bundle2-0 at cent7-host2 ]

Active resources:

Resource Group: group1
dummy1     (ocf::pacemaker:Dummy): Started cent7-host2
Resource Group: group2
dummy2     (ocf::pacemaker:Dummy): Started cent7-host2
Docker container: httpd-bundle1 [pcmktest:http]
httpd-bundle1-ip-192.168.20.188   (ocf::heartbeat:IPaddr2):       Started cent7-host2
httpd-bundle1-docker-0    (ocf::heartbeat:docker):        Started cent7-host2
httpd-bundle1-0   (ocf::pacemaker:remote):        Started cent7-host2
httpd1    (ocf::heartbeat:apache):        Started httpd-bundle1-0
Docker container: httpd-bundle2 [pcmktest:http]
httpd-bundle2-ip-192.168.20.190   (ocf::heartbeat:IPaddr2):       Started cent7-host2
httpd-bundle2-docker-0    (ocf::heartbeat:docker):        Started cent7-host2
httpd-bundle2-0   (ocf::pacemaker:remote):        Started cent7-host2
httpd2    (ocf::heartbeat:apache):        Started httpd-bundle2-0
--------

Step3) Release standby of cent7-host1.
--------
[root at cent7-host1 ~]# crm_standby -v off 
[root at cent7-host1 ~]# crm_mon -R
Defaulting to one-shot mode
You need to have curses available at compile time to enable console mode
Stack: corosync
Current DC: cent7-host2 (3232262829) (version 1.1.19-c3c624ea3d) - partition with quorum
Last updated: Thu Dec  6 13:21:59 2018
Last change: Thu Dec  6 13:21:56 2018 by root via crm_attribute on cent7-host1

4 nodes configured
10 resources configured

Online: [ cent7-host1 (3232262828) cent7-host2 (3232262829) ]
GuestOnline: [ httpd-bundle1-0 at cent7-host2 httpd-bundle2-0 at cent7-host2 ]

Active resources:

Resource Group: group1
dummy1     (ocf::pacemaker:Dummy): Started cent7-host2
Resource Group: group2
dummy2     (ocf::pacemaker:Dummy): Started cent7-host2
Docker container: httpd-bundle1 [pcmktest:http]
httpd-bundle1-ip-192.168.20.188   (ocf::heartbeat:IPaddr2):       Started cent7-host2
httpd-bundle1-docker-0    (ocf::heartbeat:docker):        Started cent7-host2
httpd-bundle1-0   (ocf::pacemaker:remote):        Started cent7-host2
httpd1    (ocf::heartbeat:apache):        Started httpd-bundle1-0
Docker container: httpd-bundle2 [pcmktest:http]
httpd-bundle2-ip-192.168.20.190   (ocf::heartbeat:IPaddr2):       Started cent7-host2
httpd-bundle2-docker-0    (ocf::heartbeat:docker):        Started cent7-host2
httpd-bundle2-0   (ocf::pacemaker:remote):        Started cent7-host2
httpd2    (ocf::heartbeat:apache):        Started httpd-bundle2-0
--------

Step4) Move the group 1 resource and also return the bundle resource to cent7-host1.
--------
[root at cent7-host1 ~]# crm_resource -M -r group1 -H cent7-host1 -f -Q
[root at cent7-host1 ~]# crm_mon -R 
Defaulting to one-shot mode
You need to have curses available at compile time to enable console mode
Stack: corosync
Current DC: cent7-host2 (3232262829) (version 1.1.19-c3c624ea3d) - partition with quorum
Last updated: Thu Dec  6 13:22:56 2018
Last change: Thu Dec  6 13:22:36 2018 by root via crm_resource on cent7-host1

4 nodes configured
10 resources configured

Online: [ cent7-host1 (3232262828) cent7-host2 (3232262829) ]
GuestOnline: [ httpd-bundle1-0 at cent7-host2 httpd-bundle2-0 at cent7-host2 ]

Active resources:

Resource Group: group1
dummy1     (ocf::pacemaker:Dummy): Started cent7-host1
Resource Group: group2
dummy2     (ocf::pacemaker:Dummy): Started cent7-host2
Docker container: httpd-bundle1 [pcmktest:http]
httpd-bundle1-ip-192.168.20.188   (ocf::heartbeat:IPaddr2):       Started cent7-host1
httpd-bundle1-docker-0    (ocf::heartbeat:docker):        Started cent7-host1
httpd-bundle1-0   (ocf::pacemaker:remote):        Started cent7-host2
httpd1    (ocf::heartbeat:apache):        Started httpd-bundle1-0
Docker container: httpd-bundle2 [pcmktest:http]
httpd-bundle2-ip-192.168.20.190   (ocf::heartbeat:IPaddr2):       Started cent7-host2
httpd-bundle2-docker-0    (ocf::heartbeat:docker):        Started cent7-host2
httpd-bundle2-0   (ocf::pacemaker:remote):        Started cent7-host2
httpd2    (ocf::heartbeat:apache):        Started httpd-bundle2-0
--------

Step5) Release the added constraints.
At this time, when looking at the display, httpd-bundle1-0 has not moved to cent7-host1.

--------
[root at cent7-host1 ~]# crm_resource  -U -r group1
[root at cent7-host1 ~]# crm_mon -R 
Defaulting to one-shot mode
You need to have curses available at compile time to enable console mode
Stack: corosync
Current DC: cent7-host2 (3232262829) (version 1.1.19-c3c624ea3d) - partition with quorum
Last updated: Thu Dec  6 13:23:21 2018
Last change: Thu Dec  6 13:23:17 2018 by root via crm_resource on cent7-host1

4 nodes configured
10 resources configured

Online: [ cent7-host1 (3232262828) cent7-host2 (3232262829) ]
GuestOnline: [ httpd-bundle1-0 at cent7-host2 httpd-bundle2-0 at cent7-host2 ]

Active resources:

Resource Group: group1
dummy1     (ocf::pacemaker:Dummy): Started cent7-host1
Resource Group: group2
dummy2     (ocf::pacemaker:Dummy): Started cent7-host2
Docker container: httpd-bundle1 [pcmktest:http]
httpd-bundle1-ip-192.168.20.188   (ocf::heartbeat:IPaddr2):       Started cent7-host1
httpd-bundle1-docker-0    (ocf::heartbeat:docker):        Started cent7-host1
httpd-bundle1-0   (ocf::pacemaker:remote):        Started cent7-host2
httpd1    (ocf::heartbeat:apache):        Started httpd-bundle1-0
Docker container: httpd-bundle2 [pcmktest:http]
httpd-bundle2-ip-192.168.20.190   (ocf::heartbeat:IPaddr2):       Started cent7-host2
httpd-bundle2-docker-0    (ocf::heartbeat:docker):        Started cent7-host2
httpd-bundle2-0   (ocf::pacemaker:remote):        Started cent7-host2
httpd2    (ocf::heartbeat:apache):        Started httpd-bundle2-0

Step6) Connect to httpd-bundle1-docker-0 and kill pacemaker-remoted to cause a malfunction.
--------
[root at cent7-host1 ~]# docker exec -it httpd-bundle1-docker-0 /bin/bash
[root at httpd-bundle1-0 /]# ps -ef |grep remote
root         5     1  0 04:22 ?        00:00:00 /usr/sbin/pacemaker_remoted 
root       133   120  0 04:23 ?        00:00:00 grep --color=auto remote 
[root at httpd-bundle1-0 /]# kill -9 5;exit
--------

Finally, the cluster looks like this.

- If pacemaker-remoted is KILL, it should FailOver to cent7-host2, but it will not fail over.
- Also, in step 6, the fault occurred at cent7-host1 is indicated as the fault occurred at cent7-host2.

--------
[root at cent7-host1 ~]# crm_mon -R 
Defaulting to one-shot mode
You need to have curses available at compile time to enable console mode
Stack: corosync
Current DC: cent7-host2 (3232262829) (version 1.1.19-c3c624ea3d) - partition with quorum
Last updated: Thu Dec  6 13:24:03 2018
Last change: Thu Dec  6 13:23:17 2018 by root via crm_resource on cent7-host1

4 nodes configured
10 resources configured

Online: [ cent7-host1 (3232262828) cent7-host2 (3232262829) ]
GuestOnline: [ httpd-bundle1-0 at cent7-host1 httpd-bundle2-0 at cent7-host2 ]

Active resources:

Resource Group: group1
dummy1     (ocf::pacemaker:Dummy): Started cent7-host1
Resource Group: group2
dummy2     (ocf::pacemaker:Dummy): Started cent7-host2
Docker container: httpd-bundle1 [pcmktest:http]
httpd-bundle1-ip-192.168.20.188   (ocf::heartbeat:IPaddr2):       Started cent7-host1
httpd-bundle1-docker-0    (ocf::heartbeat:docker):        Started cent7-host1
httpd-bundle1-0   (ocf::pacemaker:remote):        Started cent7-host1
httpd1    (ocf::heartbeat:apache):        Started httpd-bundle1-0
Docker container: httpd-bundle2 [pcmktest:http]
httpd-bundle2-ip-192.168.20.190   (ocf::heartbeat:IPaddr2):       Started cent7-host2
httpd-bundle2-docker-0    (ocf::heartbeat:docker):        Started cent7-host2
httpd-bundle2-0   (ocf::pacemaker:remote):        Started cent7-host2
httpd2    (ocf::heartbeat:apache):        Started httpd-bundle2-0

Failed Actions:
* httpd-bundle1-0_monitor_60000 on cent7-host2 'unknown error' (1): call=9, status=Error, exitreason='',
last-rc-change='Thu Dec  6 13:23:49 2018', queued=0ms, exec=0ms
--------

Apparently, the problem seems to be that when the bundle resource is moved in Step 4, the remote resource is not moving.

- In the latest master (a3bf7116d2), we could not confirm because the scheduler process went down.

(I also attached the crm_report file.)


* This problem is registered in the following Bugzilla.
- https://bugs.clusterlabs.org/show_bug.cgi?id=5373


Best Regards,
Hideo Yamauchi.



More information about the Users mailing list