[ClusterLabs] [Problem] Remote resource does not move when bundle resource moves.
renayama19661014 at ybb.ne.jp
renayama19661014 at ybb.ne.jp
Thu Dec 6 06:54:07 EST 2018
Hi All, We have confirmed a slightly strange configuration of the bundle.
There is only one bundle resource, and it has an association with a group resource.
The operation was confirmed in PM 1.1.19. Step1) Configure the cluster.
--------
[root at cent7-host1 ~]# crm_mon -R
Defaulting to one-shot mode
You need to have curses available at compile time to enable console mode
Stack: corosync
Current DC: cent7-host2 (3232262829) (version 1.1.19-c3c624ea3d) - partition with quorum
Last updated: Thu Dec 6 13:20:21 2018
Last change: Thu Dec 6 13:20:05 2018 by root via cibadmin on cent7-host1 4 nodes configured
10 resources configured Online: [ cent7-host1 (3232262828) cent7-host2 (3232262829) ]
GuestOnline: [ httpd-bundle1-0 at cent7-host1 httpd-bundle2-0 at cent7-host2 ] Active resources: Resource Group: group1 dummy1 (ocf::pacemaker:Dummy): Started cent7-host1 Resource Group: group2 dummy2 (ocf::pacemaker:Dummy): Started cent7-host2 Docker container: httpd-bundle1 [pcmktest:http] httpd-bundle1-ip-192.168.20.188 (ocf::heartbeat:IPaddr2): Started cent7-host1 httpd-bundle1-docker-0 (ocf::heartbeat:docker): Started cent7-host1 httpd-bundle1-0 (ocf::pacemaker:remote): Started cent7-host1 httpd1 (ocf::heartbeat:apache): Started httpd-bundle1-0 Docker container: httpd-bundle2 [pcmktest:http] httpd-bundle2-ip-192.168.20.190 (ocf::heartbeat:IPaddr2): Started cent7-host2 httpd-bundle2-docker-0 (ocf::heartbeat:docker): Started cent7-host2 httpd-bundle2-0 (ocf::pacemaker:remote): Started cent7-host2 httpd2 (ocf::heartbeat:apache): Started httpd-bundle2-0
-------- Step2) Once we have cent7-host1 as standby, move the resource to cent7-host2. --------
[root at cent7-host1 ~]# crm_standby -v on
[root at cent7-host1 ~]# crm_mon -R
Defaulting to one-shot mode
You need to have curses available at compile time to enable console mode
Stack: corosync
Current DC: cent7-host2 (3232262829) (version 1.1.19-c3c624ea3d) - partition with quorum
Last updated: Thu Dec 6 13:21:36 2018
Last change: Thu Dec 6 13:21:17 2018 by root via crm_attribute on cent7-host1 4 nodes configured
10 resources configured Node cent7-host1 (3232262828): standby
Online: [ cent7-host2 (3232262829) ]
GuestOnline: [ httpd-bundle1-0 at cent7-host2 httpd-bundle2-0 at cent7-host2 ] Active resources: Resource Group: group1 dummy1 (ocf::pacemaker:Dummy): Started cent7-host2 Resource Group: group2 dummy2 (ocf::pacemaker:Dummy): Started cent7-host2 Docker container: httpd-bundle1 [pcmktest:http] httpd-bundle1-ip-192.168.20.188 (ocf::heartbeat:IPaddr2): Started cent7-host2 httpd-bundle1-docker-0 (ocf::heartbeat:docker): Started cent7-host2 httpd-bundle1-0 (ocf::pacemaker:remote): Started cent7-host2 httpd1 (ocf::heartbeat:apache): Started httpd-bundle1-0 Docker container: httpd-bundle2 [pcmktest:http] httpd-bundle2-ip-192.168.20.190 (ocf::heartbeat:IPaddr2): Started cent7-host2 httpd-bundle2-docker-0 (ocf::heartbeat:docker): Started cent7-host2 httpd-bundle2-0 (ocf::pacemaker:remote): Started cent7-host2 httpd2 (ocf::heartbeat:apache): Started httpd-bundle2-0
-------- Step3) Release standby of cent7-host1.
--------
[root at cent7-host1 ~]# crm_standby -v off
[root at cent7-host1 ~]# crm_mon -R
Defaulting to one-shot mode
You need to have curses available at compile time to enable console mode
Stack: corosync
Current DC: cent7-host2 (3232262829) (version 1.1.19-c3c624ea3d) - partition with quorum
Last updated: Thu Dec 6 13:21:59 2018
Last change: Thu Dec 6 13:21:56 2018 by root via crm_attribute on cent7-host1 4 nodes configured
10 resources configured Online: [ cent7-host1 (3232262828) cent7-host2 (3232262829) ]
GuestOnline: [ httpd-bundle1-0 at cent7-host2 httpd-bundle2-0 at cent7-host2 ] Active resources: Resource Group: group1 dummy1 (ocf::pacemaker:Dummy): Started cent7-host2 Resource Group: group2 dummy2 (ocf::pacemaker:Dummy): Started cent7-host2 Docker container: httpd-bundle1 [pcmktest:http] httpd-bundle1-ip-192.168.20.188 (ocf::heartbeat:IPaddr2): Started cent7-host2 httpd-bundle1-docker-0 (ocf::heartbeat:docker): Started cent7-host2 httpd-bundle1-0 (ocf::pacemaker:remote): Started cent7-host2 httpd1 (ocf::heartbeat:apache): Started httpd-bundle1-0 Docker container: httpd-bundle2 [pcmktest:http] httpd-bundle2-ip-192.168.20.190 (ocf::heartbeat:IPaddr2): Started cent7-host2 httpd-bundle2-docker-0 (ocf::heartbeat:docker): Started cent7-host2 httpd-bundle2-0 (ocf::pacemaker:remote): Started cent7-host2 httpd2 (ocf::heartbeat:apache): Started httpd-bundle2-0
-------- Step4) Move the group 1 resource and also return the bundle resource to cent7-host1.
--------
[root at cent7-host1 ~]# crm_resource -M -r group1 -H cent7-host1 -f -Q
[root at cent7-host1 ~]# crm_mon -R
Defaulting to one-shot mode
You need to have curses available at compile time to enable console mode
Stack: corosync
Current DC: cent7-host2 (3232262829) (version 1.1.19-c3c624ea3d) - partition with quorum
Last updated: Thu Dec 6 13:22:56 2018
Last change: Thu Dec 6 13:22:36 2018 by root via crm_resource on cent7-host1 4 nodes configured
10 resources configured Online: [ cent7-host1 (3232262828) cent7-host2 (3232262829) ]
GuestOnline: [ httpd-bundle1-0 at cent7-host2 httpd-bundle2-0 at cent7-host2 ] Active resources: Resource Group: group1 dummy1 (ocf::pacemaker:Dummy): Started cent7-host1 Resource Group: group2 dummy2 (ocf::pacemaker:Dummy): Started cent7-host2 Docker container: httpd-bundle1 [pcmktest:http] httpd-bundle1-ip-192.168.20.188 (ocf::heartbeat:IPaddr2): Started cent7-host1 httpd-bundle1-docker-0 (ocf::heartbeat:docker): Started cent7-host1 httpd-bundle1-0 (ocf::pacemaker:remote): Started cent7-host2 httpd1 (ocf::heartbeat:apache): Started httpd-bundle1-0 Docker container: httpd-bundle2 [pcmktest:http] httpd-bundle2-ip-192.168.20.190 (ocf::heartbeat:IPaddr2): Started cent7-host2 httpd-bundle2-docker-0 (ocf::heartbeat:docker): Started cent7-host2 httpd-bundle2-0 (ocf::pacemaker:remote): Started cent7-host2 httpd2 (ocf::heartbeat:apache): Started httpd-bundle2-0
-------- Step5) Release the added constraints.
At this time, when looking at the display, httpd-bundle1-0 has not moved to cent7-host1. --------
[root at cent7-host1 ~]# crm_resource -U -r group1
[root at cent7-host1 ~]# crm_mon -R
Defaulting to one-shot mode
You need to have curses available at compile time to enable console mode
Stack: corosync
Current DC: cent7-host2 (3232262829) (version 1.1.19-c3c624ea3d) - partition with quorum
Last updated: Thu Dec 6 13:23:21 2018
Last change: Thu Dec 6 13:23:17 2018 by root via crm_resource on cent7-host1 4 nodes configured
10 resources configured Online: [ cent7-host1 (3232262828) cent7-host2 (3232262829) ]
GuestOnline: [ httpd-bundle1-0 at cent7-host2 httpd-bundle2-0 at cent7-host2 ] Active resources: Resource Group: group1 dummy1 (ocf::pacemaker:Dummy): Started cent7-host1 Resource Group: group2 dummy2 (ocf::pacemaker:Dummy): Started cent7-host2 Docker container: httpd-bundle1 [pcmktest:http] httpd-bundle1-ip-192.168.20.188 (ocf::heartbeat:IPaddr2): Started cent7-host1 httpd-bundle1-docker-0 (ocf::heartbeat:docker): Started cent7-host1 httpd-bundle1-0 (ocf::pacemaker:remote): Started cent7-host2 httpd1 (ocf::heartbeat:apache): Started httpd-bundle1-0 Docker container: httpd-bundle2 [pcmktest:http] httpd-bundle2-ip-192.168.20.190 (ocf::heartbeat:IPaddr2): Started cent7-host2 httpd-bundle2-docker-0 (ocf::heartbeat:docker): Started cent7-host2 httpd-bundle2-0 (ocf::pacemaker:remote): Started cent7-host2 httpd2 (ocf::heartbeat:apache): Started httpd-bundle2-0 Step6) Connect to
httpd-bundle1-docker-0 and kill pacemaker-remoted to cause a malfunction.
--------
[root at cent7-host1 ~]# docker exec -it httpd-bundle1-docker-0 /bin/bash
[root at httpd-bundle1-0 /]# ps -ef |grep remote
root 5 1 0 04:22 ? 00:00:00 /usr/sbin/pacemaker_remoted
root 133 120 0 04:23 ? 00:00:00 grep --color=auto remote
[root at httpd-bundle1-0 /]# kill -9 5;exit
-------- Finally, the cluster looks like this. - If pacemaker-remoted is KILL, it should FailOver to cent7-host2, but it will not fail over. - Also, in step 6, the fault occurred at cent7-host1 is indicated as the fault occurred at cent7-host2. --------
[root at cent7-host1 ~]# crm_mon -R
Defaulting to one-shot mode
You need to have curses available at compile time to enable console mode
Stack: corosync
Current DC: cent7-host2 (3232262829) (version 1.1.19-c3c624ea3d) - partition with quorum
Last updated: Thu Dec 6 13:24:03 2018
Last change: Thu Dec 6 13:23:17 2018 by root via crm_resource on cent7-host1 4 nodes configured
10 resources configured Online: [ cent7-host1 (3232262828) cent7-host2 (3232262829) ]
GuestOnline: [ httpd-bundle1-0 at cent7-host1 httpd-bundle2-0 at cent7-host2 ] Active resources: Resource Group: group1 dummy1 (ocf::pacemaker:Dummy): Started cent7-host1 Resource Group: group2 dummy2 (ocf::pacemaker:Dummy): Started cent7-host2 Docker container: httpd-bundle1 [pcmktest:http] httpd-bundle1-ip-192.168.20.188 (ocf::heartbeat:IPaddr2): Started cent7-host1 httpd-bundle1-docker-0 (ocf::heartbeat:docker): Started cent7-host1 httpd-bundle1-0 (ocf::pacemaker:remote): Started cent7-host1 httpd1 (ocf::heartbeat:apache): Started httpd-bundle1-0 Docker container: httpd-bundle2 [pcmktest:http] httpd-bundle2-ip-192.168.20.190 (ocf::heartbeat:IPaddr2): Started cent7-host2 httpd-bundle2-docker-0 (ocf::heartbeat:docker): Started cent7-host2 httpd-bundle2-0 (ocf::pacemaker:remote): Started cent7-host2 httpd2 (ocf::heartbeat:apache): Started httpd-bundle2-0 Failed Actions:
* httpd-bundle1-0_monitor_60000 on cent7-host2 'unknown error' (1): call=9, status=Error, exitreason='', last-rc-change='Thu Dec 6 13:23:49 2018', queued=0ms, exec=0ms
-------- Apparently, the problem seems to be that when the bundle resource is moved in Step 4, the remote resource is not moving. - In the latest master (a3bf7116d2), we could not confirm because the scheduler process went down.
* This problem is registered in the following Bugzilla.
- https://bugs.clusterlabs.org/show_bug.cgi?id=5373
Best Regards,
Hideo Yamauchi.
More information about the Users
mailing list