[ClusterLabs] [Problem] Remote resource does not move when bundle resource moves.
renayama19661014 at ybb.ne.jp
renayama19661014 at ybb.ne.jp
Sat Dec 8 15:29:22 EST 2018
Hi All,
Sorry...
I made a mistake in line breaks.
to send again.
---
Hi All,
We have confirmed a slightly strange configuration of the bundle.
There is only one bundle resource, and it has an association with a group resource.
The operation was confirmed in PM 1.1.19.
Step1) Configure the cluster.
--------
[root at cent7-host1 ~]# crm_mon -R
Defaulting to one-shot mode
You need to have curses available at compile time to enable console mode
Stack: corosync
Current DC: cent7-host2 (3232262829) (version 1.1.19-c3c624ea3d) - partition with quorum
Last updated: Thu Dec 6 13:20:21 2018
Last change: Thu Dec 6 13:20:05 2018 by root via cibadmin on cent7-host1
4 nodes configured
10 resources configured
Online: [ cent7-host1 (3232262828) cent7-host2 (3232262829) ]
GuestOnline: [ httpd-bundle1-0 at cent7-host1 httpd-bundle2-0 at cent7-host2 ]
Active resources:
Resource Group: group1
dummy1 (ocf::pacemaker:Dummy): Started cent7-host1
Resource Group: group2
dummy2 (ocf::pacemaker:Dummy): Started cent7-host2
Docker container: httpd-bundle1 [pcmktest:http]
httpd-bundle1-ip-192.168.20.188 (ocf::heartbeat:IPaddr2): Started cent7-host1
httpd-bundle1-docker-0 (ocf::heartbeat:docker): Started cent7-host1
httpd-bundle1-0 (ocf::pacemaker:remote): Started cent7-host1
httpd1 (ocf::heartbeat:apache): Started httpd-bundle1-0
Docker container: httpd-bundle2 [pcmktest:http]
httpd-bundle2-ip-192.168.20.190 (ocf::heartbeat:IPaddr2): Started cent7-host2
httpd-bundle2-docker-0 (ocf::heartbeat:docker): Started cent7-host2
httpd-bundle2-0 (ocf::pacemaker:remote): Started cent7-host2
httpd2 (ocf::heartbeat:apache): Started httpd-bundle2-0
--------
Step2) Once we have cent7-host1 as standby, move the resource to cent7-host2.
--------
[root at cent7-host1 ~]# crm_standby -v on
[root at cent7-host1 ~]# crm_mon -R
Defaulting to one-shot mode
You need to have curses available at compile time to enable console mode
Stack: corosync
Current DC: cent7-host2 (3232262829) (version 1.1.19-c3c624ea3d) - partition with quorum
Last updated: Thu Dec 6 13:21:36 2018
Last change: Thu Dec 6 13:21:17 2018 by root via crm_attribute on cent7-host1
4 nodes configured
10 resources configured
Node cent7-host1 (3232262828): standby
Online: [ cent7-host2 (3232262829) ]
GuestOnline: [ httpd-bundle1-0 at cent7-host2 httpd-bundle2-0 at cent7-host2 ]
Active resources:
Resource Group: group1
dummy1 (ocf::pacemaker:Dummy): Started cent7-host2
Resource Group: group2
dummy2 (ocf::pacemaker:Dummy): Started cent7-host2
Docker container: httpd-bundle1 [pcmktest:http]
httpd-bundle1-ip-192.168.20.188 (ocf::heartbeat:IPaddr2): Started cent7-host2
httpd-bundle1-docker-0 (ocf::heartbeat:docker): Started cent7-host2
httpd-bundle1-0 (ocf::pacemaker:remote): Started cent7-host2
httpd1 (ocf::heartbeat:apache): Started httpd-bundle1-0
Docker container: httpd-bundle2 [pcmktest:http]
httpd-bundle2-ip-192.168.20.190 (ocf::heartbeat:IPaddr2): Started cent7-host2
httpd-bundle2-docker-0 (ocf::heartbeat:docker): Started cent7-host2
httpd-bundle2-0 (ocf::pacemaker:remote): Started cent7-host2
httpd2 (ocf::heartbeat:apache): Started httpd-bundle2-0
--------
Step3) Release standby of cent7-host1.
--------
[root at cent7-host1 ~]# crm_standby -v off
[root at cent7-host1 ~]# crm_mon -R
Defaulting to one-shot mode
You need to have curses available at compile time to enable console mode
Stack: corosync
Current DC: cent7-host2 (3232262829) (version 1.1.19-c3c624ea3d) - partition with quorum
Last updated: Thu Dec 6 13:21:59 2018
Last change: Thu Dec 6 13:21:56 2018 by root via crm_attribute on cent7-host1
4 nodes configured
10 resources configured
Online: [ cent7-host1 (3232262828) cent7-host2 (3232262829) ]
GuestOnline: [ httpd-bundle1-0 at cent7-host2 httpd-bundle2-0 at cent7-host2 ]
Active resources:
Resource Group: group1
dummy1 (ocf::pacemaker:Dummy): Started cent7-host2
Resource Group: group2
dummy2 (ocf::pacemaker:Dummy): Started cent7-host2
Docker container: httpd-bundle1 [pcmktest:http]
httpd-bundle1-ip-192.168.20.188 (ocf::heartbeat:IPaddr2): Started cent7-host2
httpd-bundle1-docker-0 (ocf::heartbeat:docker): Started cent7-host2
httpd-bundle1-0 (ocf::pacemaker:remote): Started cent7-host2
httpd1 (ocf::heartbeat:apache): Started httpd-bundle1-0
Docker container: httpd-bundle2 [pcmktest:http]
httpd-bundle2-ip-192.168.20.190 (ocf::heartbeat:IPaddr2): Started cent7-host2
httpd-bundle2-docker-0 (ocf::heartbeat:docker): Started cent7-host2
httpd-bundle2-0 (ocf::pacemaker:remote): Started cent7-host2
httpd2 (ocf::heartbeat:apache): Started httpd-bundle2-0
--------
Step4) Move the group 1 resource and also return the bundle resource to cent7-host1.
--------
[root at cent7-host1 ~]# crm_resource -M -r group1 -H cent7-host1 -f -Q
[root at cent7-host1 ~]# crm_mon -R
Defaulting to one-shot mode
You need to have curses available at compile time to enable console mode
Stack: corosync
Current DC: cent7-host2 (3232262829) (version 1.1.19-c3c624ea3d) - partition with quorum
Last updated: Thu Dec 6 13:22:56 2018
Last change: Thu Dec 6 13:22:36 2018 by root via crm_resource on cent7-host1
4 nodes configured
10 resources configured
Online: [ cent7-host1 (3232262828) cent7-host2 (3232262829) ]
GuestOnline: [ httpd-bundle1-0 at cent7-host2 httpd-bundle2-0 at cent7-host2 ]
Active resources:
Resource Group: group1
dummy1 (ocf::pacemaker:Dummy): Started cent7-host1
Resource Group: group2
dummy2 (ocf::pacemaker:Dummy): Started cent7-host2
Docker container: httpd-bundle1 [pcmktest:http]
httpd-bundle1-ip-192.168.20.188 (ocf::heartbeat:IPaddr2): Started cent7-host1
httpd-bundle1-docker-0 (ocf::heartbeat:docker): Started cent7-host1
httpd-bundle1-0 (ocf::pacemaker:remote): Started cent7-host2
httpd1 (ocf::heartbeat:apache): Started httpd-bundle1-0
Docker container: httpd-bundle2 [pcmktest:http]
httpd-bundle2-ip-192.168.20.190 (ocf::heartbeat:IPaddr2): Started cent7-host2
httpd-bundle2-docker-0 (ocf::heartbeat:docker): Started cent7-host2
httpd-bundle2-0 (ocf::pacemaker:remote): Started cent7-host2
httpd2 (ocf::heartbeat:apache): Started httpd-bundle2-0
--------
Step5) Release the added constraints.
At this time, when looking at the display, httpd-bundle1-0 has not moved to cent7-host1.
--------
[root at cent7-host1 ~]# crm_resource -U -r group1
[root at cent7-host1 ~]# crm_mon -R
Defaulting to one-shot mode
You need to have curses available at compile time to enable console mode
Stack: corosync
Current DC: cent7-host2 (3232262829) (version 1.1.19-c3c624ea3d) - partition with quorum
Last updated: Thu Dec 6 13:23:21 2018
Last change: Thu Dec 6 13:23:17 2018 by root via crm_resource on cent7-host1
4 nodes configured
10 resources configured
Online: [ cent7-host1 (3232262828) cent7-host2 (3232262829) ]
GuestOnline: [ httpd-bundle1-0 at cent7-host2 httpd-bundle2-0 at cent7-host2 ]
Active resources:
Resource Group: group1
dummy1 (ocf::pacemaker:Dummy): Started cent7-host1
Resource Group: group2
dummy2 (ocf::pacemaker:Dummy): Started cent7-host2
Docker container: httpd-bundle1 [pcmktest:http]
httpd-bundle1-ip-192.168.20.188 (ocf::heartbeat:IPaddr2): Started cent7-host1
httpd-bundle1-docker-0 (ocf::heartbeat:docker): Started cent7-host1
httpd-bundle1-0 (ocf::pacemaker:remote): Started cent7-host2
httpd1 (ocf::heartbeat:apache): Started httpd-bundle1-0
Docker container: httpd-bundle2 [pcmktest:http]
httpd-bundle2-ip-192.168.20.190 (ocf::heartbeat:IPaddr2): Started cent7-host2
httpd-bundle2-docker-0 (ocf::heartbeat:docker): Started cent7-host2
httpd-bundle2-0 (ocf::pacemaker:remote): Started cent7-host2
httpd2 (ocf::heartbeat:apache): Started httpd-bundle2-0
Step6) Connect to httpd-bundle1-docker-0 and kill pacemaker-remoted to cause a malfunction.
--------
[root at cent7-host1 ~]# docker exec -it httpd-bundle1-docker-0 /bin/bash
[root at httpd-bundle1-0 /]# ps -ef |grep remote
root 5 1 0 04:22 ? 00:00:00 /usr/sbin/pacemaker_remoted
root 133 120 0 04:23 ? 00:00:00 grep --color=auto remote
[root at httpd-bundle1-0 /]# kill -9 5;exit
--------
Finally, the cluster looks like this.
- If pacemaker-remoted is KILL, it should FailOver to cent7-host2, but it will not fail over.
- Also, in step 6, the fault occurred at cent7-host1 is indicated as the fault occurred at cent7-host2.
--------
[root at cent7-host1 ~]# crm_mon -R
Defaulting to one-shot mode
You need to have curses available at compile time to enable console mode
Stack: corosync
Current DC: cent7-host2 (3232262829) (version 1.1.19-c3c624ea3d) - partition with quorum
Last updated: Thu Dec 6 13:24:03 2018
Last change: Thu Dec 6 13:23:17 2018 by root via crm_resource on cent7-host1
4 nodes configured
10 resources configured
Online: [ cent7-host1 (3232262828) cent7-host2 (3232262829) ]
GuestOnline: [ httpd-bundle1-0 at cent7-host1 httpd-bundle2-0 at cent7-host2 ]
Active resources:
Resource Group: group1
dummy1 (ocf::pacemaker:Dummy): Started cent7-host1
Resource Group: group2
dummy2 (ocf::pacemaker:Dummy): Started cent7-host2
Docker container: httpd-bundle1 [pcmktest:http]
httpd-bundle1-ip-192.168.20.188 (ocf::heartbeat:IPaddr2): Started cent7-host1
httpd-bundle1-docker-0 (ocf::heartbeat:docker): Started cent7-host1
httpd-bundle1-0 (ocf::pacemaker:remote): Started cent7-host1
httpd1 (ocf::heartbeat:apache): Started httpd-bundle1-0
Docker container: httpd-bundle2 [pcmktest:http]
httpd-bundle2-ip-192.168.20.190 (ocf::heartbeat:IPaddr2): Started cent7-host2
httpd-bundle2-docker-0 (ocf::heartbeat:docker): Started cent7-host2
httpd-bundle2-0 (ocf::pacemaker:remote): Started cent7-host2
httpd2 (ocf::heartbeat:apache): Started httpd-bundle2-0
Failed Actions:
* httpd-bundle1-0_monitor_60000 on cent7-host2 'unknown error' (1): call=9, status=Error, exitreason='',
last-rc-change='Thu Dec 6 13:23:49 2018', queued=0ms, exec=0ms
--------
Apparently, the problem seems to be that when the bundle resource is moved in Step 4, the remote resource is not moving.
- In the latest master (a3bf7116d2), we could not confirm because the scheduler process went down.
(I also attached the crm_report file.)
* This problem is registered in the following Bugzilla.
- https://bugs.clusterlabs.org/show_bug.cgi?id=5373
Best Regards,
Hideo Yamauchi.
More information about the Users
mailing list