[ClusterLabs] “pcs cluster stop -all” hangs and

Casey & Gina caseyandgina at icloud.com
Fri May 11 16:16:16 UTC 2018


I don't know why this happens, but I encounter this often.  My workaround is this:

killall -9 pacemakerd; killall pengine; killall lrmd; killall cib; killall corosync

> On May 10, 2018, at 11:26 PM, 范国腾 <fanguoteng at highgo.com> wrote:
> 
> Hi,
> 
> When I run the "pcs cluster stop --all", it will hang and there is no any response sometimes. The log is as below. Could we find the reason why it hangs from the log and how to make the cluster stop right now? 
> 
> [root at node2 pg_log]# pcs status
> Cluster name: hgpurog
> Stack: corosync
> Current DC: sds1 (version 1.1.16-12.el7-94ff4df) - partition with quorum
> Last updated: Fri May 11 01:11:26 2018
> Last change: Fri May 11 01:09:24 2018 by hacluster via crmd on sds1
> 
> 2 nodes configured
> 3 resources configured
> 
> Online: [ sds1 sds2 ]
> 
> Full list of resources:
> 
> Master/Slave Set: pgsql-ha [pgsqld]
>     Stopped: [ sds1 sds2 ]
> Resource Group: mastergroup
>     master-vip (ocf::heartbeat:IPaddr2):       Started sds1
> 
> Daemon Status:
>  corosync: active/enabled
>  pacemaker: active/enabled
>  pcsd: active/enabled
> [root at node2 pg_log]# pcs cluster stop --all
> 
> 
> The /var/log/messages is as asbelow:
> May 11 01:07:50 node2 crmd[5365]:  notice: State transition S_PENDING -> S_NOT_DC
> May 11 01:07:50 node2 crmd[5365]:  notice: State transition S_NOT_DC -> S_PENDING
> May 11 01:07:50 node2 crmd[5365]:  notice: State transition S_PENDING -> S_NOT_DC
> May 11 01:07:51 node2 pgsqlms(pgsqld)[5371]: INFO: Execute action monitor and the result 7
> May 11 01:07:51 node2 pgsqlms(undef)[5408]: INFO: Execute action meta-data and the result 0
> May 11 01:07:51 node2 crmd[5365]:  notice: Result of probe operation for pgsqld on sds2: 7 (not running)
> May 11 01:07:51 node2 crmd[5365]:  notice: sds2-pgsqld_monitor_0:6 [ /tmp:5866 - no response\n ]
> May 11 01:07:51 node2 crmd[5365]:  notice: Result of probe operation for master-vip on sds2: 7 (not running)
> May 11 01:10:02 node2 systemd: Started Session 16 of user root.
> May 11 01:10:02 node2 systemd: Starting Session 16 of user root.
> May 11 01:11:33 node2 pacemakerd[5357]:  notice: Caught 'Terminated' signal
> May 11 01:11:33 node2 systemd: Stopping Pacemaker High Availability Cluster Manager...
> May 11 01:11:33 node2 pacemakerd[5357]:  notice: Shutting down Pacemaker
> May 11 01:11:33 node2 pacemakerd[5357]:  notice: Stopping crmd
> May 11 01:11:33 node2 crmd[5365]:  notice: Caught 'Terminated' signal
> May 11 01:11:33 node2 crmd[5365]:  notice: Shutting down cluster resource manager
> May 11 01:12:49 node2 systemd: Started Session 17 of user root.
> May 11 01:12:49 node2 systemd-logind: New session 17 of user root.
> May 11 01:12:49 node2 gdm-launch-environment]: AccountsService: ActUserManager: user (null) has no username (object path: /org/freedesktop/Accounts/User0, uid: 0)
> May 11 01:12:49 node2 journal: ActUserManager: user (null) has no username (object path: /org/freedesktop/Accounts/User0, uid: 0)
> May 11 01:12:49 node2 systemd: Starting Session 17 of user root.
> May 11 01:12:49 node2 dbus[648]: [system] Activating service name='org.freedesktop.problems' (using servicehelper)
> May 11 01:12:49 node2 dbus-daemon: dbus[648]: [system] Activating service name='org.freedesktop.problems' (using servicehelper)
> May 11 01:12:49 node2 dbus[648]: [system] Successfully activated service 'org.freedesktop.problems'
> May 11 01:12:49 node2 dbus-daemon: dbus[648]: [system] Successfully activated service 'org.freedesktop.problems'
> May 11 01:12:49 node2 journal: g_dbus_interface_skeleton_unexport: assertion 'interface_->priv->connections != NULL' failed
> 
> Here is the log in the peer node
> May 11 01:09:08 node1 pgsqlms(pgsqld)[28599]: WARNING: No secondary connected to the master
> May 11 01:09:08 node1 pgsqlms(pgsqld)[28599]: WARNING: "sds2" is not connected to the primary
> May 11 01:09:08 node1 pgsqlms(pgsqld)[28599]: INFO: Execute action monitor and the result 8
> May 11 01:09:18 node1 pgsqlms(pgsqld)[28679]: WARNING: No secondary connected to the master
> May 11 01:09:18 node1 pgsqlms(pgsqld)[28679]: WARNING: "sds2" is not connected to the primary
> May 11 01:09:18 node1 pgsqlms(pgsqld)[28679]: INFO: Execute action monitor and the result 8
> May 11 01:09:24 node1 crmd[1111]:  notice: sds1-pgsqld_monitor_10000:19 [ /tmp:5866 - accepting connections\n ]
> May 11 01:09:24 node1 crmd[1111]:  notice: Transition aborted by deletion of lrm_resource[@id='pgsqld']: Resource state removal
> May 11 01:10:02 node1 systemd: Started Session 17 of user root.
> May 11 01:10:02 node1 systemd: Starting Session 17 of user root.
> May 11 01:11:33 node1 pacemakerd[1042]:  notice: Caught 'Terminated' signal
> May 11 01:11:33 node1 systemd: Stopping Pacemaker High Availability Cluster Manager...
> May 11 01:11:33 node1 pacemakerd[1042]:  notice: Shutting down Pacemaker
> May 11 01:11:33 node1 pacemakerd[1042]:  notice: Stopping crmd
> May 11 01:11:33 node1 crmd[1111]:  notice: Caught 'Terminated' signal
> May 11 01:11:33 node1 crmd[1111]:  notice: Shutting down cluster resource manager
> May 11 01:11:33 node1 crmd[1111]: warning: Input I_SHUTDOWN received in state S_TRANSITION_ENGINE from crm_shutdown
> 
> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org



More information about the Users mailing list