[ClusterLabs] “pcs cluster stop -all” hangs and

范国腾 fanguoteng at highgo.com
Fri May 11 01:26:31 EDT 2018


Hi,

When I run the "pcs cluster stop --all", it will hang and there is no any response sometimes. The log is as below. Could we find the reason why it hangs from the log and how to make the cluster stop right now? 

[root at node2 pg_log]# pcs status
Cluster name: hgpurog
Stack: corosync
Current DC: sds1 (version 1.1.16-12.el7-94ff4df) - partition with quorum
Last updated: Fri May 11 01:11:26 2018
Last change: Fri May 11 01:09:24 2018 by hacluster via crmd on sds1

2 nodes configured
3 resources configured

Online: [ sds1 sds2 ]

Full list of resources:

 Master/Slave Set: pgsql-ha [pgsqld]
     Stopped: [ sds1 sds2 ]
 Resource Group: mastergroup
     master-vip (ocf::heartbeat:IPaddr2):       Started sds1

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
[root at node2 pg_log]# pcs cluster stop --all


The /var/log/messages is as asbelow:
May 11 01:07:50 node2 crmd[5365]:  notice: State transition S_PENDING -> S_NOT_DC
May 11 01:07:50 node2 crmd[5365]:  notice: State transition S_NOT_DC -> S_PENDING
May 11 01:07:50 node2 crmd[5365]:  notice: State transition S_PENDING -> S_NOT_DC
May 11 01:07:51 node2 pgsqlms(pgsqld)[5371]: INFO: Execute action monitor and the result 7
May 11 01:07:51 node2 pgsqlms(undef)[5408]: INFO: Execute action meta-data and the result 0
May 11 01:07:51 node2 crmd[5365]:  notice: Result of probe operation for pgsqld on sds2: 7 (not running)
May 11 01:07:51 node2 crmd[5365]:  notice: sds2-pgsqld_monitor_0:6 [ /tmp:5866 - no response\n ]
May 11 01:07:51 node2 crmd[5365]:  notice: Result of probe operation for master-vip on sds2: 7 (not running)
May 11 01:10:02 node2 systemd: Started Session 16 of user root.
May 11 01:10:02 node2 systemd: Starting Session 16 of user root.
May 11 01:11:33 node2 pacemakerd[5357]:  notice: Caught 'Terminated' signal
May 11 01:11:33 node2 systemd: Stopping Pacemaker High Availability Cluster Manager...
May 11 01:11:33 node2 pacemakerd[5357]:  notice: Shutting down Pacemaker
May 11 01:11:33 node2 pacemakerd[5357]:  notice: Stopping crmd
May 11 01:11:33 node2 crmd[5365]:  notice: Caught 'Terminated' signal
May 11 01:11:33 node2 crmd[5365]:  notice: Shutting down cluster resource manager
May 11 01:12:49 node2 systemd: Started Session 17 of user root.
May 11 01:12:49 node2 systemd-logind: New session 17 of user root.
May 11 01:12:49 node2 gdm-launch-environment]: AccountsService: ActUserManager: user (null) has no username (object path: /org/freedesktop/Accounts/User0, uid: 0)
May 11 01:12:49 node2 journal: ActUserManager: user (null) has no username (object path: /org/freedesktop/Accounts/User0, uid: 0)
May 11 01:12:49 node2 systemd: Starting Session 17 of user root.
May 11 01:12:49 node2 dbus[648]: [system] Activating service name='org.freedesktop.problems' (using servicehelper)
May 11 01:12:49 node2 dbus-daemon: dbus[648]: [system] Activating service name='org.freedesktop.problems' (using servicehelper)
May 11 01:12:49 node2 dbus[648]: [system] Successfully activated service 'org.freedesktop.problems'
May 11 01:12:49 node2 dbus-daemon: dbus[648]: [system] Successfully activated service 'org.freedesktop.problems'
May 11 01:12:49 node2 journal: g_dbus_interface_skeleton_unexport: assertion 'interface_->priv->connections != NULL' failed

Here is the log in the peer node
May 11 01:09:08 node1 pgsqlms(pgsqld)[28599]: WARNING: No secondary connected to the master
May 11 01:09:08 node1 pgsqlms(pgsqld)[28599]: WARNING: "sds2" is not connected to the primary
May 11 01:09:08 node1 pgsqlms(pgsqld)[28599]: INFO: Execute action monitor and the result 8
May 11 01:09:18 node1 pgsqlms(pgsqld)[28679]: WARNING: No secondary connected to the master
May 11 01:09:18 node1 pgsqlms(pgsqld)[28679]: WARNING: "sds2" is not connected to the primary
May 11 01:09:18 node1 pgsqlms(pgsqld)[28679]: INFO: Execute action monitor and the result 8
May 11 01:09:24 node1 crmd[1111]:  notice: sds1-pgsqld_monitor_10000:19 [ /tmp:5866 - accepting connections\n ]
May 11 01:09:24 node1 crmd[1111]:  notice: Transition aborted by deletion of lrm_resource[@id='pgsqld']: Resource state removal
May 11 01:10:02 node1 systemd: Started Session 17 of user root.
May 11 01:10:02 node1 systemd: Starting Session 17 of user root.
May 11 01:11:33 node1 pacemakerd[1042]:  notice: Caught 'Terminated' signal
May 11 01:11:33 node1 systemd: Stopping Pacemaker High Availability Cluster Manager...
May 11 01:11:33 node1 pacemakerd[1042]:  notice: Shutting down Pacemaker
May 11 01:11:33 node1 pacemakerd[1042]:  notice: Stopping crmd
May 11 01:11:33 node1 crmd[1111]:  notice: Caught 'Terminated' signal
May 11 01:11:33 node1 crmd[1111]:  notice: Shutting down cluster resource manager
May 11 01:11:33 node1 crmd[1111]: warning: Input I_SHUTDOWN received in state S_TRANSITION_ENGINE from crm_shutdown




More information about the Users mailing list