[ClusterLabs] Resources not monitored in SLES11 SP4 (1.1.12-f47ea56)
bubble at hoster-ok.com
Tue Jun 26 03:45:14 EDT 2018
26.06.2018 09:14, Ulrich Windl wrote:
> We just observed some strange effect we cannot explain in SLES 11 SP4 (pacemaker 1.1.12-f47ea56):
> We run about a dozen of Xen PVMs on a three-node cluster (plus some infrastructure and monitoring stuff). It worked all well so far, and there was no significant change recently.
> However when a colleague stopped on VM for maintenance via cluster command, the cluster did not notice when the PVM actually was running again (it had been started not using the cluster (a bad idea, I know)).
To be on a safe side in such cases you'd probably want to enable
additional monitor for a "Stopped" role. Default one covers only
"Started" role. The same thing as for multistate resources, where you
need several monitor ops, for "Started/Slave" and "Master" roles.
But, this will increase a load.
And, I believe cluster should reprobe a resource on all nodes once you
change target-role back to "Started".
> Examining the logs, it seems that the recheck timer popped periodically, but no monitor action was run for the VM (the action is configured to run every 10 minutes).
> Actually the only monitor operations found were:
> May 23 08:04:13
> Jun 13 08:13:03
> Jun 25 09:29:04
> Then a manual "reprobe" was done, and several monitor operations were run.
> Then again I see no more monitor actions in syslog.
> What could be the reasons for this? Too many operations defined?
> The other message I don't understand is like "<other-resource>: Rolling back scores from <vm-resource>"
> Could it be a new bug introduced in pacemaker, or could it be some configuration problem (The status is completely clean however)?
> According to the packet changelog, there was no change since Nov 2016...
> Users mailing list: Users at clusterlabs.org
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Users