[ClusterLabs] cluster log not unambiguous about state of VirtualDomains
Ken Gaillot
kgaillot at redhat.com
Wed Aug 3 12:29:51 EDT 2022
The "found ... active" messages mean that it was the case at some
point, not necessarily currently. Newer versions log much better
messages like:
info: Probe found rsc1 active on node1 at Aug 1 15:41:34 2022
so you can see it was a historical result. The later "Started" messages
are where the cluster believes the resources are currently.
On Wed, 2022-08-03 at 17:01 +0200, Lentes, Bernd wrote:
> Hi,
>
> i have a strange behaviour found in the cluster log
> (/var/log/cluster/corosync.log).
> I KNOW that i put one node (ha-idg-2) in standby mode and stopped the
> pacemaker service on that node:
> The history of the shell says:
> 993 2022-08-02 18:28:25 crm node standby ha-idg-2
> 994 2022-08-02 18:28:58 systemctl stop pacemaker.service
>
> Later on i had some trouble with high load.
> I found contradictory entries in the log on the DC (ha-idg-1):
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> determine_op_status: Operation monitor found resource vm-
> documents-oo active on ha-idg-1
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> determine_op_status: Operation monitor found resource vm-
> documents-oo active on ha-idg-1
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> determine_op_status: Operation monitor found resource vm-mausdb
> active on ha-idg-1
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> determine_op_status: Operation monitor found resource vm-mausdb
> active on ha-idg-1
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> determine_op_status: Operation monitor found resource vm-
> photoshop active on ha-idg-1
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> determine_op_status: Operation monitor found resource vm-
> photoshop active on ha-idg-1
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> determine_op_status: Operation monitor found resource vm-encore
> active on ha-idg-1
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> determine_op_status: Operation monitor found resource vm-encore
> active on ha-idg-1
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> determine_op_status: Operation monitor found resource dlm:1
> active on ha-idg-2
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> determine_op_status: Operation monitor found resource vm-seneca
> active on ha-idg-2 <===
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> determine_op_status: Operation monitor found resource vm-pathway
> active on ha-idg-2 <===
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> determine_op_status: Operation monitor found resource vm-dietrich
> active on ha-idg-2 <===
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> determine_op_status: Operation monitor found resource vm-sim
> active on ha-idg-2 <===
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> determine_op_status: Operation monitor found resource vm-ssh
> active on ha-idg-2 <===
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> determine_op_status: Operation monitor found resource vm-
> nextcloud active on ha-idg-2 <===
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> determine_op_status: Operation monitor found resource fs_ocfs2:1
> active on ha-idg-2
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> determine_op_status: Operation monitor found resource
> gfs2_share:1 active on ha-idg-2
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> determine_op_status: Operation monitor found resource vm-geneious
> active on ha-idg-2 <===
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> determine_op_status: Operation monitor found resource gfs2_snap:1
> active on ha-idg-2 <===
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> determine_op_status: Operation monitor found resource vm-
> geneious-license-mcd active on ha-idg-2 <===
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> determine_op_status: Operation monitor found resource clvmd:1
> active on ha-idg-2
>
> The log says some VirtualDomains are running on ha-idg-2 !?!
>
> But just a few lines later the log says all VirtualDomains are
> running on ha-idg-1:
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> common_print: vm-
> mausdb (ocf::lentes:VirtualDomain): Started ha-idg-1
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> common_print: vm-sim (ocf::lentes:VirtualDomain): Started ha-
> idg-1 <===
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> common_print: vm-
> geneious (ocf::lentes:VirtualDomain): Started ha-idg-1 <===
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> common_print: vm-idcc-
> devel (ocf::lentes:VirtualDomain): Started ha-idg-1
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> common_print: vm-
> genetrap (ocf::lentes:VirtualDomain): Started ha-idg-1
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> common_print: vm-mouseidgenes
> (ocf::lentes:VirtualDomain): Started ha-idg-1
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> common_print: vm-
> greensql (ocf::lentes:VirtualDomain): Started ha-idg-1
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> common_print: vm-
> severin (ocf::lentes:VirtualDomain): Started ha-idg-1
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> common_print: ping_19216810010 (ocf::pacemaker:ping): Stop
> ped (disabled)
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> common_print: ping_19216810020 (ocf::pacemaker:ping): Stop
> ped (disabled)
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> common_print: vm_crispor (ocf::heartbeat:VirtualDomain):
> Stopped (unmanaged)
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> common_print: vm-
> dietrich (ocf::lentes:VirtualDomain): Started ha-idg-1 <===
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> common_print: vm-
> pathway (ocf::lentes:VirtualDomain): Started ha-idg-1 <===
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> common_print: vm-crispor-
> server (ocf::lentes:VirtualDomain): Started ha-idg-1
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> common_print: vm-geneious-
> license (ocf::lentes:VirtualDomain): Started ha-idg-1
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> common_print: vm-
> nextcloud (ocf::lentes:VirtualDomain): Started ha-idg-1 <===
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> common_print: vm-amok (ocf::lentes:VirtualDomain): Started ha-
> idg-1
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> common_print: vm-geneious-license-mcd
> (ocf::lentes:VirtualDomain): Started ha-idg-1 <===
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> common_print: vm-documents-oo
> (ocf::lentes:VirtualDomain): Started ha-idg-1
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> common_print: fs_test_ocfs2 (ocf::lentes:Filesystem.new): Star
> ted ha-idg-1
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> common_print: vm-ssh (ocf::lentes:VirtualDomain): Started ha-
> idg-1 <===
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> common_print: vm_snipanalysis
> (ocf::lentes:VirtualDomain): Stopped (disabled, unmanaged)
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> common_print: vm-
> seneca (ocf::lentes:VirtualDomain): Started ha-idg-1 <===
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> common_print: vm-
> photoshop (ocf::lentes:VirtualDomain): Started ha-idg-1
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> common_print: vm-check-
> mk (ocf::lentes:VirtualDomain): Started ha-idg-1
> Aug 03 00:14:04 [19367] ha-idg-1 pengine: info:
> common_print: vm-
> encore (ocf::lentes:VirtualDomain): Started ha-idg-1
>
> Why contradictory information ?
>
>
> Bernd
>
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
--
Ken Gaillot <kgaillot at redhat.com>
More information about the Users
mailing list