[ClusterLabs] why is node fenced ?
Chris Walker
cwalker at cray.com
Mon Aug 12 13:47:02 EDT 2019
When ha-idg-1 started Pacemaker around 17:43, it did not see ha-idg-2, for example,
Aug 09 17:43:05 [6318] ha-idg-1 pacemakerd: info: pcmk_quorum_notification: Quorum retained | membership=1320 members=1
after ~20s (dc-deadtime parameter), ha-idg-2 is marked 'unclean' and STONITHed as part of startup fencing.
There is nothing in ha-idg-2's HA logs around 17:43 indicating that it saw ha-idg-1 either, so it appears that there was no communication at all between the two nodes.
I'm not sure exactly why the nodes did not see one another, but there are indications of network issues around this time
2019-08-09T17:42:16.427947+02:00 ha-idg-2 kernel: [ 1229.245533] bond1: now running without any active interface!
so perhaps that's related.
HTH,
Chris
On 8/12/19, 12:09 PM, "Users on behalf of Lentes, Bernd" <users-bounces at clusterlabs.org on behalf of bernd.lentes at helmholtz-muenchen.de> wrote:
Hi,
last Friday (9th of August) i had to install patches on my two-node cluster.
I put one of the nodes (ha-idg-2) into standby (crm node standby ha-idg-2), patched it, rebooted,
started the cluster (systemctl start pacemaker) again, put the node again online, everything fine.
Then i wanted to do the same procedure with the other node (ha-idg-1).
I put it in standby, patched it, rebooted, started pacemaker again.
But then ha-idg-1 fenced ha-idg-2, it said the node is unclean.
I know that nodes which are unclean need to be shutdown, that's logical.
But i don't know from where the conclusion comes that the node is unclean respectively why it is unclean,
i searched in the logs and didn't find any hint.
I put the syslog and the pacemaker log on a seafile share, i'd be very thankful if you'll have a look.
https://hmgubox.helmholtz-muenchen.de/d/53a10960932445fb9cfe/
Here the cli history of the commands:
17:03:04 crm node standby ha-idg-2
17:07:15 zypper up (install Updates on ha-idg-2)
17:17:30 systemctl reboot
17:25:21 systemctl start pacemaker.service
17:25:47 crm node online ha-idg-2
17:26:35 crm node standby ha-idg1-
17:30:21 zypper up (install Updates on ha-idg-1)
17:37:32 systemctl reboot
17:43:04 systemctl start pacemaker.service
17:44:00 ha-idg-1 is fenced
Thanks.
Bernd
OS is SLES 12 SP4, pacemaker 1.1.19, corosync 2.3.6-9.13.1
--
Bernd Lentes
Systemadministration
Institut für Entwicklungsgenetik
Gebäude 35.34 - Raum 208
HelmholtzZentrum münchen
bernd.lentes at helmholtz-muenchen.de
phone: +49 89 3187 1241
phone: +49 89 3187 3827
fax: +49 89 3187 2294
http://www.helmholtz-muenchen.de/idg
Perfekt ist wer keine Fehler macht
Also sind Tote perfekt
Helmholtz Zentrum Muenchen
Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
Ingolstaedter Landstr. 1
85764 Neuherberg
www.helmholtz-muenchen.de
Aufsichtsratsvorsitzende: MinDir'in Prof. Dr. Veronika von Messling
Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich Bassler, Kerstin Guenther
Registergericht: Amtsgericht Muenchen HRB 6466
USt-IdNr: DE 129521671
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users
ClusterLabs home: https://www.clusterlabs.org/
More information about the Users
mailing list