[ClusterLabs] Antw: Re: why is node fenced ?

Lentes, Bernd bernd.lentes at helmholtz-muenchen.de
Tue Aug 13 10:03:58 EDT 2019



----- On Aug 13, 2019, at 3:14 PM, Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de wrote:

> You said you booted the hosts sequentially. From the logs they were starting in
> parallel.
> 

No. last says:
ha-idg-1: 
reboot   system boot  4.12.14-95.29-de Fri Aug  9 17:42 - 15:56 (3+22:14)

ha-idg-2:
reboot   system boot  4.12.14-95.29-de Fri Aug  9 18:08 - 15:58 (3+21:49)
root     pts/0        10.35.34.70      Fri Aug  9 17:24 - crash  (00:44)
(unknown :0           :0               Fri Aug  9 17:24 - crash  (00:44)
reboot   system boot  4.12.14-95.29-de Fri Aug  9 17:23 - 15:58 (3+22:34)

>> This is the initialization of the bond1 on ha‑idg‑1 during boot.
>> 3 seconds later bond1 is fine:
>> 
>> 2019‑08‑09T17:42:19.299886+02:00 ha‑idg‑2 kernel: [ 1232.117470] tg3
>> 0000:03:04.0 eth2: Link is up at 1000 Mbps, full duplex
>> 2019‑08‑09T17:42:19.299908+02:00 ha‑idg‑2 kernel: [ 1232.117482] tg3
>> 0000:03:04.0 eth2: Flow control is on for TX and on for RX
>> 2019‑08‑09T17:42:19.315756+02:00 ha‑idg‑2 kernel: [ 1232.131565] tg3
>> 0000:03:04.1 eth3: Link is up at 1000 Mbps, full duplex
>> 2019‑08‑09T17:42:19.315767+02:00 ha‑idg‑2 kernel: [ 1232.131568] tg3
>> 0000:03:04.1 eth3: Flow control is on for TX and on for RX
>> 2019‑08‑09T17:42:19.351781+02:00 ha‑idg‑2 kernel: [ 1232.169386] bond1: link
> 
>> status definitely up for interface eth2, 1000 Mbps full duplex
>> 2019‑08‑09T17:42:19.351792+02:00 ha‑idg‑2 kernel: [ 1232.169390] bond1:
> making
>> interface eth2 the new active one
>> 2019‑08‑09T17:42:19.352521+02:00 ha‑idg‑2 kernel: [ 1232.169473] bond1:
> first
>> active interface up!
>> 2019‑08‑09T17:42:19.352532+02:00 ha‑idg‑2 kernel: [ 1232.169480] bond1: link
> 
>> status definitely up for interface eth3, 1000 Mbps full duplex
>> 
>> also on ha‑idg‑1:
>> 
>> 2019‑08‑09T17:42:19.168035+02:00 ha‑idg‑1 kernel: [  110.164250] tg3
>> 0000:02:00.3 eth3: Link is up at 1000 Mbps, full duplex
>> 2019‑08‑09T17:42:19.168050+02:00 ha‑idg‑1 kernel: [  110.164252] tg3
>> 0000:02:00.3 eth3: Flow control is on for TX and on for RX
>> 2019‑08‑09T17:42:19.168052+02:00 ha‑idg‑1 kernel: [  110.164254] tg3
>> 0000:02:00.3 eth3: EEE is disabled
>> 2019‑08‑09T17:42:19.172020+02:00 ha‑idg‑1 kernel: [  110.171378] tg3
>> 0000:02:00.2 eth2: Link is up at 1000 Mbps, full duplex
>> 2019‑08‑09T17:42:19.172028+02:00 ha‑idg‑1 kernel: [  110.171380] tg3
>> 0000:02:00.2 eth2: Flow control is on for TX and on for RX
>> 2019‑08‑09T17:42:19.172029+02:00 ha‑idg‑1 kernel: [  110.171382] tg3
>> 0000:02:00.2 eth2: EEE is disabled
>>  ...
>> 2019‑08‑09T17:42:19.244066+02:00 ha‑idg‑1 kernel: [  110.240310] bond1: link
> 
>> status definitely up for interface eth2, 1000 Mbps full duplex
>> 2019‑08‑09T17:42:19.244083+02:00 ha‑idg‑1 kernel: [  110.240311] bond1:
> making
>> interface eth2 the new active one
>> 2019‑08‑09T17:42:19.244085+02:00 ha‑idg‑1 kernel: [  110.240353] bond1:
> first
>> active interface up!
>> 2019‑08‑09T17:42:19.244087+02:00 ha‑idg‑1 kernel: [  110.240356] bond1: link
> 
>> status definitely up for interface eth3, 1000 Mbps full duplex
>> 
>> And the cluster is started afterwards on ha‑idg‑1 at 17:43:04. I don't find
> 
>> further entries for problems with bond1. So i think it's not related.
>> Time is synchronized by ntp.

The two bonding devices (bond1) are connected directly (point-to-point).
So if eth2 or eth3, the ones for the bonding, go online on one host the other host
sees it directly.


Bernd
 

Helmholtz Zentrum Muenchen
Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
Ingolstaedter Landstr. 1
85764 Neuherberg
www.helmholtz-muenchen.de
Aufsichtsratsvorsitzende: MinDir'in Prof. Dr. Veronika von Messling
Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich Bassler, Kerstin Guenther
Registergericht: Amtsgericht Muenchen HRB 6466
USt-IdNr: DE 129521671



More information about the Users mailing list