[ClusterLabs] Why is node fenced ?
Lentes, Bernd
bernd.lentes at helmholtz-muenchen.de
Thu Oct 10 11:22:28 EDT 2019
HI,
i have a two node cluster running on SLES 12 SP4.
I did some testing on it.
I put one into standby (ha-idg-2), the other (ha-idg-1) got fenced a few minutes later because i made a mistake.
ha-idg-2 was DC. ha-idg-1 made a fresh boot and i started corosync/pacemaker on it.
It seems ha-idg-1 didn't find the DC after starting cluster and some sec later elected itself to the DC,
afterwards fenced ha-idg-2.
Oct 09 18:04:43 [9550] ha-idg-1 corosync notice [MAIN ] Corosync Cluster Engine ('2.3.6'): started and ready to provide service.
Oct 09 18:04:43 [9550] ha-idg-1 corosync info [MAIN ] Corosync built-in features: debug testagents augeas systemd pie relro bindnow
Oct 09 18:04:43 [9550] ha-idg-1 corosync notice [TOTEM ] Initializing transport (UDP/IP Multicast).
Oct 09 18:04:43 [9550] ha-idg-1 corosync notice [TOTEM ] Initializing transmit/receive security (NSS) crypto: aes256 hash: sha1
Oct 09 18:04:43 [9550] ha-idg-1 corosync notice [TOTEM ] The network interface [192.168.100.10] is now up.
Oct 09 18:05:06 [9565] ha-idg-1 crmd: info: crm_timer_popped: Election Trigger (I_DC_TIMEOUT) just popped (20000ms)
Oct 09 18:05:06 [9565] ha-idg-1 crmd: warning: do_log: Input I_DC_TIMEOUT received in state S_PENDING from crm_timer_popped
Oct 09 18:05:06 [9565] ha-idg-1 crmd: info: do_state_transition: State transition S_PENDING -> S_ELECTION | input=I_DC_TIMEOUT cause=C_TIMER_POPPED origin=crm_timer_popped
Oct 09 18:05:06 [9565] ha-idg-1 crmd: info: election_check: election-DC won by local node
Oct 09 18:05:06 [9565] ha-idg-1 crmd: info: do_log: Input I_ELECTION_DC received in state S_ELECTION from election_win_cb
Oct 09 18:05:06 [9565] ha-idg-1 crmd: notice: do_state_transition: State transition S_ELECTION -> S_INTEGRATION | input=I_ELECTION_DC cause=C_FSA_INTERNAL origin=election_win_cb
Oct 09 18:05:06 [9565] ha-idg-1 crmd: info: do_te_control: Registering TE UUID: f302e1d4-a1aa-4a3e-b9dd-71bd17047f82
Oct 09 18:05:06 [9565] ha-idg-1 crmd: info: set_graph_functions: Setting custom graph functions
Oct 09 18:05:06 [9565] ha-idg-1 crmd: info: do_dc_takeover: Taking over DC status for this partition
Oct 09 18:05:07 [9564] ha-idg-1 pengine: warning: stage6: Scheduling Node ha-idg-2 for STONITH
Oct 09 18:05:07 [9564] ha-idg-1 pengine: notice: LogNodeActions: * Fence (Off) ha-idg-2 'node is unclean'
Is my understanding correct ?
In the log of ha-idg-2 i don't find anything for this period:
Oct 09 17:58:46 [12504] ha-idg-2 stonith-ng: info: cib_device_update: Device fence_ilo_ha-idg-2 has been disabled on ha-idg-2: score=-10000
Oct 09 17:58:51 [12503] ha-idg-2 cib: info: cib_process_ping: Reporting our current digest to ha-idg-2: 59c4cfb14defeafbeb3417e222242cd9 for 2.9506.36 (0x242b110 0)
Oct 09 18:00:42 [12508] ha-idg-2 crmd: info: throttle_send_command: New throttle mode: 0001 (was 0000)
Oct 09 18:01:12 [12508] ha-idg-2 crmd: info: throttle_check_thresholds: Moderate CPU load detected: 32.220001
Oct 09 18:01:12 [12508] ha-idg-2 crmd: info: throttle_send_command: New throttle mode: 0010 (was 0001)
Oct 09 18:01:42 [12508] ha-idg-2 crmd: info: throttle_send_command: New throttle mode: 0001 (was 0010)
Oct 09 18:02:42 [12508] ha-idg-2 crmd: info: throttle_send_command: New throttle mode: 0000 (was 0001)
ha-idg-2 is fenced and after a reboot i started corosync/pacmeaker on it again:
Oct 09 18:29:05 [11795] ha-idg-2 corosync notice [MAIN ] Corosync Cluster Engine ('2.3.6'): started and ready to provide service.
Oct 09 18:29:05 [11795] ha-idg-2 corosync info [MAIN ] Corosync built-in features: debug testagents augeas systemd pie relro bindnow
Oct 09 18:29:05 [11795] ha-idg-2 corosync notice [TOTEM ] Initializing transport (UDP/IP Multicast).
Oct 09 18:29:05 [11795] ha-idg-2 corosync notice [TOTEM ] Initializing transmit/receive security (NSS) crypto: aes256 hash: sha1
What is the meaning of the lines with the throttle ?
Thanks.
Bernd
--
Bernd Lentes
Systemadministration
Institut für Entwicklungsgenetik
Gebäude 35.34 - Raum 208
HelmholtzZentrum münchen
bernd.lentes at helmholtz-muenchen.de
phone: +49 89 3187 1241
phone: +49 89 3187 3827
fax: +49 89 3187 2294
http://www.helmholtz-muenchen.de/idg
Perfekt ist wer keine Fehler macht
Also sind Tote perfekt
Helmholtz Zentrum Muenchen
Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
Ingolstaedter Landstr. 1
85764 Neuherberg
www.helmholtz-muenchen.de
Aufsichtsratsvorsitzende: MinDir'in Prof. Dr. Veronika von Messling
Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Kerstin Guenther
Registergericht: Amtsgericht Muenchen HRB 6466
USt-IdNr: DE 129521671
More information about the Users
mailing list