[ClusterLabs] Fencing errors

Lopez, Francisco Javier [Global IT] franciscojavier.lopez at solera.com
Tue May 21 07:10:14 EDT 2019


Hello guys !

Need your help to try to understand and debug what I'm facing in one of my clusters.

I set up fencing with this detail:

# pcs -f stonith_cfg stonith create fence_ao_pg01 fence_vmware_soap ipaddr=<IP> ssl_insecure=1 login="<User>" passwd="<Passwd>" pcmk_reboot_action=reboot pcmk_host_list="ao-pg01-p.axadmin.net" power_wait=3 op monitor interval=60s
# pcs -f stonith_cfg stonith create fence_ao_pg02 fence_vmware_soap ipaddr=<IP> ssl_insecure=1 login="<User>" passwd="<Passwd>" pcmk_reboot_action=reboot pcmk_host_list="ao-pg02-p.axadmin.net" power_wait=3 op monitor interval=60s

# pcs -f stonith_cfg constraint location fence_ao_pg01 avoids ao-pg01-p.axadmin.net=INFINITY
# pcs -f stonith_cfg constraint location fence_ao_pg02 avoids ao-pg02-p.axadmin.net=INFINITY

# pcs cluster cib-push stonith_cfg

The pcs status shows all ok during some time and then it turns to:

[root at ao-pg01-p ~]# pcs status --full
Cluster name: ao_cl_p_01
Stack: corosync
Current DC: ao-pg01-p.axadmin.net (1) (version 1.1.19-8.el7_6.4-c3c624ea3d) - partition with quorum
Last updated: Tue May 21 12:18:46 2019
Last change: Fri May 17 18:54:32 2019 by hacluster via crmd on ao-pg01-p.axadmin.net

2 nodes configured
3 resources configured

Online: [ ao-pg01-p.axadmin.net (1) ao-pg02-p.axadmin.net (2) ]

Full list of resources:

 ao-cl-p-01-vip01    (ocf::heartbeat:IPaddr2):    Started ao-pg01-p.axadmin.net
 fence_ao_pg01    (stonith:fence_vmware_soap):    Stopped
 fence_ao_pg02    (stonith:fence_vmware_soap):    Stopped

Node Attributes:
* Node ao-pg01-p.axadmin.net (1):
* Node ao-pg02-p.axadmin.net (2):

Migration Summary:
* Node ao-pg02-p.axadmin.net (2):
   fence_ao_pg01: migration-threshold=1000000 fail-count=1000000 last-failure='Sat May 18 00:22:22 2019'
* Node ao-pg01-p.axadmin.net (1):
   fence_ao_pg02: migration-threshold=1000000 fail-count=1000000 last-failure='Fri May 17 20:52:53 2019'

Failed Actions:
* fence_ao_pg01_start_0 on ao-pg02-p.axadmin.net 'unknown error' (1): call=22, status=Timed Out, exitreason='',
    last-rc-change='Sat May 18 00:19:49 2019', queued=0ms, exec=20022ms
* fence_ao_pg02_start_0 on ao-pg01-p.axadmin.net 'unknown error' (1): call=84, status=Timed Out, exitreason='',
    last-rc-change='Fri May 17 20:52:33 2019', queued=0ms, exec=20032ms

PCSD Status:
  ao-pg02-p.axadmin.net: Online
  ao-pg01-p.axadmin.net: Online

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled


>From the output I see there seems to be a 'Timed Out' but I'd like to understand if this is a configuration issue
or something else I'm not aware of.

I'm attaching part of the log that shows the problem related to 17-May.

Regards
Francisco Javier​               Lopez
IT System Engineer       |      Global IT
O: +34 619 728 249<tel:+34%20619%20728%20249>    |      M: +34 619 728 249<tel:+34%20619%20728%20249>    |
franciscojavier.lopez at solera.com<mailto:franciscojavier.lopez at solera.com>        |      Solera.com<https://www.solera.com/>
Audatex Datos, S.A.      |      Avda. de Bruselas, 36, Salida 16, A‑1 (Diversia)        ,       Alcobendas      ,       Madrid  ,       28108   ,       Spain
[cid:image613011.png at EE748CA2.38730175]


________________________________

" Este e-mail y sus archivos adjuntos son confidenciales y están dirigidos exclusivamente a la(s) persona(s) destinataria prevista. Si ha recibido este mensaje por error, por favor, notifique inmediatamente al remitente y elimine este mensaje. La empresa no firma contratos por e-mail y todas las negociaciones están sujetas a la firma de un contrato por escrito.

This e-mail and any attached files are confidential and intended for the named addressee(s) only. If you have received this message in error, please notify the sender and delete the email immediately. The company does not conclude contracts by email and all negotiations are subject to written contract. "
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20190521/e4f0b6c1/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image613011.png
Type: image/png
Size: 8543 bytes
Desc: image613011.png
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20190521/e4f0b6c1/attachment-0001.png>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Mail_21May.txt
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20190521/e4f0b6c1/attachment-0001.txt>


More information about the Users mailing list