[ClusterLabs] VIP monitor Timed Out

Strahil Nikolov hunter86_bg at yahoo.com
Sat Jul 3 15:51:12 EDT 2021


I would try to add 'trace_ra=1' or 'trace_ra=1 trace_file=<some_file>' to debug it further. In the first option (without trace_file) , the file will be at /var/lib/heartbeat/trace_ra/<resource>/*timestamp


Are you sure that the system is not overloaded and can't respond in time ?

 
Best Regards,
Strahil Nikolov






В петък, 2 юли 2021 г., 17:53:06 ч. Гринуич+3, PASERO Florent <florent.pasero at externe.bnpparibas.com> написа: 





  


Hi,

 

Once or twice a week, we have a 'Timed out' on our VIP:

~$ pcs status

Cluster name: zbx_pprod_Web_Core

Cluster Summary:

  * Stack: corosync

  * Current DC: #####(version 2.0.5-9.el8_4.1-ba59be7122) - partition with quorum

  * Last updated: Mon Jun 28 16:32:09 2021

  * Last change:  Mon Jun 14 12:42:57 2021 by root via cibadmin on ######

  * 2 nodes configured

  * 2 resource instances configured

 

Node List:

  * Online: [ ##### #####]

 

Full List of Resources:

  * Resource Group: zbx_pprod_Web_Core:

    * VIP       (ocf::heartbeat:IPaddr2):        Started #####

    * ZabbixServer      (systemd:zabbix-server):         Started ######

 

Failed Resource Actions:

  * VIP_monitor_5000 on ##### 'error' (1): call=69, status='Timed Out', exitreason='', last-rc-change='2021-06-24 14:41:57 +02:00', queued=0ms, exec=0ms

  * VIP_monitor_5000 on ##### 'error' (1): call=11, status='Timed Out', exitreason='', last-rc-change='2021-06-17 14:18:20 +02:00', queued=0ms, exec=0ms

 

 

We have the same issue on two completely different clusters.

 

We can see in the log :

Jun 24 14:41:29 ##### pacemaker-execd     [1442069] (child_timeout_callback)     warning: VIP_monitor_5000 process (PID 2752333) timed out

Jun 24 14:41:34 #####pacemaker-execd     [1442069] (child_timeout_callback)     crit: VIP_monitor_5000 process (PID 2752333) will not die!

Jun 24 14:41:57 ##### pacemaker-execd     [1442069] (operation_finished)         warning: VIP_monitor_5000[2752333] timed out after 20000ms

Jun 24 14:41:57 ##### pacemaker-controld  [1442072] (process_lrm_event)  error: Result of monitor operation for VIP on #####: Timed Out | call=69 key=VIP_monitor_5000 timeout=20000ms

Jun 24 14:41:57 ##### pacemaker-based     [1442067] (cib_process_request)        info: Forwarding cib_modify operation for section status to all (origin=local/crmd/722)

Jun 24 14:41:57 ##### pacemaker-based     [1442067] (cib_perform_op)     info: Diff: --- 0.54.443 2

Jun 24 14:41:57 ##### pacemaker-based     [1442067] (cib_perform_op)     info: Diff: +++ 0.54.444 (null)

Jun 24 14:41:57 ##### pacemaker-based     [1442067] (cib_perform_op)     info: +  /cib:  @num_updates=444

 

 

Thanks for help

 


Classification : Internal




This message and any attachments (the "message") is
intended solely for the intended addressees and is confidential. 
If you receive this message in error,or are not the intended recipient(s), 
please delete it and any copies from your systems and immediately notify
the sender. Any unauthorized view, use that does not comply with its purpose, 
dissemination or disclosure, either whole or partial, is prohibited. Since the internet 
cannot guarantee the integrity of this message which may not be reliable, BNP PARIBAS 
(and its subsidiaries) shall not be liable for the message if modified, changed or falsified. 
Do not print this message unless it is necessary, consider the environment.

----------------------------------------------------------------------------------------------------------------------------------

Ce message et toutes les pieces jointes (ci-apres le "message") 
sont etablis a l'intention exclusive de ses destinataires et sont confidentiels.
Si vous recevez ce message par erreur ou s'il ne vous est pas destine,
merci de le detruire ainsi que toute copie de votre systeme et d'en avertir
immediatement l'expediteur. Toute lecture non autorisee, toute utilisation de 
ce message qui n'est pas conforme a sa destination, toute diffusion ou toute 
publication, totale ou partielle, est interdite. L'Internet ne permettant pas d'assurer
l'integrite de ce message electronique susceptible d'alteration, BNP Paribas 
(et ses filiales) decline(nt) toute responsabilite au titre de ce message dans l'hypothese
ou il aurait ete modifie, deforme ou falsifie. 
N'imprimez ce message que si necessaire, pensez a l'environnement.

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


More information about the Users mailing list