[ClusterLabs] Antw: [EXT] VIP monitor Timed Out

Klaus Wenninger kwenning at redhat.com
Mon Jul 5 03:13:30 EDT 2021


Using DHCP? Maybe a glitch/issue during renewal ... but elaborate
monitoring as suggested should show that ...

On Mon, Jul 5, 2021 at 9:03 AM Ulrich Windl <
Ulrich.Windl at rz.uni-regensburg.de> wrote:

> Hi!
>
> See "ip_served" and "find_interface" (essentially "$IP2UTIL -o -f $FAMILY
> addr
> show") in the RA.
> Basically it searches _all_ interfaces for $ipaddr/$netmask to locate the
> interface when it could also examine the interface and look at the address.
> For many interfaces it could make a difference performance-wise IMHO.
> Maybe so a periodic sampling how long the corresponding command takes for
> your
> setup.
> If it's not a timing issue, the interface may actually be gone
> temporarily, or
> the tools could have bugs.
>
> Regards,
> Ulrich
>
> >>> PASERO Florent <florent.pasero at externe.bnpparibas.com> schrieb am
> 01.07.2021 um
> 17:29 in Nachricht
> <
> PR0P264MB21394030D5C5120BB885E95DB4009 at PR0P264MB2139.FRAP264.PROD.OUTLOOK.COM
> >:
>
> > Hi,
> >
> > Once or twice a week, we have a 'Timed out' on our VIP:
> > ~$ pcs status
> > Cluster name: zbx_pprod_Web_Core
> > Cluster Summary:
> >   * Stack: corosync
> >   * Current DC: #####(version 2.0.5‑9.el8_4.1‑ba59be7122) ‑ partition
> with
> > quorum
> >   * Last updated: Mon Jun 28 16:32:09 2021
> >   * Last change:  Mon Jun 14 12:42:57 2021 by root via cibadmin on ######
> >   * 2 nodes configured
> >   * 2 resource instances configured
> >
> > Node List:
> >   * Online: [ ##### #####]
> >
> > Full List of Resources:
> >   * Resource Group: zbx_pprod_Web_Core:
> >     * VIP       (ocf::heartbeat:IPaddr2):        Started #####
> >     * ZabbixServer      (systemd:zabbix‑server):         Started ######
> >
> > Failed Resource Actions:
> >   * VIP_monitor_5000 on ##### 'error' (1): call=69, status='Timed Out',
> > exitreason='', last‑rc‑change='2021‑06‑24 14:41:57 +02:00', queued=0ms,
> exec=0ms
> >   * VIP_monitor_5000 on ##### 'error' (1): call=11, status='Timed Out',
> > exitreason='', last‑rc‑change='2021‑06‑17 14:18:20 +02:00', queued=0ms,
> exec=0ms
> >
> >
> > We have the same issue on two completely different clusters.
> >
> > We can see in the log :
> > Jun 24 14:41:29 ##### pacemaker‑execd     [1442069]
> (child_timeout_callback)
>
> >    warning: VIP_monitor_5000 process (PID 2752333) timed out
> > Jun 24 14:41:34 #####pacemaker‑execd     [1442069]
> (child_timeout_callback)
>
> >   crit: VIP_monitor_5000 process (PID 2752333) will not die!
> > Jun 24 14:41:57 ##### pacemaker‑execd     [1442069]
> (operation_finished)
>
> >    warning: VIP_monitor_5000[2752333] timed out after 20000ms
> > Jun 24 14:41:57 ##### pacemaker‑controld  [1442072] (process_lrm_event)
> > error: Result of monitor operation for VIP on #####: Timed Out | call=69
> > key=VIP_monitor_5000 timeout=20000ms
> > Jun 24 14:41:57 ##### pacemaker‑based     [1442067]
> (cib_process_request)
>
> >    info: Forwarding cib_modify operation for section status to all
> > (origin=local/crmd/722)
> > Jun 24 14:41:57 ##### pacemaker‑based     [1442067] (cib_perform_op)
> > info: Diff: ‑‑‑ 0.54.443 2
> > Jun 24 14:41:57 ##### pacemaker‑based     [1442067] (cib_perform_op)
> > info: Diff: +++ 0.54.444 (null)
> > Jun 24 14:41:57 ##### pacemaker‑based     [1442067] (cib_perform_op)
> > info: +  /cib:  @num_updates=444
> >
> >
> > Thanks for help
> >
> >
> >
> > Classification : Internal
> > This message and any attachments (the "message") is
> > intended solely for the intended addressees and is confidential.
> > If you receive this message in error,or are not the intended
> recipient(s),
> > please delete it and any copies from your systems and immediately notify
> > the sender. Any unauthorized view, use that does not comply with its
> > purpose,
> > dissemination or disclosure, either whole or partial, is prohibited.
> Since
> > the internet
> > cannot guarantee the integrity of this message which may not be
> reliable,
> > BNP PARIBAS
> > (and its subsidiaries) shall not be liable for the message if modified,
> > changed or falsified.
> > Do not print this message unless it is necessary, consider the
> environment.
> >
> >
>
> ‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑
> >
> > Ce message et toutes les pieces jointes (ci‑apres le "message")
> > sont etablis a l'intention exclusive de ses destinataires et sont
> > confidentiels.
> > Si vous recevez ce message par erreur ou s'il ne vous est pas destine,
> > merci de le detruire ainsi que toute copie de votre systeme et d'en
> avertir
> > immediatement l'expediteur. Toute lecture non autorisee, toute
> utilisation
> > de
> > ce message qui n'est pas conforme a sa destination, toute diffusion ou
> toute
>
> >
> > publication, totale ou partielle, est interdite. L'Internet ne
> permettant
> > pas d'assurer
> > l'integrite de ce message electronique susceptible d'alteration, BNP
> Paribas
>
> >
> > (et ses filiales) decline(nt) toute responsabilite au titre de ce
> message
> > dans l'hypothese
> > ou il aurait ete modifie, deforme ou falsifie.
> > N'imprimez ce message que si necessaire, pensez a l'environnement.
>
>
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20210705/c30a9124/attachment.htm>


More information about the Users mailing list