[ClusterLabs] Antw: Re: Antw: [EXT] VIP monitor Timed Out
Ulrich Windl
Ulrich.Windl at rz.uni-regensburg.de
Mon Jul 5 04:05:42 EDT 2021
>>> Klaus Wenninger <kwenning at redhat.com> schrieb am 05.07.2021 um 09:13 in
Nachricht
<CALrDAo39EwprjNgrdrxDn7Gfnzn6a3u=NnCp7boszJ4rVP3h7g at mail.gmail.com>:
> Using DHCP? Maybe a glitch/issue during renewal ... but elaborate
> monitoring as suggested should show that ...
>
> On Mon, Jul 5, 2021 at 9:03 AM Ulrich Windl <
> Ulrich.Windl at rz.uni-regensburg.de> wrote:
>
>> Hi!
>>
>> See "ip_served" and "find_interface" (essentially "$IP2UTIL -o -f $FAMILY
>> addr
>> show") in the RA.
>> Basically it searches _all_ interfaces for $ipaddr/$netmask to locate the
>> interface when it could also examine the interface and look at the
address.
>> For many interfaces it could make a difference performance-wise IMHO.
>> Maybe so a periodic sampling how long the corresponding command takes for
s/so/do/ # Sorry!
>> your
>> setup.
>> If it's not a timing issue, the interface may actually be gone
>> temporarily, or
>> the tools could have bugs.
>>
>> Regards,
>> Ulrich
>>
>> >>> PASERO Florent <florent.pasero at externe.bnpparibas.com> schrieb am
>> 01.07.2021 um
>> 17:29 in Nachricht
>> <
>>
>
PR0P264MB21394030D5C5120BB885E95DB4009 at PR0P264MB2139.FRAP264.PROD.OUTLOOK.COM
>> >:
>>
>> > Hi,
>> >
>> > Once or twice a week, we have a 'Timed out' on our VIP:
>> > ~$ pcs status
>> > Cluster name: zbx_pprod_Web_Core
>> > Cluster Summary:
>> > * Stack: corosync
>> > * Current DC: #####(version 2.0.5‑9.el8_4.1‑ba59be7122) ‑ partition
>> with
>> > quorum
>> > * Last updated: Mon Jun 28 16:32:09 2021
>> > * Last change: Mon Jun 14 12:42:57 2021 by root via cibadmin on
######
>> > * 2 nodes configured
>> > * 2 resource instances configured
>> >
>> > Node List:
>> > * Online: [ ##### #####]
>> >
>> > Full List of Resources:
>> > * Resource Group: zbx_pprod_Web_Core:
>> > * VIP (ocf::heartbeat:IPaddr2): Started #####
>> > * ZabbixServer (systemd:zabbix‑server): Started ######
>> >
>> > Failed Resource Actions:
>> > * VIP_monitor_5000 on ##### 'error' (1): call=69, status='Timed Out',
>> > exitreason='', last‑rc‑change='2021‑06‑24 14:41:57 +02:00', queued=0ms,
>> exec=0ms
>> > * VIP_monitor_5000 on ##### 'error' (1): call=11, status='Timed Out',
>> > exitreason='', last‑rc‑change='2021‑06‑17 14:18:20 +02:00', queued=0ms,
>> exec=0ms
>> >
>> >
>> > We have the same issue on two completely different clusters.
>> >
>> > We can see in the log :
>> > Jun 24 14:41:29 ##### pacemaker‑execd [1442069]
>> (child_timeout_callback)
>>
>> > warning: VIP_monitor_5000 process (PID 2752333) timed out
>> > Jun 24 14:41:34 #####pacemaker‑execd [1442069]
>> (child_timeout_callback)
>>
>> > crit: VIP_monitor_5000 process (PID 2752333) will not die!
>> > Jun 24 14:41:57 ##### pacemaker‑execd [1442069]
>> (operation_finished)
>>
>> > warning: VIP_monitor_5000[2752333] timed out after 20000ms
>> > Jun 24 14:41:57 ##### pacemaker‑controld [1442072] (process_lrm_event)
>> > error: Result of monitor operation for VIP on #####: Timed Out | call=69
>> > key=VIP_monitor_5000 timeout=20000ms
>> > Jun 24 14:41:57 ##### pacemaker‑based [1442067]
>> (cib_process_request)
>>
>> > info: Forwarding cib_modify operation for section status to all
>> > (origin=local/crmd/722)
>> > Jun 24 14:41:57 ##### pacemaker‑based [1442067] (cib_perform_op)
>> > info: Diff: ‑‑‑ 0.54.443 2
>> > Jun 24 14:41:57 ##### pacemaker‑based [1442067] (cib_perform_op)
>> > info: Diff: +++ 0.54.444 (null)
>> > Jun 24 14:41:57 ##### pacemaker‑based [1442067] (cib_perform_op)
>> > info: + /cib: @num_updates=444
>> >
>> >
>> > Thanks for help
>> >
>> >
>> >
>> > Classification : Internal
>> > This message and any attachments (the "message") is
>> > intended solely for the intended addressees and is confidential.
>> > If you receive this message in error,or are not the intended
>> recipient(s),
>> > please delete it and any copies from your systems and immediately notify
>> > the sender. Any unauthorized view, use that does not comply with its
>> > purpose,
>> > dissemination or disclosure, either whole or partial, is prohibited.
>> Since
>> > the internet
>> > cannot guarantee the integrity of this message which may not be
>> reliable,
>> > BNP PARIBAS
>> > (and its subsidiaries) shall not be liable for the message if modified,
>> > changed or falsified.
>> > Do not print this message unless it is necessary, consider the
>> environment.
>> >
>> >
>>
>>
>
‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑
> ‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑
>> >
>> > Ce message et toutes les pieces jointes (ci‑apres le "message")
>> > sont etablis a l'intention exclusive de ses destinataires et sont
>> > confidentiels.
>> > Si vous recevez ce message par erreur ou s'il ne vous est pas destine,
>> > merci de le detruire ainsi que toute copie de votre systeme et d'en
>> avertir
>> > immediatement l'expediteur. Toute lecture non autorisee, toute
>> utilisation
>> > de
>> > ce message qui n'est pas conforme a sa destination, toute diffusion ou
>> toute
>>
>> >
>> > publication, totale ou partielle, est interdite. L'Internet ne
>> permettant
>> > pas d'assurer
>> > l'integrite de ce message electronique susceptible d'alteration, BNP
>> Paribas
>>
>> >
>> > (et ses filiales) decline(nt) toute responsabilite au titre de ce
>> message
>> > dans l'hypothese
>> > ou il aurait ete modifie, deforme ou falsifie.
>> > N'imprimez ce message que si necessaire, pensez a l'environnement.
>>
>>
>>
>> _______________________________________________
>> Manage your subscription:
>> https://lists.clusterlabs.org/mailman/listinfo/users
>>
>> ClusterLabs home: https://www.clusterlabs.org/
>>
More information about the Users
mailing list