[ClusterLabs] Antw: [EXT] Re: what is the point of pcs status error messages while the VIP is still up and service is retained?

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Fri Oct 8 02:07:45 EDT 2021


>>> Ken Gaillot <kgaillot at redhat.com> schrieb am 07.10.2021 um 22:28 in
Nachricht
<597cd05761a31365b34c6b349539478a5a8b8ced.camel at redhat.com>:
> On Thu, 2021-10-07 at 12:18 +0000, Ian Diddams wrote:
>> I trying to find out exactly what the impact/point of such cluster
>> error messages as from running “pcs status”
>> 
>> 
>> mysql_VIP01_monitor_30000 on wp-vldyn-rafeiro 'unknown error' (1):
>> call=467, status=Timed Out, exitreason='',
>> last-rc-change='Mon Oct 4 17:04:07 2021', queued=0ms, exec=0ms
>> 
>> This error may be  reported for several hours until "pcs resouce
>> refresh" is  run when it just of course goes away
> 
> It's a historical record, which is why the last-rc-change is listed. At
> that time, the IP monitor timed out. The cluster would have reacted as
> configured, most likely restarting the IP.

Actually that's a bit confusing:
I thought the monitoring failed, the cluster took some actions based on that,
but it wouldn't run the resource on that node and neither the monitoring until
the monitoring error state had been cleaned up.
For everyone: Only recently I learned when using crm_resource -C to cleanup,
you _must_ also specify -I# when specifying the action to clean up (-n action);
otherwise nothing is cleaned up, even when the messages being output might
indicate otherwise. That fact is explained rather vaguely in the manual page.

> 
>> 
>> This sever is also reporting as the master at the same time, and
>> other (xymon) monitoring checks showed 
>> 
>> * the VIP as "up" (conn check remained green ie was pingable)
>> * a mysql connection via the VIP was successful all the time (test
>> run every minute - from a another system "mysqlshow -h<VIP> -uroot
>> -p<redcated> mysql" )
>> 
>> so Im not getting what the point of the error is given it doesn’t
>> seem to actually affect the service and is just a visual obfuscation
>> that doesn’t seem to help?
> 
> Pacemaker is just reporting that the IP agent did not respond within
> its timeout. Whether the IP was actually affected and why that happened
> is unfortunately not something it can figure out.
> -- 
> Ken Gaillot <kgaillot at redhat.com>
> 
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/ 





More information about the Users mailing list