[ClusterLabs] pacemaker-fenced /dev/shm errors

d tbsky tbskyd at gmail.com
Tue Mar 28 01:11:50 EDT 2023


Ken Gaillot <kgaillot at redhat.com>
> I'm glad it's resolved, but for future reference, that does indicate a
> serious problem. It means the fencer is not accepting any requests, so
> any fencing attempts or even attempts to monitor a fencing device from
> that node will fail.
>

   That sounds like pacemaker-fenced became some kind of zombie.
For testing, I block the connection between the node and ipmi-fencing
device. the fencing resource stopped and  report error like below:

Failed Resource Actions:
  * fence_ipmi start on c1.example.tw could not be executed (Timed
Out) because 'Fence agent did not complete in time' at Tue Mar 28
12:49:58 2023 after 20.004s

and it recovered when the connection recovered.
Does it mean fencing is still working?
I want to make sure if I saw message like "pacemaker-fenced[2405] is
unresponsive to ipc after 1 tries", does it mean permanent fail or the
second try success so it no more complains.


More information about the Users mailing list