[ClusterLabs] monitor failed actions not cleared
    LE COQUIL Pierre-Yves 
    pierre-yves.lecoquil at enfrasys.fr
       
    Mon Sep 25 10:58:16 EDT 2017
    
    
  
Hi,
I'am using Pacemaker  1.1.15-11.el7_3.4 / Corosync 2.4.0-4.el7 under CentOS 7.3.1611
My subject is very close to the post "clearing failed actions" initiated by Attila Megyeri in May 2017.
But the issue doesn't fit my case.
What I want to do is:
-          2 systemd resources running on 1 of the 2 nodes of my cluster,
-          When  1 resource fails (by killing it or by moving the resource), I want it to be restarted on the other node, but I want the other resource still running on the same node.
What I have done in addition to the default parameters:
-          For my resources:
o   migration-threshold=1,
o   failure-timeout=PT1M
-          For the cluster
o   Cluster-recheck-interval=120
I have added for my resource operation monitor: on-fail=restart (which is the default)
I do not use Fencing (Stonith Enabled = false)
What happens:
-          When I kill or move 1 resource, it is restarted on the other node => OK
-          The failcount is incremented to 1 for this resource => OK
-          The failcount is never cleared => NOK
I know that my english and my pacemaker knowledge are not so high but could you please give me some explanations about that behavior that I misunderstand.
Thanks
Pierre-Yves Le Coquil
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20170925/f4ad9a14/attachment-0002.html>
    
    
More information about the Users
mailing list