[ClusterLabs] dependency of pacemaker resources starting

lkxjtu lkxjtu at 163.com
Wed Oct 25 11:10:55 EDT 2017


My problem is about the pacemaker. For example,the pacemaker cluster has two resources, both of them resource agent are ocf. One of resource is starting(calling ocf start function), such as needing for 1 minutes, then in this 1 minutes, if another resource monitor failed, pacemaker will not immediately call stop/start method to restart it but waiting the first resource to starting complete. After the first resource start completely, the second resource begin restarting.
 
My cluster version: corosync 2.3.4 pacemaker 1.1.13
 
Configure as:
# crm configure show
node 168002177: 192.168.2.177
node 168002178: 192.168.2.178
node 168002179: 192.168.2.179
primitive fm_mgt fm_mgt \
        op monitor interval=20s timeout=120s \
        op stop interval=0 timeout=120s on-fail=restart \
        op start interval=0 timeout=120s on-fail=restart \
        meta target-role=Started
primitive logserver logserver \
        op monitor interval=20s timeout=120s \
        op stop interval=0 timeout=120s on-fail=restart \
        op start interval=0 timeout=120s on-fail=restart \
        meta target-role=Started
clone fm_mgt_replica fm_mgt
clone logserver_replica logserver
property cib-bootstrap-options: \
        have-watchdog=false \
        dc-version=1.1.13-10.el7-44eb2dd \
        cluster-infrastructure=corosync \
        stonith-enabled=false \
        start-failure-is-fatal=false
 
When I kill fm_mgt service on 177 node, pacemaker will stop and start it immediately after firts monitor failed. But at this time, I kill a logserver service, pacemaker will not restart it after monitor failed, but waiting fm_mgt recovery completely.There are 48 seconds between logserver monitor failed and stop/start action.
 
# crm status
Last updated: Thu Oct 26 06:40:24 2017          Last change: Thu Oct 26     06:36:33 2017 by root via crm_resource on 192.168.2.177
Stack: corosync
Current DC: 192.168.2.179 (version 1.1.13-10.el7-44eb2dd) - partition with quorum
3 nodes and 6 resources configured
Online: [ 192.168.2.177 192.168.2.178 192.168.2.179 ]
Full list of resources:
 Clone Set: logserver_replica [logserver]
     logserver  (ocf::heartbeat:logserver):     FAILED 192.168.2.177
     Started: [ 192.168.2.178 192.168.2.179 ]
 Clone Set: fm_mgt_replica [fm_mgt]
     Started: [ 192.168.2.178 192.168.2.179 ]
     Stopped: [ 192.168.2.177 ]
 
I am confusing very much. Is there something wrong configure?Thank you very much!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20171025/3841483b/attachment-0002.html>


More information about the Users mailing list