<div dir="ltr">Hi<div><br></div><div>I am having a problem with a simple Active/Passive cluster which consists in the next configuration<br clear="all"><div><br></div><div><div>Cluster Name: kamcluster</div><div>Corosync Nodes:</div><div> kam1vs3 kam2vs3</div><div>Pacemaker Nodes:</div><div> kam1vs3 kam2vs3</div><div><br></div><div>Resources:</div><div> Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2)</div><div>  Attributes: ip=10.0.1.206 cidr_netmask=32</div><div>  Operations: start interval=0s timeout=20s (ClusterIP-start-interval-0s)</div><div>              stop interval=0s timeout=20s (ClusterIP-stop-interval-0s)</div><div>              monitor interval=10s (ClusterIP-monitor-interval-10s)</div><div> Resource: ClusterIP2 (class=ocf provider=heartbeat type=IPaddr2)</div><div>  Attributes: ip=10.0.1.207 cidr_netmask=32</div><div>  Operations: start interval=0s timeout=20s (ClusterIP2-start-interval-0s)</div><div>              stop interval=0s timeout=20s (ClusterIP2-stop-interval-0s)</div><div>              monitor interval=10s (ClusterIP2-monitor-interval-10s)</div><div> Resource: rtpproxycluster (class=systemd type=rtpproxy)</div><div>  Operations: monitor interval=10s (rtpproxycluster-monitor-interval-10s)</div><div>              stop interval=0s on-fail=block (rtpproxycluster-stop-interval-0s)</div><div> Resource: kamailioetcfs (class=ocf provider=heartbeat type=Filesystem)</div><div>  Attributes: device=/dev/drbd1 directory=/etc/kamailio fstype=ext4</div><div>  Operations: start interval=0s timeout=60 (kamailioetcfs-start-interval-0s)</div><div>              monitor interval=10s on-fail=fence (kamailioetcfs-monitor-interval-1                                                                                      0s)</div><div>              stop interval=0s on-fail=fence (kamailioetcfs-stop-interval-0s)</div><div> Clone: fence_kam2_xvm-clone</div><div>  Meta Attrs: interleave=true clone-max=2 clone-node-max=1</div><div>  Resource: fence_kam2_xvm (class=stonith type=fence_xvm)</div><div>   Attributes: port=tegamjg_kam2 pcmk_host_list=kam2vs3</div><div>   Operations: monitor interval=60s (fence_kam2_xvm-monitor-interval-60s)</div><div> Master: kamailioetcclone</div><div>  Meta Attrs: master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=t                                                                                      rue on-fail=fence</div><div>  Resource: kamailioetc (class=ocf provider=linbit type=drbd)</div><div>   Attributes: drbd_resource=kamailioetc</div><div>   Operations: start interval=0s timeout=240 (kamailioetc-start-interval-0s)</div><div>               promote interval=0s on-fail=fence (kamailioetc-promote-interval-0s)</div><div>               demote interval=0s on-fail=fence (kamailioetc-demote-interval-0s)</div><div>               stop interval=0s on-fail=fence (kamailioetc-stop-interval-0s)</div><div>               monitor interval=10s (kamailioetc-monitor-interval-10s)</div><div> Clone: fence_kam1_xvm-clone</div><div>  Meta Attrs: interleave=true clone-max=2 clone-node-max=1</div><div>  Resource: fence_kam1_xvm (class=stonith type=fence_xvm)</div><div>   Attributes: port=tegamjg_kam1 pcmk_host_list=kam1vs3</div><div>   Operations: monitor interval=60s (fence_kam1_xvm-monitor-interval-60s)</div><div> Resource: kamailiocluster (class=ocf provider=heartbeat type=kamailio)</div><div>  Attributes: listen_address=10.0.1.206 conffile=/etc/kamailio/kamailio.cfg pidfil                                                                                      e=/var/run/kamailio.pid monitoring_ip=10.0.1.206 monitoring_ip2=10.0.1.207 port=50                                                                                      60 proto=udp kamctlrc=/etc/kamailio/kamctlrc shmem=128 pkg=8</div><div>  Meta Attrs: target-role=Stopped</div><div>  Operations: start interval=0s timeout=60 (kamailiocluster-start-interval-0s)</div><div>              stop interval=0s timeout=30 (kamailiocluster-stop-interval-0s)</div><div>              monitor interval=5s (kamailiocluster-monitor-interval-5s)</div><div><br></div><div>Stonith Devices:</div><div>Fencing Levels:</div><div><br></div><div>Location Constraints:</div><div>Ordering Constraints:</div><div>  start fence_kam1_xvm-clone then start fence_kam2_xvm-clone (kind:Mandatory) (id:                                                                                      order-fence_kam1_xvm-clone-fence_kam2_xvm-clone-mandatory)</div><div>  start fence_kam2_xvm-clone then promote kamailioetcclone (kind:Mandatory) (id:or                                                                                      der-fence_kam2_xvm-clone-kamailioetcclone-mandatory)</div><div>  promote kamailioetcclone then start kamailioetcfs (kind:Optional) (id:order-kama                                                                                      ilioetcclone-kamailioetcfs-Optional)</div><div>  Resource Sets:</div><div>    set kamailioetcfs sequential=true (id:pcs_rsc_set_kamailioetcfs) set ClusterIP                                                                                       ClusterIP2 sequential=false (id:pcs_rsc_set_ClusterIP_ClusterIP2) set rtpproxyclu                                                                                      ster kamailiocluster sequential=true (id:pcs_rsc_set_rtpproxycluster_kamailioclust                                                                                      er) (id:pcs_rsc_order_set_kamailioetcfs_set_ClusterIP_ClusterIP2_set_rtpproxyclust                                                                                      er_kamailiocluster)</div><div>Colocation Constraints:</div><div>  rtpproxycluster with ClusterIP2 (score:INFINITY) (id:colocation-rtpproxycluster-                                                                                      ClusterIP2-INFINITY)</div><div>  ClusterIP2 with ClusterIP (score:INFINITY) (id:colocation-ClusterIP2-ClusterIP-I                                                                                      NFINITY)</div><div>  ClusterIP with kamailioetcfs (score:INFINITY) (id:colocation-ClusterIP-kamailioe                                                                                      tcfs-INFINITY)</div><div>  kamailioetcfs with kamailioetcclone (score:INFINITY) (with-rsc-role:Master) (id:                                                                                      colocation-kamailioetcfs-kamailioetcclone-INFINITY)</div><div>  fence_kam2_xvm-clone with fence_kam1_xvm-clone (score:INFINITY) (id:colocation-f                                                                                      ence_kam2_xvm-clone-fence_kam1_xvm-clone-INFINITY)</div><div>  kamailioetcclone with fence_kam2_xvm-clone (score:INFINITY) (id:colocation-kamai                                                                                      lioetcclone-fence_kam2_xvm-clone-INFINITY)</div><div>  kamailiocluster with rtpproxycluster (score:INFINITY) (id:colocation-kamailioclu                                                                                      ster-rtpproxycluster-INFINITY)</div><div><br></div><div>Resources Defaults:</div><div> migration-threshold: 2</div><div> failure-timeout: 10m</div><div> resource-stickiness: 200</div><div>Operations Defaults:</div><div> No defaults set</div><div><br></div><div>Cluster Properties:</div><div> cluster-infrastructure: corosync</div><div> cluster-name: kamcluster</div><div> dc-version: 1.1.13-10.el7_2.2-44eb2dd</div><div> have-watchdog: false</div><div> last-lrm-refresh: 1471479940</div><div> no-quorum-policy: ignore</div><div> start-failure-is-fatal: false</div><div> stonith-action: reboot</div><div> stonith-enabled: true</div></div><div><br></div><div>Now my problem is when the rtpproxy systemd resource starts/stops, i am experiencing 2 seconds delay between the call from the start/stop signal is sent to systemd and the confirmation and interpretation of the action to the crmd. I am pretty sure that the initiation of this service doesnt take even half of that time to start/stop. i did the same thing with a dummy service  as a systemd resource and it als takes 2 seconds to start/stop the resource. Also, this is not happening in any of the ocf resources. I am putting what i see in the logs about this behaviour</div><div><br></div><div>This is after doing "pcs resource restart rtpproxycluster"</div><div><br></div><div><div>Aug 17 20:59:18 [13187] kam1       crmd:   notice: te_rsc_command:      Initiating action 14: stop rtpproxycluster_stop_0 on kam1vs3 (local)</div><div>Aug 17 20:59:18 [13184] kam1       lrmd:     info: cancel_recurring_action:     Cancelling systemd operation rtpproxycluster_status_10000</div><div>Aug 17 20:59:18 [13187] kam1       crmd:     info: do_lrm_rsc_op:       Performing key=14:30:0:8a202722-ece2-4617-b26e-8d4aa5f3522b op=rtpproxycluster_stop_0</div><div>Aug 17 20:59:18 [13184] kam1       lrmd:     info: log_execute: executing - rsc:rtpproxycluster action:stop call_id:106</div><div>Aug 17 20:59:18 [13187] kam1       crmd:     info: process_lrm_event:   Operation rtpproxycluster_monitor_10000: Cancelled (node=kam1vs3, call=104, confirmed=true)</div><div>Aug 17 20:59:18 [13184] kam1       lrmd:     info: systemd_exec_result: Call to stop passed: /org/freedesktop/systemd1/job/8302</div><div>Aug 17 20:59:20 [13187] kam1       crmd:   notice: process_lrm_event:   Operation rtpproxycluster_stop_0: ok (node=kam1vs3, call=106, rc=0, cib-update=134, confirmed=true)</div><div>Aug 17 20:59:20 [13182] kam1        cib:     info: cib_perform_op:      +  /cib/status/node_state[@id='1']/lrm[@id='1']/lrm_resources/lrm_resource[@id='rtpproxycluster']/lrm_rsc_op[@id='rtpproxycluster_last_0']:  @operation_key=rtpproxycluster_stop_0, @operation=stop, @transition-key=14:30:0:8a202722-ece2-4617-b26e-8d4aa5f3522b, @transition-magic=0:0;14:30:0:8a202722-ece2-4617-b26e-8d4aa5f3522b, @call-id=106, @last-run=1471481958, @last-rc-change=1471481958, @exec-time=2116</div><div><br></div><div>Aug 17 20:59:20 [13186] kam1    pengine:     info: RecurringOp:  Start recurring monitor (10s) for rtpproxycluster on kam1vs3</div><div>Aug 17 20:59:20 [13186] kam1    pengine:   notice: LogActions:  Start   rtpproxycluster (kam1vs3)</div><div>Aug 17 20:59:20 [13187] kam1       crmd:   notice: te_rsc_command:      Initiating action 13: start rtpproxycluster_start_0 on kam1vs3 (local)</div><div>Aug 17 20:59:20 [13187] kam1       crmd:     info: do_lrm_rsc_op:       Performing key=13:31:0:8a202722-ece2-4617-b26e-8d4aa5f3522b op=rtpproxycluster_start_0</div><div>Aug 17 20:59:20 [13184] kam1       lrmd:     info: log_execute: executing - rsc:rtpproxycluster action:start call_id:107</div><div>Aug 17 20:59:21 [13184] kam1       lrmd:     info: systemd_exec_result: Call to start passed: /org/freedesktop/systemd1/job/8303</div><div>Aug 17 20:59:23 [13187] kam1       crmd:   notice: process_lrm_event:   Operation rtpproxycluster_start_0: ok (node=kam1vs3, call=107, rc=0, cib-update=136, confirmed=true)</div><div>Aug 17 20:59:23 [13182] kam1        cib:     info: cib_perform_op:      +  /cib/status/node_state[@id='1']/lrm[@id='1']/lrm_resources/lrm_resource[@id='rtpproxycluster']/lrm_rsc_op[@id='rtpproxycluster_last_0']:  @operation_key=rtpproxycluster_start_0, @operation=start, @transition-key=13:31:0:8a202722-ece2-4617-b26e-8d4aa5f3522b, @transition-magic=0:0;13:31:0:8a202722-ece2-4617-b26e-8d4aa5f3522b, @call-id=107, @last-run=1471481960, @last-rc-change=1471481960, @exec-time=2068</div></div><div><br></div><div>Why is this happening?, many of my resources depends on each other, so the delay it takes when failover its needed is quite a lot (just 4 seconds of delay stopping/starting just for that systemd resource.</div><div><br></div><div>Is this normal?, is there anyway to solve this?, Do i really need to make an OCF resource for that service (that would take some time that i dont really have at this moment)</div><div><br></div><div>Regards</div><div><br></div><div>Alejandro</div>
</div></div>