[Pacemaker] Decreasing failover time when running DRBD+OCFS2+XEN in dual primary mode

Thu Jun 12 19:52:57 EDT 2014

On 12 Jun 2014, at 9:15 pm, kamal kishi <kamal.kishi at gmail.com> wrote:

> Hi All,
> 
> This might be a basic question but I'm not sure whats taking time for failover switching.
> Hope anyone can figure it out.

How about looking in the logs and seeing when the various stop/start actions occur and which ones take the longest?

> 
> Scenario - 
> Pacemaker running DRBD(Dual primary mode)+OCFS2+XEN for Virtual windows machine
> 
> Pacemaker startup starts - 
> DRBD -> OCFS2 -> XEN
> Lets consider under Server1  - DRBD, OCFS2(clone) and XEN are started
> 
> Server2 - DRBD, OCFS2(clone) are started
> 
> Now if Server1 power is OFF
> 
> The XEN resource which was running under Server1 should be failed over to Server2.
> 
> In my case, its taking almost 90 to 110 seconds to do this.
> 
> Can anyone suggest me ways to reduce it to within 30 to 40 seconds
> 
> My pacemaker configuration is -
> crm configure 
> property no-quorum-policy=ignore 
> property stonith-enabled=false 
> property default-resource-stickiness=1000 
> 
> primitive resDRBDr1 ocf:linbit:drbd \ 
> params drbd_resource="r0" \ 
> op start interval="0" timeout="240s" \ 
> op stop interval="0" timeout="100s" \ 
> op monitor interval="20s" role="Master" timeout="240s" \ 
> op monitor interval="30s" role="Slave" timeout="240s" \ 
> meta migration-threshold="3" failure-timeout="60s" 
> primitive resOCFS2r1 ocf:heartbeat:Filesystem \ 
> params device="/dev/drbd/by-res/r0" directory="/cluster" fstype="ocfs2" \ 
> op monitor interval="10s" timeout="60s" \ 
> op start interval="0" timeout="90s" \ 
> op stop interval="0" timeout="60s" \ 
> meta migration-threshold="3" failure-timeout="60s" 
> primitive resXen1 ocf:heartbeat:Xen \ 
> params xmfile="/home/cluster/xen/win7.cfg" name="xenwin7" \ 
> op monitor interval="20s" timeout="60s" \ 
> op start interval="0" timeout="90s" \ 
> op stop interval="0" timeout="60s" \ 
> op migrate_from interval="0" timeout="120s" \ 
> op migrate_to interval="0" timeout="120s" \ 
> meta allow-migrate="true" target-role="started" 
>  
> ms msDRBDr1 resDRBDr1 \ 
> meta notify="true" master-max="2" interleave="true" target-role="Started" 
> clone cloOCFS2r1 resOCFS2r1 \ 
> meta interleave="true" ordered="true" target-role="Started" 
>  
> colocation colOCFS12-with-DRBDrMaster inf: cloOCFS2r1 msDRBDr1:Master 
> colocation colXen-with-OCFSr1 inf: resXen1 cloOCFS2r1 
> order ordDRBD-before-OCFSr1 inf: msDRBDr1:promote cloOCFS2r1:start 
> order ordOCFS2r1-before-Xen1 inf: cloOCFS2r1:start resXen1:start 
>  
> commit 
> bye
> 
> -- 
> Regards,
> Kamal Kishore B V
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140613/eba22c4d/attachment-0003.sig>