[Pacemaker] How to tell pacemaker to start exportfs after filesystem resource

Aleksander Malaev amalaev at alt-lan.ru
Tue Jun 21 10:59:52 EDT 2011


I'm not sure that Filesystem resource causes this behaviour. I'm doing some
tests now and taking logs.
I think it may be related to res-nfs group. Now I founded that portmap is
started by upstart before pacemaker and may be it is the reason of fail.

2011/6/21 Dejan Muhamedagic <dejanmm at fastmail.fm>

> Hi Vladislav,
>
> On Tue, Jun 21, 2011 at 05:38:21PM +0300, Vladislav Bogdanov wrote:
> > 21.06.2011 17:23, Dejan Muhamedagic wrote:
> > > On Tue, Jun 21, 2011 at 06:10:16PM +0400, Aleksander Malaev wrote:
> > >> How can I check this?
> > >> If I don't add this exportfs resource then cluster is become the fully
> > >> operational - all mounts are accesible and fail-over between nodes is
> > >> working as it should. May be I need to add some sort of delay between
> this
> > >> resources?
> > >
> > > If you need to do so (there's actually start-delay, but it
> > > should be deprecated), then some RA doesn't implement start
> > > action correctly. In this case, it looks like it's Filesystem,
> > > right? Since the filesystem is ocfs2 it may be that the cluster
> > > services supporting ocfs2 are not fast enough. At any rate,
> > > Filesystem shouldn't start before the filesystem is really
> > > mounted.
> >
> > If I recall correctly from my totally failed experiments with ocfs2
> > (simultaneous kernel panic on all nodes running f13-x86_64 ;), this is
> > ocfs2-specific problem.
> >
> > Although mount call returns success, ocfs2 filesystem may be not ready
> > for consumption for at least several seconds.
>
> That sounds like a plausible explanation. Before trying to fix
> ocfs2, which may take time or be impossible, we can make
> Filesystem use monitor internally to exit only once the
> filesystem has really been mounted. But please somebody first
> open a bugzilla, this needs to be tracked.
>
> BTW, interestingly I cannot recall that anybody complained about
> this before. It obviously depends on the network, but still...
>
> Cheers,
>
> Dejan
>
> > Best,
> > Vladislav
> >
> > >
> > > If so, please file a bugzilla for it and attach hb_report of the
> > > incident.
> > >
> > > Thanks,
> > >
> > > Dejan
> > >
> > >> 2011/6/21 Dejan Muhamedagic <dejanmm at fastmail.fm>
> > >>
> > >>> On Tue, Jun 21, 2011 at 05:56:40PM +0400, Aleksander Malaev wrote:
> > >>>> Sure, I'm using order constraint.
> > >>>> But it seems that it doesn't check monitor of the previous started
> > >>> resource.
> > >>>
> > >>> It doesn't need to check monitor. The previous resource, if
> > >>> started, must be fully operational. If it's not, then the RA is
> > >>> broken.
> > >>>
> > >>> Thanks,
> > >>>
> > >>> Dejan
> > >>>
> > >>>> 2011/6/21 Dejan Muhamedagic <dejanmm at fastmail.fm>
> > >>>>
> > >>>>> Hi,
> > >>>>>
> > >>>>> On Mon, Jun 20, 2011 at 11:40:04PM +0400, Александр Малаев wrote:
> > >>>>>> Hello,
> > >>>>>>
> > >>>>>> I have configured pacemaker+ocfs2 cluster with shared storage
> > >>> connected
> > >>>>> by
> > >>>>>> FC.
> > >>>>>> Now I need to setup NFS export in Active/Active mode and I added
> all
> > >>>>> needed
> > >>>>>> resources and wrote the order of starting.
> > >>>>>> But then node is starting after reboot I got race condition
> between
> > >>>>>> Filesystem resource and exportfs.
> > >>>>>> Exportfs couldn't start because ocfs2 mountpoint isn't mounted
> yet.
> > >>>>>>
> > >>>>>> How to tell ExportFS resource to start then filesystem resource
> will
> > >>> be
> > >>>>>> ready?
> > >>>>>
> > >>>>> Use the order constraint? Or did I miss something? You already
> > >>>>> have some order constraints defined, so you should be able to
> > >>>>> manage.
> > >>>>>
> > >>>>> Thanks,
> > >>>>>
> > >>>>> Dejan
> > >>>>>
> > >>>>>> crm config is the following:
> > >>>>>> node msk-nfs-gw01
> > >>>>>> node msk-nfs-gw02
> > >>>>>> primitive nfs-kernel-server lsb:nfs-kernel-server \
> > >>>>>>         op monitor interval="10s" timeout="30s"
> > >>>>>> primitive ping ocf:pacemaker:ping \
> > >>>>>>         params host_list="10.236.22.35" multiplier="100"
> name="ping"
> > >>> \
> > >>>>>>         op monitor interval="20s" timeout="60s" \
> > >>>>>>         op start interval="0" timeout="60s"
> > >>>>>> primitive portmap upstart:portmap \
> > >>>>>>         op monitor interval="10s" timeout="30s"
> > >>>>>> primitive res-dlm ocf:pacemaker:controld \
> > >>>>>>         op monitor interval="120s"
> > >>>>>> primitive res-fs ocf:heartbeat:Filesystem \
> > >>>>>>         params device="/dev/mapper/mpath0"
> directory="/media/media0"
> > >>>>>> fstype="ocfs2" \
> > >>>>>>         op monitor interval="120s"
> > >>>>>> primitive res-nfs1-ip ocf:heartbeat:IPaddr2 \
> > >>>>>>         params ip="10.236.22.38" cidr_netmask="27" nic="bond0" \
> > >>>>>>         op monitor interval="30s"
> > >>>>>> primitive res-nfs2-ip ocf:heartbeat:IPaddr2 \
> > >>>>>>         params ip="10.236.22.39" cidr_netmask="27" nic="bond0" \
> > >>>>>>         op monitor interval="30s"
> > >>>>>> primitive res-o2cb ocf:pacemaker:o2cb \
> > >>>>>>         op monitor interval="120s"
> > >>>>>> primitive res-share ocf:heartbeat:exportfs \
> > >>>>>>         params directory="/media/media0/nfsroot/export1"
> clientspec="
> > >>>>>> 10.236.22.0/24"
> options="rw,async,no_subtree_check,no_root_squash"
> > >>>>> fsid="1"
> > >>>>>> \
> > >>>>>>         op monitor interval="10s" timeout="30s" \
> > >>>>>>         op start interval="10" timeout="40s" \
> > >>>>>>         op stop interval="0" timeout="40s"
> > >>>>>> primitive st-null stonith:null \
> > >>>>>>         params hostlist="msk-nfs-gw01 msk-nfs-gw02"
> > >>>>>> group nfs portmap nfs-kernel-server
> > >>>>>> clone clone-dlm res-dlm \
> > >>>>>>         meta globally-unique="false" interleave="true"
> > >>>>>> clone clone-fs res-fs \
> > >>>>>>         meta globally-unique="false" interleave="true"
> > >>>>>> clone clone-nfs nfs \
> > >>>>>>         meta globally-unique="false" interleace="true"
> > >>>>>> clone clone-o2cb res-o2cb \
> > >>>>>>         meta globally-unique="false" interleave="true"
> > >>>>>> clone clone-share res-share \
> > >>>>>>         meta globally-unique="false" interleave="true"
> > >>>>>> clone fencing st-null
> > >>>>>> clone ping_clone ping \
> > >>>>>>         meta globally-unique="false"
> > >>>>>> location nfs1-ip-on-nfs1 res-nfs1-ip 50: msk-nfs-gw01
> > >>>>>> location nfs2-ip-on-nfs2 res-nfs2-ip 50: msk-nfs-gw02
> > >>>>>> colocation col-fs-o2cb inf: clone-fs clone-o2cb
> > >>>>>> colocation col-nfs-fs inf: clone-nfs clone-fs
> > >>>>>> colocation col-o2cb-dlm inf: clone-o2cb clone-dlm
> > >>>>>> colocation col-share-nfs inf: clone-share clone-nfs
> > >>>>>> order ord-dlm-o2cb 0: clone-dlm clone-o2cb
> > >>>>>> order ord-nfs-share 0: clone-nfs clone-share
> > >>>>>> order ord-o2cb-fs 0: clone-o2cb clone-fs
> > >>>>>> order ord-o2cb-nfs 0: clone-fs clone-nfs
> > >>>>>> order ord-share-nfs1 0: clone-share res-nfs1-ip
> > >>>>>> order ord-share-nfs2 0: clone-share res-nfs2-ip
> > >>>>>> property $id="cib-bootstrap-options" \
> > >>>>>>
> dc-version="1.0.9-da7075976b5ff0bee71074385f8fd02f296ec8a3" \
> > >>>>>>         cluster-infrastructure="openais" \
> > >>>>>>         expected-quorum-votes="2" \
> > >>>>>>         stonith-enabled="true" \
> > >>>>>>         no-quorum-policy="ignore" \
> > >>>>>>         last-lrm-refresh="1308040111"
> > >>>>>>
> > >>>>>> --
> > >>>>>> Best Regards
> > >>>>>> Alexander Malaev
> > >>>>>
> > >>>>>> _______________________________________________
> > >>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > >>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > >>>>>>
> > >>>>>> Project Home: http://www.clusterlabs.org
> > >>>>>> Getting started:
> > >>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > >>>>>> Bugs:
> > >>>>>
> > >>>
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> > >>>>>
> > >>>>>
> > >>>>
> > >>>>
> > >>>> --
> > >>>> С уважением,
> > >>>> Александр Малаев
> > >>>> +7-962-938-9323
> > >>>
> > >>>> _______________________________________________
> > >>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > >>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > >>>>
> > >>>> Project Home: http://www.clusterlabs.org
> > >>>> Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > >>>> Bugs:
> > >>>
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> > >>>
> > >>>
> > >>
> > >>
> > >> --
> > >> С уважением,
> > >> Александр Малаев
> > >> +7-962-938-9323
> > >
> > >> _______________________________________________
> > >> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > >>
> > >> Project Home: http://www.clusterlabs.org
> > >> Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > >> Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> > >
> > >
> > > _______________________________________________
> > > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > >
> > > Project Home: http://www.clusterlabs.org
> > > Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > > Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> >
> >
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>



-- 
С уважением,
Александр Малаев
+7-962-938-9323
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20110621/6043b3c1/attachment-0003.html>


More information about the Pacemaker mailing list