[ClusterLabs] Unexpected resource restart
Ken Gaillot
kgaillot at redhat.com
Wed Jan 16 10:34:12 EST 2019
On Wed, 2019-01-16 at 13:41 +0100, Valentin Vidic wrote:
> On Wed, Jan 16, 2019 at 12:41:11PM +0100, Valentin Vidic wrote:
> > This is what pacemaker says about the resource restarts:
> >
> > Jan 16 11:19:08 node1 pacemaker-schedulerd[713]: notice: *
> > Start dlm:1 ( node2 )
> > Jan 16 11:19:08 node1 pacemaker-schedulerd[713]: notice: *
> > Start lockd:1 ( node2 )
> > Jan 16 11:19:08 node1 pacemaker-schedulerd[713]: notice: *
> > Restart gfs2-lvm:0 ( node1 ) due to required
> > storage-clone running
> > Jan 16 11:19:08 node1 pacemaker-schedulerd[713]: notice: *
> > Restart gfs2-fs:0 ( node1 ) due to required
> > gfs2-lvm:0 start
> > Jan 16 11:19:08 node1 pacemaker-schedulerd[713]: notice: *
> > Start gfs2-lvm:1 ( node2 )
> > Jan 16 11:19:08 node1 pacemaker-schedulerd[713]: notice: *
> > Start gfs2-fs:1 ( node2 )
> > Jan 16 11:19:08 node1 pacemaker-schedulerd[713]: notice: *
> > Restart ocfs2-lvm:0 ( node1 ) due to required
> > storage-clone running
> > Jan 16 11:19:08 node1 pacemaker-schedulerd[713]: notice: *
> > Restart ocfs2-fs:0 ( node1 ) due to required
> > ocfs2-lvm:0 start
> > Jan 16 11:19:08 node1 pacemaker-schedulerd[713]: notice: *
> > Start ocfs2-lvm:1 ( node2 )
> > Jan 16 11:19:08 node1 pacemaker-schedulerd[713]: notice: *
> > Start ocfs2-fs:1 ( node2 )
>
> It seems interleave was required an gfs2 and ocfs2 clones:
>
> interleave (default: false)
> If this clone depends on another clone via an ordering constraint,
> is
> it allowed to start after the local instance of the other clone
> starts, rather
> than wait for all instances of the other clone to start?
Exactly, that's the purpose of interleave.
In retrospect, interleave=true should have been the default. I've never
seen a case where false made sense, and people get bit by overlooking
it all the time. False is the default because it's (theoretically at
least) safer when there's nothing known about the particular service's
requirements.
I should've flipped the default at 2.0.0 but didn't think of it. Now
we'll have to wait a decade for 3.0.0 :) or maybe we can justify doing
it in a minor bump in a few years.
> Now it behaves as expected when the node2 is set online:
>
> Jan 16 12:35:33 node1 pacemaker-schedulerd[564]: notice: *
> Start dlm:1 ( node2 )
> Jan 16 12:35:33 node1 pacemaker-schedulerd[564]: notice: *
> Start lockd:1 ( node2 )
> Jan 16 12:35:33 node1 pacemaker-schedulerd[564]: notice: *
> Start gfs2-lvm:1 ( node2 )
> Jan 16 12:35:33 node1 pacemaker-schedulerd[564]: notice: *
> Start gfs2-fs:1 ( node2 )
> Jan 16 12:35:33 node1 pacemaker-schedulerd[564]: notice: *
> Start ocfs2-lvm:1 ( node2 )
> Jan 16 12:35:33 node1 pacemaker-schedulerd[564]: notice: *
> Start ocfs2-fs:1 ( node2 )
>
> Clone: gfs2-clone
> Meta Attrs: interleave=true target-role=Started
> Group: gfs2
> Resource: gfs2-lvm (class=ocf provider=heartbeat type=LVM-
> activate)
> Attributes: activation_mode=shared vg_access_mode=lvmlockd
> vgname=vgshared lvname=gfs2
> Resource: gfs2-fs (class=ocf provider=heartbeat type=Filesystem)
> Attributes: directory=/srv/gfs2 fstype=gfs2
> device=/dev/vgshared/gfs2
>
More information about the Users
mailing list