[Pacemaker] [Linux-HA] Probably a regression of the linbit drbd agent between pacemaker 1.1.8 and 1.1.10

Lars Ellenberg lars.ellenberg at linbit.com
Mon Sep 9 10:20:33 UTC 2013


On Mon, Sep 09, 2013 at 02:42:45PM +1000, Andrew Beekhof wrote:
> 
> On 06/09/2013, at 5:51 PM, Lars Ellenberg <Lars.Ellenberg at linbit.com> wrote:
> 
> > On Tue, Aug 27, 2013 at 06:51:45AM +0200, Andreas Mock wrote:
> >> Hi Andrew,
> >> 
> >> as this is a real showstopper at the moment I invested some other
> >> hours to be sure (as far as possible) not having made an error.
> >> 
> >> Some additions:
> >> 1) I mirrored the whole mini drbd config to another pacemaker cluster.
> >> Same result: pacemaker 1.1.8 works, pacemaker 1.1.10 not 
> >> 2) When I remove the target role Stopped from the drbd ms resource
> >> and insert the config snippet related to the drbd device via crm -f <file>
> >> to a lean running pacemaker config (pacemaker cluster options, stonith
> >> resources),
> >> it seems to work. That means one of the nodes gets promoted.
> >> 
> >> Then after stopping 'crm resource stop ms_drbd_xxx' and starting again
> >> I see the same promotion error as described.
> >> 
> >> The drbd resource agent is using /usr/sbin/crm_master.
> >> Is there a possibility that feedback given through this client tool
> >> is changing the timing behaviour of pacemaker? Or the way
> >> transitions are scheduled?
> >> Any idea that may be related to a change in pacemaker?
> > 
> > I think that recent pacemaker allows for "start" and "promote" in the
> > same transition.
> 
> At least in the one case I saw logs of, this wasn't the case.
> The PE computed:
> 
> Current cluster status:
> Online: [ db05 db06 ]
> 
> r_stonith-db05	(stonith:fence_imm):	Started db06 
> r_stonith-db06	(stonith:fence_imm):	Started db05 
> Master/Slave Set: ms_drbd_fodb [r_drbd_fodb]
>     Slaves: [ db05 db06 ]
> Master/Slave Set: ms_drbd_fodblog [r_drbd_fodblog]
>     Slaves: [ db05 db06 ]
> 
> Transition Summary:
> * Promote r_drbd_fodb:0	(Slave -> Master db05)
> * Promote r_drbd_fodblog:0	(Slave -> Master db05)
> 
> and it was the promotion of r_drbd_fodb:0 that failed.

Right.

Off-list communication revealed that
DRBD came up as "Consistent" only,
which is a normal and expected state,
when using resource level fencing.

The promotion attempt then raced with the connection handshake.
The DRBD fence-peer handler is run (because it's only Consistent,
not UpToDate) and returns successfully, but due to that race,
this result is ignored, DRBD stays "only Consistent", which
is not good enough to be promoted ("need access to UpToDate data").

Once the handshake is done, that also results in "access to good data",
which is why the next promotion attempt succeeds.


Something in the timing of pacemaker actions has changed
between the affected and unaffected versions.
Apparently before there was enough time to do the connection handshake
before the promote request was made.


This race is fixed with DRBD 8.3.16 and 8.4.4 (currently rc1)

You can avoid that race by not allowing Pacemaker to promote
if DRBD is only "Consistent".

Pacemaker will only attempt promotion,
if there is a positive master score for the resource.

The ocf:linbit:drbd RA hardcodes the master score for
"Consistent" to 5.
So you may edit the RA and instead remove the master score
for the "only Consistent".

(above mentioned fixed DRBD versions also introduce a new
"adjust_master_score" paramater, and this becomes configurable)

Or you can add a location constraint like this:
 location no-master-if-only-consistent ms_drbd_XY \
        rule $role="Master" -10: defined #uname

where "defined #uname" is a funny way to express "true",
as in this constraint reduces the resulting master score by 10,
always, anywhere.

If you have other $role=Master constraints, you may need to play with
the scores to achieve the desired outcome.


> > I suspect you would not be able to reproduce by:
> >  crm resource stop ms_drbd
> >  crm resource demote ms_drbd (will only make drbd Secondary stuff)
> >    ... meanwhile, DRBD will establish the connection ...
> >  crm resource promote ms_drbd (will then promote one node)

By first allowing DRBD to do the handshake in Secondary/Secondary,
and only later allowing it to promote,
this sequence also avoids the race.

Cheers,
	Lars

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.




More information about the Pacemaker mailing list