[Pacemaker] Issue with ordering

Vladislav Bogdanov bubble at hoster-ok.com
Thu Mar 29 08:07:40 UTC 2012


Hi Andrew, all,

I'm continuing experiments with lustre on stacked drbd, and see
following problem:

I have one drbd resource (ms-drbd-testfs-mdt0000) is stacked on top of
other (ms-drbd-testfs-mdt0000-left), and have following constraints
between them:

colocation drbd-testfs-mdt0000-with-drbd-testfs-mdt0000-left inf:
ms-drbd-testfs-mdt0000 ms-drbd-testfs-mdt0000-left:Master
order drbd-testfs-mdt0000-after-drbd-testfs-mdt0000-left inf:
ms-drbd-testfs-mdt0000-left:promote ms-drbd-testfs-mdt0000:start

Then I have filesystem mounted on top of ms-drbd-testfs-mdt0000
(testfs-mdt0000 resource).

colocation testfs-mdt0000-with-drbd-testfs-mdt0000 inf: testfs-mdt0000
ms-drbd-testfs-mdt0000:Master
order testfs-mdt0000-after-drbd-testfs-mdt0000 inf:
ms-drbd-testfs-mdt0000:promote testfs-mdt0000:start

When I trigger event which causes many resources to stop (including
these three), LogActions output look like:

LogActions: Stop    drbd-local#011(lustre01-left)
LogActions: Stop    drbd-stacked#011(Started lustre02-left)
LogActions: Stop    drbd-testfs-local#011(Started lustre03-left)
LogActions: Stop    drbd-testfs-stacked#011(Started lustre04-left)
LogActions: Stop    lustre#011(Started lustre04-left)
LogActions: Stop    mgs#011(Started lustre01-left)
LogActions: Stop    testfs#011(Started lustre03-left)
LogActions: Stop    testfs-mdt0000#011(Started lustre01-left)
LogActions: Stop    testfs-ost0000#011(Started lustre01-left)
LogActions: Stop    testfs-ost0001#011(Started lustre02-left)
LogActions: Stop    testfs-ost0002#011(Started lustre03-left)
LogActions: Stop    testfs-ost0003#011(Started lustre04-left)
LogActions: Stop    drbd-mgs:0#011(Master lustre01-left)
LogActions: Stop    drbd-mgs:1#011(Slave lustre02-left)
LogActions: Stop    drbd-testfs-mdt0000:0#011(Master lustre01-left)
LogActions: Stop    drbd-testfs-mdt0000-left:0#011(Master lustre01-left)
LogActions: Stop    drbd-testfs-mdt0000-left:1#011(Slave lustre02-left)
LogActions: Stop    drbd-testfs-ost0000:0#011(Master lustre01-left)
LogActions: Stop    drbd-testfs-ost0000-left:0#011(Master lustre01-left)
LogActions: Stop    drbd-testfs-ost0000-left:1#011(Slave lustre02-left)
LogActions: Stop    drbd-testfs-ost0001:0#011(Master lustre02-left)
LogActions: Stop    drbd-testfs-ost0001-left:0#011(Master lustre02-left)
LogActions: Stop    drbd-testfs-ost0001-left:1#011(Slave lustre01-left)
LogActions: Stop    drbd-testfs-ost0002:0#011(Master lustre03-left)
LogActions: Stop    drbd-testfs-ost0002-left:0#011(Master lustre03-left)
LogActions: Stop    drbd-testfs-ost0002-left:1#011(Slave lustre04-left)
LogActions: Stop    drbd-testfs-ost0003:0#011(Master lustre04-left)
LogActions: Stop    drbd-testfs-ost0003-left:0#011(Master lustre04-left)
LogActions: Stop    drbd-testfs-ost0003-left:1#011(Slave lustre03-left)

For some reason demote is not run on both mdt drbd esources (should
it?), so drbd RA prints warning about that.

What I see then is that ms-drbd-testfs-mdt0000-left is tried to stop
before ms-drbd-testfs-mdt0000.

More, testfs-mdt0000 filesystem resource is not stopped before stopping
drbd-testfs-mdt0000.

I have advisory ordering constraints between mdt and ost filesystem
resources, so all ost's are stopped before mdt. Thus mdt stop is delayed
a bit. May be this influences what happens.

I'm pretty sure I have correct constraints for at least these three
resources, so it looks like a bug, because mandatory ordering is not
preserved.

I can produce report for this.

Best,
Vladislav




More information about the Pacemaker mailing list