[Pacemaker] Xen/DRBD cluster issuse when putting a node in standby mode

Mon Aug 2 04:14:11 EDT 2010

On Mon, Jul 26, 2010 at 11:32 PM, Pierre POMES
<pierre.pomes at interface-tech.com> wrote:
> Hi All,
>
> I am using a simple two-nodes cluster with Xen on top of DRBD in
> primary/primary mode (necessary for live migration).  My configuration is
> quite simple:
>
> primitive appyul1 ocf:heartbeat:Xen \
>         params xmfile="/etc/xen/appyul1.cfg" shutdown_timeout="299" \
>         op monitor interval="10s" timeout="300s" \
>         op start interval="0s" timeout="180s" \
>         op stop interval="0s" timeout="300s" \
>         op migrate_from interval="0s" timeout="180s" \
>         op migrate_to interval="0s" timeout="180s" \
>         meta target-role="Started" allow-migrate="true" is-managed="true"
> primitive appyul1slash-DRBD ocf:linbit:drbd \
>         params drbd_resource="appyul1slash" \
>         operations $id="appyul1slash-DRBD-ops" \
>         op monitor interval="20s" role="Master" timeout="300s" \
>         op monitor interval="30s" role="Slave" timeout="300s"
> primitive appyul1swap-DRBD ocf:linbit:drbd \
>         params drbd_resource="appyul1swap" \
>         operations $id="appyul1swap-DRBD-ops" \
>         op monitor interval="20s" role="Master" timeout="300s" \
>         op monitor interval="30s" role="Slave" timeout="300s"
> ms appyul1slash-MS appyul1slash-DRBD \
>         meta master-max="2" notify="true" interleave="true"
> target-role="Started" is-managed="true"
> ms appyul1swap-MS appyul1swap-DRBD \
>         meta master-max="2" notify="true" interleave="true"
> target-role="Started" is-managed="true"
> order appyul1-after-drbd inf: appyul1slash-MS:promote appyul1swap-MS:promote
> appyul1:start
>
> So to summerize:
> - A  resource for Xen
> - Two Master/Slave DRBD ressources for the VM filesystem (/ and swap).
> master-max is set to 2 to have both node in primary DRBD state.
> - a "order" directive to start the VM after drbd has been promoted.
>
> Node startup is ok, the VM is started after DRBD is promoted.
>
> Node shutdown is problematic. Assuming the Xen VM runs on node A :
> -  When puting node A in standby when node B is active, a live migration is
> started, BUT in the same second, pacemaker tries to demote DRBD volumes on A
> (while live migration is in progress).

You'' need to tell Pacemaker not to do that with another order constraint.
Unless you tell it otherwise, it assumes all services are unrelated to
each other.

> - When putting node A in standby when node B is also in standby, the VM is
> stopped, BUT in the same second, pacemaker tries to demote DRBD volumes on A
> (while shutdown is still in progress).

As above.

>
> All this results in "failed actions" in the CRM, and cause unwanted stonith
> actions (when enabled). I tried to add "symmetrical=false" on the order
> constraint, but it did not help.
>
> I do not understand by pacemaker does not wait the Xen VM is
> stopped/migrated before demoting DRBD volumes.

Because you didn't tell it to.

>
> Setup is done with corosync and pacemaker packages available on a standard
> Ubuntu Lucid (corosync 1.2.0 and pacemaker 1.0.8).
>
> Thanks for your help,
>
> Pierre
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
>