[ClusterLabs] two node cluster with clvm and virtual machines

Thu Feb 2 13:43:44 EST 2017

Hi,

i'm implementing a two node cluster with SLES 11 SP4. I have a shared storage (FC-SAN).
I'm planning to use clvm.
What i did already:
Connecting the SAN to the hosts
Creating a volume on the SAN
Volume is visible on both nodes (through multipath and device-mapper)
My pacemaker config looks like this:

crm(live)# configure show
node ha-idg-1
node ha-idg-2
primitive prim_clvmd ocf:lvm2:clvmd \
        op stop interval=0 timeout=100 \
        op start interval=0 timeout=90 \
        op monitor interval=20 timeout=20
primitive prim_dlm ocf:pacemaker:controld \
        op start interval=0 timeout=90 \
        op stop interval=0 timeout=100 \
        op monitor interval=60 timeout=60
primitive prim_stonith_ilo_ha-idg-1 stonith:external/riloe \
        params ilo_hostname=SUNHB65279 hostlist=ha-idg-1 ilo_user=root ilo_password=**** \
        op monitor interval=60m timeout=120s \
        meta target-role=Started
primitive prim_stonith_ilo_ha-idg-2 stonith:external/riloe \
        params ilo_hostname=SUNHB58820-3 hostlist=ha-idg-2 ilo_user=root ilo_password=**** \
        op monitor interval=60m timeout=120s \
        meta target-role=Started
primitive prim_vg_cluster_01 LVM \
        params volgrpname=vg_cluster_01 \
        op monitor interval=60 timeout=60 \
        op start interval=0 timeout=30 \
        op stop interval=0 timeout=30
group group_prim_dlm_clvmd_vg_cluster_01 prim_dlm prim_clvmd prim_vg_cluster_01
clone clone_group_prim_dlm_clvmd_vg_cluster_01 group_prim_dlm_clvmd_vg_cluster_01 \
        meta target-role=Started
location loc_prim_stonith_ilo_ha-idg-1 prim_stonith_ilo_ha-idg-1 -inf: ha-idg-1
location loc_prim_stonith_ilo_ha-idg-2 prim_stonith_ilo_ha-idg-2 -inf: ha-idg-2
property cib-bootstrap-options: \
        dc-version=1.1.12-f47ea56 \
        cluster-infrastructure="classic openais (with plugin)" \
        expected-quorum-votes=2 \
        no-quorum-policy=ignore \
        last-lrm-refresh=1485872095 \
        stonith-enabled=true \
        default-resource-stickiness=100 \
        start-failure-is-fatal=true \
        is-managed-default=true \
        stop-orphan-resources=true
rsc_defaults rsc-options: \
        target-role=stopped \
        resource-stickiness=100 \
        failure-timeout=0
op_defaults op-options: \
        on-fail=restart

This is the status:
crm(live)# status
Last updated: Thu Feb  2 19:14:10 2017
Last change: Thu Feb  2 19:05:26 2017 by root via cibadmin on ha-idg-2
Stack: classic openais (with plugin)
Current DC: ha-idg-2 - partition with quorum
Version: 1.1.12-f47ea56
2 Nodes configured, 2 expected votes
8 Resources configured

Online: [ ha-idg-1 ha-idg-2 ]

 Clone Set: clone_group_prim_dlm_clvmd_vg_cluster_01 [group_prim_dlm_clvmd_vg_cluster_01]
     Started: [ ha-idg-1 ha-idg-2 ]

Failed actions:
    prim_stonith_ilo_ha-idg-1_start_0 on ha-idg-2 'unknown error' (1): call=100, status=Timed Out, exit-reason='none', last-rc-change='Tue Jan 31 15:14:34 2017', queued=0ms, exec=20004ms
    prim_stonith_ilo_ha-idg-2_start_0 on ha-idg-1 'unknown error' (1): call=107, status=Error, exit-reason='none', last-rc-change='Tue Jan 31 15:14:55 2017', queued=0ms, exec=11584ms

Until now everything is fine. The stonith resources have currently wrong passwords for the ILO adapters. It's difficult enough to establish a HA-cluster for the first time.
Until now i don't like to have my hosts booting all the time because of my errors in the configuration.

I created a vg and a lv, it's visible on both nodes.
My plan is to use for each vm a dedicated lv. VM's should run on both nodes, some on nodeA, some on nodeB.
If the cluster cares about the mounting of the fs inside the lv (i'm planning to use btrfs), i should not need a cluster fs ? Right ?
Because the cluster cares that the fs is always mounted only on one node. That's what i've been told.
I'd like to use btrfs because of its snapshot capability which is great.
Should i create now a resource group with the lv, the fs and the vm ?
I stumbled across sfex. It seems to provide an additional layer of security concerning access to a shared storage (my lv ?).
Is it senseful, does anyone have experience with it ?

Btw: Suse recommends (https://www.suse.com/documentation/sle_ha/book_sleha/data/sec_ha_clvm_config.html) to create a mirrored lv.
Is that really necessary/advisable ? My lv's reside on a SAN which is a RAID5 configuration. I don't see the benefit and the need of a mirrored lv,
just the disadvantage of wasting disk space. Beside the RAID we have a backup, and before changes of the vm's i will create a btrfs snapshot.
Unfortunately i'm not able to create a snapshot inside the vm because they are running older versions of Suse which don't support btrfs. Of course i could
recreate the vm's with a lvm configuration inside themselves. Maybe, if i have time enough. Then i could create snapshots with lvm tools.

Thanks.

Bernd

-- 
Bernd Lentes 

Systemadministration 
institute of developmental genetics 
Gebäude 35.34 - Raum 208 
HelmholtzZentrum München 
bernd.lentes at helmholtz-muenchen.de 
phone: +49 (0)89 3187 1241 
fax: +49 (0)89 3187 2294 

Erst wenn man sich auf etwas festlegt kann man Unrecht haben 
Scott Adams

Helmholtz Zentrum Muenchen
Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
Ingolstaedter Landstr. 1
85764 Neuherberg
www.helmholtz-muenchen.de
Aufsichtsratsvorsitzende: MinDir'in Baerbel Brumme-Bothe
Geschaeftsfuehrer: Prof. Dr. Guenther Wess, Heinrich Bassler, Dr. Alfons Enhsen
Registergericht: Amtsgericht Muenchen HRB 6466
USt-IdNr: DE 129521671