[Pacemaker] Fail-over NFS Server (need cluster configuration check) [SOLVED]

Matteo Guglielmi matteo.guglielmi at epfl.ch
Tue Feb 28 14:15:06 EST 2012


I's always fun replying to our own emails :-)

Solution to point (1): "Fully Sequential MS Promotion"

primitive p_drbd_home ocf:linbit:drbd \
  params drbd_resource="home" \
  op start interval="0" timeout="60" \
  op stop interval="0" timeout="240" \
  op monitor interval="20"
primitive p_drbd_software ocf:linbit:drbd \
  params drbd_resource="software" \
  op start interval="0" timeout="60" \
  op stop interval="0" timeout="240" \
  op monitor interval="20"
primitive p_drbd_srv ocf:linbit:drbd \
  params drbd_resource="srv" \
  op start interval="0" timeout="60" \
  op stop interval="0" timeout="240" \
  op monitor interval="20"
ms ms_drbd_home p_drbd_home \
  meta master-max="1" master-node-max="1" \
  clone-max="2" clone-node-max="1" notify="true"
ms ms_drbd_software p_drbd_software \
  meta master-max="1" master-node-max="1" \
  clone-max="2" clone-node-max="1" notify="true"
ms ms_drbd_srv p_drbd_srv \
  meta master-max="1" master-node-max="1" \
  clone-max="2" clone-node-max="1" notify="true"
colocation co_ms_drbd_software_with_ms_drbd_srv inf: \
  ms_drbd_software:Master ms_drbd_srv:Master
order o_ms_drbd_software_after_ms_drbd_srv_promote mandatory: \
  ms_drbd_srv:promote ms_drbd_software:start
colocation co_ms_drbd_home_with_ms_drbd_software inf: \
  ms_drbd_home:Master ms_drbd_software:Master
order o_ms_drbd_home_after_ms_drbd_software_promote mandatory: \
  ms_drbd_software:promote ms_drbd_home:start

Solution to point (2): "Fully Sequential FS Mounting + DHCP server (after MS Promotion)"

primitive p_fs_home ocf:heartbeat:Filesystem \
  params device="/dev/drbd/by-res/home" \
  directory="/share/drbd/nfs/home" fstype="ext4" \
  options="noatime,usrjquota=aquota.user,grpjquota=aquota.group,jqfmt=vfsv0" \
  op start interval="0" timeout="60" \
  op stop interval="0" timeout="240" \
  op monitor interval="20"
primitive p_fs_software ocf:heartbeat:Filesystem \
  params device="/dev/drbd/by-res/software" \
  directory="/share/drbd/nfs/software" \
  fstype="ext4" options="noatime" \
  op start interval="0" timeout="60" \
  op stop interval="0" timeout="240" \
  op monitor interval="20"
primitive p_fs_srv ocf:heartbeat:Filesystem \
  params device="/dev/drbd/by-res/srv" \
  directory="/share/drbd/nfs/srv" \
  fstype="ext4" options="noatime" \
  op start interval="0" timeout="60" \
  op stop interval="0" timeout="240" \
  op monitor interval="20"
primitive p_ip_nfs ocf:heartbeat:IPaddr2 \
  params ip="192.168.0.50" cidr_netmask="24" iflabel="nfs" \
  op monitor interval="20"
primitive p_service_isc-dhcp-server lsb:isc-dhcp-server \
  op start interval="0" timeout="60" \
  op stop interval="0" timeout="240" \
  op monitor interval="20"
group g_service_fs_ip_dhcp p_fs_srv p_fs_software p_fs_home \
  p_ip_nfs p_service_isc-dhcp-server
colocation co_ms_drbd_home_with_g_service_fs_ip_dhcp inf: \
  g_service_fs_ip_dhcp ms_drbd_home:Master
order o_g_service_fs_ip_dhcp_after_ms_drbd_home_promote mandatory: \
  ms_drbd_home:promote g_service_fs_ip_dhcp:start

Solution to point (3): "Fully Sequential Cloned NFS Exporting + QUOTA server (after FS Mounting + DHCP server)"

primitive p_service_nfs-common lsb:nfs-common \
  op start interval="0" timeout="60" \
  op stop interval="0" timeout="240" \
  op monitor interval="20"
primitive p_service_nfs-kernel-server lsb:nfs-kernel-server \
  op start interval="0" timeout="60" \
  op stop interval="0" timeout="240" \
  op monitor interval="20"
primitive p_service_quota lsb:quota \
  op start interval="0" timeout="60" \
  op stop interval="0" timeout="240" \
  op monitor interval="20"
group g_service_nfs_quota p_service_nfs-common p_service_nfs-kernel-server \
  p_service_quota
clone cl_g_service_nfs_quota g_service_nfs_quota
order o_cl_g_service_nfs_quota_after_p_service_isc-dhcp-server_start mandatory: \
  g_service_fs_ip_dhcp:start cl_g_service_nfs_quota:start


Works like a charm,

--matt

On 02/27/12 03:32, Matteo Guglielmi wrote:
> On two machines (A and B) I've created three identical LVM
> partitions (DRBD backing device) called srv, home and software.
> 
> The fs on all of them is ext4.
> 
> The home fs has quotas.
> 
> srv, home and software are exported via NFS.
> 
> Both A and B do also have an extra locally mounted fs (data1 and
> data2 respectively) with quotas, data1 and data2 are exported via
> NFS too (NO DRBD backing device for them... they are just local
> file systems).
> 
> Both A and B do have a dhcp server but only one dhcp server can
> be found running on the machine which have all three drbd fs
> in primary mode.
> 
> A floating IP is used for mounting srv, software and home on all
> NFS clients.
> 
> The cluster configuration I'd like to have should reproduce the
> following scenario:
> 
> 
> A: ( srv + home + software + IP + dhcp + nfsserver + quota-server)
> B: ( nfs-server + quota-server)
> 
> or
> 
> A: ( nfs-server + quota-server)
> A: ( srv + home + software + IP + dhcp + nfsserver )
> 
> 
> ### Cluster Configuration ###
> 
> 1) All ms_drbd must be in primary mode on the same host:
> 
> primitive p_drbd_home ocf:linbit:drbd \
>    params drbd_resource="home" \
>    op start interval="0" timeout="60" \
>    op stop interval="0" timeout="240" \
>    op monitor interval="20"
> primitive p_drbd_software ocf:linbit:drbd \
>    params drbd_resource="software" \
>    op start interval="0" timeout="60" \
>    op stop interval="0" timeout="240" \
>    op monitor interval="20"
> primitive p_drbd_srv ocf:linbit:drbd \
>    params drbd_resource="srv" \
>    op start interval="0" timeout="60" \
>    op stop interval="0" timeout="240" \
>    op monitor interval="20"
> ms ms_drbd_home p_drbd_home \
>    meta master-max="1" master-node-max="1" \
>    clone-max="2" clone-node-max="1" notify="true"
> ms ms_drbd_software p_drbd_software \
>    meta master-max="1" master-node-max="1" \
>    clone-max="2" clone-node-max="1" notify="true"
> ms ms_drbd_srv p_drbd_srv \
>    meta master-max="1" master-node-max="1" \
>    clone-max="2" clone-node-max="1" notify="true"
> colocation co_ms_drbd_home_with_ms_drbd_srv_and_ms_drbd_software \
>    inf: ms_drbd_home:Master ms_drbd_srv:Master ms_drbd_software:Master
> 
> Questions:
> 
>   - is the "colocation" definition correct/enough?
> 
>   - how to enforce a sequence of events such as: promote software first,
>     then if everything went ok promote srv, then if everything went ok
>     promote home? (I would need this behavior because... see questions at
>     the end of point 2)
> 
> 2) Mounting srv, software and home fs + floating IP + dhcp server on the
>     node hosting all drbd devices in primary mode:
> 
> primitive p_fs_home ocf:heartbeat:Filesystem \
>    params device="/dev/drbd/by-res/home" \
>    directory="/share/drbd/nfs/home" fstype="ext4" \
>    options="noatime,usrjquota=aquota.user,grpjquota=aquota.group,jqfmt=vfsv0" \
>    op start interval="0" timeout="60" \
>    op stop interval="0" timeout="240" \
>    op monitor interval="20"
> primitive p_fs_software ocf:heartbeat:Filesystem \
>    params device="/dev/drbd/by-res/software" \
>    directory="/share/drbd/nfs/software" fstype="ext4" \
>    options="noatime" \
>    op start interval="0" timeout="60" \
>    op stop interval="0" timeout="240" \
>    op monitor interval="20"
> primitive p_fs_srv ocf:heartbeat:Filesystem \
>    params device="/dev/drbd/by-res/srv" \
>    directory="/share/drbd/nfs/srv" fstype="ext4" \
>    options="noatime" \
>    op start interval="0" timeout="60" \
>    op stop interval="0" timeout="240" \
>    op monitor interval="20"
> primitive p_ip_nfs ocf:heartbeat:IPaddr2 \
>    params ip="192.168.0.50" cidr_netmask="24" iflabel="nfs" \
>    op monitor interval="20"
> primitive p_service_isc-dhcp-server lsb:isc-dhcp-server \
>    op start interval="0" timeout="60" \
>    op stop interval="0" timeout="240" \
>    op monitor interval="20"
> group g_service_fs_ip_dhcp p_fs_srv p_fs_software p_fs_home \
>    p_ip_nfs p_service_isc-dhcp-server
> colocation co_ms_drbd_home_with_g_service_fs_ip_dhcp \
>    inf: g_service_fs_ip_dhcp ms_drbd_home:Master
> order o_g_service_fs_ip_dhcp_after_ms_drbd_home_promote \
>    inf: ms_drbd_home:promote g_service_fs_ip_dhcp:start
> 
> Questions:
> 
>   - If I know that home is the last drbd device promoted into
>     primary mode, then I'm ready to mount all fs, start the
>     floating IP and dhcp server on the node where drbd home is
>     in primary mode... are both colocation and order constraints
>     correct?
> 
> 3) nfs-server and quota-server must be started on both hosts
>     once all filesystems are mouned:
> 
> primitive p_service_nfs-common lsb:nfs-common \
>    op start interval="0" timeout="60" \
>    op stop interval="0" timeout="240" \
>    op monitor interval="20"
> primitive p_service_nfs-kernel-server lsb:nfs-kernel-server \
>    op start interval="0" timeout="60" \
>    op stop interval="0" timeout="240" \
>    op monitor interval="20"
> primitive p_service_quota lsb:quota \
>    op start interval="0" timeout="60" \
>    op stop interval="0" timeout="240" \
>    op monitor interval="20"
> group g_service_nfs_quota p_service_nfs-common \
>    p_service_nfs-kernel-server p_service_quota
> clone cl_g_service_nfs_quota g_service_nfs_quota
> order o_cl_g_service_nfs_quota_after_service_fs_ip_dhcp_start \
> inf: g_service_fs_ip_dhcp:start cl_g_service_nfs_quota
> 
> Questions:
> 
>    - Here I'm really lost... and with this configuration my
>      cluster do not act properly (many error messages) once I set
>      in standby one of the two nodes.... do you see anything weired
>      here?
> 
> ###
> 
> I can post the error messages but I'd first like to make sure that
> the cluster configuration is at least not that bad...
> 
> Thanks to all.
> 
> --matt
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> .
> 





More information about the Pacemaker mailing list