[ClusterLabs] DRBD + VDO HowTo?

Andrei Borzenkov arvidjaar at gmail.com
Tue May 18 04:19:14 EDT 2021


On Tue, May 18, 2021 at 10:41 AM Eric Robinson <eric.robinson at psmnv.com> wrote:
>
> > - check that device mapper target exists - otherwise no VDO is possible at all
> > - check that backing store device is visible - otherwise no VDO is possible
> > - and only then possibly call vdo tool to check actual status
> >
>
> Sorry, it is not clear to me what you mean by device mapper target. How would I check for the existence of the device mapper target for vdo0?
>

dmsetup targets

But really, the most simple would be to use systemd service. Then you
do not really need to monitor anything. Resource is assumed to be
active when service is started. That is enough to quickly get it
going.

>
> > -----Original Message-----
> > From: Users <users-bounces at clusterlabs.org> On Behalf Of Andrei
> > Borzenkov
> > Sent: Tuesday, May 18, 2021 12:22 AM
> > To: users at clusterlabs.org
> > Subject: Re: [ClusterLabs] DRBD + VDO HowTo?
> >
> > On 17.05.2021 21:28, Eric Robinson wrote:
> > > Andrei --
> > >
> > > Sorry for the novels. Sometimes it is hard to tell whether people want all
> > the configs, logs, and scripts first, or if they want a description of the problem
> > and what one is trying to accomplish first. I'll send whatever you want. I am
> > very eager to get to the bottom of this.
> > >
> > > I'll start with my custom LSB RA. I can send the Pacemaker config a bit later.
> > >
> > > [root at ha09a init.d]# ll|grep vdo
> > > lrwxrwxrwx. 1 root root     9 May 16 10:28 vdo0 -> vdo_multi
> > > lrwxrwxrwx. 1 root root     9 May 16 10:28 vdo1 -> vdo_multi
> > > -rwx------. 1 root root  3623 May 16 13:21 vdo_multi
> > >
> > > [root at ha09a init.d]#  cat vdo_multi
> > > #!/bin/bash
> > >
> > > #--custom script for managing vdo volumes
> > >
> > > #--functions
> > > function isActivated() {
> > >         R=$(/usr/bin/vdo status -n $VOL 2>&1)
> > >         if [ $? -ne 0 ]; then
> > >                 #--error occurred checking vdo status
> > >                 echo "$VOL: an error occurred checking activation status on
> > $MY_HOSTNAME"
> > >                 return 1
> > >         fi
> > >         R=$(/usr/bin/vdo status -n $VOL|grep Activate|awk '{$1=$1};1'|cut -d"
> > " -f2)
> > >         echo "$R"
> > >         return 0
> > > }
> > >
> > > function isOnline() {
> > >         R=$(/usr/bin/vdo status -n $VOL 2>&1)
> > >         if [ $? -ne 0 ]; then
> > >                 #--error occurred checking vdo status
> > >                 echo "$VOL: an error occurred checking activation status on
> > $MY_HOSTNAME"
> > >                 return 1
> > >         fi
> > >         R=$(/usr/bin/vdo status -n $VOL|grep "Index status"|awk
> > '{$1=$1};1'|cut -d" " -f3)
> > >         echo "$R"
> > >         return 0
> > > }
> > >
> > > #--vars
> > > MY_HOSTNAME=$(hostname -s)
> > >
> > > #--get the volume name
> > > VOL=$(basename $0)
> > >
> > > #--get the action
> > > ACTION=$1
> > >
> > > #--take the requested action
> > > case $ACTION in
> > >
> > >         start)
> > >
> > >                 #--check current status
> > >                 R=$(isOnline "$VOL")
> > >                 if [ $? -ne 0 ]; then
> > >                         echo "error occurred checking $VOL status on
> > $MY_HOSTNAME"
> > >                         exit 0
> > >                 fi
> > >                 if [ "$R"  == "online" ]; then
> > >                         echo "running on $MY_HOSTNAME"
> > >                         exit 0 #--lsb: success
> > >                 fi
> > >
> > >                 #--enter activation loop
> > >                 ACTIVATED=no
> > >                 TIMER=15
> > >                 while [ $TIMER -ge 0 ]; do
> > >                         R=$(isActivated "$VOL")
> > >                         if [ "$R" == "enabled" ]; then
> > >                                 ACTIVATED=yes
> > >                                 break
> > >                         fi
> > >                         sleep 1
> > >                         TIMER=$(( TIMER-1 ))
> > >                 done
> > >                 if [ "$ACTIVATED" == "no" ]; then
> > >                         echo "$VOL: not activated on $MY_HOSTNAME"
> > >                         exit 5 #--lsb: not running
> > >                 fi
> > >
> > >                 #--enter start loop
> > >                 /usr/bin/vdo start -n $VOL
> > >                 ONLINE=no
> > >                 TIMER=15
> > >                 while [ $TIMER -ge 0 ]; do
> > >                         R=$(isOnline "$VOL")
> > >                         if [ "$R" == "online" ]; then
> > >                                 ONLINE=yes
> > >                                 break
> > >                         fi
> > >                         sleep 1
> > >                         TIMER=$(( TIMER-1 ))
> > >                 done
> > >                 if [ "$ONLINE" == "yes" ]; then
> > >                         echo "$VOL: started on $MY_HOSTNAME"
> > >                         exit 0 #--lsb: success
> > >                 else
> > >                         echo "$VOL: not started on $MY_HOSTNAME (unknown
> > problem)"
> > >                         exit 0 #--lsb: unknown problem
> > >                 fi
> > >                 ;;
> > >         stop)
> > >
> > >                 #--check current status
> > >                 R=$(isOnline "$VOL")
> > >                 if [ $? -ne 0 ]; then
> > >                         echo "error occurred checking $VOL status on
> > $MY_HOSTNAME"
> > >                         exit 0
> > >                 fi
> > >
> > >                 if [ "$R" == "not" ]; then
> > >                         echo "not started on $MY_HOSTNAME"
> > >                         exit 0 #--lsb: success
> > >                 fi
> > >
> > >                 #--enter stop loop
> > >                 /usr/bin/vdo stop -n $VOL
> > >                 ONLINE=yes
> > >                 TIMER=15
> > >                 while [ $TIMER -ge 0 ]; do
> > >                         R=$(isOnline "$VOL")
> > >                         if [ "$R" == "not" ]; then
> > >                                 ONLINE=no
> > >                                 break
> > >                         fi
> > >                         sleep 1
> > >                         TIMER=$(( TIMER-1 ))
> > >                 done
> > >                 if [ "$ONLINE" == "no" ]; then
> > >                         echo "$VOL: stopped on $MY_HOSTNAME"
> > >                         exit 0 #--lsb:success
> > >                 else
> > >                         echo "$VOL: failed to stop on $MY_HOSTNAME (unknown
> > problem)"
> > >                         exit 0
> > >                 fi
> > >                 ;;
> > >         status)
> > >                 R=$(isOnline "$VOL")
> > >                 if [ $? -ne 0 ]; then
> > >                         echo "error occurred checking $VOL status on
> > $MY_HOSTNAME"
> > >                         exit 5
> > >                 fi
> >
> > Smoke test
> >
> > $  vdo status -n foo
> > vdo: ERROR - VDO volume foo not found
> > $ echo $?
> > 7
> > $
> >
> > And why pray is this an error? This quite clearly tells you that requested
> > volume does not exist so it cannot be active.
> >
> > The task of monitor action is to return "active" or "not active". The only case
> > where it is appropriate to return other error is when monitor action cannot
> > reliably decide whether resource is active or not. It is not the case in my trivial
> > example so error indication is wrong.
> >
> > As long as you intend to stick with standard vdo tool, "vdo list" seems to be
> > more appropriate in monitor action. It gives you exactly "a list of started VDO
> > volumes" which is what you are interested in.
> >
> > >                 if [ "$R"  == "online" ]; then
> > >                         echo "$VOL started on $MY_HOSTNAME"
> > >                         exit 0 #--lsb: success
> > >                 else
> > >                         echo "$VOL not started on $MY_HOSTNAME"
> > >                         exit 3 #--lsb: not running
> > >                 fi
> > >                 ;;
> > >
> >
> > Are you sure it is the correct test in general? This reports whether
> > deduplication is enabled or disabled. It is not the same as whether device is
> > started. Device is started when corresponding device-mapper table is
> > created. Deduplication may be enabled or disabled later without actually
> > stopping it.
> >
> > But note that vdo *tool* is not really suitable for using in resource agent for
> > at least two reasons
> >
> > - it relies on existing configuration files and does not really scan system for
> > VDO devices. "VDO volume foo not found" really means "not found in
> > configuration file". At the same time vdo tools accepts configuration file as
> > argument. Which means VDO device may have been started using different
> > configuration file already and your vdo invocation won't be aware of it. Also
> > these configuration files are silently updated by vdo invocation which
> > immediately means they may differ on cluster nodes.
> >
> > - it does not provide fine grained error indication to understand what
> > happens.
> >
> > vdo resource agent would
> >
> > - check that device mapper target exists - otherwise no VDO is possible at all
> > - check that backing store device is visible - otherwise no VDO is possible
> > - and only then possibly call vdo tool to check actual status
> >
> > And may be it should use private configuration file and actually import VDO
> > from device before starting
> >
> > > esac
> > >
> > >
> > >
> > >> -----Original Message-----
> > >> From: Users <users-bounces at clusterlabs.org> On Behalf Of Andrei
> > >> Borzenkov
> > >> Sent: Monday, May 17, 2021 12:49 PM
> > >> To: users at clusterlabs.org
> > >> Subject: Re: [ClusterLabs] DRBD + VDO HowTo?
> > >>
> > >> On 17.05.2021 18:18, Eric Robinson wrote:
> > >>> To Strahil and Klaus –
> > >>>
> > >>> I created the vdo devices using default parameters, so ‘auto’ mode
> > >>> was
> > >> selected by default. vdostatus shows that the current mode is async.
> > >> The underlying drbd devices are running protocol C, so I assume that
> > >> vdo should be changed to sync mode?
> > >>>
> > >>> The VDO service is disabled and is solely under the control of
> > >>> Pacemaker,
> > >> but I have been unable to get a resource agent to work reliably. I
> > >> have two nodes. Under normal operation, Node A is primary for disk
> > >> drbd0, and device
> > >> vdo0 rides on top of that. Node B is primary for disk drbd1 and
> > >> device vdo1 rides on top of that. In the event of a node failure, the
> > >> vdo device and the underlying drbd disk should migrate to the other
> > >> node, and then that node will be primary for both drbd disks and both vdo
> > devices.
> > >>>
> > >>> The default systemd vdo service does not work because it uses the
> > >>> –all flag
> > >> and starts/stops all vdo devices. I noticed that there is also a
> > >> vdo-start-by- dev.service, but there is no documentation on how to
> > >> use it. I wrote my own vdo-by-dev system service, but that did not
> > >> work reliably either. Then I noticed that there is already an OCF
> > >> resource agent named vdo-vol, but that did not work either. I finally
> > >> tried writing my own OCF-compliant RA, and then I tried writing an
> > >> LSB-compliant script, but none of those worked very well.
> > >>>
> > >>
> > >> You continue to write novels instead of simply showing your resource
> > >> agent, your configuration and logs.
> > >>
> > >>> My big problem is that I don’t understand how Pacemaker uses the
> > >> monitor action. Pacemaker would often fail vdo resources because the
> > >> monitor action received an error when it ran on the standby node. For
> > >> example, when Node A is primary for disk drbd1 and device vdo1,
> > >> Pacemaker would fail device vdo1 because when it ran the monitor
> > >> action on Node B, the RA reported an error. But OF COURSE it would
> > >> report an error, because disk drbd1 is secondary on that node, and is
> > >> therefore inaccessible to the vdo driver. I DON’T UNDERSTAND.
> > >>>
> > >>
> > >> May be your definition of "error" does not match pacemaker definition
> > >> of "error". It is hard to comment without seeing code.
> > >>
> > >>> -Eric
> > >>>
> > >>>
> > >>>
> > >>> From: Strahil Nikolov <hunter86_bg at yahoo.com>
> > >>> Sent: Monday, May 17, 2021 5:09 AM
> > >>> To: kwenning at redhat.com; Klaus Wenninger <kwenning at redhat.com>;
> > >>> Cluster Labs - All topics related to open-source clustering welcomed
> > >>> <users at clusterlabs.org>; Eric Robinson <eric.robinson at psmnv.com>
> > >>> Subject: Re: [ClusterLabs] DRBD + VDO HowTo?
> > >>>
> > >>> Have you tried to set VDO in async mode ?
> > >>>
> > >>> Best Regards,
> > >>> Strahil Nikolov
> > >>> On Mon, May 17, 2021 at 8:57, Klaus Wenninger
> > >>> <kwenning at redhat.com<mailto:kwenning at redhat.com>> wrote:
> > >>> Did you try VDO in sync-mode for the case the flush-fua stuff isn't
> > >>> working through the layers?
> > >>> Did you check that VDO-service is disabled and solely under
> > >>> pacemaker-control and that the dependencies are set correctly?
> > >>>
> > >>> Klaus
> > >>>
> > >>> On 5/17/21 6:17 AM, Eric Robinson wrote:
> > >>>
> > >>> Yes, DRBD is working fine.
> > >>>
> > >>>
> > >>>
> > >>> From: Strahil Nikolov
> > >>> <hunter86_bg at yahoo.com><mailto:hunter86_bg at yahoo.com>
> > >>> Sent: Sunday, May 16, 2021 6:06 PM
> > >>> To: Eric Robinson
> > >>> <eric.robinson at psmnv.com><mailto:eric.robinson at psmnv.com>;
> > Cluster
> > >>> Labs - All topics related to open-source clustering welcomed
> > >>> <users at clusterlabs.org><mailto:users at clusterlabs.org>
> > >>> Subject: RE: [ClusterLabs] DRBD + VDO HowTo?
> > >>>
> > >>>
> > >>>
> > >>> Are you sure that the DRBD is working properly ?
> > >>>
> > >>>
> > >>>
> > >>> Best Regards,
> > >>>
> > >>> Strahil Nikolov
> > >>>
> > >>> On Mon, May 17, 2021 at 0:32, Eric Robinson
> > >>>
> > >>> <eric.robinson at psmnv.com<mailto:eric.robinson at psmnv.com>>
> > wrote:
> > >>>
> > >>> Okay, it turns out I was wrong. I thought I had it working, but I
> > >>> keep running
> > >> into problems. Sometimes when I demote a DRBD resource on Node A
> > and
> > >> promote it on Node B, and I try to mount the filesystem, the system
> > >> complains that it cannot read the superblock. But when I move the
> > >> DRBD primary back to Node A, the file system is mountable again.
> > >> Also, I have problems with filesystems not mounting because the vdo
> > >> devices are not present. All kinds of issues.
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> From: Users
> > >>> <users-bounces at clusterlabs.org<mailto:users-
> > bounces at clusterlabs.org>
> > >>> >
> > >>> On Behalf Of Eric Robinson
> > >>> Sent: Friday, May 14, 2021 3:55 PM
> > >>> To: Strahil Nikolov
> > >>> <hunter86_bg at yahoo.com<mailto:hunter86_bg at yahoo.com>>;
> > Cluster
> > >> Labs -
> > >>> All topics related to open-source clustering welcomed
> > >>> <users at clusterlabs.org<mailto:users at clusterlabs.org>>
> > >>> Subject: Re: [ClusterLabs] DRBD + VDO HowTo?
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> Okay, I have it working now. The default systemd service definitions
> > >>> did
> > >> not work, so I created my own.
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> From: Strahil Nikolov
> > >>> <hunter86_bg at yahoo.com<mailto:hunter86_bg at yahoo.com>>
> > >>> Sent: Friday, May 14, 2021 3:41 AM
> > >>> To: Eric Robinson
> > >>> <eric.robinson at psmnv.com<mailto:eric.robinson at psmnv.com>>;
> > Cluster
> > >>> Labs - All topics related to open-source clustering welcomed
> > >>> <users at clusterlabs.org<mailto:users at clusterlabs.org>>
> > >>> Subject: RE: [ClusterLabs] DRBD + VDO HowTo?
> > >>>
> > >>>
> > >>>
> > >>> There is no VDO RA according to my knowledge, but you can use
> > >>> systemd
> > >> service as a resource.
> > >>>
> > >>>
> > >>>
> > >>> Yet, the VDO service that comes with thr OS is a generic one and
> > >>> controlls
> > >> all VDOs - so you need to create your own vdo service.
> > >>>
> > >>>
> > >>>
> > >>> Best Regards,
> > >>>
> > >>> Strahil Nikolov
> > >>>
> > >>> On Fri, May 14, 2021 at 6:55, Eric Robinson
> > >>>
> > >>> <eric.robinson at psmnv.com<mailto:eric.robinson at psmnv.com>>
> > wrote:
> > >>>
> > >>> I created the VDO volumes fine on the drbd devices, formatted them
> > >>> as xfs
> > >> filesystems, created cluster filesystem resources, and the cluster us
> > >> using them. But the cluster won’t fail over. Is there a VDO cluster
> > >> RA out there somewhere already?
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> From: Strahil Nikolov
> > >>> <hunter86_bg at yahoo.com<mailto:hunter86_bg at yahoo.com>>
> > >>> Sent: Thursday, May 13, 2021 10:07 PM
> > >>> To: Cluster Labs - All topics related to open-source clustering
> > >>> welcomed <users at clusterlabs.org<mailto:users at clusterlabs.org>>; Eric
> > >>> Robinson
> > >> <eric.robinson at psmnv.com<mailto:eric.robinson at psmnv.com>>
> > >>> Subject: Re: [ClusterLabs] DRBD + VDO HowTo?
> > >>>
> > >>>
> > >>>
> > >>> For DRBD there is enough info, so let's focus on VDO.
> > >>>
> > >>> There is a systemd service that starts all VDOs on the system. You
> > >>> can
> > >> create the VDO once drbs is open for writes and then you can create
> > >> your own systemd '.service' file which can be used as a cluster resource.
> > >>>
> > >>>
> > >>> Best Regards,
> > >>>
> > >>> Strahil Nikolov
> > >>>
> > >>>
> > >>>
> > >>> On Fri, May 14, 2021 at 2:33, Eric Robinson
> > >>>
> > >>> <eric.robinson at psmnv.com<mailto:eric.robinson at psmnv.com>>
> > wrote:
> > >>>
> > >>> Can anyone point to a document on how to use VDO de-duplication with
> > >> DRBD? Linbit has a blog page about it, but it was last updated 6
> > >> years ago and the embedded links are dead.
> > >>>
> > >>>
> > >>>
> > >>> https://linbit.com/blog/albireo-virtual-data-optimizer-vdo-on-drbd/
> > >>>
> > >>>
> > >>>
> > >>> -Eric
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> Disclaimer : This email and any files transmitted with it are
> > >>> confidential and
> > >> intended solely for intended recipients. If you are not the named
> > >> addressee you should not disseminate, distribute, copy or alter this
> > >> email. Any views or opinions presented in this email are solely those
> > >> of the author and might not represent those of Physician Select
> > >> Management. Warning: Although Physician Select Management has taken
> > >> reasonable precautions to ensure no viruses are present in this
> > >> email, the company cannot accept responsibility for any loss or damage
> > arising from the use of this email or attachments.
> > >>>
> > >>> _______________________________________________
> > >>> Manage your subscription:
> > >>> https://lists.clusterlabs.org/mailman/listinfo/users
> > >>>
> > >>> ClusterLabs home: https://www.clusterlabs.org/
> > >>>
> > >>> Disclaimer : This email and any files transmitted with it are
> > >>> confidential and
> > >> intended solely for intended recipients. If you are not the named
> > >> addressee you should not disseminate, distribute, copy or alter this
> > >> email. Any views or opinions presented in this email are solely those
> > >> of the author and might not represent those of Physician Select
> > >> Management. Warning: Although Physician Select Management has taken
> > >> reasonable precautions to ensure no viruses are present in this
> > >> email, the company cannot accept responsibility for any loss or damage
> > arising from the use of this email or attachments.
> > >>>
> > >>> Disclaimer : This email and any files transmitted with it are
> > >>> confidential and
> > >> intended solely for intended recipients. If you are not the named
> > >> addressee you should not disseminate, distribute, copy or alter this
> > >> email. Any views or opinions presented in this email are solely those
> > >> of the author and might not represent those of Physician Select
> > >> Management. Warning: Although Physician Select Management has taken
> > >> reasonable precautions to ensure no viruses are present in this
> > >> email, the company cannot accept responsibility for any loss or damage
> > arising from the use of this email or attachments.
> > >>>
> > >>> Disclaimer : This email and any files transmitted with it are
> > >>> confidential and
> > >> intended solely for intended recipients. If you are not the named
> > >> addressee you should not disseminate, distribute, copy or alter this
> > >> email. Any views or opinions presented in this email are solely those
> > >> of the author and might not represent those of Physician Select
> > >> Management. Warning: Although Physician Select Management has taken
> > >> reasonable precautions to ensure no viruses are present in this
> > >> email, the company cannot accept responsibility for any loss or damage
> > arising from the use of this email or attachments.
> > >>> Disclaimer : This email and any files transmitted with it are
> > >>> confidential and
> > >> intended solely for intended recipients. If you are not the named
> > >> addressee you should not disseminate, distribute, copy or alter this
> > >> email. Any views or opinions presented in this email are solely those
> > >> of the author and might not represent those of Physician Select
> > >> Management. Warning: Although Physician Select Management has taken
> > >> reasonable precautions to ensure no viruses are present in this
> > >> email, the company cannot accept responsibility for any loss or damage
> > arising from the use of this email or attachments.
> > >>>
> > >>> _______________________________________________
> > >>>
> > >>> Manage your subscription:
> > >>>
> > >>> https://lists.clusterlabs.org/mailman/listinfo/users
> > >>>
> > >>>
> > >>>
> > >>> ClusterLabs home: https://www.clusterlabs.org/
> > >>>
> > >>> Disclaimer : This email and any files transmitted with it are
> > >>> confidential and
> > >> intended solely for intended recipients. If you are not the named
> > >> addressee you should not disseminate, distribute, copy or alter this
> > >> email. Any views or opinions presented in this email are solely those
> > >> of the author and might not represent those of Physician Select
> > >> Management. Warning: Although Physician Select Management has taken
> > >> reasonable precautions to ensure no viruses are present in this
> > >> email, the company cannot accept responsibility for any loss or damage
> > arising from the use of this email or attachments.
> > >>>
> > >>>
> > >>> _______________________________________________
> > >>> Manage your subscription:
> > >>> https://lists.clusterlabs.org/mailman/listinfo/users
> > >>>
> > >>> ClusterLabs home: https://www.clusterlabs.org/
> > >>>
> > >>
> > >> _______________________________________________
> > >> Manage your subscription:
> > >> https://lists.clusterlabs.org/mailman/listinfo/users
> > >>
> > >> ClusterLabs home: https://www.clusterlabs.org/
> > > Disclaimer : This email and any files transmitted with it are confidential and
> > intended solely for intended recipients. If you are not the named addressee
> > you should not disseminate, distribute, copy or alter this email. Any views or
> > opinions presented in this email are solely those of the author and might not
> > represent those of Physician Select Management. Warning: Although
> > Physician Select Management has taken reasonable precautions to ensure
> > no viruses are present in this email, the company cannot accept responsibility
> > for any loss or damage arising from the use of this email or attachments.
> > > _______________________________________________
> > > Manage your subscription:
> > > https://lists.clusterlabs.org/mailman/listinfo/users
> > >
> > > ClusterLabs home: https://www.clusterlabs.org/
> > >
> >
> > _______________________________________________
> > Manage your subscription:
> > https://lists.clusterlabs.org/mailman/listinfo/users
> >
> > ClusterLabs home: https://www.clusterlabs.org/
> Disclaimer : This email and any files transmitted with it are confidential and intended solely for intended recipients. If you are not the named addressee you should not disseminate, distribute, copy or alter this email. Any views or opinions presented in this email are solely those of the author and might not represent those of Physician Select Management. Warning: Although Physician Select Management has taken reasonable precautions to ensure no viruses are present in this email, the company cannot accept responsibility for any loss or damage arising from the use of this email or attachments.
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/


More information about the Users mailing list