[ClusterLabs Developers] OCF_RESKEY_CRM_meta_notify_active_* always empty

Jehan-Guillaume de Rorthais jgdr at dalibo.com
Mon Aug 1 16:18:01 UTC 2016


Le Mon, 1 Aug 2016 10:27:53 -0500,
Ken Gaillot <kgaillot at redhat.com> a écrit :

> On 07/29/2016 06:19 PM, Andrew Beekhof wrote:
> > Urgh. I must be confused with sles11. 
> > In any case, the first version of pacemaker was identical to the last
> > heartbeat crm. 
> > 
> > I don't recall the ocfs2 agent changing design while I was there, so 11 may
> > be broken too
> 
> I just realized *_active_* is only broken for master/slave clones.
> Filesystem is not master/slave, so it wouldn't have any issue.

Well, I'm glad we are the first RA using it :)

I wonders how other m/s RA are doing without it. We are using it (actually
"master + slave + start - stop" because of the bug) to check during a
promotion after a failover if the resource being promoted is the best one
among the known ones.

> >> On 30 Jul 2016, at 8:51 AM, Ken Gaillot <kgaillot at redhat.com> wrote:
> >>
> >>> On 07/29/2016 05:41 PM, Andrew Beekhof wrote:
> >>>
> >>>
> >>> Sent from my iPhone
> >>>
> >>>> On 30 Jul 2016, at 8:32 AM, Ken Gaillot <kgaillot at redhat.com> wrote:
> >>>>
> >>>> I finally had time to investigate this, and it definitely is broken.
> >>>>
> >>>> The only existing heartbeat RA to use the *_notify_active_* variables is
> >>>> Filesystem, and it only does so for OCFS2 on SLES10, which didn't even
> >>>> ship pacemaker,
> >>>
> >>> I'm pretty sure it did
> >>
> >> All I could find was:
> >>
> >> "SLES 10 did not yet ship pacemaker, but heartbeat with the builtin crm"
> >>
> >> http://oss.clusterlabs.org/pipermail/pacemaker/2014-July/022232.html
> >>
> >> I'm sure people were compiling it, and ClusterLabs probably even
> >> provided a repo, but it looks like sles didn't ship it.
> >>
> >> The issue is that the code that builds the active list checks for role
> >> RSC_ROLE_STARTED rather than RSC_ROLE_SLAVE + RSC_ROLE_MASTER, so I
> >> don't think it ever would have worked.
> >>
> >>>
> >>>> so I'm guessing it's been broken from the beginning of
> >>>> pacemaker.
> >>>>
> >>>> The fix looks straightforward, so I should be able to take care of it
> >>>> soon.
> >>>>
> >>>> Filed bug http://bugs.clusterlabs.org/show_bug.cgi?id=5295
> >>>>
> >>>>> On 05/08/2016 04:57 AM, Jehan-Guillaume de Rorthais wrote:
> >>>>> Le Fri, 6 May 2016 15:41:11 -0500,
> >>>>> Ken Gaillot <kgaillot at redhat.com> a écrit :
> >>>>>
> >>>>>>> On 05/03/2016 05:30 PM, Jehan-Guillaume de Rorthais wrote:
> >>>>>>> Le Tue, 3 May 2016 21:10:12 +0200,
> >>>>>>> Jehan-Guillaume de Rorthais <jgdr at dalibo.com> a écrit :
> >>>>>>>
> >>>>>>>> Le Mon, 2 May 2016 17:59:55 -0500,
> >>>>>>>> Ken Gaillot <kgaillot at redhat.com> a écrit :
> >>>>>>>>
> >>>>>>>>>> On 04/28/2016 04:47 AM, Jehan-Guillaume de Rorthais wrote:
> >>>>>>>>>> Hello all,
> >>>>>>>>>>
> >>>>>>>>>> While testing and experiencing with our RA for PostgreSQL, I found
> >>>>>>>>>> the meta_notify_active_* variables seems always empty. Here is an
> >>>>>>>>>> example of these variables as they are seen from our RA during a
> >>>>>>>>>> migration/switchover:
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> {
> >>>>>>>>>>   'type' => 'pre',
> >>>>>>>>>>   'operation' => 'demote',
> >>>>>>>>>>   'active' => [],
> >>>>>>>>>>   'inactive' => [],
> >>>>>>>>>>   'start' => [],
> >>>>>>>>>>   'stop' => [],
> >>>>>>>>>>   'demote' => [
> >>>>>>>>>>                 {
> >>>>>>>>>>                   'rsc' => 'pgsqld:1',
> >>>>>>>>>>                   'uname' => 'hanode1'
> >>>>>>>>>>                 }
> >>>>>>>>>>               ],
> >>>>>>>>>>
> >>>>>>>>>>   'master' => [
> >>>>>>>>>>                 {
> >>>>>>>>>>                   'rsc' => 'pgsqld:1',
> >>>>>>>>>>                   'uname' => 'hanode1'
> >>>>>>>>>>                 }
> >>>>>>>>>>               ],
> >>>>>>>>>>
> >>>>>>>>>>   'promote' => [
> >>>>>>>>>>                  {
> >>>>>>>>>>                    'rsc' => 'pgsqld:0',
> >>>>>>>>>>                    'uname' => 'hanode3'
> >>>>>>>>>>                  }
> >>>>>>>>>>                ],
> >>>>>>>>>>   'slave' => [
> >>>>>>>>>>                {
> >>>>>>>>>>                  'rsc' => 'pgsqld:0',
> >>>>>>>>>>                  'uname' => 'hanode3'
> >>>>>>>>>>                },
> >>>>>>>>>>                {
> >>>>>>>>>>                  'rsc' => 'pgsqld:2',
> >>>>>>>>>>                  'uname' => 'hanode2'
> >>>>>>>>>>                }
> >>>>>>>>>>              ],
> >>>>>>>>>>
> >>>>>>>>>> }
> >>>>>>>>>>
> >>>>>>>>>> In case this comes from our side, here is code building this:
> >>>>>>>>>>
> >>>>>>>>>> https://github.com/dalibo/PAF/blob/6e86284bc647ef1e81f01f047f1862e40ba62906/lib/OCF_Functions.pm#L444
> >>>>>>>>>>
> >>>>>>>>>> But looking at the variable itself in debug logs, I always find it
> >>>>>>>>>> empty, in various situations (switchover, recover, failover).
> >>>>>>>>>>
> >>>>>>>>>> If I understand the documentation correctly, I would expect
> >>>>>>>>>> 'active' to list all the three resources, shouldn't it? Currently,
> >>>>>>>>>> to bypass this, we consider: active == master + slave
> >>>>>>>>>
> >>>>>>>>> You're right, it should. The pacemaker code that generates the
> >>>>>>>>> "active" variables is the same used for "demote" etc., so it seems
> >>>>>>>>> unlikely the issue is on pacemaker's side. Especially since your
> >>>>>>>>> code treats active etc. differently from demote etc., it seems like
> >>>>>>>>> it must be in there somewhere, but I don't see where.
> >>>>>>>>
> >>>>>>>> The code treat active, inactive, start and stop all together, for any
> >>>>>>>> cloned resource. If the resource is a multistate, it adds promote,
> >>>>>>>> demote, slave and master.
> >>>>>>>>
> >>>>>>>> Note that from this piece of code, the 7 other notify vars are set
> >>>>>>>> correctly: start, stop, inactive, promote, demote, slave, master.
> >>>>>>>> Only active is always missing.
> >>>>>>>>
> >>>>>>>> I'll investigate and try to find where is hiding the bug.
> >>>>>>>
> >>>>>>> So I added a piece of code to dump the **all** the environment
> >>>>>>> variables to a temp file as early as possible **to avoid any
> >>>>>>> interaction with our perl module** in the code of the RA, ie.:
> >>>>>>>
> >>>>>>> BEGIN {
> >>>>>>>   use Time::HiRes qw(time);
> >>>>>>>   my $now = time;
> >>>>>>>   open my $fh, ">", "/tmp/test-$now.env.txt";
> >>>>>>>   printf($fh "%-20s = ''%s''\n", $_, $ENV{$_}) foreach sort keys %ENV;
> >>>>>>> }
> >>>>>>>
> >>>>>>> Then I started my cluster and set maintenance-mode=false while no
> >>>>>>> resources where running. So the debug files contains the probe
> >>>>>>> action, start on all nodes, one promote on the master and the first
> >>>>>>> monitors. The "*active" variables are always empty anywhere in the
> >>>>>>> cluster. Find in attachment the result of the following command on
> >>>>>>> the master node:
> >>>>>>>
> >>>>>>> for i in test-*; do echo "===== $i ====="; grep OCF_ $i; done >
> >>>>>>> debug-env.txt
> >>>>>>>
> >>>>>>> I'm using Pacemaker 1.1.13-10.el7_2.2-44eb2dd under CentOS 7.2.1511.
> >>>>>>>
> >>>>>>> For completeness, I added the Pacemaker configuration I use for my 3
> >>>>>>> node dev/test cluster.
> >>>>>>>
> >>>>>>> Let me know if you think of more investigations and test I could run
> >>>>>>> on this issue. I'm out of ideas for tonight (and I really would
> >>>>>>> prefer having this bug on my side).
> >>>>>>
> >>>>>> From your environment dumps, what I think is happening is that you are
> >>>>>> getting multiple notifications (start, pre-promote, post-promote) in a
> >>>>>> single cluster transition. So the variables reflect the initial state
> >>>>>> of that transition -- none of the instances are active, all three are
> >>>>>> being started (so the nodes are in the "*_start_*" variables), and one
> >>>>>> is being promoted.
> >>>>>
> >>>>>
> >>>>> Yes, this is what happening here. It's embarrassing I didn't thought
> >>>>> about that :)
> >>>>>
> >>>>>> The starts will be done before the promote. If one of the starts fails,
> >>>>>> the transition will be aborted, and a new one will be calculated. So,
> >>>>>> if you get to the promote, you can assume anything in "*_start_*" is
> >>>>>> now active.
> >>>>>
> >>>>> I did another simple test:
> >>>>>
> >>>>> * 3 ms clones are running on hanode1 hanode2 hanode3
> >>>>> * master role is on hanode1
> >>>>> * I move the master role to hanode 2 using: 
> >>>>>   "pcs resource move pgsql-ha hanode2 --master"
> >>>>>
> >>>>> The transition gives us:
> >>>>>
> >>>>> * demote on hanode1
> >>>>> * promote en hanode2
> >>>>>
> >>>>> I suppose all the three clone on hanode1, hanode2 and hanode3 should
> >>>>> appear in active env variable in this context, isn't it?
> >>>>>
> >>>>> Please, find in attachment the environment dumps of this transition from
> >>>>> hanode1. You'll see both "OCF_RESKEY_CRM_meta_notify_active_resource"
> >>>>> and "OCF_RESKEY_CRM_meta_notify_active_uname" only contains one char: a
> >>>>> space.
> >>>>>
> >>>>> I start looking at the Pacemaker code, at least to have a better
> >>>>> understanding on where environment variables are set and when they are
> >>>>> available. I was out of luck so far but I lack of time. Any pointers
> >>>>> would be appreciated :)
> >>>>>
> >>>>>>> On a side note, I noticed with these debug files that the notify
> >>>>>>> variables where also available outside of notify actions (start and
> >>>>>>> notify here). Are they always available during "transition
> >>>>>>> actions" (start, stop, promote, demote)? Checking at the mysql RA,
> >>>>>>> they are using OCF_RESKEY_CRM_meta_notify_master_uname during the
> >>>>>>> start action. So I suppose it's safe?
> >>>>>>
> >>>>>> Good question, I've never tried that before. I'm reluctant to say it's
> >>>>>> guaranteed; it's possible seeing them in the start action is a side
> >>>>>> effect of the current implementation and could theoretically change in
> >>>>>> the future. But if mysql is relying on it, I suppose it's
> >>>>>> well-established already, making changing it unlikely ...
> >>>>>
> >>>>> Thank you very much for this clarification. Presently we keep in a
> >>>>> private attribute what we //think// (we can not rely on
> >>>>> active_uname :/) are the active uname for the ms resource. As it seems
> >>>>> the notify vars appears outside of notify action is just a side effect
> >>>>> of the current implementation, I prefer to stay away from them when we
> >>>>> are not in a notify action and keep our current implementation.
> >>>>>
> >>>>> Thank you,



-- 
Jehan-Guillaume de Rorthais
Dalibo




More information about the Developers mailing list