[ClusterLabs Developers] OCF_RESKEY_CRM_meta_notify_active_* always empty
Ken Gaillot
kgaillot at redhat.com
Mon Aug 1 17:00:24 UTC 2016
On 08/01/2016 11:18 AM, Jehan-Guillaume de Rorthais wrote:
> Le Mon, 1 Aug 2016 10:27:53 -0500,
> Ken Gaillot <kgaillot at redhat.com> a écrit :
>
>> On 07/29/2016 06:19 PM, Andrew Beekhof wrote:
>>> Urgh. I must be confused with sles11.
>>> In any case, the first version of pacemaker was identical to the last
>>> heartbeat crm.
>>>
>>> I don't recall the ocfs2 agent changing design while I was there, so 11 may
>>> be broken too
>>
>> I just realized *_active_* is only broken for master/slave clones.
>> Filesystem is not master/slave, so it wouldn't have any issue.
>
> Well, I'm glad we are the first RA using it :)
>
> I wonders how other m/s RA are doing without it. We are using it (actually
> "master + slave + start - stop" because of the bug) to check during a
> promotion after a failover if the resource being promoted is the best one
> among the known ones.
Yes, the very simple workaround is simply to set active = master +
slave; that's all the pacemaker fix will do. You'll still need the "+
start - stop" to get the situation after the action.
Coincidentally, we need to bump crm_feature_set to 3.0.11 anyway, so
you'll be able to test that to tell whether *_active_* is correct, if
desired. There is an ocf_version_cmp function in ocf-shellfuncs.
>
>>>> On 30 Jul 2016, at 8:51 AM, Ken Gaillot <kgaillot at redhat.com> wrote:
>>>>
>>>>> On 07/29/2016 05:41 PM, Andrew Beekhof wrote:
>>>>>
>>>>>
>>>>> Sent from my iPhone
>>>>>
>>>>>> On 30 Jul 2016, at 8:32 AM, Ken Gaillot <kgaillot at redhat.com> wrote:
>>>>>>
>>>>>> I finally had time to investigate this, and it definitely is broken.
>>>>>>
>>>>>> The only existing heartbeat RA to use the *_notify_active_* variables is
>>>>>> Filesystem, and it only does so for OCFS2 on SLES10, which didn't even
>>>>>> ship pacemaker,
>>>>>
>>>>> I'm pretty sure it did
>>>>
>>>> All I could find was:
>>>>
>>>> "SLES 10 did not yet ship pacemaker, but heartbeat with the builtin crm"
>>>>
>>>> http://oss.clusterlabs.org/pipermail/pacemaker/2014-July/022232.html
>>>>
>>>> I'm sure people were compiling it, and ClusterLabs probably even
>>>> provided a repo, but it looks like sles didn't ship it.
>>>>
>>>> The issue is that the code that builds the active list checks for role
>>>> RSC_ROLE_STARTED rather than RSC_ROLE_SLAVE + RSC_ROLE_MASTER, so I
>>>> don't think it ever would have worked.
>>>>
>>>>>
>>>>>> so I'm guessing it's been broken from the beginning of
>>>>>> pacemaker.
>>>>>>
>>>>>> The fix looks straightforward, so I should be able to take care of it
>>>>>> soon.
>>>>>>
>>>>>> Filed bug http://bugs.clusterlabs.org/show_bug.cgi?id=5295
>>>>>>
>>>>>>> On 05/08/2016 04:57 AM, Jehan-Guillaume de Rorthais wrote:
>>>>>>> Le Fri, 6 May 2016 15:41:11 -0500,
>>>>>>> Ken Gaillot <kgaillot at redhat.com> a écrit :
>>>>>>>
>>>>>>>>> On 05/03/2016 05:30 PM, Jehan-Guillaume de Rorthais wrote:
>>>>>>>>> Le Tue, 3 May 2016 21:10:12 +0200,
>>>>>>>>> Jehan-Guillaume de Rorthais <jgdr at dalibo.com> a écrit :
>>>>>>>>>
>>>>>>>>>> Le Mon, 2 May 2016 17:59:55 -0500,
>>>>>>>>>> Ken Gaillot <kgaillot at redhat.com> a écrit :
>>>>>>>>>>
>>>>>>>>>>>> On 04/28/2016 04:47 AM, Jehan-Guillaume de Rorthais wrote:
>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>
>>>>>>>>>>>> While testing and experiencing with our RA for PostgreSQL, I found
>>>>>>>>>>>> the meta_notify_active_* variables seems always empty. Here is an
>>>>>>>>>>>> example of these variables as they are seen from our RA during a
>>>>>>>>>>>> migration/switchover:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> {
>>>>>>>>>>>> 'type' => 'pre',
>>>>>>>>>>>> 'operation' => 'demote',
>>>>>>>>>>>> 'active' => [],
>>>>>>>>>>>> 'inactive' => [],
>>>>>>>>>>>> 'start' => [],
>>>>>>>>>>>> 'stop' => [],
>>>>>>>>>>>> 'demote' => [
>>>>>>>>>>>> {
>>>>>>>>>>>> 'rsc' => 'pgsqld:1',
>>>>>>>>>>>> 'uname' => 'hanode1'
>>>>>>>>>>>> }
>>>>>>>>>>>> ],
>>>>>>>>>>>>
>>>>>>>>>>>> 'master' => [
>>>>>>>>>>>> {
>>>>>>>>>>>> 'rsc' => 'pgsqld:1',
>>>>>>>>>>>> 'uname' => 'hanode1'
>>>>>>>>>>>> }
>>>>>>>>>>>> ],
>>>>>>>>>>>>
>>>>>>>>>>>> 'promote' => [
>>>>>>>>>>>> {
>>>>>>>>>>>> 'rsc' => 'pgsqld:0',
>>>>>>>>>>>> 'uname' => 'hanode3'
>>>>>>>>>>>> }
>>>>>>>>>>>> ],
>>>>>>>>>>>> 'slave' => [
>>>>>>>>>>>> {
>>>>>>>>>>>> 'rsc' => 'pgsqld:0',
>>>>>>>>>>>> 'uname' => 'hanode3'
>>>>>>>>>>>> },
>>>>>>>>>>>> {
>>>>>>>>>>>> 'rsc' => 'pgsqld:2',
>>>>>>>>>>>> 'uname' => 'hanode2'
>>>>>>>>>>>> }
>>>>>>>>>>>> ],
>>>>>>>>>>>>
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> In case this comes from our side, here is code building this:
>>>>>>>>>>>>
>>>>>>>>>>>> https://github.com/dalibo/PAF/blob/6e86284bc647ef1e81f01f047f1862e40ba62906/lib/OCF_Functions.pm#L444
>>>>>>>>>>>>
>>>>>>>>>>>> But looking at the variable itself in debug logs, I always find it
>>>>>>>>>>>> empty, in various situations (switchover, recover, failover).
>>>>>>>>>>>>
>>>>>>>>>>>> If I understand the documentation correctly, I would expect
>>>>>>>>>>>> 'active' to list all the three resources, shouldn't it? Currently,
>>>>>>>>>>>> to bypass this, we consider: active == master + slave
>>>>>>>>>>>
>>>>>>>>>>> You're right, it should. The pacemaker code that generates the
>>>>>>>>>>> "active" variables is the same used for "demote" etc., so it seems
>>>>>>>>>>> unlikely the issue is on pacemaker's side. Especially since your
>>>>>>>>>>> code treats active etc. differently from demote etc., it seems like
>>>>>>>>>>> it must be in there somewhere, but I don't see where.
>>>>>>>>>>
>>>>>>>>>> The code treat active, inactive, start and stop all together, for any
>>>>>>>>>> cloned resource. If the resource is a multistate, it adds promote,
>>>>>>>>>> demote, slave and master.
>>>>>>>>>>
>>>>>>>>>> Note that from this piece of code, the 7 other notify vars are set
>>>>>>>>>> correctly: start, stop, inactive, promote, demote, slave, master.
>>>>>>>>>> Only active is always missing.
>>>>>>>>>>
>>>>>>>>>> I'll investigate and try to find where is hiding the bug.
>>>>>>>>>
>>>>>>>>> So I added a piece of code to dump the **all** the environment
>>>>>>>>> variables to a temp file as early as possible **to avoid any
>>>>>>>>> interaction with our perl module** in the code of the RA, ie.:
>>>>>>>>>
>>>>>>>>> BEGIN {
>>>>>>>>> use Time::HiRes qw(time);
>>>>>>>>> my $now = time;
>>>>>>>>> open my $fh, ">", "/tmp/test-$now.env.txt";
>>>>>>>>> printf($fh "%-20s = ''%s''\n", $_, $ENV{$_}) foreach sort keys %ENV;
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> Then I started my cluster and set maintenance-mode=false while no
>>>>>>>>> resources where running. So the debug files contains the probe
>>>>>>>>> action, start on all nodes, one promote on the master and the first
>>>>>>>>> monitors. The "*active" variables are always empty anywhere in the
>>>>>>>>> cluster. Find in attachment the result of the following command on
>>>>>>>>> the master node:
>>>>>>>>>
>>>>>>>>> for i in test-*; do echo "===== $i ====="; grep OCF_ $i; done >
>>>>>>>>> debug-env.txt
>>>>>>>>>
>>>>>>>>> I'm using Pacemaker 1.1.13-10.el7_2.2-44eb2dd under CentOS 7.2.1511.
>>>>>>>>>
>>>>>>>>> For completeness, I added the Pacemaker configuration I use for my 3
>>>>>>>>> node dev/test cluster.
>>>>>>>>>
>>>>>>>>> Let me know if you think of more investigations and test I could run
>>>>>>>>> on this issue. I'm out of ideas for tonight (and I really would
>>>>>>>>> prefer having this bug on my side).
>>>>>>>>
>>>>>>>> From your environment dumps, what I think is happening is that you are
>>>>>>>> getting multiple notifications (start, pre-promote, post-promote) in a
>>>>>>>> single cluster transition. So the variables reflect the initial state
>>>>>>>> of that transition -- none of the instances are active, all three are
>>>>>>>> being started (so the nodes are in the "*_start_*" variables), and one
>>>>>>>> is being promoted.
>>>>>>>
>>>>>>>
>>>>>>> Yes, this is what happening here. It's embarrassing I didn't thought
>>>>>>> about that :)
>>>>>>>
>>>>>>>> The starts will be done before the promote. If one of the starts fails,
>>>>>>>> the transition will be aborted, and a new one will be calculated. So,
>>>>>>>> if you get to the promote, you can assume anything in "*_start_*" is
>>>>>>>> now active.
>>>>>>>
>>>>>>> I did another simple test:
>>>>>>>
>>>>>>> * 3 ms clones are running on hanode1 hanode2 hanode3
>>>>>>> * master role is on hanode1
>>>>>>> * I move the master role to hanode 2 using:
>>>>>>> "pcs resource move pgsql-ha hanode2 --master"
>>>>>>>
>>>>>>> The transition gives us:
>>>>>>>
>>>>>>> * demote on hanode1
>>>>>>> * promote en hanode2
>>>>>>>
>>>>>>> I suppose all the three clone on hanode1, hanode2 and hanode3 should
>>>>>>> appear in active env variable in this context, isn't it?
>>>>>>>
>>>>>>> Please, find in attachment the environment dumps of this transition from
>>>>>>> hanode1. You'll see both "OCF_RESKEY_CRM_meta_notify_active_resource"
>>>>>>> and "OCF_RESKEY_CRM_meta_notify_active_uname" only contains one char: a
>>>>>>> space.
>>>>>>>
>>>>>>> I start looking at the Pacemaker code, at least to have a better
>>>>>>> understanding on where environment variables are set and when they are
>>>>>>> available. I was out of luck so far but I lack of time. Any pointers
>>>>>>> would be appreciated :)
>>>>>>>
>>>>>>>>> On a side note, I noticed with these debug files that the notify
>>>>>>>>> variables where also available outside of notify actions (start and
>>>>>>>>> notify here). Are they always available during "transition
>>>>>>>>> actions" (start, stop, promote, demote)? Checking at the mysql RA,
>>>>>>>>> they are using OCF_RESKEY_CRM_meta_notify_master_uname during the
>>>>>>>>> start action. So I suppose it's safe?
>>>>>>>>
>>>>>>>> Good question, I've never tried that before. I'm reluctant to say it's
>>>>>>>> guaranteed; it's possible seeing them in the start action is a side
>>>>>>>> effect of the current implementation and could theoretically change in
>>>>>>>> the future. But if mysql is relying on it, I suppose it's
>>>>>>>> well-established already, making changing it unlikely ...
>>>>>>>
>>>>>>> Thank you very much for this clarification. Presently we keep in a
>>>>>>> private attribute what we //think// (we can not rely on
>>>>>>> active_uname :/) are the active uname for the ms resource. As it seems
>>>>>>> the notify vars appears outside of notify action is just a side effect
>>>>>>> of the current implementation, I prefer to stay away from them when we
>>>>>>> are not in a notify action and keep our current implementation.
>>>>>>>
>>>>>>> Thank you,
>
>
>
More information about the Developers
mailing list