[Pacemaker] "crm resource restart" does not work on the DC node with crmd-transtion-delay="2s"
NAKAHIRA Kazutomo
nakahira.kazutomo at oss.ntt.co.jp
Tue Aug 23 02:09:58 UTC 2011
Hi Dejan,
Thank you for reviewing my patches.
If you prefer, could you please commit patches to the latest repository?
Best regards,
(2011/08/17 17:37), Dejan Muhamedagic wrote:
> Hi Kazutomo-san,
>
> On Tue, Aug 02, 2011 at 06:12:12PM +0900, NAKAHIRA Kazutomo wrote:
>> Hi, Andrew
>>
>> (2011/08/01 12:13), Andrew Beekhof wrote:
>>> 2011/7/27 NAKAHIRA Kazutomo<nakahira.kazutomo at oss.ntt.co.jp>:
>>>> Hi, all
>>>>
>>>> I configured crmd-transition-delay="2s" to address the following problem.
>>>>
>>>> http://www.gossamer-threads.com/lists/linuxha/pacemaker/68504
>>>> http://developerbugs.linux-foundation.org/show_bug.cgi?id=2528
>>>>
>>>> And then, "crm resource restart" command get become less able to
>>>> restart any resources on the DC node.
>>>> # "crm resource restart" works fine on the non-DC node.
>>>> # Please see attached hb_report generated on the simple environment.
>>>>
>>>> How can I use "crm resource restart" command on the DC node
>>>> with crmd-transtion-delay="2s"?
>>>
>>> Sounds like the shell isn't waiting long enough.
>>
>> I understood that it is hard to resolve this problem by the
>> configuration and we need to fix crm shell. Would that be about
>> right?
>>
>> If so, I made a patch for crm shell that wait a
>> crmd-transition-delay before checking DC node status.
>>
>> Please see attached patch.
>
> The patches look fine to me.
>
> Cheers,
>
> Dejan
>
>> Best regards,
>>
>>>
>>>>
>>>> I confirmed that I can avoid this problem by the following procedure
>>>> 1. "crm resource stop rsc-ID"
>>>> 2. wait crmd-transtion-delay(2) scond
>>>> 3. "crm resource start rsc-ID"
>>>> but this behavior(restart does not works on the DC node)
>>>> may be confuse users.
>>>>
>>>> Best regards,
>>>>
>>>> --
>>>> NAKAHIRA Kazutomo
>>>> Infrastructure Software Technology Unit
>>>> NTT Open Source Software Center
>>>>
>>>> _______________________________________________
>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>>
>>>
>>
>>
>> --
>> NAKAHIRA Kazutomo
>> Infrastructure Software Technology Unit
>> NTT Open Source Software Center
>
>> # HG changeset patch
>> # User NAKAHIRA Kazutomo<nakahira.kazutomo at oss.ntt.co.jp>
>> # Date 1312274729 -32400
>> # Branch stable-1.0
>> # Node ID 2b4a64c1bb737cfce61b1eaef0dca31d903d9b2e
>> # Parent db98485d06ed3fe0fe236509f023e1bd4a5566f1
>> shell: crm_msec is deemed desirable to be located in the utils.py
>>
>> diff -r db98485d06ed -r 2b4a64c1bb73 shell/modules/ra.py.in
>> --- a/shell/modules/ra.py.in Fri May 06 13:47:43 2011 +0200
>> +++ b/shell/modules/ra.py.in Tue Aug 02 17:45:29 2011 +0900
>> @@ -224,37 +224,6 @@
>> depth = find_value(pl, "depth") or '0'
>> role = find_value(pl, "role")
>> return mk_monitor_name(role,depth)
>> -def crm_msec(t):
>> - '''
>> - See lib/common/utils.c:crm_get_msec().
>> - '''
>> - convtab = {
>> - 'ms': (1,1),
>> - 'msec': (1,1),
>> - 'us': (1,1000),
>> - 'usec': (1,1000),
>> - '': (1000,1),
>> - 's': (1000,1),
>> - 'sec': (1000,1),
>> - 'm': (60*1000,1),
>> - 'min': (60*1000,1),
>> - 'h': (60*60*1000,1),
>> - 'hr': (60*60*1000,1),
>> - }
>> - if not t:
>> - return -1
>> - r = re.match("\s*(\d+)\s*([a-zA-Z]+)?", t)
>> - if not r:
>> - return -1
>> - if not r.group(2):
>> - q = ''
>> - else:
>> - q = r.group(2).lower()
>> - try:
>> - mult,div = convtab[q]
>> - except:
>> - return -1
>> - return (int(r.group(1))*mult)/div
>> def crm_time_cmp(a, b):
>> return crm_msec(a) - crm_msec(b)
>>
>> diff -r db98485d06ed -r 2b4a64c1bb73 shell/modules/utils.py
>> --- a/shell/modules/utils.py Fri May 06 13:47:43 2011 +0200
>> +++ b/shell/modules/utils.py Tue Aug 02 17:45:29 2011 +0900
>> @@ -199,6 +199,38 @@
>> s = get_stdout(add_sudo(cmd), stderr_on)
>> return s.split('\n')
>>
>> +def crm_msec(t):
>> + '''
>> + See lib/common/utils.c:crm_get_msec().
>> + '''
>> + convtab = {
>> + 'ms': (1,1),
>> + 'msec': (1,1),
>> + 'us': (1,1000),
>> + 'usec': (1,1000),
>> + '': (1000,1),
>> + 's': (1000,1),
>> + 'sec': (1000,1),
>> + 'm': (60*1000,1),
>> + 'min': (60*1000,1),
>> + 'h': (60*60*1000,1),
>> + 'hr': (60*60*1000,1),
>> + }
>> + if not t:
>> + return -1
>> + r = re.match("\s*(\d+)\s*([a-zA-Z]+)?", t)
>> + if not r:
>> + return -1
>> + if not r.group(2):
>> + q = ''
>> + else:
>> + q = r.group(2).lower()
>> + try:
>> + mult,div = convtab[q]
>> + except:
>> + return -1
>> + return (int(r.group(1))*mult)/div
>> +
>> def wait4dc(what = "", show_progress = True):
>> '''
>> Wait for the DC to get into the S_IDLE state. This should be
>
>> # HG changeset patch
>> # User NAKAHIRA Kazutomo<nakahira.kazutomo at oss.ntt.co.jp>
>> # Date 1312275597 -32400
>> # Branch stable-1.0
>> # Node ID 422551903526667c380e39f1712b38f3c2b8f0a6
>> # Parent 2b4a64c1bb737cfce61b1eaef0dca31d903d9b2e
>> shell: waits crmd-transition-delay before DC status checking
>>
>> diff -r 2b4a64c1bb73 -r 422551903526 shell/modules/utils.py
>> --- a/shell/modules/utils.py Tue Aug 02 17:45:29 2011 +0900
>> +++ b/shell/modules/utils.py Tue Aug 02 17:59:57 2011 +0900
>> @@ -263,6 +263,13 @@
>> if not dc:
>> common_warn("can't find DC in: %s" % s)
>> return False
>> + cmd = "crm_attribute -Gq -t crm_config -n crmd-transition-delay 2> /dev/null"
>> + delay = get_stdout(add_sudo(cmd))
>> + if delay:
>> + delaymsec = crm_msec(delay)
>> + if 0< delaymsec:
>> + common_info("The crmd-transition-delay is configured. Waiting %d msec before check DC status." % delaymsec)
>> + time.sleep(delaymsec / 1000)
>> cmd = "crmadmin -S %s" % dc
>> cnt = 0
>> output_started = 0
>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
>
--
NAKAHIRA Kazutomo
Infrastructure Software Technology Unit
NTT Open Source Software Center
More information about the Pacemaker
mailing list