[Pacemaker] Colocating with unmanaged resource

Andrew Beekhof andrew at beekhof.net
Sun Mar 29 21:11:25 EDT 2015


> On 28 Feb 2015, at 6:00 am, Покотиленко Костик <casper at meteor.dp.ua> wrote:
> 
> В Чтв, 22/01/2015 в 14:59 +1100, Andrew Beekhof пишет:
>>> On 15 Jan 2015, at 12:54 am, Покотиленко Костик <casper at meteor.dp.ua> wrote:
>>> 
>>> В Вто, 06/01/2015 в 16:27 +1100, Andrew Beekhof пишет:
>>>>> On 20 Dec 2014, at 6:21 am, Покотиленко Костик <casper at meteor.dp.ua> wrote:
>>>>> Here are behaviors of different versions of pacemaker:
>>>>> 
>>>>> 1.1.12:
>>>>> 
>>>>> - stopping nginx on a node always makes the clone instance to FAIL for
>>>>> that node, but FIP stays running on that node regardless of INF
>>>>> colocation
>>>> 
>>>> can you attach a crm_report of the above test please?
>>> 
>>> crm_report of this test attached as
>>> pcmk-nginx-fail-Wed-14-Jan-2015.tar.bz2
>> 
>> is there a reason nginx is not managed?
>> if it wasn't, then we'd have stopped it and FIP_2 would have been moved
> 
> I'm not sure I got this right.
> 
> Nginx is not managed by intention (is-managed="false") that's why subj.
> And the whole subject is in fact that stopping unmanaged nginx doesn't
> move away FIP which is INF colocated with it (this is regarding 1.1.12,
> 1.1.6 works fine).

Ahhhh.
We changed the way monitors that return OCF_NOT_RUNNING were handled to still require a stop under most conditions.
I've added "not managed" to the list of exceptions:

diff --git a/lib/pengine/unpack.c b/lib/pengine/unpack.c
index 308258d..6dc44fd 100644
--- a/lib/pengine/unpack.c
+++ b/lib/pengine/unpack.c
@@ -2689,7 +2689,7 @@ determine_op_status(
             break;
 
         case PCMK_OCF_NOT_RUNNING:
-            if (is_probe || target_rc == rc) {
+            if (is_probe || target_rc == rc || is_not_set(rsc->flags, pe_rsc_managed)) {
                 result = PCMK_LRM_OP_DONE;
                 rsc->role = RSC_ROLE_STOPPED;
 
Look for this in 1.1.13-rc2

> 
>>>> 1.1.6, 1.1.10, 1.1.12:
>>>> 
>>>>> - if Nginx has started on a node after initial probe for Nginx clone
>>>>> then pacemaker will never see it running until cleanup or other
>>> probe
>>>>> trigger
>>>> 
>>>> you'll want a recurring monitor with role=Stopped
>>>> 
>>> 
>>> How is it done?
>> 
>> I don't know the crmsh syntax. Sorry
>> 
>>> 
>>> I've tried on 1.1.12 with:
>>> primitive Nginx lsb:nginx \
>>> 	op monitor interval=2s \
>>> 	op monitor interval=3s role=Stopped
>>> 
>>> This produces warning that monitor_stopped may be unsupported by RA.
>> 
>> I'm not familiar with that warning.
>> Where did you see it?
> 
> The exact text is:
> WARNING: Nginx: action monitor_Stopped not advertised in meta-data, it may not be supported by the RA
> 
> This is produced by crm configure edit, 

Hmmm, you'd have to take that up with the crmsh maintainers.

> 
>>> Should it?
>>> And it's not recognizing start of nginx.
>> 
>> It seems role=Stopped only works for primitives (not clones)
>> I've made a note to get this fixed
> 
> This will add usability for unmanaged resources, thanks.
> 
>>> 
>>> Steps:
>>> - stop nginx on 2nd node
>>> - cleanup cl_Nginx so that pacemaker forget nginx was running in 2nd
>>> node
>>> - clear logs
>>> - start nginx
>>> - nothing happens
>>> - make crm_report
>>> 
>>> crm_report of this test attached as
>>> pcmk-monitor-stopped-Wed-14-Jan-2015.tar.bz2
>>> 
>>> <pcmk-monitor-stopped-Wed-14-Jan-2015.tar.bz2><pcmk-nginx-fail-Wed-14-Jan-2015.tar.bz2>





More information about the Pacemaker mailing list