[Pacemaker] Should monitor operations be stopped after a resource is unmanaged?

Ron Kerry rkerry at sgi.com
Sun Apr 3 14:29:36 EDT 2011


On 7/22/64 2:59 PM, Tim Serong wrote:
> On 4/2/2011 at 09:42 PM, Ron Kerry <rkerry at sgi.com> wrote:
>  > On 7/22/64 2:59 PM, Serge Dubrouski wrote:
>  > > On Fri, Apr 1, 2011 at 2:09 PM, Ron Kerry <rkerry at sgi.com> wrote:
>  > > > On 7/22/64 2:59 PM, Pavel Levshin wrote:
>  > > >>
>  > > >> 01.04.2011 18:36, Ron Kerry:
>  > > >> > Folks -
>  > > >> >
>  > > >> > Consider a running cluster with all resources managed. We want to stop
>  > > >> > and quickly restart a particular resource without impacting other
>  > > >> > resources. The software stack running on the system can deal with this
>  > > >> > sort of temporary outage. We perform the following actions:
>  > > >> > * unmanage the resource
>  > > >> > * stop the resource
>  > > >> > * start the resource
>  > > >> > * manage the resource
>  > > >> >
>  > > >> > The above procedure is sometimes successful. However, we will also
>  > > >> > sometimes get a resource monitor failure after stopping the resource.
>  > > >> > It is clear that the monitor operation was not stopped (at least not
>  > > >> > immediately) by unmanaging the resource.
>  > > >>
>  > > >> Unmanaged resource cannot be started and stopped, but can still be
>  > > >> monitored.
>  > > >
>  > > > So unmanaged really means the resource is still being managed to some
>  > > > degree?
>  > >
>  > > It means that Pacemaker still wants to know its state. What kind of
>  > > problem does it create?
>  > >
>  >
>  > An unmanaged resource whoose monitor is still running will cause a monitor
>  > failure when the resource
>  > is stopped. Pacemaker then takes the 'onfail' action defined for the monitor
>  > operation. In other
>  > words, the resource is still being managed to some degree. If the monitor
>  > operation was still
>  > running but no action was taken as a result of the monitor operation
>  > outcome, there would be no issue.
>
> Try "crm configure property maintenance-mode=true". Admittedly this
> affects the entire cluster, but it will ensure no starts, stops or
> monitors...
>
> Regards,
>
> Tim

Tim -

Thanks, this does work but is rather like using a sledge hammer to do the work of a ball peen 
hammer. It unmanages ALL resources and stops all the monitor operations.

How do we go about requesting a change to pacemaker to achieve the desired behavior? As I see it 
there are two options:

   1. fix 'crm resource unmanage <rsc>' to also stop the individual resource monitor

-or-

   2. create a 'crm resource maintenance <rsc>' to unmanage and stop the individual resource monitor

-- 

Ron Kerry         rkerry at sgi.com
Global Product Support - SGI Federal




More information about the Pacemaker mailing list