[Pacemaker] questions about the booth

Wed May 29 01:52:14 EDT 2013

Hi Yuusuke,

Merged, thanks!

Regards,
Jiaju

On Mon, 2013-05-27 at 16:26 +0900, Yuusuke Iida wrote:
> Hi, Jiaju
> 
> I made the daemon who supervises the resource depending on a ticket, in
> order to solve this problem.
> 
> I have sent the following "pull request".
> https://github.com/jjzhang/booth/pull/52
> 
> The feature is as follows.
>  - The information on the ticket to supervise is acquired from the
> configuration file of booth.
>  - A ticket becomes "grant", and if a resource start(s), surveillance
> will start.
>  - booth_resource_monitord moves a ticket to other sites using booth,
> when it becomes impossible for a resource to work in a site.
>  - booth_resource_monitord will be installed when the configure option
> was with the "--enable-resource-monitor".
> 
> How to use:
> Usually, booth_resource_monitord is added to the composition which is
> using booth as follows.
> ===================================================================
> group grpBooth prmIpBooth prmApBooth prmApBooth_rsc_mond
> primitive prmIpBooth ocf:heartbeat:IPaddr2 \
>         params ip="***.***.***.***" nic="eth*" cidr_netmask="24" \
>         op start interval="0s" timeout="60s" on-fail="restart" \
>         op monitor interval="10s" timeout="60s" on-fail="restart" \
>         op stop interval="0s" timeout="60s" on-fail="fence"
> primitive prmApBooth ocf:pacemaker:booth-site \
>         op start interval="0s" timeout="90s" on-fail="restart" \
>         op monitor interval="10s" timeout="60s" on-fail="restart" \
>         op stop interval="0s" timeout="100s" on-fail="fence"
> primitive prmApBooth_rsc_mond ocf:heartbeat:anything \
>         params binfile="booth_resource_monitord" \
>         op start interval="0s" timeout="90s" on-fail="restart" \
>         op monitor interval="10s" timeout="60s" on-fail="restart" \
>         op stop interval="0s" timeout="100s" on-fail="fence"
> --------------------------------------------------------------------
> 
> limitation:
> The target resource cannot be read when "rsc_ticket" is described by
> "resource_set".
> 
> I want me to merge this function into the sauce tree of booth by all means.
> 
> Best Regards,
> Yusuke
> 
> 
> (2012/03/08 11:37), Yuusuke Iida wrote:
> > Hi, Jiaju
> > 
> > Thank you for reply.
> > 
> > (2012/03/05 14:00), Jiaju Zhang wrote:
> >> Hi Yuusuke,
> >>
> >> On Mon, 2012-03-05 at 11:49 +0900, Yuusuke Iida wrote:
> >>> Hi, Jiaju
> >>>
> >>> I thought about a plan to deal when a resource did not change in sites.
> >>> I think that I make daemon working outside booth.
> >>>
> >>> This daemon watches it whether a resource can work in sites.
> >>> And it executes revoke command for booth when the state that a resource
> >>> cannot manage was confirmed.
> >>> booth catches revoke and thinks that I move a ticket to another site.
> >>
> >> If I understand it correctly, the daemon you mentioned automated some of
> >> the admin's behaviors, if the resources cannot be managed by one site,
> >> revoke the ticket and move the ticket to another site. I have no
> >> objection if the admin has this requirement;)
> > Thank you for agreeing.
> > The summary of the processing is just what you think.
> > admin may not necessarily need this function.
> > However, I think that admin which wants to automate processing as much
> > as possible exists.
> > 
> >> The only thing I'm not sure is if the admin really want to do this? My
> >> assumption is if the local site is alive the admin will be inclined to
> >> keep the ticket stay in this site, if the site is totally down, we have
> >> no choice, the ticket has to move to another site to keep the service
> >> available.
> >> However, that is just one using scenario in my mind, booth should
> >> support the using scenario that you mentioned;)
> >>
> >>>
> >>> I think that the continuity of the resource is kept in this movement.
> >>>
> >>> I analyze CIB and intend to perform the state confirmation of the
> >>> resource using score.
> >>
> >> I'm not quite understand here, do you mean that if the resource usually
> >> being un-managed by this site, we'd better move it to another site, so
> >> your daemon will depends on this value to decide whether it would move
> >> the ticket another site, right?
> > When a resource failed, I think that the score of the resource becomes
> > less than 0.
> > When the resource was not able to start in all nodes in the site, I
> > think that score becomes less than 0 in all nodes.
> > I want to judge the state that a resource was not able to operate from
> > this score.
> > 
> > When a ticket does not become grant, the score of the resource becomes
> > less than 0.
> > Therefore, I want to monitor the resource while a ticket becomes grant.
> > 
> >>
> >> Well, I think you raised another using scenario which I has not thought
> >> of before;) And I agree with you to setup such a daemon to do this work
> >> if the admin need.
> > I want you to confirm it again when you were completed.
> > 
> > Thanks,
> > Yuusuke
> >>
> >> Thanks,
> >> Jiaju
> >>
> >>
> > 
>