[ClusterLabs] Antw: emergency stop does not honor resources ordering constraints (?)

Radoslaw Garbacz radoslaw.garbacz at xtremedatainc.com
Fri Dec 9 11:09:02 EST 2016


Thank you, lost of quorum could indeed be an intentional behavior, however
I experience the same situation when there is a monitoring failure or when
parameter "no-quorum-policy" is set to "ignore", i.e.
- normal pacemaker service stop or 'crm_resources' stop for all resources:
A -> B -> C
- lost quorum (with 'no-quorum-policy=ignore') or 'crm_resources' stop for
all resources, when on of the resources reported "monitor" error: not
ordered stop

I will double check my tests, however it would be helpful to know if, by
chance, it is as it suppose to be.


On Wed, Dec 7, 2016 at 1:40 AM, Ulrich Windl <
Ulrich.Windl at rz.uni-regensburg.de> wrote:

> >>> Radoslaw Garbacz <radoslaw.garbacz at xtremedatainc.com> schrieb am
> 06.12.2016 um
> 18:50 in Nachricht
> <CAHBw7oTJX0CZdvaOO0cc+k6TDS5PhJMjr0_rsyrpLPPEXtVSLg at mail.gmail.com>:
> > Hi,
> >
> > I have encountered a problem with pacemaker resources shutdown in case of
> > (seems like) any emergency situation, when order constraints are not
> > honored.
> > I would be grateful for any information, whether this behavior is
> > intentional or should not happen (i.e. some testing issue rather then
> > pacemaker behavior). It would also be helpful to know if there is any
> > configuration parameter altering this, or whether there can be any reason
> > (cluster event) triggering not ordered resources stop.
> >
> > Thanks,
> >
> > To illustrate the issue I provide an example below and my collected data.
> > My environment uses resources cloning feature - maybe this contributes to
> > my tests outcome.
> >
> >
> > * Example:
> > - having resources ordered with constraints: A -> B -> C
> > - when stopping with 'crm_resources' command (all at once) resources are
> > stopped: C, B, A
> > - when stopping by terminating pacemaker resources are stopped: C, B, A
> > - when there is a monitoring error or quorum lost: no order is honored
> e.g.
> > B, C, A
>
> Hi!
>
> If the node does not have quorum, it cannot do any cluster operations
> (IMHO). Instead it will try to commit suicide, maby with the help of
> self-fencing. So I think this case is normal for no quorum.
>
> Ulrich
>
> >
> >
> >
> > * Version details:
> > Pacemaker 1.1.15-1.1f8e642.git.el6
> > Corosync Cluster Engine, version '2.4.1.2-0da1'
> >
> >
> >
> > * My ordering constraints:
> > Ordering Constraints:
> >   dbx_first_primary then dbx_head_head (kind:Mandatory)
> >   dbx_first_primary-clone then dbx_head_head (kind:Mandatory)
> >   dbx_head_head then dbx_mounts_nodes (kind:Mandatory)
> >   dbx_head_head then dbx_mounts_nodes-clone (kind:Mandatory)
> >   dbx_mounts_nodes then dbx_bind_mounts_nodes (kind:Mandatory)
> >   dbx_mounts_nodes-clone then dbx_bind_mounts_nodes-clone
> (kind:Mandatory)
> >   dbx_bind_mounts_nodes then dbx_nfs_nodes (kind:Mandatory)
> >   dbx_bind_mounts_nodes-clone then dbx_nfs_nodes-clone (kind:Mandatory)
> >   dbx_nfs_nodes then dbx_gss_datas (kind:Mandatory)
> >   dbx_nfs_nodes-clone then dbx_gss_datas-clone (kind:Mandatory)
> >   dbx_gss_datas then dbx_nfs_mounts_datas (kind:Mandatory)
> >   dbx_gss_datas-clone then dbx_nfs_mounts_datas-clone (kind:Mandatory)
> >   dbx_nfs_mounts_datas then dbx_swap_nodes (kind:Mandatory)
> >   dbx_nfs_mounts_datas-clone then dbx_swap_nodes-clone (kind:Mandatory)
> >   dbx_swap_nodes then dbx_sync_head (kind:Mandatory)
> >   dbx_swap_nodes-clone then dbx_sync_head (kind:Mandatory)
> >   dbx_sync_head then dbx_dbx_datas (kind:Mandatory)
> >   dbx_sync_head then dbx_dbx_datas-clone (kind:Mandatory)
> >   dbx_dbx_datas then dbx_dbx_head (kind:Mandatory)
> >   dbx_dbx_datas-clone then dbx_dbx_head (kind:Mandatory)
> >   dbx_dbx_head then dbx_web_head (kind:Mandatory)
> >   dbx_web_head then dbx_ready_primary (kind:Mandatory)
> >   dbx_web_head then dbx_ready_primary-clone (kind:Mandatory)
> >
> >
> >
> > * Pacemaker stop (OK):
> > ready.ocf.sh(dbx_ready_primary)[18639]: 2016/12/06_15:40:32 INFO:
> > ready_stop: Stopping resource
> > mng.ocf.sh(dbx_mng_head)[20312]:        2016/12/06_15:40:44 INFO:
> mng_stop:
> > Stopping resource
> > web.ocf.sh(dbx_web_head)[20310]:        2016/12/06_15:40:44 INFO:
> > dbxcl_stop: Stopping resource
> > dbx.ocf.sh(dbx_dbx_head)[20569]:        2016/12/06_15:40:46 INFO:
> > dbxcl_stop: Stopping resource
> > sync.ocf.sh(dbx_sync_head)[20719]:      2016/12/06_15:40:54 INFO:
> > sync_stop: Stopping resource
> > swap.ocf.sh(dbx_swap_nodes)[21053]:     2016/12/06_15:40:56 INFO:
> > swap_stop: Stopping resource
> > nfs.ocf.sh(dbx_nfs_nodes)[21151]:       2016/12/06_15:40:58 INFO:
> nfs_stop:
> > Stopping resource
> > dbx_mounts.ocf.sh(dbx_bind_mounts_nodes)[21344]:
> 2016/12/06_15:40:59
> > INFO: dbx_mounts_stop: Stopping resource
> > dbx_mounts.ocf.sh(dbx_mounts_nodes)[21767]:     2016/12/06_15:41:01
> INFO:
> > dbx_mounts_stop: Stopping resource
> > head.ocf.sh(dbx_head_head)[22213]:      2016/12/06_15:41:04 INFO:
> > head_stop: Stopping resource
> > first.ocf.sh(dbx_first_primary)[22999]: 2016/12/06_15:41:11 INFO:
> > first_stop: Stopping resource
> >
> >
> >
> > * Quorum lost:
> > sync.ocf.sh(dbx_sync_head)[23099]:      2016/12/06_16:42:04 INFO:
> > sync_stop: Stopping resource
> > nfs.ocf.sh(dbx_nfs_nodes)[23102]:       2016/12/06_16:42:04 INFO:
> nfs_stop:
> > Stopping resource
> > mng.ocf.sh(dbx_mng_head)[23101]:        2016/12/06_16:42:04 INFO:
> mng_stop:
> > Stopping resource
> > ready.ocf.sh(dbx_ready_primary)[23104]: 2016/12/06_16:42:04 INFO:
> > ready_stop: Stopping resource
> > web.ocf.sh(dbx_web_head)[23344]:        2016/12/06_16:42:04 INFO:
> > dbxcl_stop: Stopping resource
> > dbx_mounts.ocf.sh(dbx_bind_mounts_nodes)[23664]:
> 2016/12/06_16:42:05
> > INFO: dbx_mounts_stop: Stopping resource
> > dbx_mounts.ocf.sh(dbx_mounts_nodes)[24459]:     2016/12/06_16:42:08
> INFO:
> > dbx_mounts_stop: Stopping resource
> > head.ocf.sh(dbx_head_head)[25036]:      2016/12/06_16:42:11 INFO:
> > head_stop: Stopping resource
> > swap.ocf.sh(dbx_swap_nodes)[27491]:     2016/12/06_16:43:08 INFO:
> > swap_stop: Stopping resource
> >
> >
> > --
> > Best Regards,
> >
> > Radoslaw Garbacz
> > XtremeData Incorporation
>
>
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>



-- 
Best Regards,

Radoslaw Garbacz
XtremeData Incorporation
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20161209/17aef447/attachment-0003.html>


More information about the Users mailing list