[ClusterLabs] Two node Active/Active Asterisk+GFS2+DLM+fence_xvm Cluster
TEG AMJG
tegamjg at gmail.com
Fri Jul 15 17:32:10 UTC 2016
Hi
Thank you very much for your quick answer, i didnt put the whole
configuration because i though that maybe is a limitation of clone
resources since it happens in any start/restart operation and when a node
or a resource of a node has any problem. Also all my clone resources has
interleave=true specify.
My whole configuration is this one:
Stack: corosync
Current DC: pbx2vs3 (version 1.1.13-10.el7_2.2-44eb2dd) - partition with
quorum
2 nodes and 10 resources configured
Online: [ pbx1vs3 pbx2vs3 ]
Full list of resources:
Clone Set: dlm-clone [dlm]
Started: [ pbx1vs3 pbx2vs3 ]
Clone Set: asteriskfs-clone [asteriskfs]
Started: [ pbx1vs3 pbx2vs3 ]
Clone Set: asterisk-clone [asterisk]
Started: [ pbx1vs3 pbx2vs3 ]
fence_pbx2_xvm (stonith:fence_xvm): Started pbx2vs3
fence_pbx1_xvm (stonith:fence_xvm): Started pbx1vs3
Clone Set: clvmd-clone [clvmd]
Started: [ pbx1vs3 pbx2vs3 ]
PCSD Status:
pbx1vs3: Online
pbx2vs3: Online
Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/enabled
[root at pbx1 ~]# pcs config show
Cluster Name: asteriskcluster
Corosync Nodes:
pbx1vs3 pbx2vs3
Pacemaker Nodes:
pbx1vs3 pbx2vs3
Resources:
Clone: dlm-clone
Meta Attrs: clone-max=2 clone-node-max=1 interleave=true
Resource: dlm (class=ocf provider=pacemaker type=controld)
Attributes: allow_stonith_disabled=false
Operations: start interval=0s timeout=90 (dlm-start-interval-0s)
stop interval=0s on-fail=fence (dlm-stop-interval-0s)
monitor interval=60s on-fail=fence (dlm-monitor-interval-60s)
Clone: asteriskfs-clone
Meta Attrs: interleave=true clone-max=2 clone-node-max=1
Resource: asteriskfs (class=ocf provider=heartbeat type=Filesystem)
Attributes: device=/dev/vg_san1/lv_pbx directory=/mnt/asterisk
fstype=gfs2
Operations: start interval=0s timeout=60 (asteriskfs-start-interval-0s)
stop interval=0s on-fail=fence (asteriskfs-stop-interval-0s)
monitor interval=60s on-fail=fence
(asteriskfs-monitor-interval-60s)
Clone: asterisk-clone
Meta Attrs: interleaved=true sipp_monitor=/root/scripts/haasterisk.sh
sipp_binary=/usr/local/src/sipp-3.4.1/bin/sipp globally-unique=false
ordered=false interleave=true clone-max=2 clone-node-max=1 notify=true
Resource: asterisk (class=ocf provider=heartbeat type=asterisk)
Attributes: user=root group=root config=/mnt/asterisk/etc/asterisk.conf
sipp_monitor=/root/scripts/haasterisk.sh
sipp_binary=/usr/local/src/sipp-3.4.1/bin/sipp maxfiles=65535
Operations: start interval=0s timeout=40s (asterisk-start-interval-0s)
stop interval=0s on-fail=fence (asterisk-stop-interval-0s)
monitor interval=10s (asterisk-monitor-interval-10s)
Clone: clvmd-clone
Meta Attrs: clone-max=2 clone-node-max=1 interleave=true
Resource: clvmd (class=ocf provider=heartbeat type=clvm)
Operations: start interval=0s timeout=90 (clvmd-start-interval-0s)
monitor interval=30s on-fail=fence
(clvmd-monitor-interval-30s)
stop interval=0s on-fail=fence (clvmd-stop-interval-0s)
Stonith Devices:
Resource: fence_pbx2_xvm (class=stonith type=fence_xvm)
Attributes: port=tegamjg_pbx2 pcmk_host_list=pbx2vs3
Operations: monitor interval=60s (fence_pbx2_xvm-monitor-interval-60s)
Resource: fence_pbx1_xvm (class=stonith type=fence_xvm)
Attributes: port=tegamjg_pbx1 pcmk_host_list=pbx1vs3
Operations: monitor interval=60s (fence_pbx1_xvm-monitor-interval-60s)
Fencing Levels:
Location Constraints:
Ordering Constraints:
start fence_pbx1_xvm then start fence_pbx2_xvm (kind:Mandatory)
(id:order-fence_pbx1_xvm-fence_pbx2_xvm-mandatory)
start fence_pbx2_xvm then start dlm-clone (kind:Mandatory)
(id:order-fence_pbx2_xvm-dlm-clone-mandatory)
start dlm-clone then start clvmd-clone (kind:Mandatory)
(id:order-dlm-clone-clvmd-clone-mandatory)
start clvmd-clone then start asteriskfs-clone (kind:Mandatory)
(id:order-clvmd-clone-asteriskfs-clone-mandatory)
start asteriskfs-clone then start asterisk-clone (kind:Mandatory)
(id:order-asteriskfs-clone-asterisk-clone-mandatory)
Colocation Constraints:
clvmd-clone with dlm-clone (score:INFINITY)
(id:colocation-clvmd-clone-dlm-clone-INFINITY)
asteriskfs-clone with clvmd-clone (score:INFINITY)
(id:colocation-asteriskfs-clone-clvmd-clone-INFINITY)
asterisk-clone with asteriskfs-clone (score:INFINITY)
(id:colocation-asterisk-clone-asteriskfs-clone-INFINITY)
Resources Defaults:
migration-threshold: 2
failure-timeout: 10m
start-failure-is-fatal: false
Operations Defaults:
No defaults set
Cluster Properties:
cluster-infrastructure: corosync
cluster-name: asteriskcluster
dc-version: 1.1.13-10.el7_2.2-44eb2dd
have-watchdog: false
last-lrm-refresh: 1468598829
no-quorum-policy: ignore
stonith-action: reboot
stonith-enabled: true
There are some meta operations that doesnt make sense, sorry about that,
the problem is that i dont know how to delete them with PCSD :). Now, I
found something interesting about constraint ordering with clone resources
in "Pacemaker Explained" documentation, which describes something like this:
*"<constraints><rsc_location id="clone-prefers-node1" rsc="apache-clone"
node="node1" score="500"/><rsc_colocation id="stats-with-clone"
rsc="apache-stats" with="apache-clone"/><rsc_order
id="start-clone-then-stats" first="apache-clone"
then="apache-stats"/></constraints>""Ordering constraints behave slightly
differently for clones. In the example above, apache-stats willwait until
all copies of apache-clone that need to be started have done so before
being started itself.Only if no copies can be started will apache-stats be
prevented from being active. Additionally, theclone will wait for
apache-stats to be stopped before stopping itself".*
I am not sure if that has something to do with it, but i cannot destroy the
whole cluster to test it and probably in vain.
Thank you very much again. Regards
Alejandro
2016-07-15 3:35 GMT-04:00 Kristoffer Grönlund <kgronlund at suse.com>:
> TEG AMJG <tegamjg at gmail.com> writes:
>
> > Dear list
> >
> > I am quite new to PaceMaker and i am configuring a two node active/active
> > cluster which consist basically on something like this:
> >
> > I am using pcsd Pacemaker/Corosync:
> >
> > Clone Set: dlm-clone [dlm]
> > Started: [ pbx1vs3 pbx2vs3 ]
> > Clone Set: asteriskfs-clone [asteriskfs]
> > Started: [ pbx1vs3 pbx2vs3 ]
> > Clone Set: asterisk-clone [asterisk]
> > Started: [ pbx1vs3 pbx2vs3 ]
> > fence_pbx2_xvm (stonith:fence_xvm): Started pbx1vs3
> > fence_pbx1_xvm (stonith:fence_xvm): Started pbx2vs3
> > Clone Set: clvmd-clone [clvmd]
> > Started: [ pbx1vs3 pbx2vs3]
> >
> > Now my problem is that, for example, when i fence one of the nodes, the
> > other one restarts every clone resource and start them back again, same
> > thing happens when i stop pacemaker and corosync in one node only (pcs
> > cluster stop). That would mean that if i have a problem in one of my
> > Asterisk (for example in DLM resource or CLVMD) that would require
> fencing
> > right away, for example node pbx2vs3, the other node (pbx1vs3) will
> restart
> > every service which will drop all my calls in a well functioning node.
>
> The pcsd output doesn't really give any hint as to what your
> configuration looks like, but it sounds like the issue may be not setting
> interleave=true for a clone which other resources depend on. See this
> article for more information:
>
>
> https://www.hastexo.com/resources/hints-and-kinks/interleaving-pacemaker-clones/
>
> Cheers,
> Kristoffer
>
> --
> // Kristoffer Grönlund
> // kgronlund at suse.com
>
--
-
Saludos a todos
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20160715/55fc6fe4/attachment.htm>
More information about the Users
mailing list