[Pacemaker] Possible to colocate ms ressource with standard ones ?

Tue May 20 06:29:23 UTC 2014

Based on your config, the only reason I can find that the slave doesn't start is that the second node is offline.

On 15 May 2014, at 9:34 am, Andrew Beekhof <andrew at beekhof.net> wrote:

> 
> On 14 May 2014, at 8:00 pm, Sékine Coulibaly <scoulibaly at gmail.com> wrote:
> 
>> Hi Andrew,
>> 
>> I came through some kind of solution, lightly different from what I used in my first post.
>> You'll find it in the raw cibadmin attached to this post.
>> BOUM,UFO,INGESTOR and QUOTAS all are applications and depend on ZK and Postgresql.
>> 
>> I'm somewhat stuck with 6.3 release for the moment. Right now I'm considering switching to PCS to make the transition to RHEL 6.5 or 7.x easier. Transition to RHEL 6.X is a global decision process so I'm afraid I'll stay with 6.3 for the moment.
> 
> Ok, but bare in mind that if this turns out to be a bug, I cannot fix it for 6.3
> Your only way to get the fix is upgrade or build upstream yourself.
> 
>> 
>> Sekine
>> 
>>> Hi,
>>> 
>>> Let me explain my use case. I'm using RHEL 6.3
>> 
>> fwiw, there are updates to pacemaker 1.1.10 in 6.4 and 6.5.
>> Its even supported now.
>> 
>>> with Corosync + Pacemaker + PostgreSQL9.2 + repmgr 2.0. I have two nodes names clustera and clusterb.
>>> 
>>> I have a total of 3 resources :
>>> - APACHE
>>> - BOUM
>>> - MS_POSTGRESQL
>>> 
>>> They are defined as follow :
>>> 
>>> sudo crm configure  primitive APACHE ocf:heartbeat:apache \
>>>   params configfile=/etc/httpd/conf/httpd.conf \
>>>   op monitor interval=5s timeout=10s \
>>>   op start interval=0 timeout=10s \
>>>   op stop interval=0 timeout=10s
>>> 
>>> sudo crm configure primitive BOUM ocf:heartbeat:anything \
>>>   params binfile=/usr/local/boum/current/bin/boum \
>>>   workdir=/var/boum \
>>>   logfile=/var/log/boum/boum_STDOUT \
>>>   errlogfile=/var/log/boum/boum_STDERR \
>>>   pidfile=/var/run/boum.pid \
>>>   op monitor interval=5s timeout=10s \
>>>   op start interval=0 timeout=10s \
>>>   op stop interval=0 timeout=10s
>>> 
>>> sudo crm configure primitive POSTGRESQL ocf:xxxxxx:postgresql \
>>>   params repmgr_conf=/var/lib/pgsql/repmgr/repmgr.conf pgctl=/usr/pgsql-9.2/bin/pg_ctl pgdata=/opt/pgdata \
>>>   op start interval=0 timeout=90s \
>>>   op stop interval=0 timeout=60s \
>>>   op promote interval=0 timeout=120s \
>>>   op monitor interval=53s role=Master \
>>>   op monitor interval=60s role=Slave
>>> 
>>> Since the PostgreSQL is in streaming replication, I need to have a master and a slave constantly running. Hence, I created an MasterSlave resource, called MS_POSTGRESQL.
>>> 
>>> I want to that APACHE, BOUM and the master node of PostgreSQL run altogether on the same node. It looks like that as soon as I add a colocation, the Postgresql slave doesn't start anymore.
>>> 
>>> I end up with :
>>> 
>>> Online: [ clusterb clustera ]
>>> 
>>> Master/Slave Set: MS_POSTGRESQL [POSTGRESQL]
>>>     Masters: [ clustera ]
>>>     Stopped: [ POSTGRESQL:1 ]
>>> APACHE  (ocf::heartbeat:apache):        Started clustera
>>> BOUM     (ocf::heartbeat:anything):   Started clustera
>>> 
>>> My configuration is as follows :
>>> 
>>> 
>>> node clustera \
>>>        attributes standby="off"
>>> node clusterb \
>>>        attributes standby="off"
>>> primitive APACHE ocf:heartbeat:apache \
>>>        params configfile="/etc/httpd/conf/httpd.conf" \
>>>        op monitor interval="5s" timeout="10s" \
>>>        op start interval="0" timeout="10s" \
>>>        op stop interval="0" timeout="10s" \
>>>        meta target-role="Started"
>>> primitive BOUM ocf:heartbeat:anything \
>>>        params binfile="/usr/local/boum/current/bin/boum" workdir="/var/boum" logfile="/var/log/boum/boum_STDOUT" errlogfile="/var/log/boum/boum_STDERR" pidfile="/var/run/boum.pid" \
>>>        op monitor interval="5s" timeout="10s" \
>>>        op start interval="0" timeout="10s" \
>>>        op stop interval="0" timeout="10s"
>>> primitive POSTGRESQL ocf:xxxxxxx:postgresql \
>>>        params repmgr_conf="/var/lib/pgsql/repmgr/repmgr.conf" pgctl="/usr/pgsql-9.2/bin/pg_ctl" pgdata="/opt/pgdata" \
>>>        op start interval="0" timeout="90s" \
>>>        op stop interval="0" timeout="60s" \
>>>        op promote interval="0" timeout="120s" \
>>>        op monitor interval="53s" role="Master" \
>>>        op monitor interval="60s" role="Slave"
>>> ms MS_POSTGRESQL POSTGRESQL \
>>>        meta clone-max="2" target-role="Started" resource-stickiness="100" notify="true"
>>> colocation link-resources inf: ZK UFO BOUM APACHE MS_POSTGRESQL
>> 
>> Could you send the raw xml (cibadmin -Ql) please?
>> I've never gotten used to crmsh's colocation syntax and don't have it installed locally (pcs is the supplied tool for configuring pacemaker on rhel)
>> 
>>> property $id="cib-bootstrap-options" \
>>>        dc-version="1.1.7-6.el6-148fccfd5985c5590cc601123c6c16e966b85d14" \
>>>        cluster-infrastructure="openais" \
>>>        expected-quorum-votes="2" \
>>>        stonith-enabled="false" \
>>>        no-quorum-policy="ignore" \
>>>        default-resource-stickiness="10" \
>>>        start-failure-is-fatal="false" \
>>>        last-lrm-refresh="1398775386"
>>> 
>>> Is this a normal behaviour ? If it is, is there a workaround I didn't think of ?
>> <cibadmin.txt>_______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>> 
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140520/ae5fe0f3/attachment-0004.sig>