[Pacemaker] Fwd: Problem with monitor

Tue Nov 27 18:25:58 EST 2012

----- Original Message -----
> From: "Юлия Школьникова" <shkolnikova_yuli at mail.ru>
> To: pacemaker at oss.clusterlabs.org
> Sent: Thursday, November 22, 2012 11:03:59 PM
> Subject: [Pacemaker] Fwd:  Problem with monitor
> 
> 
> Hello, again. Why you didn't answer me? I so need your help!!

There was a bug involving master/slave monitoring not being resumed that got fixed recently.  I'd recommend updating to 1.1.8.

Here is the related bug report.

http://bugs.clusterlabs.org/show_bug.cgi?id=5072

-- Vossel

> -------- Пересылаемое сообщение --------
> От кого: Юлия Школьникова <shkolnikova_yuli at mail.ru>
> Кому: pacemaker at oss.clusterlabs.org
> Дата: Mon 19 Nov 2012 16:37:21
> Тема: [Pacemaker] Problem with monitor
> 
> 
> 
> 
> 
> 
> Hello,
> 
> I configure master/slave cluster for postgresql 9.1 based on corosync
> и pacemaker.
> I do it using this presentation:
> http://schedule2012.rmll.info/IMG/pdf/postgresql-9-0-ha.pdf .
> Resource agent (pgsql-ms) for master/slave postgresql I took from
> this: https://github.com/roidelapluie/puppet-cluster .
> My nodes are node1 и node2.
> 
> My config file of pacemaker:
> 
> node node1
> node node2
> primitive DBIP ocf:heartbeat:IPaddr2 \
> params nic="eth0" ip="10.76.112.183" cidr_netmask="22" \
> op monitor interval="30s" \
> meta target-role="Started" is-managed="true"
> primitive pgsql ocf:inuits:pgsql-ms \
> op monitor interval="5s" role="Master" \
> op monitor interval="10s" role="Slave"
> primitive ping ocf:pacemaker:ping \
> params host_list="10.76.112.1" \
> op monitor interval="10s" timeout="10s" \
> op start interval="0" timeout="45s"
> group PSQL DBIP
> ms pgsql-ms pgsql \
> params pgsqlconfig="/var/lib/pgsql/9.1/data/postgresql.conf"
> lsb_script="/etc/init.d/postgresql-9.1"
> pgsqlrecovery="/var/lib/pgsql/9.1/data/recovery.conf" \
> meta clone-max="2" clone-node-max="1" master-max="1"
> master-node-max="1" notify="true"
> clone clone-ping ping \
> meta globally-unique="false"
> location connected PSQL \
> rule $id="connected-rule" -inf: not_defined pingd or pingd lte 0
> colocation ip_psql inf: PSQL pgsql-ms:Master
> property $id="cib-bootstrap-options" \
> dc-version="1.1.7-6.el6-148fccfd5985c5590cc601123c6c16e966b85d14" \
> cluster-infrastructure="openais" \
> expected-quorum-votes="2" \
> stonith-enabled="false" \
> no-quorum-policy="ignore" \
> default-resource-stickiness="INFINITY" \
> last-lrm-refresh="1352470332"
> rsc_defaults $id="rsc_defaults-options" \
> migration-threshold="INFINITY" \
> failure-timeout="10" \
> resource-stickiness="INFINITY"
> 
> 
> Then I try to test my cluster:
> 1) If I switch off the master, then the slave becomes a new master as
> expected. This works fine and can be repeated many times
> 2) But if I try to stop postgresql (to simulate a failure of
> postgresql) with command: service postgresql-9.1 stop, the following
> occurs:
> 
> Given node1 is master, node2 is slave.
> On the node1 I run "service postgresql-9.1 stop" and the node2
> becomes the master.
> Now, on the node2 I run "service postgresql-9.1 stop" and the node1
> becomes the master again.
> At this time a monitoring of my resource on node1 stops, and the
> following entry appears in the log:
> 
> 
> node1 crmd[1362]: info: process_lrm_event: LRM operation
> pgsql:0_monitor_10000 (call=33, status=1, cib-update=0,
> confirmed=true) Cancelled
> 
> 
> 
> 
> Now if I run "service postgresql-9.1 stop" on the node1, pacemaker
> doesn't see that postgresql have stopped and doesn't try to restart
> it
> and promote node2 to master.
> 
> If I run "crm resource reprobe" montor action resumes to work.
> I can not understand why the operation monitor stops working. Please,
> help me.
> 
> Shkolnikova Yulia.
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>