you should use stonith device for the server and configure stonith-enabled to true to get the expected behavior of the resource failure.<br><br><div class="gmail_quote">On Fri, Nov 23, 2012 at 12:03 PM, Юлия Школьникова <span dir="ltr"><<a href="mailto:shkolnikova_yuli@mail.ru" target="_blank">shkolnikova_yuli@mail.ru</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div>Hello, again. Why you didn't answer me? I so need your help!!<br><br>-------- Пересылаемое сообщение --------<br>

От кого: Юлия Школьникова <<a href="mailto:shkolnikova_yuli@mail.ru" target="_blank">shkolnikova_yuli@mail.ru</a>><br>

Кому: <a href="mailto:pacemaker@oss.clusterlabs.org" target="_blank">pacemaker@oss.clusterlabs.org</a><br>


Дата: Mon 19 Nov 2012 16:37:21<br>

Тема: [Pacemaker] Problem with monitor<br>

<br>

<div>

        
        <div>

                
                        <div>

<p>Hello,</p><p>I configure master/slave cluster for postgresql 9.1 based on corosync и pacemaker.<br>I do it using this presentation: <a href="http://schedule2012.rmll.info/IMG/pdf/postgresql-9-0-ha.pdf" target="_blank">http://schedule2012.rmll.info/IMG/pdf/postgresql-9-0-ha.pdf</a>.<br>

Resource agent (pgsql-ms) for master/slave postgresql I took from this: <a href="https://github.com/roidelapluie/puppet-cluster" target="_blank">https://github.com/roidelapluie/puppet-cluster</a>. <br>My nodes are node1 и node2.</p>

<p>My config file of pacemaker:</p><p>node node1<br>node node2<br>primitive DBIP ocf:heartbeat:IPaddr2 \<br>params nic="eth0" ip="10.76.112.183" cidr_netmask="22" \<br>op monitor interval="30s" \<br>

meta target-role="Started" is-managed="true"<br>primitive pgsql ocf:inuits:pgsql-ms \<br>op monitor interval="5s" role="Master" \<br>op monitor interval="10s" role="Slave" <br>

primitive ping ocf:pacemaker:ping \<br>params host_list="10.76.112.1" \<br>op monitor interval="10s" timeout="10s" \<br>op start interval="0" timeout="45s"<br>group PSQL DBIP<br>

ms pgsql-ms pgsql \<br>params pgsqlconfig="/var/lib/pgsql/9.1/data/postgresql.conf" lsb_script="/etc/init.d/postgresql-9.1" pgsqlrecovery="/var/lib/pgsql/9.1/data/recovery.conf" \<br>meta clone-max="2" clone-node-max="1" master-max="1" master-node-max="1" notify="true"<br>

clone clone-ping ping \<br>meta globally-unique="false"<br>location connected PSQL \<br>rule $id="connected-rule" -inf: not_defined pingd or pingd lte 0<br>colocation ip_psql inf: PSQL pgsql-ms:Master<br>

property $id="cib-bootstrap-options" \<br>dc-version="1.1.7-6.el6-148fccfd5985c5590cc601123c6c16e966b85d14" \<br>cluster-infrastructure="openais" \<br>expected-quorum-votes="2" \<br>

stonith-enabled="false" \<br>no-quorum-policy="ignore" \<br>default-resource-stickiness="INFINITY" \<br>last-lrm-refresh="1352470332"<br>rsc_defaults $id="rsc_defaults-options" \<br>

migration-threshold="INFINITY" \<br>failure-timeout="10" \<br>resource-stickiness="INFINITY"</p><p><br>Then I try to test my cluster: <br>1) If I switch off the master, then the slave becomes a new master as expected. This works fine and can be repeated many times <br>

2) But if I try to stop postgresql (to simulate a failure of postgresql) with command: service postgresql-9.1 stop, the following occurs:</p><p>Given node1 is master, node2 is slave. <br>On the node1 I run "service postgresql-9.1 stop" and the node2 becomes the master.<br>

Now, on the node2 I run "service postgresql-9.1 stop" and the node1 becomes the master again.<br>At this time a monitoring of my resource on node1 stops, and the following entry appears in the log:</p><p><br>node1 crmd[1362]: info: process_lrm_event: LRM operation pgsql:0_monitor_10000 (call=33, status=1, cib-update=0, confirmed=true) Cancelled</p>

<p><br></p><p>Now if I run "service postgresql-9.1 stop" on the node1, pacemaker doesn't see that postgresql have stopped and doesn't try to restart it <br>and promote node2 to master.</p><p>If I run "crm resource reprobe" montor action resumes to work.<br>

I can not understand why the operation monitor stops working. Please, help me.<br><br>Shkolnikova Yulia. </p>

</div>

                        
        </div>


</div>


<br><hr></div>

<br>_______________________________________________<br>

Pacemaker mailing list: <a href="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</a><br>

<a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/mailman/listinfo/pacemaker</a><br>

<br>

Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>

Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>

Bugs: <a href="http://bugs.clusterlabs.org" target="_blank">http://bugs.clusterlabs.org</a><br>

<br></blockquote></div><br>