<br><br><div class="gmail_quote">On Fri, May 20, 2011 at 3:42 AM, Eamon Roque <span dir="ltr"><<a href="mailto:Eamon.Roque@lex-com.net">Eamon.Roque@lex-com.net</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">

<tt><font size="2">Hi,</font></tt>

<br><div><div></div><div class="h5">

<br>

<br><tt><font size="2">>> On Thu, May 19, 2011 at 5:05 AM, Eamon Roque

<<a href="mailto:Eamon.Roque@lex-com.net" target="_blank">Eamon.Roque@lex-com.net</a>>wrote:<br>

<br>

>> Hi,<br>

>><br>

>> I've put together a cluster of two nodes running a databank without

shared<br>

>> storage. Both nodes replicate data between them, which is taken

care of by<br>

>> the databank itself.<br>

>><br>

>> I have a resource for the databank and ip. I then created a stateful

clone<br>

>> from the databank resource. I created colocation rules joining

the<br>

>> databank-ms-clone and ip:<br>

>><br>

>> node pgsqltest1<br>

>> node pgsqltest2<br>

>> primitive Postgres-IP ocf:heartbeat:IPaddr2 \<br>

>>         params ip="10.19.57.234"

cidr_netmask="32" \<br>

>>         op monitor interval="30s"

\<br>

>>         meta is-managed="false"<br>

>> primitive resPostgres ocf:heartbeat:pgsql \<br>

>>         params pgctl="/opt/PostgreSQL/9.0/bin/pg_ctl"<br>

>>pgdata="/opt/PostgreSQL/9.0/data" psql="/opt/PostgreSQL/9.0/bin/psql"<br>

>> pgdba="postgres" \<br>

>>         op monitor interval="1min"

\<br>

>>         meta is-managed="false"<br>

>> ms msPostgres resPostgres \<br>

>>         meta master-max="1" master-node-max="1"

clone-max="2"<br>

>> clone-node-max="1" notify="true" target-role="started"<br>

>> colocation colPostgres inf: Postgres-IP msPostgres:Master<br>

>> order ordPostgres inf: msPostgres:promote Postgres-IP:start<br>

>> property $id="cib-bootstrap-options" \<br>

>>         dc-version="1.1.2-2e096a41a5f9e184a1c1537c82c6da1093698eb5"

\<br>

>>         cluster-infrastructure="openais"

\<br>

>>         expected-quorum-votes="2"

\<br>

>>        stonith-enabled="false" \<br>

>>        no-quorum-policy="ignore"

\<br>

>>         last-lrm-refresh="1302707146"<br>

>> rsc_defaults $id="rsc-options" \<br>

>>         resource-stickiness="200"<br>

>> op_defaults $id="op_defaults-options" \<br>

>>         record-pending="false"<br>

>><br>

>> The normal postgres agent doesn't support this functionality,

but I've put<br>

>> together my own using the mysql agent as a model. Before running

the script<br>

>> through ocf-tester, I unmanage the postgres resource.<br>

>><br>

<br>

> Could you show how you implemented promote/demote for pgsql?<br>

</font></tt>

<br></div></div><tt><font size="2">Sure, let's start with the ultra-simple "promote"

function:</font></tt>

<br>

<br><tt><font size="2">#</font></tt>

<br><tt><font size="2"># These variables are higher up in the file, but they

will probably help with understanding the error of </font></tt>

<br><tt><font size="2"># my ways.</font></tt>

<br>

<br><tt><font size="2">CRM_MASTER="${HA_SBIN_DIR}/crm_master"</font></tt>

<br><tt><font size="2">ATTRD_UPDATER="${HA_SBIN_DIR}/attrd_updater"</font></tt>

<br>

<br><tt><font size="2">pgsql_promote() {</font></tt>

<br><tt><font size="2">        local output</font></tt>

<br><tt><font size="2">        local rc</font></tt>

<br><tt><font size="2">        local CHECK_PG_SQL</font></tt>

<br><tt><font size="2">        local COMPLETE_STANDBY_QUERY</font></tt>

<br><tt><font size="2">        local PROMOTE_SCORE_HIGH</font></tt>

<br><tt><font size="2">        local MOD_PSQL_M_FORMAT</font></tt>

<br>

<br>

<br><tt><font size="2">        PROMOTE_SCORE_HIGH=1000

       </font></tt>

<br><tt><font size="2">        CHECK_PG_SQL="SELECT

pg_is_in_recovery()"</font></tt>

<br><tt><font size="2">        MOD_PSQL_M_FORMAT="$OCF_RESKEY_psql

-Atc"</font></tt>

<br><tt><font size="2">        COMPLETE_STANDBY_QUERY="$MOD_PSQL_M_FORMAT

\"$CHECK_PG_SQL\""</font></tt>

<br>

<br><tt><font size="2">        output=$(su -

$OCF_RESKEY_pgdba -c "$COMPLETE_STANDBY_QUERY" 2>&1)</font></tt>

<br><tt><font size="2">        echo $output</font></tt>

<br><tt><font size="2">        </font></tt>

<br><tt><font size="2">        rc=$?</font></tt>

<br><tt><font size="2">        </font></tt>

<br><tt><font size="2">        case $output in</font></tt>

<br><tt><font size="2">           

    f)</font></tt>

<br><tt><font size="2">           

            ocf_log debug

"PostgreSQL Node is running in Master mode..."</font></tt>

<br><tt><font size="2">           

            return $OCF_RUNNING_MASTER</font></tt>

<br><tt><font size="2">           

    ;;</font></tt>

<br><tt><font size="2">        </font></tt>

<br><tt><font size="2">           

    t)</font></tt>

<br><tt><font size="2">           

            ocf_log debug

"PostgreSQL Node is in Hot_Standby mode..."</font></tt>

<br><tt><font size="2">           

            return $OCF_SUCCESS</font></tt>

<br><tt><font size="2">           

    ;;</font></tt>

<br>

<br><tt><font size="2">           

    *)</font></tt>

<br><tt><font size="2">           

            ocf_log err "Critical

error in $CHECK_PG_SQL: $output"</font></tt>

<br><tt><font size="2">           

            return $OCF_ERR_GENERIC

       </font></tt>

<br><tt><font size="2">           

    ;;</font></tt>

<br><tt><font size="2">        esac</font></tt>

<br>

<br><tt><font size="2">#</font></tt>

<br><tt><font size="2"># "Real" promotion is handled here.</font></tt>

<br><tt><font size="2"># The trigger file is created and we check for "recovery.conf"

on the host.</font></tt>

<br><tt><font size="2"># If we can't find it, then the file will be copied

from the HA-Config into postgres' data folder.</font></tt>

<br><tt><font size="2">#</font></tt>

<br>

<br><tt><font size="2">if ! touch $OCF_RESKEY_trigger_file; then</font></tt>

<br><tt><font size="2">        ocf_log err "$OCF_RESKEY_trigger_file

could not be created!"</font></tt>

<br><tt><font size="2">        return $OCF_ERR_GENERIC</font></tt>

<br><tt><font size="2">fi</font></tt>

<br>

<br><tt><font size="2">if [ ! -f $OCF_RESKEY_recovery_conf ]; then</font></tt>

<br><tt><font size="2">        ocf_log err "$OCF_RESKEY_recovery_conf

doesn't exist!"</font></tt>

<br><tt><font size="2">        cp $OCF_RESKEY_recovery_conf_ersatz

$OCF_RESKEY_pgdata</font></tt>

<br><tt><font size="2">        return $OCF_SUCCESS</font></tt>

<br><tt><font size="2">fi</font></tt></blockquote><div><br>Why do you need this? As far as I know when you switch standby database to primary using trigger file recovery.conf gets renamed to recovery.done. If you rename it back DB will be put into standby mode after restart.We are talking about streaming replication, right?<br>

 <br></div><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">

<br>

<br><tt><font size="2"># If both file exist or can be created, then the failover

fun can start.</font></tt>

<br>

<br><tt><font size="2">ocf_log info "$OCF_RESKEY_trigger_file was created."</font></tt>

<br><tt><font size="2">ocf_log info "$OCF_RESKEY_recovery_conf exists

and can be copied to the correct location."</font></tt>

<br>

<br><tt><font size="2"># Sometimes, the master needs a bit of time to take

the reins. So...</font></tt>

<br>

<br><tt><font size="2">while :</font></tt>

<br><tt><font size="2">do</font></tt>

<br><tt><font size="2">        pgsql_monitor

warn</font></tt>

<br><tt><font size="2">        rc=$?</font></tt>

<br>

<br><tt><font size="2">        if [ $rc -eq $OCF_RUNNING_MASTER

]; then</font></tt>

<br><tt><font size="2">           

    break;</font></tt>

<br><tt><font size="2">        fi</font></tt>

<br>

<br><tt><font size="2">        ocf_log debug

"Postgres Server could not be promoted. Please wait..."</font></tt>

<br><tt><font size="2">        </font></tt>

<br><tt><font size="2">        sleep 1</font></tt>

<br>

<br><tt><font size="2">done</font></tt>

<br>

<br><tt><font size="2">ocf_log info "Postgres Server has been promoted.

Please check on the previous master."</font></tt>

<br>

<br><tt><font size="2">#################################</font></tt>

<br><tt><font size="2">#Attributes Update:          

  #</font></tt>

<br><tt><font size="2">#################################</font></tt>

<br>

<br><tt><font size="2">$ATTRD_UPDATER -n $PGSQL_STATUS_NAME -v \"PRI\"

|| exit $(echo "Eh! Attrd_updater is not working!")</font></tt>

<br>

<br><tt><font size="2">#############################################</font></tt>

<br><tt><font size="2"># Resource stickiness pumped up to 1000 :  #</font></tt>

<br><tt><font size="2">#############################################</font></tt>

<br>

<br><tt><font size="2">$CRM_MASTER -v $PROMOTE_WERT_HOCH || exit $(echo "crm_master

could not change the Master's status!")</font></tt>

<br>

<br><tt><font size="2">############</font></tt>

<br><tt><font size="2"># Success! #</font></tt>

<br><tt><font size="2">############            

   </font></tt>

<br>

<br><tt><font size="2">return $OCF_SUCCESS</font></tt>

<br>

<br><tt><font size="2">}</font></tt>

<br>

<br><tt><font size="2">######################################################################################################</font></tt>

<br>

<br><tt><font size="2">Thanks!</font></tt>

<br>

<br></blockquote><div><br>And what about demote? Switching standby into primary using trigger files changes TIMELINE in the DB and that invalidates all other standby databases as well as previous master database. After that you have to restore them from a fresh backup made on new master. This particular behavior stopped me from implementing Master/Slave functionality in pgsql RA so far.<br>

<br>BTW, why pgsql is set to <tt><font size="2">is-managed="false" in your configuration.With this setting cluster will keep monitoring it but won't take any other actions AFAIK.</font></tt><br><br><br></div>

<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><tt><font size="2">Éamon<div class="im"><br>

<br>

<br>

>> Unfortunately, promote/demote doesn't work. ocf-tester tries to

use the<br>

>> "crm_attribute -N pgsql1 -n master-pgrql-replication-agent

-l reboot -v<br>

>> 100", but the (unmanaged) resources don't accept the score

change.<br>

>><br>

>> I'm pretty sure that I just need to be hit with a clue stick and

would be<br>

>> grateful for any help.<br>

>><br>

>> Thanks,<br>

>><br></div>

>> ?amon<br>

>><br><font color="#888888">

<br>

<br>

<br>

-- <br>

Serge Dubrouski.</font></font></tt><br>_______________________________________________<br>

Pacemaker mailing list: <a href="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</a><br>

<a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/mailman/listinfo/pacemaker</a><br>

<br>

Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>

Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>

Bugs: <a href="http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker" target="_blank">http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker</a><br>

<br></blockquote></div><br><br clear="all"><br>-- <br>Serge Dubrouski.<br>