[Pacemaker] Postgres RA won't start

Tue Oct 11 10:32:54 EDT 2011

What version of resource-agents package do you use?  Old version of pgsql
depended on fuser tool installed, otherway it could fail with that error
code.
 On Oct 11, 2011 8:12 AM, "Amar Prasovic" <amar at linux.org.ba> wrote:

> Hello everyone,
>
> I tried to configure postgres RA and I ran into some problems.
>
> I configured several resources in my cluster config where pgsql was set to
> run last, after DRBD, Filesystem, IPAddr2 and nginx.
>
> Here is how it looks like in crm configure:
>
> crm(live)configure# show
> node webnode01 \
>         attributes standby="off"
> node webnode02 \
>         attributes standby="off"
> primitive ClusterIP ocf:heartbeat:IPaddr2 \
>         params ip="192.168.10.80" cidr_netmask="32" \
>         op monitor interval="30s"
> primitive drbd_res ocf:linbit:drbd \
>         params drbd_resource="yorxs" \
>         op monitor interval="60s" \
>         op start interval="0s" timeout="240s" \
>         op stop interval="0s" timeout="100s"
> primitive fs_res ocf:heartbeat:Filesystem \
>         params device="/dev/drbd1" directory="/srv" fstype="ext4" \
>         op start interval="0s" timeout="60s" \
>         op stop interval="0s" timeout="60s" \
>         op monitor interval="60s" timeout="40s"
> primitive nginx_res ocf:heartbeat:nginx \
>         params configfile="/etc/nginx/nginx.conf"
> httpd="/usr/local/sbin/nginx" status10url="http:/127.0.0.1" \
>         op monitor interval="10s" timeout="30s" \
>         op start interval="0" timeout="40s" \
>         op stop interval="0" timeout="60s"
> primitive postgres_res ocf:heartbeat:pgsql \
>         params psql="/bin/psql" pgdata="/var/lib/postgres/8.4/main"
> logfile="/var/log/postgres/postgres.log" \
>         op start interval="0" timeout="120s" \
>         op stop interval="0" timeout="120s" \
>         op monitor interval="30s" timeout="30s"
> group cluster_1 fs_res ClusterIP nginx_res postgres_res
> ms drbd_cluster drbd_res \
>         meta master-max="1" master-node-max="1" clone-max="2"
> clone-node-max="1" notify="true"
> location prefer_webnode01 cluster_1 50: webnode01
> location prefer_webnode01_drbd drbd_cluster 50: webnode01
> colocation cluster_1_on_drbd inf: cluster_1 drbd_cluster:Master
> order cluster_1_after_drbd inf: drbd_cluster:promote cluster_1:start
> property $id="cib-bootstrap-options" \
>         dc-version="1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b" \
>         cluster-infrastructure="openais" \
>         expected-quorum-votes="2" \
>         stonith-enabled="false" \
>         no-quorum-policy="ignore" \
>         last-lrm-refresh="1318326771"
>
> However, when I run this config, everything except for pgsql starts without
> problems. For pgsql, I got the following error:
>
> in crm_mon
> Online: [ webnode02 webnode01 ]
>
>  Master/Slave Set: drbd_cluster
>      Masters: [ webnode01 ]
>      Slaves: [ webnode02 ]
>  Resource Group: cluster_1
>      fs_res     (ocf::heartbeat:Filesystem):    Started webnode01
>      ClusterIP  (ocf::heartbeat:IPaddr2):       Started webnode01
>      nginx_res  (ocf::heartbeat:nginx):    Started webnode01
>      postgres_res       (ocf::heartbeat:pgsql): Stopped
>
> Failed actions:
>     postgres_res_start_0 (node=webnode01, call=84, rc=5, status=complete):
> not installed
>     postgres_res_start_0 (node=webnode02, call=66, rc=5, status=complete):
> not installed
>
> in /var/log/syslog
> webnode01 log # cat syslog |grep postgres_res
> Oct 11 11:39:34 webnode01 crmd: [921]: info: do_lrm_rsc_op: Performing
> key=6:93:7:933bf2ab-00d0-435c-a24f-85897e0c9725 op=postgres_res_monitor_0 )
> Oct 11 11:39:34 webnode01 lrmd: [914]: info: rsc:postgres_res:27: probe
> Oct 11 11:39:34 webnode01 crmd: [921]: info: process_lrm_event: LRM
> operation postgres_res_monitor_0 (call=27, rc=7, cib-update=36,
> confirmed=true) not running
> Oct 11 11:39:50 webnode01 crmd: [921]: info: do_lrm_rsc_op: Performing
> key=39:96:0:933bf2ab-00d0-435c-a24f-85897e0c9725 op=postgres_res_start_0 )
> Oct 11 11:39:50 webnode01 lrmd: [914]: info: rsc:postgres_res:39: start
> Oct 11 11:39:50 webnode01 crmd: [921]: info: process_lrm_event: LRM
> operation postgres_res_start_0 (call=39, rc=5, cib-update=47,
> confirmed=true) not installed
> Oct 11 11:39:50 webnode01 attrd: [918]: info: find_hash_entry: Creating
> hash entry for fail-count-postgres_res
> Oct 11 11:39:50 webnode01 attrd: [918]: info: attrd_trigger_update: Sending
> flush op to all hosts for: fail-count-postgres_res (INFINITY)
> Oct 11 11:39:50 webnode01 attrd: [918]: info: attrd_perform_update: Sent
> update 63: fail-count-postgres_res=INFINITY
> Oct 11 11:39:50 webnode01 attrd: [918]: info: find_hash_entry: Creating
> hash entry for last-failure-postgres_res
> Oct 11 11:39:50 webnode01 attrd: [918]: info: attrd_trigger_update: Sending
> flush op to all hosts for: last-failure-postgres_res (1318325990)
> Oct 11 11:39:50 webnode01 attrd: [918]: info: attrd_perform_update: Sent
> update 66: last-failure-postgres_res=1318325990
> Oct 11 11:39:50 webnode01 crmd: [921]: info: do_lrm_rsc_op: Performing
> key=4:97:0:933bf2ab-00d0-435c-a24f-85897e0c9725 op=postgres_res_stop_0 )
> Oct 11 11:39:50 webnode01 lrmd: [914]: info: rsc:postgres_res:40: stop
> Oct 11 11:39:50 webnode01 crmd: [921]: info: process_lrm_event: LRM
> operation postgres_res_stop_0 (call=40, rc=0, cib-update=49, confirmed=true)
> ok
>
> Additional info:
>
> /etc/postgresql, /etc/postgresql-common and /var/lib/postgresql are
> symlinks on both nodes. Actual directories are on shared DRBD disk.
> Postgres starts without any problems with init script. On both nodes.
>
> Thanks a lot in advance for any advice.
>
> --
> Amar Prasovic
> Gaißacher Straße 17
> D - 81371 München
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20111011/81acf7f9/attachment-0003.html>