[Pacemaker] Issues with Pacemaker / Corosync

Sat Dec 24 21:31:48 EST 2011

Quoting Arnold Krille <arnold at arnoldarts.de>:
>> Hi,
>   >
>   > On Friday 23 December 2011 16:03:37 Aravind M D wrote:
>   >>   I am facing some problem wth corosync and pacemaker implementation. I
>   >> have configured cluster on Debian squeeze, the package for corosync and
>   >> pacemaker is installed from backports.
>   >>   I am configuring two node cluster and i have configured one resource
>   >> also. Below is my configuration.
>   >>   root at nagt02a:~# crm configure show
>   >>   node nagt02
>   >>   node nagt02a
>   >>   primitive icinga lsb:icinga \
>   >>           op start interval="0" timeout="30s" \
>   >>           op stop interval="0" timeout="30s" \
>   >>           op monitor interval="30s" \
>   >>           meta multiple-active="stop_start"
>   >>   location prefer-nagt02 icinga 10: nagt02
>   >>   property $id="cib-bootstrap-options" \
>   >>           dc-version="1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
>   >>           cluster-infrastructure="openais" \
>   >>           expected-quorum-votes="2" \
>   >>           stonith-enabled="false" \
>   >>           no-quorum-policy="ignore"
>   >>   Problem 1: When the service is active on nagt02 and if i  
> manually start
>   >> the service on cgnagt02a the service is not disabling on nagt02a.
>   >
>   > I found that it will be stopped, but not as fast as you think it  
> will. The
>   > monitoring action only runs on the active resource. But every  
> now and then (I
>   > think every five to ten minutes but that is configurable) the cluster
>   > checks the
>   > whole status and therefor also detects services running where  
> they shouldn't.
>   > With this you will probably find that once pacemaker sees the second
>   > icinga, it
>   > will shut down both to make sure and restart it on one node

How can i configure this for one min or less than that? Because i need  
to make sure that service is running only on one node for the  
integrity of data.

>   >
>   >>   Problem 2: For checking I have stopped the service on nagt02 and made
>   >> some changes on configuration files so service wont start again  
> on nagt02.
>   >> What i am testing is when node comes from a failover and service was not
>   >> able to start on nagt02 it should start on nagt02a. But i am getting the
>   >> below error.
>   >>
>   >>   root at cgnagt02:~# crm_mon --one-shot
>   >>   Online: [ cgnagt02 cgnagt02a ]
>   >>    icinga (lsb:icinga):   Started cgnagt02 (unmanaged) FAILED
>   >>   Failed actions:
>   >>       icinga_monitor_30000 (node=cgnagt02, call=4, rc=6,  
> status=complete):
>   >> not configured
>   >>       icinga_stop_0 (node=cgnagt02, call=5, rc=6, status=complete): not
>   >> configured
>   >
>   > Looks as if you making the service "not start" also made the service "not
>   > stop". And pacemaker won't start a service on one node which it  
> can't shut
>   > down definitely on another node. Unless you configure fencing  
> and the failed
>   > host gets killed by that I guess.

I have shutdown both the systems and switched on one by one. First i  
had switched on nagt02 the system comes up and it was unable to start  
the service due to some problem( configuration file error, filesystem  
error ). Whether the service will start on nagt02a automatically . if  
so how can i configure this.

Rgds,
Aravind M D