[Pacemaker] coroync not able to exec services properly

Shravan Mishra shravan.mishra at gmail.com
Mon Jan 4 10:04:35 EST 2010


Hi,

I'm using corosync 1.1.1.
Should I still change it or leave it or change it in case in future I
upgrade then I don't have to remember.

Thanks
Shravan

On Sat, Jan 2, 2010 at 6:54 PM, Steven Dake <sdake at redhat.com> wrote:
> If your using corosync 1.2.0, we enforced a constraint on consensus and
> token such that consensus must be 1.2* token. Your consensus is 1/2
> token which will cause corosync to exit at start.
>
> Regards
> -steve
>
> On Mon, 2009-12-28 at 12:58 +0100, Dejan Muhamedagic wrote:
>> Hi,
>>
>> On Thu, Dec 24, 2009 at 02:35:01PM -0500, Shravan Mishra wrote:
>> > Hi Guys,
>> >
>> > I had a perfectly running system for about 3 weeks now but now on reboot I
>> > see problems.
>> >
>> > Looks like the processes are being spawned and respawned but a proper exec
>> > is not happening.
>>
>> According to the logs, attrd can't start (exit code 100) for some
>> reason (perhaps there are more logs elsewhere where it says
>> what's wrong) and pengine segfaults. For the latter please
>> enable coredumps (ulimit -c unlimited) and file a bugzilla.
>>
>> > Am I missing some permissions on directories.
>> >
>> >
>> > I have a script which does the following for directories:
>>
>> Why do you need this script? It should be done by the package
>> installation scripts.
>>
>> > =============
>> > getent group haclient > /dev/null || groupadd -r haclient
>> > getent passwd hacluster > /dev/null || useradd -r -g haclient -d
>> > /var/lib/heartbeat/cores/hacluster -s /sbin/nologin -c "cluster user"
>> > hacluster
>> >
>> > if [ ! -d "/var/lib/pengine" ];then
>> >  mkdir /var/lib/pengine
>> > fi
>> > chown -R hacluster:haclient /var/lib/pengine
>> >
>> > if [ ! -d "/var/lib/heartbeat" ];then
>> > mkdir /var/lib/heartbeat
>> > fi
>> >
>> > if [ ! -d "/var/lib/heartbeat/crm" ];then
>> >  mkdir /var/lib/heartbeat/crm
>> > fi
>> > chown -R hacluster:haclient /var/lib/heartbeat/crm/
>> > chmod 750 /var/lib/heartbeat/crm/
>> >
>> > if [ ! -d "/var/lib/heartbeat/ccm" ];then
>> >  mkdir /var/lib/heartbeat/ccm
>> > fi
>> > chown -R hacluster:haclient /var/lib/heartbeat/ccm/
>> > chmod 750 /var/lib/heartbeat/ccm/
>> >
>> > if [ ! -d "/var/run/heartbeat/" ];then
>> >  mkdir /var/run/heartbeat/
>> >  fi
>> >
>> > if [ ! -d "/var/run/heartbeat/ccm" ];then
>> >  mkdir /var/run/heartbeat/ccm/
>> >  fi
>> > chown -R hacluster:haclient /var/run/heartbeat/ccm/
>> > chmod 750 /var/run/heartbeat/ccm/
>>
>> You don't need ccm for corosync/openais clusters.
>>
>> > if [ ! -d "/var/run/heartbeat/crm" ];then
>> >  mkdir /var/run/heartbeat/crm/
>> >  fi
>> > chown -R hacluster:haclient /var/run/heartbeat/crm/
>> > chmod 750 /var/run/heartbeat/crm/
>> >
>> > if [ ! -d "/var/run/crm" ];then
>> >  mkdir /var/run/crm
>> > fi
>> >
>> > if [ ! -d "/var/lib/corosync" ];then
>> >  mkdir /var/lib/corosync
>> > fi
>> > =============
>> >
>> >
>> > I have a very simple active-passive configuration with just 2 nodes.
>> >
>> > On starting Corosync , on doing
>> >
>> >
>> > [root at node2 ~]# ps -ef | grep coro
>> > root      8242     1  0 11:33 ?        00:00:00 /usr/sbin/corosync
>> > root      8248  8242  0 11:33 ?        00:00:00 /usr/sbin/corosync
>> > root      8249  8242  0 11:33 ?        00:00:00 /usr/sbin/corosync
>> > root      8250  8242  0 11:33 ?        00:00:00 /usr/sbin/corosync
>> > root      8252  8242  0 11:33 ?        00:00:00 /usr/sbin/corosync
>> > root      8393  8242  0 11:35 ?        00:00:00 /usr/sbin/corosync
>> > [root at node2 ~]# ps -ef | grep heart
>> > 82        7924     1  0 11:28 ?        00:00:00 /usr/lib64/heartbeat/pengine
>> >
>> > I'm attaching the log file.
>> >
>> > My config is:
>> >
>> >
>> > # Please read the corosync.conf.5 manual page
>> > compatibility: whitetank
>> >
>> > totem {
>> >  version: 2
>> >   token: 3000
>> >   token_retransmits_before_loss_const: 10
>> >   join: 60
>> >   consensus: 1500
>> >   vsftype: none
>> >   max_messages: 20
>> >   clear_node_high_bit: yes
>> >   secauth: on
>> >   threads: 0
>> >   rrp_mode: passive
>> > interface {
>> > ringnumber: 0
>> > bindnetaddr: 192.168.1.0
>> > # mcastaddr: 226.94.1.1
>> > broadcast: yes
>> > mcastport: 5405
>> > }
>> > interface {
>> > ringnumber: 1
>> > bindnetaddr: 172.20.20.0
>> > # mcastaddr: 226.94.1.1
>> > broadcast: yes
>> > mcastport: 5405
>> > }
>> > }
>> >
>> > logging {
>> > fileline: off
>> > to_stderr: yes
>> > to_logfile: yes
>> > to_syslog: yes
>> > logfile: /tmp/corosync.log
>>
>> Don't log to file. Can't recall exactly but there were some
>> permission problems with that, probably because Pacemaker daemons
>> don't run as root.
>>
>> Thanks,
>>
>> Dejan
>>
>> > debug: on
>> > timestamp: on
>> > logger_subsys {
>> > subsys: AMF
>> > debug: off
>> > }
>> > }
>> >
>> > service {
>> > name: pacemaker
>> > ver: 0
>> > }
>> >
>> > aisexec {
>> > user:root
>> > group: root
>> > }
>> >
>> > amf {
>> > mode: disabled
>> > }
>> >
>> >
>> > Please help.
>> >
>> > Sincerely
>> > Shravan
>>
>>
>> > _______________________________________________
>> > Pacemaker mailing list
>> > Pacemaker at oss.clusterlabs.org
>> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>>
>> _______________________________________________
>> Pacemaker mailing list
>> Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
>
> _______________________________________________
> Pacemaker mailing list
> Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>




More information about the Pacemaker mailing list