[Pacemaker] Solved: [Linux-HA] SLES 11 HAE SP1 Signon to CIB Failed

Darren.Mansell at opengi.co.uk Darren.Mansell at opengi.co.uk
Thu Feb 3 05:47:32 EST 2011


On Fri, Jan 28, 2011 at 1:06 PM,  <Darren.Mansell at opengi.co.uk> wrote:
> Hi all, this seems like it should be an easy one to fix, I'll raise a 
> support call with Novell if required.
>
>
>
> Base install of SLES 11 32 bit SP1 with HAE SP1 and crm_mon gives 
> 'signon to CIB failed'. Same thing with the CRM shell etc.

Too many open file descriptors?
lsof might show something interesting



-----------


Unfortunately not.

It seems that corosync doesn't spawn anything else, which is causing
this issue:

>From a SLES 11 HAE install:

root      7342  5.6  0.2 166048 38924 ?        SLl   2010 5685:08
aisexec
root      7349  0.0  0.0  67768 10516 ?        SLs   2010   3:02  \_
/usr/lib64/heartbeat/stonithd
90        7350  0.0  0.0  65028  4656 ?        S     2010   7:43  \_
/usr/lib64/heartbeat/cib
nobody    7351  0.0  0.0  61600  1832 ?        S     2010   8:24  \_
/usr/lib64/heartbeat/lrmd
90        7352  0.0  0.0  66284  2320 ?        S     2010   0:00  \_
/usr/lib64/heartbeat/attrd
90        7353  0.0  0.0  67536  3588 ?        S     2010   1:24  \_
/usr/lib64/heartbeat/pengine
90        7354  0.0  0.0  72392  3712 ?        S     2010   6:01  \_
/usr/lib64/heartbeat/crmd
root      7355  0.0  0.0  75148  2504 ?        S     2010   2:25  \_
/usr/lib64/heartbeat/mgmtd
root      4040  0.0  0.0      0     0 ?        Z     2010   0:00  \_
[aisexec] <defunct>
root      4059  0.0  0.0      0     0 ?        Z     2010   0:00  \_
[aisexec] <defunct>

>From a SLES 11 SP1 HAE install:

root      9109  0.0  0.4  13308  2288 tty1     Ss+  Feb02   0:00  \_
-bash
root      8989  0.0  0.1   4344   744 tty2     Ss+  Feb02   0:00
/sbin/mingetty tty2
root      8990  0.0  0.1   4344   752 tty3     Ss+  Feb02   0:00
/sbin/mingetty tty3
root      8991  0.0  0.1   4344   748 tty4     Ss+  Feb02   0:00
/sbin/mingetty tty4
root      8992  0.0  0.1   4344   748 tty5     Ss+  Feb02   0:00
/sbin/mingetty tty5
root      8993  0.0  0.1   4344   744 tty6     Ss+  Feb02   0:00
/sbin/mingetty tty6
root     24883  0.0  0.8  89808  4424 ?        Ssl  Feb02   0:34
/usr/sbin/corosync
lookup-01:~ # 

So I compared the /etc/ais/openais.conf in non-sp1 with
/etc/corosync/corosync.conf from sp1 and found this bit missing which
could be quite useful...

service {
        # Load the Pacemaker Cluster Resource Manager
        ver:       0
        name:      pacemaker
        use_mgmtd: yes
        use_logd:  yes
}

Added it and it works. Doh.

It seems the example corosync.conf that is shipped won't start
pacemaker, I'm not sure if that's on purpose or not, but I found it a
bit confusing after being used to it 'just working' previously.




More information about the Pacemaker mailing list