[Pacemaker] OpenAIS vs. Corosync

Ryan Steele ryans at aweber.com
Thu May 21 18:39:16 EDT 2009


Well, I'm still not sure what the state of OpenAIS/CoroSync is, but seem to encounter only more problems the farther I 
dig in to it.  Here are the versions of the software packages I'm using:

corosync            0.92-0ubuntu3
libcorosync-dev     0.92-0ubuntu3
libcorosync2        0.92-0ubuntu3
libopenais-dev      0.91-0ubuntu3
libopenais2         0.91-0ubuntu3
libstonith0         2.99.1-1
openais		    0.91-0ubuntu3
pacemaker           1.0.2-1
pacemaker-dev       1.0.2-1
pacemaker-mgmt      1.99.0-3
pacemaker-mgmt-dev  1.99.0-3
stonith             2.99.1-1


That being said, with logging set to 'debug' I still had no idea why I'm seeing this in the logs:


$ [SERV  ] Service failed to load 'pacemaker'.


Here's the openais.conf/corosync.conf I'm using:

corosync {
    user:  root
    group: root
}

aisexec {
    user:  root
    group: root
}

service {
    name: pacemaker
    ver: 0
}

totem {
    version: 2
    token:          10000
    token_retransmits_before_loss_const: 20
    join:           60
    consensus:      4800
    max_messages:   20
    secauth: off
    threads: 0
    rrp_mode: none
    vsftype: none
    clear_node_high_bit: yes

    interface {
       ringnumber: 0
       bindnetaddr: 192.168.7.0
       mcastaddr: 226.94.1.1
       mcastport: 5405
    }
}

logging {
    fileline: on
    to_stderr: yes
    to_file: yes
    to_syslog: yes
    logfile: /var/log/openais/openais.log
    syslog_facility: daemon
    debug: on
    timestamp: on
    logger_subsys {
       subsys: AMF
       debug: on
       tags: enter|leave|trace1|trace2|trace3|trace4|trace6
    }
}

amf {
    mode: disabled
}


I know that root is a member of the proper groups:

root at ha1:~# whoami; groups
root
root haclient

And that the ownerships and modes in /var/run/heartbeat are correct/as unrestrictive as possible:

root at ha1:~# ls -alt /var/run/heartbeat/*
/var/run/heartbeat/ccm:
total 0
drwxrwxrwx 4 hacluster haclient 80 2009-05-21 16:12 ..
drwxrwxrwx 2 hacluster haclient 40 2008-11-20 10:06 .

/var/run/heartbeat/crm:
total 0
drwxrwxrwx 4 hacluster haclient 80 2009-05-21 16:12 ..
drwxrwxrwx 2 hacluster haclient 40 2008-11-20 10:06 .


I straced the starting of the daemon via the corosync init script, starting the daemon manually, and querying via 
cibadmin.  The only one that gave any useful information at all was stracing cibadmin, and it complained that it 
couldn't find a few local sockets (/var/run/heartbeat/crm/cib_callback and /var/run/heartbeat/crm/cib_rw).  But, that's 
literally all I've got to go on.  Can anyone shed light on this issue?  Is there any way I can get CoroSync to log 
useful information about why a service can't be started?  The debug logs are far less verbose than I need to be able to 
make accurate assessments of the problems I'm seeing ("Service failed to load 'pacemaker'" just doesn't cut it).  If any 
additional information is needed, please let me know and I'll do my best to provide it.


$ strace -F /etc/init.d/corosync start > initscript.out 2>&1
$ /etc/init.d/corosync stop
$ strace -F /usr/sbin/corosync > binary.out 2>&1
$ strace cibadmin -Q > cibadmin.out 2>&1



$ grep ENOENT *.out | sort -u
binary.out:access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
binary.out:access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)


cibadmin.out:access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
cibadmin.out:access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
cibadmin.out:connect(4, {sa_family=AF_FILE, path="/var/run/heartbeat/crm/cib_callback"}, 110) = -1 ENOENT (No such file 
or directory)
cibadmin.out:connect(4, {sa_family=AF_FILE, path="/var/run/heartbeat/crm/cib_rw"}, 110) = -1 ENOENT (No such file or 
directory)
cibadmin.out:open("/usr/lib/openais/libbz2.so.1.0", O_RDONLY) = -1 ENOENT (No such file or directory)
cibadmin.out:open("/usr/lib/openais/libccmclient.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
cibadmin.out:open("/usr/lib/openais/libcib.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
cibadmin.out:open("/usr/lib/openais/libcrmcommon.so.2", O_RDONLY) = -1 ENOENT (No such file or directory)
cibadmin.out:open("/usr/lib/openais/libcrypto.so.0.9.8", O_RDONLY) = -1 ENOENT (No such file or directory)
cibadmin.out:open("/usr/lib/openais/libc.so.6", O_RDONLY) = -1 ENOENT (No such file or directory)
cibadmin.out:open("/usr/lib/openais/libdl.so.2", O_RDONLY) = -1 ENOENT (No such file or directory)
cibadmin.out:open("/usr/lib/openais/libgcrypt.so.11", O_RDONLY) = -1 ENOENT (No such file or directory)
cibadmin.out:open("/usr/lib/openais/libglib-2.0.so.0", O_RDONLY) = -1 ENOENT (No such file or directory)
cibadmin.out:open("/usr/lib/openais/libgnutls.so.13", O_RDONLY) = -1 ENOENT (No such file or directory)
cibadmin.out:open("/usr/lib/openais/libgpg-error.so.0", O_RDONLY) = -1 ENOENT (No such file or directory)
cibadmin.out:open("/usr/lib/openais/libhbclient.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
cibadmin.out:open("/usr/lib/openais/libltdl.so.3", O_RDONLY) = -1 ENOENT (No such file or directory)
cibadmin.out:open("/usr/lib/openais/libm.so.6", O_RDONLY) = -1 ENOENT (No such file or directory)
cibadmin.out:open("/usr/lib/openais/libncurses.so.5", O_RDONLY) = -1 ENOENT (No such file or directory)
cibadmin.out:open("/usr/lib/openais/libpam.so.0", O_RDONLY) = -1 ENOENT (No such file or directory)
cibadmin.out:open("/usr/lib/openais/libpcre.so.3", O_RDONLY) = -1 ENOENT (No such file or directory)
cibadmin.out:open("/usr/lib/openais/libpils.so.2", O_RDONLY) = -1 ENOENT No such file or directory)
cibadmin.out:open("/usr/lib/openais/libplumb.so.2", O_RDONLY) = -1 ENOENT (No such file or directory)
cibadmin.out:open("/usr/lib/openais/libpthread.so.0", O_RDONLY) = -1 ENOENT (No such file or directory)
cibadmin.out:open("/usr/lib/openais/librt.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
cibadmin.out:open("/usr/lib/openais/libtasn1.so.3", O_RDONLY) = -1 ENOENT (No such file or directory)
cibadmin.out:open("/usr/lib/openais/libuuid.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
cibadmin.out:open("/usr/lib/openais/libxml2.so.2", O_RDONLY) = -1 ENOENT (No such file or directory)
cibadmin.out:open("/usr/lib/openais/libxslt.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
cibadmin.out:open("/usr/lib/openais/libz.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
cibadmin.out:open("/usr/lib/openais/tls/libcrmcommon.so.2", O_RDONLY) = -1 ENOENT (No such file or directory)
cibadmin.out:open("/usr/lib/openais/tls/x86_64/libcrmcommon.so.2", O_RDONLY) = -1 ENOENT (No such file or directory)
cibadmin.out:open("/usr/lib/openais/x86_64/libcrmcommon.so.2", O_RDONLY) = -1 ENOENT (No such file or directory)
cibadmin.out:stat("/usr/lib/openais/tls", 0x7fff3f31fe80) = -1 ENOENT (No such file or directory)
cibadmin.out:stat("/usr/lib/openais/tls/x86_64", 0x7fff3f31fe80) = -1 ENOENT (No such file or directory)
cibadmin.out:stat("/usr/lib/openais/x86_64", 0x7fff3f31fe80) = -1 ENOENT (No such file or directory)


initscript.out:access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
initscript.out:access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)

-- 
Ryan Steele
Systems Administrator




More information about the Pacemaker mailing list