[Pacemaker] Remote Access not Working

Colin colin.hch at gmail.com
Mon Nov 9 11:24:21 EST 2009


Hi All,

just tried to get the remote access to the cluster up-and-running, but
with more error than success...

Starting point was a working cluster installation. Then I did

# cibadmin --modify -X '<cib remote-clear-port="6900"/>'
# /etc/init.d/corosync stop
# /etc/init.d/corosync start

to get the listener, erm, listening:

# netstat -ant | grep 6900
tcp        0      0 0.0.0.0:6900            0.0.0.0:*               LISTEN

For a first test I also changed the password of the "hacluster" user.

Then, on another machine, I set up the environment variables as follows:

# env | grep CIB
CIB_server=192.168.80.10
CIB_user=hacluster
CIB_port=6900

And issued a simple command, crm_resource --list. The crm_resource
command asks for a password and then hangs, on the cluster machine I
find the following in /var/log/daemon.log:

Nov  9 17:15:10 mz-dom0-001-4000 cib: [15698]: debug:
cib_remote_listen: New clear-text connection
Nov  9 17:15:10 mz-dom0-001-4000 cib: [15698]: ERROR: crm_xml_err: XML
Error: Entity: line 1: parsererror : Start tag expected, '<' not found
Nov  9 17:15:10 mz-dom0-001-4000 cib: [15698]: ERROR: crm_xml_err: XML
Error: #026#003#002
Nov  9 17:15:10 mz-dom0-001-4000 cib: [15698]: ERROR: crm_xml_err: XML Error: ^
Nov  9 17:15:10 mz-dom0-001-4000 cib: [15698]: WARN: string2xml:
Parsing failed (domain=1, level=3, code=4): Start tag expected, '<'
not found
Nov  9 17:15:10 mz-dom0-001-4000 cib: [15698]: ERROR: string2xml:
Couldn't parse 3 chars: #026#003#002
Nov  9 17:15:10 mz-dom0-001-4000 cib: [15698]: ERROR:
cib_recv_remote_msg: Couldn't parse: '#026#003#002'
Nov  9 17:15:26 mz-dom0-001-4000 cib: [15698]: ERROR:
cib_recv_remote_msg: Empty reply
Nov  9 17:15:27 mz-dom0-001-4000 cib: [15698]: ERROR:
cib_recv_remote_msg: Empty reply
Nov  9 17:15:28 mz-dom0-001-4000 cib: [15698]: ERROR:
cib_recv_remote_msg: Empty reply
Nov  9 17:15:29 mz-dom0-001-4000 cib: [15698]: ERROR:
cib_recv_remote_msg: Empty reply
Nov  9 17:15:30 mz-dom0-001-4000 cib: [15698]: ERROR:
cib_recv_remote_msg: Empty reply
.........

This continues forever, an error message every second, and the process
does not stop itself the normal way:

# /etc/init.d/corosync stop
Stopping corosync daemon: corosync.
# ps aux | grep cib
105      15698  0.3  0.7  13844  4588 ?        S    17:12   0:01
/usr/lib/heartbeat/cib

This seems to prevent other processes from cleanly shutting down, too.

Am I doing something obviously wrong?

Thanks, Colin


PS: AFAICS the remote access does not support something like failover,
or connections to multiple cluster hosts, so I'll have to roll my own
wrapper that takes care of the issue?




More information about the Pacemaker mailing list