[Pacemaker] Remote Access not Working

Colin colin.hch at gmail.com
Thu Nov 12 10:46:50 EST 2009


On Thu, Nov 12, 2009 at 3:36 PM, Andrew Beekhof <andrew at beekhof.net> wrote:
> I used it the other day.
>
> http://www.clusterlabs.org/doc/pacemaker-explained/ch-advanced-options.html#s-remote-connection
>
> Try setting CIB_encrypted to false.

Thanks, that got me a step further...

...but there are still various issues:

1) In cib/remote.c, the function check_group_membership() only checks
whether the user is explicitly listed as member of the group in
/etc/group, but does not accept the user if only the users's primary
group in /etc/passwd is set to the correct group (and the explicit,
then redundant, membership in /etc/group is missing).

2) "Configuration Explained" does not mention CIB_encryped, that's why
my first attempts didn't work in the first place.

3) "Configuration Explained" says "remote-open-port" instead of
"remote-clear-port" in one place.

4) "Configuration Explained" says that CIB_user must be in the
"hacluster" group, rather then the "haclient" group.

5) The log message "cib: [2941]: debug: cib_remote_listen: New
clear-text connection" should include from where the connection came.

6) The log message "cib: [2941]: ERROR: cib_remote_listen: User is not
a member of the required group" might mention which user and which
group...

7) "Configuration Explained" and the page you just sent me both state
that CIB_user must be part of the hacluster group; apart from the
mistake that the group is haclient, the commend in cib/remote.c and my
observations shows that CIB_user actually must be the user as which
the cib process is running.

8) Just tried with crm_resource: The password prompt when not setting
CIB_password is sent to stdout, rather than stderr [which makes it
near impossible to send the output someplace].

9) I am getting completely bogus results via the remote connection,
e.g. "crm_resource --list" shows only 2 of 8 resources, and shows the
as stopped, whereas on the cluster nodes I see the -- correct -- list
with 8 resources which are all started. With "cibadmin -Q" I get:

# cibadmin -Q | wc  # on a cluster node
    379    1895   50474

# cibadmin -Q | wc  # via the remote connection
cibadmin: Opened connection to 192.168.80.10:6900
     66     193    4731

10) It's very easy to trash the cib process, e.g. by connecting via
telnet and sending a few bytes of garbage; result is an endless loop
of "cib: [7846]: ERROR: cib_recv_remote_msg: Empty reply" messages,
one per second, and that I need to "killall -9 cib" in order to get
everything working again.

Only once, out of a couple dozen attempts, did the remote access
actually yield the correct output, other times it completely fails
without any apparent reason ... at this point I'm not quite sure what
to make of all this.

Regards, Colin




More information about the Pacemaker mailing list