[Pacemaker] Remote Access not Working

Andrew Beekhof andrew at beekhof.net
Mon Nov 16 10:42:29 EST 2009


On Mon, Nov 16, 2009 at 4:31 PM, Colin <colin.hch at gmail.com> wrote:
> Hi Andrew,
>
> thanks for your response!
>
> On Mon, Nov 16, 2009 at 3:19 PM, Andrew Beekhof <andrew at beekhof.net> wrote:
>> On Thu, Nov 12, 2009 at 4:46 PM, Colin <colin.hch at gmail.com> wrote:
>>> On Thu, Nov 12, 2009 at 3:36 PM, Andrew Beekhof <andrew at beekhof.net> wrote:
>>
>>> 5) The log message "cib: [2941]: debug: cib_remote_listen: New
>>> clear-text connection" should include from where the connection came.
>>
>> why and how?
>
> Why: It's like "file not found" without the info which file wasn't
> found ... perhaps it's just me, but I would like to see the source IP
> and port of the connection.
>
> How: You're probably not asking me how to implement the feature, so
> I'm assuming that you misunderstood what exactly I was asking for(?).

No, I'm saying that I'm pretty sure we don't have access to the IP information.

>
>>> 6) The log message "cib: [2941]: ERROR: cib_remote_listen: User is not
>>> a member of the required group" might mention which user and which
>>> group...
>>
>> it doesn't do so for security reasons
>
> Hm.
>
> Security? I see, that's when you use unencrypted remote syslogging --
> anybody already on the machine could just use ps(1).
>
> How about logging it in the ERROR messages, but only when
> debug-logging is enabled?

No, because then I'll get confused emails from people wondering why
there are a stream of ERRORs in the logs.

>
>>> 8) Just tried with crm_resource: The password prompt when not setting
>>> CIB_password is sent to stdout, rather than stderr [which makes it
>>> near impossible to send the output someplace].
>>
>> we can probably change that
>
> That'd be great, also because the new behaviour would be more in-line
> with what many other command line programs do...
>
>>> 9) I am getting completely bogus results via the remote connection,
>>> e.g. "crm_resource --list" shows only 2 of 8 resources, and shows the
>>> as stopped, whereas on the cluster nodes I see the -- correct -- list
>>> with 8 resources which are all started. With "cibadmin -Q" I get:
>>>
>>> # cibadmin -Q | wc  # on a cluster node
>>>    379    1895   50474
>>>
>>> # cibadmin -Q | wc  # via the remote connection
>>> cibadmin: Opened connection to 192.168.80.10:6900
>>>     66     193    4731
>>
>> someone else mentioned that, i've not been able to reproduce it yet.
>
> Weird. I'm using the precompiled Debian packages for Pacemaker 1.0.6
> with Corosync. Anything that might help debug the problem?

add more hours to the day? :)

>
> root at cluster1:~# tail -f /var/log/daemon.log
> Nov 16 15:53:33 cluster1 cib: [24749]: debug: cib_remote_listen: New
> clear-text connection
> Nov 16 15:53:34 cluster1 cib: [24749]: info: log_data_element:
> cib_remote_listen: Login:  <cib_command op="authenticate"
> user="hacluster" password="*****" hidden="password" />
> Nov 16 15:53:34 cluster1 cib: [24749]: debug: cib_remote_listen: New
> clear-text connection
> Nov 16 15:53:35 cluster1 cib: [24749]: info: log_data_element:
> cib_remote_listen: Login:  <cib_command op="authenticate"
> user="hacluster" password="*****" hidden="password" />
> Nov 16 15:53:35 cluster1 corosync[7426]:   [TOTEM ] mcasted message
> added to pending queue
> [... more corosync messages ...]
> Nov 16 15:53:35 cluster1 corosync[7426]:   [TOTEM ] releasing messages
> up to and including 48a
> Nov 16 15:53:35 cluster1 cib: [24749]: ERROR: cib_recv_remote_msg: Empty reply
> Nov 16 15:53:35 cluster1 cib: [24749]: ERROR: cib_recv_plaintext:
> Error receiving message: -1: Connection reset by peer (104)
> Nov 16 15:53:35 cluster1 cib: [24749]: ERROR: cib_recv_remote_msg: Empty reply
> ^C
> root at cluster1:~# cibadmin -Q | wc
>    382    1943   51825
> root at cluster1:~#
>
> root at admin:~# cibadmin -Q > cib.xml
> cibadmin: Opened connection to 192.168.80.10:6900
> root at admin:~# wc cib.xml
>  86  255 6379 cib.xml
> root at admin:~#
>
>>> 10) It's very easy to trash the cib process, e.g. by connecting via
>>> telnet and sending a few bytes of garbage; result is an endless loop
>>> of "cib: [7846]: ERROR: cib_recv_remote_msg: Empty reply" messages,
>>> one per second, and that I need to "killall -9 cib" in order to get
>>> everything working again.
>>
>> ok, thats not good.
>> I think this patch should fix it though:
>>
>> diff -r 828b3329a64c cib/remote.c
>> --- a/cib/remote.c      Fri Nov 06 16:28:21 2009 +0100
>> +++ b/cib/remote.c      Mon Nov 16 15:18:41 2009 +0100
>> @@ -220,7 +220,7 @@ cib_remote_listen(int ssock, gpointer da
>>        }
>>
>>        do {
>> -               crm_debug_2("Iter: %d", lpc++);
>> +               crm_debug_2("Iter: %d", lpc);
>>                if(ssock == remote_tls_fd) {
>>  #ifdef HAVE_GNUTLS_GNUTLS_H
>>                    login = cib_recv_remote_msg(session, TRUE);
>> @@ -230,7 +230,7 @@ cib_remote_listen(int ssock, gpointer da
>>                }
>>                sleep(1);
>>
>> -       } while(login == NULL && lpc < 10);
>> +       } while(login == NULL && ++lpc < 10);
>>
>>        crm_log_xml_info(login, "Login: ");
>>        if(login == NULL) {
>
> Thanks, since we have been using precompiled packages I haven't
> actually gone through the exercise of compiling Pacemaker, so it might
> take some time before I get around to testing this patch...
>
> Regards, Colin
>
> _______________________________________________
> Pacemaker mailing list
> Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>




More information about the Pacemaker mailing list