<div dir="ltr"><div><div>Hello Bernardo<br><br></div>I don't know if this is the problem, but try this option<br><br>      clear_node_high_bit<br>              This configuration option is optional and is only relevant when no nodeid is specified.  Some openais clients require  a  signed  32  bit  nodeid  that  is<br>

              greater than zero however by default openais uses all 32 bits of the IPv4 address space when generating a nodeid.  Set this option to yes to force the high<br>              bit to be zero and therefor ensure the nodeid is a positive signed 32 bit integer.<br>

<br>              WARNING: The clusters behavior is undefined if this option is enabled on only a subset of the cluster (for example during a rolling upgrade).<br><br></div>Thanks<br></div><div class="gmail_extra"><br><br>

<div class="gmail_quote">2013/6/27 Bernardo Cabezas Serra <span dir="ltr"><<a href="mailto:bcabezas@apsl.net" target="_blank">bcabezas@apsl.net</a>></span><br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Hello,<br>

<br>

Our cluster was working OK on corosync stack, with corosync 2.3.0 and<br>

pacemaker 1.1.8.<br>

<br>

After upgrading (full versions and configs below), we began to have<br>

problems with node names.<br>

It's a two node cluster, with node names "turifel" (DC) and "selavi".<br>

<br>

When selavi joins cluster, we have this warning at selavi log:<br>

<br>

-----<br>

Jun 27 11:54:29 selavi attrd[11998]:   notice: corosync_node_name:<br>

Unable to get node name for nodeid <a href="tel:168385827" value="+49168385827">168385827</a><br>

Jun 27 11:54:29 selavi attrd[11998]:   notice: get_node_name: Defaulting<br>

to uname -n for the local corosync node name<br>

-----<br>

<br>

This is ok, and also happenned with version 1.1.8.<br>

<br>

At corosync level, all seems ok:<br>

----<br>

Jun 27 11:51:18 turifel corosync[6725]:   [TOTEM ] A processor joined or<br>

left the membership and a new membership (<a href="http://10.9.93.35:1184" target="_blank">10.9.93.35:1184</a>) was formed.<br>

Jun 27 11:51:18 turifel corosync[6725]:   [QUORUM] Members[2]: <a href="tel:168385827" value="+49168385827">168385827</a><br>

<a href="tel:168385835" value="+49168385835">168385835</a><br>

Jun 27 11:51:18 turifel corosync[6725]:   [MAIN  ] Completed service<br>

synchronization, ready to provide service.<br>

Jun 27 11:51:18 turifel crmd[19526]:   notice: crm_update_peer_state:<br>

pcmk_quorum_notification: Node selavi[168385827] - state is now member<br>

(was lost)<br>

-------<br>

<br>

But when starting pacemaker on selavi (the new node), turifel log shows<br>

this:<br>

<br>

----<br>

Jun 27 11:54:28 turifel crmd[19526]:   notice: do_state_transition:<br>

State transition S_IDLE -> S_INTEGRATION [ input=I_NODE_JOIN<br>

cause=C_FSA_INTERNAL origin=peer_update_callback ]<br>

Jun 27 11:54:28 turifel crmd[19526]:  warning: crm_get_peer: Node<br>

'selavi' and 'selavi' share the same cluster nodeid: 168385827<br>

Jun 27 11:54:28 turifel crmd[19526]:  warning: crmd_cs_dispatch:<br>

Recieving messages from a node we think is dead: selavi[0]<br>

Jun 27 11:54:29 turifel crmd[19526]:  warning: crm_get_peer: Node<br>

'selavi' and 'selavi' share the same cluster nodeid: 168385827<br>

Jun 27 11:54:29 turifel crmd[19526]:  warning: do_state_transition: Only<br>

1 of 2 cluster nodes are eligible to run resources - continue 0<br>

Jun 27 11:54:29 turifel attrd[19524]:   notice: attrd_local_callback:<br>

Sending full refresh (origin=crmd)<br>

----<br>

<br>

And selavi remains on pending state. Some times turifel (DC) fences<br>

selavi, but other times remains pending forever.<br>

<br>

On turifel node, all resources gives warnings like this one:<br>

 warning: custom_action: Action p_drbd_ha0:0_monitor_0 on selavi is<br>

unrunnable (pending)<br>

<br>

On both nodes, uname -n and crm_node -n gives correct node names (selavi<br>

and turifel respectively)<br>

<br>

¿Do you think it's a configuration problem?<br>

<br>

<br>

Below I give information about versions and configurations.<br>

<br>

Best regards,<br>

Bernardo.<br>

<br>

<br>

-----<br>

Versions (git/hg compiled versions):<br>

<br>

corosync: 2.3.0.66-615d<br>

pacemaker: 1.1.9-61e4b8f<br>

cluster-glue: 1.0.11<br>

libqb:  0.14.4.43-bb4c3<br>

resource-agents: 3.9.5.98-3b051<br>

crmsh: 1.2.5<br>

<br>

Cluster also has drbd, dlm and gfs2, but I think versions are unrelevant<br>

here.<br>

<br>

--------<br>

Output of pacemaker configuration:<br>

./configure --prefix=/opt/ha --without-cman \<br>

    --without-heartbeat --with-corosync \<br>

    --enable-fatal-warnings=no --with-lcrso-dir=/opt/ha/libexec/lcrso<br>

<br>

pacemaker configuration:<br>

  Version                  = 1.1.9 (Build: 61e4b8f)<br>

  Features                 = generated-manpages ascii-docs ncurses<br>

libqb-logging libqb-ipc lha-fencing upstart nagios  corosync-native snmp<br>

libesmtp<br>

<br>

  Prefix                   = /opt/ha<br>

  Executables              = /opt/ha/sbin<br>

  Man pages                = /opt/ha/share/man<br>

  Libraries                = /opt/ha/lib<br>

  Header files             = /opt/ha/include<br>

  Arch-independent files   = /opt/ha/share<br>

  State information        = /opt/ha/var<br>

  System configuration     = /opt/ha/etc<br>

  Corosync Plugins         = /opt/ha/lib<br>

<br>

  Use system LTDL          = yes<br>

<br>

  HA group name            = haclient<br>

  HA user name             = hacluster<br>

<br>

  CFLAGS                   = -I/opt/ha/include -I/opt/ha/include<br>

-I/opt/ha/include/heartbeat    -I/opt/ha/include   -I/opt/ha/include<br>

-ggdb  -fgnu89-inline -fstack-protector-all -Wall -Waggregate-return<br>

-Wbad-function-cast -Wcast-align -Wdeclaration-after-statement<br>

-Wendif-labels -Wfloat-equal -Wformat=2 -Wformat-security<br>

-Wformat-nonliteral -Wmissing-prototypes -Wmissing-declarations<br>

-Wnested-externs -Wno-long-long -Wno-strict-aliasing<br>

-Wunused-but-set-variable -Wpointer-arith -Wstrict-prototypes<br>

-Wwrite-strings<br>

  Libraries                = -lgnutls -lcorosync_common -lplumb -lpils<br>

-lqb -lbz2 -lxslt -lxml2 -lc -luuid -lpam -lrt -ldl  -lglib-2.0   -lltdl<br>

-L/opt/ha/lib -lqb -ldl -lrt -lpthread<br>

  Stack Libraries          =   -L/opt/ha/lib -lqb -ldl -lrt -lpthread<br>

-L/opt/ha/lib -lcpg   -L/opt/ha/lib -lcfg   -L/opt/ha/lib -lcmap<br>

-L/opt/ha/lib -lquorum<br>

<br>

----<br>

Corosync config:<br>

<br>

totem {<br>

        version: 2<br>

        crypto_cipher: none<br>

        crypto_hash: none<br>

        cluster_name: fiestaha<br>

        interface {<br>

                ringnumber: 0<br>

                ttl: 1<br>

                bindnetaddr: 10.9.93.0<br>

                mcastaddr: 226.94.1.1<br>

                mcastport: 5405<br>

        }<br>

}<br>

logging {<br>

        fileline: off<br>

        to_stderr: yes<br>

        to_logfile: no<br>

        to_syslog: yes<br>

        syslog_facility: local7<br>

        debug: off<br>

        timestamp: on<br>

        logger_subsys {<br>

                subsys: QUORUM<br>

                debug: off<br>

        }<br>

}<br>

quorum {<br>

        provider: corosync_votequorum<br>

        expected_votes: 2<br>

        two_node: 1<br>

        wait_for_all: 0<br>

}<br>

<br>

<br>

<br>

<br>

<br>

<br>

<br>

<br>

<br>

<br>

<br>

<br>

--<br>

APSL<br>

*Bernardo Cabezas Serra*<br>

*Responsable Sistemas*<br>

Camí Vell de Bunyola 37, esc. A, local 7<br>

07009 Polígono de Son Castelló, Palma<br>

Mail: <a href="mailto:bcabezas@apsl.net">bcabezas@apsl.net</a><br>

Skype: bernat.cabezas<br>

Tel: 971439771<br>

<br>

<br>

_______________________________________________<br>

Pacemaker mailing list: <a href="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</a><br>

<a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/mailman/listinfo/pacemaker</a><br>

<br>

Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>

Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>

Bugs: <a href="http://bugs.clusterlabs.org" target="_blank">http://bugs.clusterlabs.org</a><br>

</blockquote></div><br><br clear="all"><br>-- <br>esta es mi vida e me la vivo hasta que dios quiera

</div>