<div dir="ltr">Oh, seems I've found the solution! At least two mistakes was in my corosync.conf (BTW logs did not say about any errors, so my conclusion is based on my experiments only).<div><br></div><div>1. nodelist.node MUST contain only IP addresses. No hostnames! They simply do not work, "crm status" shows no nodes. And no warnings are in logs regarding this.</div><div>2. quorum {} MUST NOT be empty (in the config sample it IS empty): in my case, the following fixed the problem together with (1):</div><div><br></div><div>quorum {</div><div> provider: corosync_votequorum</div><div> two_node: 1</div><div>}</div><div><br></div><div>So, below is my final corosync.conf. Now "crm status" shows "Online: [ node1 node2 ]", UDPu transport is used, no virtual network exists at all (only public IP addresses are specified in corosync.conf).</div><div><br></div><div>========================</div><div><br></div><div><div># This seems to be a really WORKING configuration.</div><div># Ubuntu 14.04, corosync 2.3.3, pacemaker 1.1.10</div><div>totem {</div><div> version: 2</div><div> cluster_name: cluster</div><div> crypto_cipher: none</div><div> crypto_hash: none</div><div> clear_node_high_bit: yes</div><div> interface {</div><div> ringnumber: 0</div><div> bindnetaddr: <public-ip-address-of-the-current-machine></div><div> mcastport: 5405</div><div> ttl: 1</div><div> }</div><div> transport: udpu</div><div> heartbeat_failures_allowed: 3</div><div>}</div><div>logging {</div><div> fileline: off</div><div> to_logfile: no</div><div> to_syslog: yes</div><div> debug: on</div><div> timestamp: off</div><div> logger_subsys {</div><div> subsys: QUORUM</div><div> debug: off</div><div> }</div><div>}</div><div>nodelist {</div><div> node {</div><div> ring0_addr: <public-ip-address-of-the-first-machine></div><div> }</div><div> node {</div><div> ring0_addr: <public-ip-address-of-the-second-machine></div><div> }</div><div>}</div><div>quorum {</div><div> provider: corosync_votequorum</div><div> two_node: 1</div><div>}</div></div><div><br></div><div>=========================</div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Dec 30, 2014 at 12:34 PM, Dmitry Koterov <span dir="ltr"><<a href="mailto:dmitry.koterov@gmail.com" target="_blank">dmitry.koterov@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><span><span class="">On Mon, Dec 29, 2014 at 1:50 PM, Dejan Muhamedagic <<a href="mailto:dejanmm@fastmail.fm" target="_blank">dejanmm@fastmail.fm</a>> wrote:<br></span><span class="">>> On Mon, Dec 29, 2014 at 06:11:49AM +0300, Dmitry Koterov wrote:<br>
>> Hello.<br>
>><br>
>> I have a geographically distributed cluster, all machines have public IP<br>
>> addresses. No virtual IP subnet exists, so no multicast is available.<br>
>><br>
>> I thought that UDPu transport can work in such environment, doesn't it?<br>
>><br>
>> To test everything in advance, I've set up a corosync+pacemaker on Ubuntu<br>
>> 14.04 with the following corosync.conf:<br>
>><br>
>> totem {<br>
>> transport: udpu<br>
>> interface {<br>
>> ringnumber: 0<br>
>> bindnetaddr: ip-address-of-the-current-machine<br>
>> mcastport: 5405<br>
>> }<br></span></span></blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">>> ... </blockquote><span class=""><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">>> } </blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><span style="color:rgb(80,0,80);font-size:13px">>> nodelist {<br></span><span style="color:rgb(80,0,80);font-size:13px">></span><span style="color:rgb(80,0,80);font-size:13px">></span><span style="color:rgb(80,0,80);font-size:13px"> node {<br></span><span style="color:rgb(80,0,80);font-size:13px">></span><span style="color:rgb(80,0,80);font-size:13px">></span><span style="color:rgb(80,0,80);font-size:13px"> ring0_addr: node1<br></span><span style="color:rgb(80,0,80);font-size:13px">></span><span style="color:rgb(80,0,80);font-size:13px">></span><span style="color:rgb(80,0,80);font-size:13px"> }<br></span><span style="color:rgb(80,0,80);font-size:13px">></span><span style="color:rgb(80,0,80);font-size:13px">></span><span style="color:rgb(80,0,80);font-size:13px"> node {<br></span><span style="color:rgb(80,0,80);font-size:13px">></span><span style="color:rgb(80,0,80);font-size:13px">></span><span style="color:rgb(80,0,80);font-size:13px"> ring0_addr: node2<br></span><span style="color:rgb(80,0,80);font-size:13px">></span><span style="color:rgb(80,0,80);font-size:13px">></span><span style="color:rgb(80,0,80);font-size:13px"> }<br></span><span style="color:rgb(80,0,80);font-size:13px">></span><span style="color:rgb(80,0,80);font-size:13px">></span><span style="font-size:13px;color:rgb(80,0,80)"> }</span></blockquote></span><div style="font-size:13px"><blockquote style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex" class="gmail_quote"><span style="color:rgb(80,0,80)">></span><span style="color:rgb(80,0,80)">></span><span style="color:rgb(80,0,80)"> </span>root@node1:/etc/corosync# crm status | grep node<span class=""><br><span style="color:rgb(80,0,80)">></span><span style="color:rgb(80,0,80)">></span><span style="color:rgb(80,0,80)"> </span>OFFLINE: [ node1 node2 ]<br><span style="color:rgb(80,0,80)">></span><span style="color:rgb(80,0,80)">></span><span style="color:rgb(80,0,80)"> </span><span style="font-size:13px">and "crm node online" (as all other attempts to make crm to do something) are timed out with "communication error".</span> </span></blockquote></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><br></blockquote></div><div class="gmail_quote"><span class=""><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Dmitry, which version do you have?</blockquote><div> </div></span><div>root@node1:~# corosync -v</div><div>Corosync Cluster Engine, version '2.3.3'</div><div>Copyright (c) 2006-2009 Red Hat, Inc. </div><div><br></div>- so nodelist is defenitely enough, and totem->interface->member is deprecated.</div><div class="gmail_quote"><br><div>So, am I at least right that the configuration with UDPu SHOULD work with geo-distributed nodes with only public IP addresses and no private/virtual subnetwork? If yes, how could I debug it?</div><div><br></div><div>Here's some more info (x.x.x.x is a public IP associated to node1):</div><div><br></div><div><div>root@node1:~# netstat -nap|grep coro</div><div>udp 0 0 x.x.x.x:41083 0.0.0.0:* 7037/corosync</div><div>udp 0 0 x.x.x.x:49299 0.0.0.0:* 7037/corosync</div><div>udp 0 0 x.x.x.x:5405 0.0.0.0:* 7037/corosync</div><div>unix 2 [ ACC ] STREAM LISTENING 52458 7037/corosync @quorum</div><div>unix 2 [ ACC ] STREAM LISTENING 52455 7037/corosync @cmap</div><div>unix 2 [ ACC ] STREAM LISTENING 52456 7037/corosync @cfg</div><div>unix 2 [ ACC ] STREAM LISTENING 52457 7037/corosync @cpg</div><div>unix 3 [ ] STREAM CONNECTED 52512 7037/corosync @cpg</div><div>unix 3 [ ] STREAM CONNECTED 52625 7037/corosync @cpg</div><div>unix 3 [ ] STREAM CONNECTED 52504 7037/corosync @cfg</div><div>unix 3 [ ] STREAM CONNECTED 52520 7037/corosync @quorum</div><div>unix 2 [ ] DGRAM 52420 7037/corosync</div><div>unix 3 [ ] STREAM CONNECTED 52643 7037/corosync @quorum</div><div>unix 3 [ ] STREAM CONNECTED 52568 7037/corosync @cpg</div><div>unix 3 [ ] STREAM CONNECTED 52588 7037/corosync @cpg</div><div>unix 3 [ ] STREAM CONNECTED 52554 7037/corosync @cpg</div></div><div><br></div><div class="gmail_quote">root@node1:~# crm status</div><div class="gmail_quote">Last updated: Tue Dec 30 04:33:40 2014</div><div class="gmail_quote">Last change: Sun Dec 28 21:40:41 2014 via crmd on node2</div><div class="gmail_quote">Stack: corosync</div><div class="gmail_quote">Current DC: NONE</div><div class="gmail_quote">2 Nodes configured</div><div class="gmail_quote">0 Resources configured</div><div class="gmail_quote">OFFLINE: [ node1 node2 ]<br></div></div><div class="gmail_quote"><br><div class="gmail_quote">root@node1:~# crm node online</div><div class="gmail_quote">Error setting standby=off (section=nodes, set=nodes-1084751873): Communication error on send</div><div class="gmail_quote">Error performing operation: Communication error on send</div></div></div></div>
</blockquote></div><br></div>