<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">Hi Ken,<div class=""><br class=""></div><div class="">I took a little time away from the problem. Getting back to it now. I found that the corosync logs were not only in journalctl but also in /var/log/syslog. I think the logs in syslog are more interesting, though I haven’t actually done a thorough comparison. Nevertheless, I’m pasting what the logs in syslog say and am hoping there’s more interesting data here. The time signatures match perfectly here, too.</div><div class=""><br class=""></div><div class=""><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] waiting_trans_ack changed to 1 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] Token Timeout (3000 ms) retransmit timeout (294 ms) </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] token hold (225 ms) retransmits before loss (10 retrans) </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] join (50 ms) send_join (0 ms) consensus (3600 ms) merge (200 ms) </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] downcheck (1000 ms) fail to recv const (2500 msgs) </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] seqno unchanged const (30 rotations) Maximum network MTU 1401 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] window size per rotation (50 messages) maximum messages per rotation (17 messages) </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] missed count const (5 messages) </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] send threads (0 threads) </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] RRP token expired timeout (294 ms) </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] RRP token problem counter (2000 ms) </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] RRP threshold (10 problem count) </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] RRP multicast threshold (100 problem count) </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] RRP automatic recovery check timeout (1000 ms) </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] RRP mode set to none. </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] heartbeat_failures_allowed (0) </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] max_network_delay (50 ms) </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] HeartBeat is Disabled. To enable set heartbeat_failures_allowed > 0 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] Initializing transport (UDP/IP Multicast). </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] Initializing transmit/receive security (NSS) crypto: none hash: none </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] Receive multicast socket recv buffer size (320000 bytes). </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] Transmit multicast socket send buffer size (320000 bytes). </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] Local receive multicast loop socket recv buffer size (320000 bytes). </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] Local transmit multicast loop socket send buffer size (320000 bytes). </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] The network interface [192.168.99.225] is now up. </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] Created or loaded sequence id 74.192.168.99.225 for this ring. </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [QB ] server name: cmap </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [QB ] server name: cfg </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [QB ] server name: cpg </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [QB ] server name: votequorum </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [QB ] server name: quorum </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] entering GATHER state from 15(interface change). </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] Creating commit token because I am the rep. </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] Saving state aru 0 high seq received 0 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] entering COMMIT state. </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] got commit token </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] entering RECOVERY state. </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] position [0] member 192.168.99.225: </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] previous ring seq 74 rep 192.168.99.225 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] aru 0 high delivered 0 received flag 1 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] Did not need to originate any messages in recovery. </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] got commit token </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] Sending initial ORF token </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 0, aru 0 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] install seq 0 aru 0 high seq received 0 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 1, aru 0 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] install seq 0 aru 0 high seq received 0 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 2, aru 0</div><div class=""><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] install seq 0 aru 0 high seq received 0 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 3, aru 0 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] install seq 0 aru 0 high seq received 0 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] retrans flag count 4 token aru 0 install seq 0 aru 0 0 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] Resetting old ring state </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] recovery to regular 1-0 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] waiting_trans_ack changed to 1 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] entering OPERATIONAL state. </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] A new membership (192.168.99.225:120) was formed. Members joined: 1084777441 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] waiting_trans_ack changed to 0 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [QB ] IPC credentials authenticated (2946-2958-18) </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [QB ] connecting to client [2958] </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [QB ] shm size:1048589; real_size:1052672; rb->word_size:263168 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: message repeated 2 times: [ [QB ] shm size:1048589; real_size:1052672; rb->word_size:263168] </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [QB ] IPC credentials authenticated (2946-2958-19) </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [QB ] connecting to client [2958] </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [QB ] shm size:1048589; real_size:1052672; rb->word_size:263168 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: message repeated 2 times: [ [QB ] shm size:1048589; real_size:1052672; rb->word_size:263168] </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [QB ] IPC credentials authenticated (2946-2958-20) </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [QB ] connecting to client [2958] </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [QB ] shm size:1048589; real_size:1052672; rb->word_size:263168 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: message repeated 2 times: [ [QB ] shm size:1048589; real_size:1052672; rb->word_size:263168] </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [QB ] HUP conn (2946-2958-20) </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [QB ] qb_ipcs_disconnect(2946-2958-20) state:2 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [QB ] Free'ing ringbuffer: /dev/shm/qb-cfg-response-2946-2958-20-header </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [QB ] Free'ing ringbuffer: /dev/shm/qb-cfg-event-2946-2958-20-header </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [QB ] Free'ing ringbuffer: /dev/shm/qb-cfg-request-2946-2958-20-header </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] entering GATHER state from 11(merge during join). </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] got commit token </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] Saving state aru 6 high seq received 6 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] entering COMMIT state. </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] got commit token </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] entering RECOVERY state. </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] TRANS [0] member 192.168.99.225: </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] position [0] member 192.168.99.223: </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] previous ring seq 78 rep 192.168.99.223 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] aru e high delivered e received flag 1 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] position [1] member 192.168.99.224: </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] previous ring seq 78 rep 192.168.99.223 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] aru e high delivered e received flag 1 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] position [2] member 192.168.99.225: </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] previous ring seq 78 rep 192.168.99.225 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] aru 6 high delivered 6 received flag 1 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] Did not need to originate any messages in recovery. </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 0, aru ffffffff </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] install seq 0 aru 0 high seq received 0 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 1, aru 0 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] install seq 0 aru 0 high seq received 0 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 2, aru 0 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] install seq 0 aru 0 high seq received 0 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 3, aru 0 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] install seq 0 aru 0 high seq received 0 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] retrans flag count 4 token aru 0 install seq 0 aru 0 0 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] Resetting old ring state </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] recovery to regular 1-0 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] waiting_trans_ack changed to 1 </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] entering OPERATIONAL state. </div><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] A new membership (192.168.99.223:124) was formed. Members joined: 1 3 </div></div><div class=""><div class="">Dec 18 23:44:21 region-ctrl-2 corosync[2946]: [TOTEM ] waiting_trans_ack changed to 0 </div><div class="">Dec 18 23:44:40 region-ctrl-2 corosync[2946]: [QB ] IPC credentials authenticated (2946-2976-20) </div><div class="">Dec 18 23:44:40 region-ctrl-2 corosync[2946]: [QB ] connecting to client [2976] </div><div class="">Dec 18 23:44:40 region-ctrl-2 corosync[2946]: [QB ] shm size:1048589; real_size:1052672; rb->word_size:263168 </div><div class="">Dec 18 23:44:40 region-ctrl-2 corosync[2946]: message repeated 2 times: [ [QB ] shm size:1048589; real_size:1052672; rb->word_size:263168] </div><div class="">Dec 18 23:44:40 region-ctrl-2 corosync[2946]: [QB ] IPC credentials authenticated (2946-2976-21) </div><div class="">Dec 18 23:44:40 region-ctrl-2 corosync[2946]: [QB ] connecting to client [2976] </div><div class="">Dec 18 23:44:40 region-ctrl-2 corosync[2946]: [QB ] shm size:1048589; real_size:1052672; rb->word_size:263168 </div><div class="">Dec 18 23:44:40 region-ctrl-2 corosync[2946]: message repeated 2 times: [ [QB ] shm size:1048589; real_size:1052672; rb->word_size:263168] </div><div class="">Dec 18 23:44:40 region-ctrl-2 corosync[2946]: [QB ] HUP conn (2946-2976-21) </div></div><div class=""><br class=""></div><div class="">Then that last few lines repeat over and over again…</div><div class="">I’m very curious if you spot a bug. The way this is manifesting now is:</div><div class=""><br class=""></div><div class=""><div class=""># crm configure show</div><div class="">node 1: region-ctrl-1</div><div class="">node 1084777441: region-ctrl-2</div><div class="">node 3: postgres-sb</div><div class="">property cib-bootstrap-options: \</div><div class=""> have-watchdog=false \</div><div class=""> dc-version=1.1.18-2b07d5c5a9 \</div><div class=""> cluster-infrastructure=corosync \</div><div class=""> cluster-name=debian \</div><div class=""> stonith-enabled=false</div></div><div class=""><br class=""></div><div class=""><div class=""># crm cluster status</div><div class="">Services:</div><div class="">corosync active/running/disabled</div><div class="">pacemaker deactivating/stop-sigterm/disabled</div><div class=""><br class=""></div><div class="">Printing ring status.</div><div class="">Local node ID 1084777441</div><div class="">RING ID 0</div><div class=""> id = 192.168.99.225</div><div class=""> status = ring 0 active with no faults</div></div><div class=""><br class=""></div><div class=""><div class=""># pcs cluster status</div><div class="">Cluster Status:</div><div class=""> Stack: corosync</div><div class=""> Current DC: region-ctrl-1 (version 1.1.18-2b07d5c5a9) - partition with quorum</div><div class=""> Last updated: Thu Dec 19 00:22:50 2019</div><div class=""> Last change: Wed Dec 18 23:44:40 2019 by hacluster via crmd on region-ctrl-2</div><div class=""> 3 nodes configured</div><div class=""> 0 resources configured</div><div class=""><br class=""></div><div class="">PCSD Status:</div><div class=""> postgres-sb: Online</div><div class=""> region-ctrl-1: Online</div></div><div class=""><br class=""></div><div class="">The corosync cluster doesn’t even have a nodeid: 2 in the nodelist so this thing is getting autodetected somehow:</div><div class=""><br class=""></div><div class=""><div class=""># cat /etc/corosync/corosync.conf </div><div class="">totem {</div><div class=""> version: 2</div><div class=""> cluster_name: maas-cluster</div><div class=""> token: 3000</div><div class=""> token_retransmits_before_loss_const: 10</div><div class=""> clear_node_high_bit: yes</div><div class=""> crypto_cipher: none</div><div class=""> crypto_hash: none</div><div class=""><br class=""></div><div class=""> interface {</div><div class=""> ringnumber: 0</div><div class=""> bindnetaddr: 192.168.99.0</div><div class=""> mcastport: 5405</div><div class=""> ttl: 1</div><div class=""> }</div><div class="">}</div><div class=""><br class=""></div><div class="">logging {</div><div class=""> fileline: off</div><div class=""> to_stderr: no</div><div class=""> to_logfile: yes</div><div class=""> to_syslog: yes</div><div class=""> syslog_facility: daemon</div><div class=""> debug: on</div><div class=""> timestamp: on</div><div class=""><br class=""></div><div class=""> logger_subsys {</div><div class=""> subsys: QUORUM</div><div class=""> debug: on</div><div class=""> }</div><div class="">}</div><div class=""><br class=""></div><div class="">quorum {</div><div class=""> provider: corosync_votequorum</div><div class=""> expected_votes: 3</div><div class=""> two_node: 1</div><div class="">}</div><div class=""><br class=""></div><div class="">nodelist {</div><div class=""> node {</div><div class=""> ring0_addr: postgres-sb</div><div class=""> nodeid: 3</div><div class=""> }</div><div class=""><br class=""></div><div class=""> node {</div><div class=""> ring0_addr: region-ctrl-1</div><div class=""> nodeid: 1</div><div class=""> }</div><div class="">}</div></div><div class=""><br class=""></div><div class=""><br class=""></div><div class="">Moreover, I’ve tried deleting node 2 (doesn’t exist so this fails). I’ve tried deleting/clearing 1084777441. Delete fails. Clear works. When the node goes away and I try to recreate nodeid: 2 the errant node comes back instead as node 1084777441. </div><div class=""><br class=""></div><div class="">Finally, please review my other relevant settings:</div><div class=""><br class=""></div><div class=""><div class=""># cat /etc/hosts</div><div class="">127.0.0.1 localhost</div><div class="">#127.0.1.1 region-ctrl-2</div><div class=""><br class=""></div><div class="">192.168.99.223 region-ctrl-1</div><div class="">192.168.99.224 postgres-sb</div><div class="">192.168.99.225 region-ctrl-2</div><div class=""><br class=""></div><div class="">192.168.7.223 region-ctrl-1</div><div class="">192.168.7.224 postgres-sb</div><div class="">192.168.7.225 region-ctrl-2</div><div class=""><br class=""></div><div class=""># The following lines are desirable for IPv6 capable hosts</div><div class="">::1 ip6-localhost ip6-loopback</div><div class="">fe00::0 ip6-localnet</div><div class="">ff00::0 ip6-mcastprefix</div><div class="">ff02::1 ip6-allnodes</div><div class="">ff02::2 ip6-allrouters</div></div><div class=""><br class=""></div><div class=""><br class=""></div><div class=""><div class=""># hostname</div><div class="">region-ctrl-2</div></div><div class=""><br class=""></div><div class=""><br class=""></div><div class=""><div class=""># uname -n</div><div class="">region-ctrl-2</div></div><div class=""><br class=""></div><div class="">Is there some other setting I could be missing here that could be causing this problem?</div><div class=""><br class=""></div><div class="">- Jim</div><div class=""><br class=""></div><div><br class=""><blockquote type="cite" class=""><div class="">On Dec 18, 2019, at 13:24, Ken Gaillot <<a href="mailto:kgaillot@redhat.com" class="">kgaillot@redhat.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><span style="caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;" class="">On Wed, 2019-12-18 at 12:21 -0800, JC wrote:</span><br style="caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><blockquote type="cite" style="font-family: Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; text-decoration: none;" class="">Adding logs (minus time stamps)<br class=""><br class=""> info: crm_log_init: Changed active directory to<br class="">/var/lib/pacemaker/cores<br class=""> info: get_cluster_type: Detected an active 'corosync' cluster<br class=""> info: qb_ipcs_us_publish: server name: pacemakerd<br class=""> info: pcmk__ipc_is_authentic_process_active: Could not<br class="">connect to lrmd IPC: Connection refused<br class=""> info: pcmk__ipc_is_authentic_process_active: Could not<br class="">connect to cib_ro IPC: Connection refused<br class=""> info: pcmk__ipc_is_authentic_process_active: Could not<br class="">connect to crmd IPC: Connection refused<br class=""> info: pcmk__ipc_is_authentic_process_active: Could not<br class="">connect to attrd IPC: Connection refused<br class=""> info: pcmk__ipc_is_authentic_process_active: Could not<br class="">connect to pengine IPC: Connection refused<br class=""> info: pcmk__ipc_is_authentic_process_active: Could not<br class="">connect to stonith-ng IPC: Connection refused<br class=""> info: corosync_node_name: Unable to get node name for nodeid<br class="">1084777441<br class=""> notice: get_node_name: Could not obtain a node name for<br class="">corosync nodeid 1084777441<br class=""></blockquote><br style="caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><span style="caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;" class="">This ID appears to be coming from corosync. You have only to_syslog</span><br style="caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><span style="caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;" class="">turned on in corosync.conf, so look in the system log around this same</span><br style="caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><span style="caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;" class="">time to see what corosync is thinking. It does seem odd; I wonder if --</span><br style="caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><span style="caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;" class="">purge is missing something.</span><br style="caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><br style="caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><span style="caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;" class="">BTW you don't need bindnetaddr to be different for each host; it's the</span><br style="caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><span style="caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;" class="">network address (e.g. the .0 for a /24), not the host address.</span><br style="caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><br style="caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><blockquote type="cite" style="font-family: Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""> info: crm_get_peer: Created entry ea4ec23e-e676-4798-9b8b-<br class="">00af39d3bb3d/0x5555f74984d0 for node (null)/1084777441 (1 total)<br class=""> info: crm_get_peer: Node 1084777441 has uuid 1084777441<br class=""> info: crm_update_peer_proc: cluster_connect_cpg: Node<br class="">(null)[1084777441] - corosync-cpg is now online<br class=""> notice: cluster_connect_quorum: Quorum acquired<br class=""> info: crm_get_peer: Created entry 882c0feb-d546-44b7-955f-<br class="">4c8a844a0db1/0x5555f7499fd0 for node postgres-sb/3 (2 total)<br class=""> info: crm_get_peer: Node 3 is now known as postgres-sb<br class=""> info: crm_get_peer: Node 3 has uuid 3<br class=""> info: crm_get_peer: Created entry 4e6a6b1e-d687-4527-bffc-<br class="">5d701ff60a66/0x5555f749a6f0 for node region-ctrl-2/2 (3 total)<br class=""> info: crm_get_peer: Node 2 is now known as region-ctrl-2<br class=""> info: crm_get_peer: Node 2 has uuid 2<br class=""> info: crm_get_peer: Created entry 5532a3cc-2577-4764-b9ee-<br class="">770d437ccec0/0x5555f749a0a0 for node region-ctrl-1/1 (4 total)<br class=""> info: crm_get_peer: Node 1 is now known as region-ctrl-1<br class=""> info: crm_get_peer: Node 1 has uuid 1<br class=""> info: corosync_node_name: Unable to get node name for nodeid<br class="">1084777441<br class=""> notice: get_node_name: Defaulting to uname -n for the local<br class="">corosync node name<br class="">warning: crm_find_peer: Node 1084777441 and 2 share the same<br class="">name: 'region-ctrl-2'<br class=""> info: crm_get_peer: Node 1084777441 is now known as region-ctrl-2<br class=""> info: pcmk_quorum_notification: Quorum retained |<br class="">membership=32 members=3<br class=""> notice: crm_update_peer_state_iter: Node region-ctrl-1 state is<br class="">now member | nodeid=1 previous=unknown<br class="">source=pcmk_quorum_notification<br class=""> notice: crm_update_peer_state_iter: Node postgres-sb state is now<br class="">member | nodeid=3 previous=unknown source=pcmk_quorum_notification<br class=""> notice: crm_update_peer_state_iter: Node region-ctrl-2 state is<br class="">now member | nodeid=1084777441 previous=unknown<br class="">source=pcmk_quorum_notification<br class=""> info: crm_reap_unseen_nodes: State of node region-ctrl-<br class="">2[2] is still unknown<br class=""> info: pcmk_cpg_membership: Node 1084777441 joined group<br class="">pacemakerd (counter=0.0, pid=32765, unchecked for rivals)<br class=""> info: pcmk_cpg_membership: Node 1 still member of group<br class="">pacemakerd (peer=region-ctrl-1:900, counter=0.0, at least once)<br class=""> info: crm_update_peer_proc: pcmk_cpg_membership: Node region-<br class="">ctrl-1[1] - corosync-cpg is now online<br class=""> info: pcmk_cpg_membership: Node 3 still member of group<br class="">pacemakerd (peer=postgres-sb:976, counter=0.1, at least once)<br class=""> info: crm_update_peer_proc: pcmk_cpg_membership: Node postgres-<br class="">sb[3] - corosync-cpg is now online<br class=""> info: pcmk_cpg_membership: Node 1084777441 still member of group<br class="">pacemakerd (peer=region-ctrl-2:3016, counter=0.2, at least once)<br class=""> pengine: info: crm_log_init: Changed active directory to<br class="">/var/lib/pacemaker/cores<br class=""> lrmd: info: crm_log_init: Changed active directory to<br class="">/var/lib/pacemaker/cores<br class=""> lrmd: info: qb_ipcs_us_publish: server name: lrmd<br class=""> pengine: info: qb_ipcs_us_publish: server name: pengine<br class=""> cib: info: crm_log_init: Changed active directory to<br class="">/var/lib/pacemaker/cores<br class=""> attrd: info: crm_log_init: Changed active directory to<br class="">/var/lib/pacemaker/cores<br class=""> attrd: info: get_cluster_type: Verifying cluster type:<br class="">'corosync'<br class=""> attrd: info: get_cluster_type: Assuming an active 'corosync'<br class="">cluster<br class=""> info: crm_log_init: Changed active directory to<br class="">/var/lib/pacemaker/cores<br class=""> attrd: notice: crm_cluster_connect: Connecting to cluster<br class="">infrastructure: corosync<br class=""> cib: info: get_cluster_type: Verifying cluster type:<br class="">'corosync'<br class=""> cib: info: get_cluster_type: Assuming an active 'corosync'<br class="">cluster<br class=""> info: get_cluster_type: Verifying cluster type: 'corosync'<br class=""> info: get_cluster_type: Assuming an active 'corosync' cluster<br class=""> notice: crm_cluster_connect: Connecting to cluster infrastructure:<br class="">corosync<br class=""> attrd: info: corosync_node_name: Unable to get node<br class="">name for nodeid 1084777441<br class=""> cib: info: validate_with_relaxng: Creating RNG parser<br class="">context<br class=""> crmd: info: crm_log_init: Changed active directory to<br class="">/var/lib/pacemaker/cores<br class=""> crmd: info: get_cluster_type: Verifying cluster type:<br class="">'corosync'<br class=""> crmd: info: get_cluster_type: Assuming an active 'corosync'<br class="">cluster<br class=""> crmd: info: do_log: Input I_STARTUP received in state<br class="">S_STARTING from crmd_init<br class=""> attrd: notice: get_node_name: Could not obtain a node name<br class="">for corosync nodeid 1084777441<br class=""> attrd: info: crm_get_peer: Created entry af5c62c9-21c5-<br class="">4428-9504-ea72a92de7eb/0x560870420e90 for node (null)/1084777441 (1<br class="">total)<br class=""> attrd: info: crm_get_peer: Node 1084777441 has uuid<br class="">1084777441<br class=""> attrd: info: crm_update_peer_proc: cluster_connect_cpg:<br class="">Node (null)[1084777441] - corosync-cpg is now online<br class=""> attrd: notice: crm_update_peer_state_iter: Node (null)<br class="">state is now member | nodeid=1084777441 previous=unknown<br class="">source=crm_update_peer_proc<br class=""> attrd: info: init_cs_connection_once: Connection to<br class="">'corosync': established<br class=""> info: corosync_node_name: Unable to get node name for nodeid<br class="">1084777441<br class=""> notice: get_node_name: Could not obtain a node name for<br class="">corosync nodeid 1084777441<br class=""> info: crm_get_peer: Created entry 5bcb51ae-0015-4652-b036-<br class="">b92cf4f1d990/0x55f583634700 for node (null)/1084777441 (1 total)<br class=""> info: crm_get_peer: Node 1084777441 has uuid 1084777441<br class=""> info: crm_update_peer_proc: cluster_connect_cpg: Node<br class="">(null)[1084777441] - corosync-cpg is now online<br class=""> notice: crm_update_peer_state_iter: Node (null) state is now<br class="">member | nodeid=1084777441 previous=unknown<br class="">source=crm_update_peer_proc<br class=""> attrd: info: corosync_node_name: Unable to get node<br class="">name for nodeid 1084777441<br class=""> attrd: notice: get_node_name: Defaulting to uname -n for<br class="">the local corosync node name<br class=""> attrd: info: crm_get_peer: Node 1084777441 is now known<br class="">as region-ctrl-2<br class=""> info: corosync_node_name: Unable to get node name for nodeid<br class="">1084777441<br class=""> notice: get_node_name: Defaulting to uname -n for the local<br class="">corosync node name<br class=""> info: init_cs_connection_once: Connection to 'corosync':<br class="">established<br class=""> info: corosync_node_name: Unable to get node name for nodeid<br class="">1084777441<br class=""> notice: get_node_name: Defaulting to uname -n for the local<br class="">corosync node name<br class=""> info: crm_get_peer: Node 1084777441 is now known as region-ctrl-2<br class=""> cib: notice: crm_cluster_connect: Connecting to cluster<br class="">infrastructure: corosync<br class=""> cib: info: corosync_node_name: Unable to get node<br class="">name for nodeid 1084777441<br class=""> cib: notice: get_node_name: Could not obtain a node name<br class="">for corosync nodeid 1084777441<br class=""> cib: info: crm_get_peer: Created entry a6ced2c1-9d51-<br class="">445d-9411-2fb19deab861/0x55848365a150 for node (null)/1084777441 (1<br class="">total)<br class=""> cib: info: crm_get_peer: Node 1084777441 has uuid<br class="">1084777441<br class=""> cib: info: crm_update_peer_proc: cluster_connect_cpg:<br class="">Node (null)[1084777441] - corosync-cpg is now online<br class=""> cib: notice: crm_update_peer_state_iter: Node (null)<br class="">state is now member | nodeid=1084777441 previous=unknown<br class="">source=crm_update_peer_proc<br class=""> cib: info: init_cs_connection_once: Connection to<br class="">'corosync': established<br class=""> cib: info: corosync_node_name: Unable to get node<br class="">name for nodeid 1084777441<br class=""> cib: notice: get_node_name: Defaulting to uname -n for<br class="">the local corosync node name<br class=""> cib: info: crm_get_peer: Node 1084777441 is now known<br class="">as region-ctrl-2<br class=""> cib: info: qb_ipcs_us_publish: server name: cib_ro<br class=""> cib: info: qb_ipcs_us_publish: server name: cib_rw<br class=""> cib: info: qb_ipcs_us_publish: server name: cib_shm<br class=""> cib: info: pcmk_cpg_membership: Node 1084777441<br class="">joined group cib (counter=0.0, pid=0, unchecked for rivals)<br class="">_______________________________________________<br class="">Manage your subscription:<br class=""><a href="https://lists.clusterlabs.org/mailman/listinfo/users" class="">https://lists.clusterlabs.org/mailman/listinfo/users</a><br class=""><br class="">ClusterLabs home:<span class="Apple-converted-space"> </span><a href="https://www.clusterlabs.org/" class="">https://www.clusterlabs.org/</a><br class=""><br class=""></blockquote><span style="caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;" class="">--<span class="Apple-converted-space"> </span></span><br style="caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><span style="caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;" class="">Ken Gaillot <</span><a href="mailto:kgaillot@redhat.com" style="font-family: Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px;" class="">kgaillot@redhat.com</a><span style="caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;" class="">></span><br style="caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><br style="caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><span style="caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;" class="">_______________________________________________</span><br style="caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><span style="caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;" class="">Manage your subscription:</span><br style="caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><a href="https://lists.clusterlabs.org/mailman/listinfo/users" style="font-family: Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px;" class="">https://lists.clusterlabs.org/mailman/listinfo/users</a><br style="caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><br style="caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><span style="caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;" class="">ClusterLabs home:<span class="Apple-converted-space"> </span></span><a href="https://www.clusterlabs.org/" style="font-family: Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px;" class="">https://www.clusterlabs.org/</a></div></blockquote></div><br class=""></div></body></html>