[ClusterLabs] wireshark cannot recognize corosync packets
Jan Friesse
jfriesse at redhat.com
Fri Mar 17 15:30:01 CET 2017
> I have checked all the config files are the same, except bindnetaddr.
> So I'm sending only logs.
I'm not sure if config files matches log files. Because config file
contains nodes 200.201.162.(52|53|54), but log files contains ip
200.201.162.(52|53|55).
Can you confirm node with ip 200.201.162.54 exists and it shouldn't be
200.201.162.55 (or 200.201.162.55 shouldn't have ip 200.201.162.54)?
Honza
>
>
>
>
>
>
> 在2017年03月16 15时54分, "Jan Friesse"<jfriesse at redhat.com>写道:
>
>> corosync.conf and debug logs are in attachment.
>
> Thanks for them. They look really interesting. As can be seen
>
> Mar 14 11:37:28 [57827] node-132.acloud.vt corosync debug [TOTEM ]
> timer_function_orf_token_timeout The token was lost in the
> OPERATIONAL state.
>
> corosync correctly detected token lost. Also
>
> Mar 14 11:44:41 [57827] node-132.acloud.vt corosync debug [TOTEM ]
> memb_state_gather_enter entering GATHER state from 11(merg
> e during join).
>
> says it correctly detected merge. But since then it's becoming weird.
> Mar 14 11:44:54 [57827] node-132.acloud.vt corosync debug [TOTEM ]
> memb_state_gather_enter entering GATHER state from 0(conse
> nsus timeout).
> Mar 14 11:45:06 [57827] node-132.acloud.vt corosync debug [TOTEM ]
> memb_state_gather_enter entering GATHER state from 0(conse
> nsus timeout).
> ...
> Mar 14 12:55:47 [154709] node-132.acloud.vt corosync debug [TOTEM ]
> memb_state_gather_enter entering GATHER state from 0(cons
> ensus timeout)
>
> So even after two other nodes merged, there is still something what
> prevents corosync to reach consensus.
>
> Would it be possible to attach also other nodes logs/configs?
>
> For now I guess reason can be one ofe:
> - ifdown on one of other nodes which made whole membership broken
> - different node list in config between nodes
> - "forget" node with node list containing one of the 200.201.162.x nodes
>
> Regards,
> Honza
>>
>> And two messages from kernel:
>>
>> 2017-03-14 11:37:20.097233 - info e1000: eth0 NIC Link is Down
>>
>> 2017-03-14 11:44:41.032121 - info e1000: eth0 NIC Link is Up 1000 Mbps
>> Full Duplex, Flow Control: RX
>>
>>
>> Thanks.
>>
>>
>> On 2017/3/15 16:29, Jan Friesse wrote:
>>>> Yesterday I found corosync took almost one hour to form a cluster(a
>>>> failed node came back online).
>>>
>>> This for sure shouldn't happen (at least with default timeout settings).
>>>
>>>>
>>>> So I captured some corosync packets, and opened the pcap file in
>>>> wireshark.
>>>>
>>>> But wireshark only displayed raw udp, no totem.
>>>>
>>>> Wireshark version is 2.2.5. I'm sure it supports corosync totem.
>>>>
>>>> corosync is 2.4.0.
>>>
>>> Wireshark has corosync dissector, but only for version 1.x. 2.x is not
>>> supported yet.
>>>
>>>>
>>>> And if corosync takes too long to form a cluster, how to diagnose it?
>>>>
>>>> I read the logs, but could not figure it out.
>>>
>>> Logs, specially when debug is enabled, has usually enough info. Can
>>> paste your config + logs?
>>>
>>> Regards,
>>> Honza
>>>
>>>>
>>>> Thanks.
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Users mailing list: Users at clusterlabs.org
>>>> http://lists.clusterlabs.org/mailman/listinfo/users
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://bugs.clusterlabs.org
>>>
>>>
>>> _______________________________________________
>>> Users mailing list: Users at clusterlabs.org
>>> http://lists.clusterlabs.org/mailman/listinfo/users
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>
>>
>>
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org
>> http://lists.clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
More information about the Users
mailing list