[ClusterLabs] Strange lost quorum with qdevice

Jan Friesse jfriesse at redhat.com
Tue Aug 13 08:55:51 EDT 2019


Олег Самойлов napsal(a):
> 
> 
>> 12 авг. 2019 г., в 8:46, Jan Friesse <jfriesse at redhat.com> написал(а):
>>
>> Let me try to bring some light in there:
>>
>> - dpd_interval is qnetd variable how often qnetd walks thru the list of all clients (qdevices) and checks timestamp of last sent message. If diff between current timestamp and last sent message timestamp is larger than 2 * timeout sent by client then client is considered as death.
>>
>> - interval - affects how often qdevice sends heartbeat to corosync (this is half of the interval) about its liveness and also how often it sends heartbeat to qnetd (0.8 * interval). On corosync side this is used as a timeout after which qdevice daemon is considered death and its votes are no longer valid.
>>
>> - sync_timeout - Not used by qdevice/qnetd. Used by corosync during sync phase. If corosync doesn't get reply by qdevice till this timeout it considers qdevice daemon death and continues sync process.
>>
>> It was probably not evident from my reply, but what I meant was to change just dpd_interval. Could you please recheck with dpd_interval=1, timeout=20, sync_timeout=60?
> 
> Did you by mistype call 'timeout' parameter of qdevice as 'interval'?

Yep

> 
> I can't understand how is planned your configuration to work. dpd_interval=1, timeout=20, sync_timeout=60. Qnetd will check every 1 second and timeout will be detected on 2 second away from the last message from Qdevice. But Qdevice will send message only every 0.8*20=16 seconds. So, by your description, quorum must be lost every time.

Nope. With dpd_interval=1 Qnetd would check every second that client 
sent (any) message no longer than 32 (0.8*20*2) seconds ago.


> 
> The reality also is strange. Qnetd detect lost client not after 2 seconds, by your description, but 

I've never wrote anything about 2 seconds.

after 33 seconds and this is only slightly less than 40s when 
dpd_interval=20. With 33 seconds timeout real reaction time is 54s which 
enough for timeout 60s, in this example. But looked like 6 (+-, may be 
slightly random) seconds gap is enough, there is no problems with lost 
quorum.
> 
> Aug 13 02:21:38 witness corosync-qnetd: Aug 13 02:21:38 debug   Client ::ffff:192.168.89.12:36150 (cluster krogan1, node_id 2) sent membership node list.
> Aug 13 02:21:38 witness corosync-qnetd: Aug 13 02:21:38 debug     msg seq num = 11
> Aug 13 02:21:38 witness corosync-qnetd: Aug 13 02:21:38 debug     ring id = (2.30)
> Aug 13 02:21:39 witness corosync-qnetd: Aug 13 02:21:38 debug     heuristics = Undefined
> Aug 13 02:21:39 witness corosync-qnetd: Aug 13 02:21:38 debug     node list:
> Aug 13 02:21:39 witness corosync-qnetd: Aug 13 02:21:38 debug       node_id = 2, data_center_id = 0, node_state = not set
> Aug 13 02:21:39 witness corosync-qnetd: Aug 13 02:21:38 debug   ffsplit: Membership for cluster krogan1 is not yet stable
> Aug 13 02:21:39 witness corosync-qnetd: Aug 13 02:21:38 debug   Algorithm result vote is Wait for reply
> Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 warning Client ::ffff:192.168.89.11:48924 doesn't sent any message during 33000ms. Disconnecting
> Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug   Client ::ffff:192.168.89.11:48924 (init_received 1, cluster krogan1, node_id 1) disconnect
> Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug   ffsplit: Membership for cluster krogan1 is now stable
> Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug   ffsplit: Quorate partition selected
> Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug     node list:
> Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug       node_id = 2, data_center_id = 0, node_state = not set
> Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug   ffsplit: No client gets NACK
> Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug   Sending vote info to client ::ffff:192.168.89.12:36150 (cluster krogan1, node_id 2)
> Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug     msg seq num = 6
> Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug     vote = ACK
> Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug   Client ::ffff:192.168.89.12:36150 (cluster krogan1, node_id 2) replied back to vote info message
> Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug     msg seq num = 6
> Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug   ffsplit: All ACK votes sent for cluster krogan1
> Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug   Client ::ffff:192.168.89.12:36150 (cluster krogan1, node_id 2) sent quorum node list.
> Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug     msg seq num = 12
> Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug     quorate = 1
> Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug     node list:
> Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug       node_id = 1, data_center_id = 0, node_state = dead
> Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug       node_id = 2, data_center_id = 0, node_state = member
> Aug 13 02:22:32 witness corosync-qnetd: Aug 13 02:22:32 debug   Algorithm result vote is No change
> Aug 13 02:22:38 witness ntpd[556]: 0.0.0.0 0612 02 freq_set kernel 11.647 PPM
> Aug 13 02:22:38 witness ntpd[556]: 0.0.0.0 0615 05 clock_sync
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug   Client ::ffff:192.168.89.12:36150 (cluster krogan1, node_id 2) sent membership node list.
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug     msg seq num = 13
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug     ring id = (1.38)
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug     heuristics = Undefined
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug     node list:
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug       node_id = 1, data_center_id = 0, node_state = not set
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug       node_id = 2, data_center_id = 0, node_state = not set
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug   ffsplit: Membership for cluster krogan1 is now stable
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug   ffsplit: Quorate partition selected
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug     node list:
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug       node_id = 1, data_center_id = 0, node_state = not set
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug       node_id = 2, data_center_id = 0, node_state = not set
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug   ffsplit: No client gets NACK
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug   Sending vote info to client ::ffff:192.168.89.12:36150 (cluster krogan1, node_id 2)
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug     msg seq num = 7
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug     vote = ACK
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug   Algorithm result vote is No change
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug   Client ::ffff:192.168.89.12:36150 (cluster krogan1, node_id 2) replied back to vote info message
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug     msg seq num = 7
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug   ffsplit: All ACK votes sent for cluster krogan1
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug   Client ::ffff:192.168.89.12:36150 (cluster krogan1, node_id 2) sent quorum node list.
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug     msg seq num = 14
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug     quorate = 1
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug     node list:
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug       node_id = 1, data_center_id = 0, node_state = member
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug       node_id = 2, data_center_id = 0, node_state = member
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug   Algorithm result vote is No change
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug   New client connected
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug     cluster name = krogan1
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug     tls started = 1
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug     tls peer certificate verified = 1
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug     node_id = 1
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug     pointer = 0x5602a3e6f5f0
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug     addr_str = ::ffff:192.168.89.11:36144
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug     ring id = (1.38)
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug     cluster dump:
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug       client = ::ffff:192.168.89.12:36150, node_id = 2
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug       client = ::ffff:192.168.89.11:36144, node_id = 1
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug   Client ::ffff:192.168.89.11:36144 (cluster krogan1, node_id 1) sent initial node list.
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug     msg seq num = 4
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug     node list:
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug       node_id = 1, data_center_id = 0, node_state = not set
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug       node_id = 2, data_center_id = 0, node_state = not set
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug   Algorithm result vote is Ask later
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug   Client ::ffff:192.168.89.11:36144 (cluster krogan1, node_id 1) sent membership node list.
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug     msg seq num = 5
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug     ring id = (1.38)
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug     heuristics = Undefined
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug     node list:
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug       node_id = 1, data_center_id = 0, node_state = not set
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug       node_id = 2, data_center_id = 0, node_state = not set
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug   ffsplit: Membership for cluster krogan1 is now stable
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug   ffsplit: Quorate partition selected
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug     node list:
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug       node_id = 1, data_center_id = 0, node_state = not set
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug       node_id = 2, data_center_id = 0, node_state = not set
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug   ffsplit: No client gets NACK
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug   Sending vote info to client ::ffff:192.168.89.12:36150 (cluster krogan1, node_id 2)
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug     msg seq num = 8
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug     vote = ACK
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug   Sending vote info to client ::ffff:192.168.89.11:36144 (cluster krogan1, node_id 1)
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug     msg seq num = 1
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug     vote = ACK
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug   Algorithm result vote is No change
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug   Client ::ffff:192.168.89.11:36144 (cluster krogan1, node_id 1) sent quorum node list.
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug     msg seq num = 6
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug     quorate = 1
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug     node list:
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug       node_id = 1, data_center_id = 0, node_state = member
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug       node_id = 2, data_center_id = 0, node_state = member
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug   Algorithm result vote is No change
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug   Client ::ffff:192.168.89.12:36150 (cluster krogan1, node_id 2) replied back to vote info message
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug     msg seq num = 8
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug   Client ::ffff:192.168.89.11:36144 (cluster krogan1, node_id 1) replied back to vote info message
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug     msg seq num = 1
> Aug 13 02:23:18 witness corosync-qnetd: Aug 13 02:23:18 debug   ffsplit: All ACK votes sent for cluster krogan1
> 
> I have checked with dpd_interval=1, timeout=10 (default), sync_timeout=30 (default).
> 
> The lost client timeout now 17 seconds, reaction time is 21 second. Also work without problem.

Good it works for you now.

> 
> Well, I think, if don't take into accound strange dependency between dpd_interval and reaction 

Good. dpd_interval (as a variable of qnetd) must be reworked (and maybe 
even removed) anyway. It should be really dynamic based on timeout of 
qdevice clients.


> time, IMHO, the increasing sync_timeout will be a straight approach to fix the problem. The 

There is going to be slightly different solution (set this timeouts 
based on corosync token timeout) which I'm working on, but it's kind of 
huge amount of work and not super high prio (workaround exists), so no 
ETA yet.

Regards,
   Honza

sync_timeout must be:
> 10 (dpd_interval, how often qnetd checks)
> +
> 20 (2*dpd_interval, this is timeout for a client)
> +
> 10 (the lag time, I suppose it equal to dpd_interval)
> 
> I checked dpd_interval=10 (default), timeout=10 (default), sync_timeout=40. I got 20 seconds timeout and I saw reaction time between 17-34 seconds (time between "Algorithm result vote is Wait for reply" and "ffsplit: All ACK votes sent for cluster").
> 



More information about the Users mailing list