[Pacemaker] DRBD monitor time out in high I/O situations

Sebastian Kaps sebastian.kaps at imail.de
Sat Jul 16 10:55:33 EDT 2011


 On 12.07.2011, at 12:05, Lars Marowsky-Bree wrote:

>> [unexplained, sporadic monitor timeouts]
> drbd's monitor operation is not that heavy-weight; I can't 
> immediately
> see why the IO load on the file system it hosts should affect it so
> badly.

 Contrary to my first assumption, the problem does not seem to be 
 by high I/O primarily.
 We've witnessed some STONITH-shootouts in the last few days while
 the active node was mainly idle and we've had situations with high I/O
 that did not show any unexpected behavior.

 I noticed that after rebooting a machine, the status of our the second
 Corosync ring is always displayed "FAULTY" by corosync-cfgtool, whereas
 the first ring always is reported working. Since the first ring is a
 direct connection between both nodes and the second one runs on a 
 interface utilizing two redundant cables and different switches, I 
 that this might be caused by the bonding driver being configured later 
 the boot process. I could always issue a "corosync-cfgtool -r" manually
 after booting and both rings' state switched to "no faults" and stayed 
 way until the next reboot.

 Further investigation showed that we have been using identical 
 port numbers for both rings (different IP adresses, though) and that 
 might not be the best idea (I've learned, that the multicast port 
 are supposed to differ by at least 2) and I have corrected this now.

 Could this have caused our problem?
 Is there a way to change the port number for the second ring in a 
 cluster or does it require a complete restart of corosync on all (2) 
 If the second ring is marked faulty (which is the state I currently 
 left it in),
 will that prevent corosync from using that ring or will it eventually 
 that ring?
 It's probably safer to run everything over a single, working, direct 
 for a while than over a faulty redundant ring-pair.

 Other changes we've tried so far and that didn't solve the issue:
 - increasing the number of threads used for message en/decryption from 
 2 to 16.
 - disabling time stamps for cluster messages
 - increasing various monitor timeouts/intervals

 Thanks again for helping!

 BTW: does anyone know if there's a pre-configured $Linux (whatever 
 virtual machine image for Pacemaker that could be used to quickly set 
 up a
 virtual cluster test environment with 2 or three nodes?


More information about the Pacemaker mailing list