[Pacemaker] IPMI stonith resource gets stuck

Jérôme Charaoui jcharaoui at cmaisonneuve.qc.ca
Wed Jan 28 18:53:17 UTC 2015


I'm testing a 2-node Corosync (1.4.6) and Pacemaker (1.1.10+git20130802) 
cluster on Debian 8.0 and having some problems with the stonith resources.

I've set up two external/ipmi resources on each node and wanted to test 
how they would react by physically unplugging the IPMI device network 

On the DC, no problem, the resource monitor fails, stop op succeeds and 
due to location constraints, as expected the resource enters the stop 
state and stays there. After replugging the network cable and cleaningup 
the resource, it gets restored to normal state.

On the slave node, different scenario: after monitor op fails, stop op 
also fails for an unknown reason. The cluster then retries the stop 
operation unsuccessfully until I have the node enter/exit standby mode. 
Replugging the network cable on the IPMI device has no effect.

At least, that's what I figure is happenning from these logs:

DC: http://pastebin.com/raw.php?i=QpwG6nea
Slave: http://pastebin.com/raw.php?i=3nesX8yJ
Config: http://pastebin.com/raw.php?i=3FrJuwWz

Any help tracking down the issue would be much appreciated.


Jérôme Charaoui
Technicien informatique
Collège de Maisonneuve

More information about the Pacemaker mailing list