[Pacemaker] Filesystem RA would cause "stop timeout" if it mounts shared storage

Junko IKEDA tsukishima.ha at gmail.com
Fri Apr 20 01:23:42 EDT 2012


Hi,

I found the following event;

1) Mount shared storage drive using Filesystem RA.
2) Unlink the Fibre Channel, so RA can not access the shared storage for now.
3) RA detects the monitor failure and calls Filesystem_stop().
4) Filesystem_stop() goes into a timeout error.

The current Filesystem RA checks whether there is $STATUSFILE or not
during its stop operation,
https://github.com/ClusterLabs/resource-agents/commit/8b58d11ee68b4b75fc913c00348ee574ea60b24e#L0L649

and it seems that this "if" statement would be locked because RA can
not access the storage, so stop operation produces its timeout.

 if [ -f "$STATUSFILE" ]; then

Does the original "if" statement cause any bad behavior?

 if [ -n "$OCF_RESKEY_statusfile_prefix" ]; then


Thanks,
Junko IKEDA

NTT DATA INTELLILINK CORPORATION




More information about the Pacemaker mailing list