[Pacemaker] crm resource status and HAWK display differ after manually mounting filesystem resource

Tim Serong tserong at suse.com
Sun Aug 28 23:24:12 EDT 2011


On 28/08/11 21:43, Sebastian Kaps wrote:
> Hi,
>
> on our two-node cluster (SLES11-SP1+HAE; corosync 1.3.1, pacemaker 1.1.5) we have defined the following FS resource and its corresponding clone:
>
> primitive p_fs_wwwdata ocf:heartbeat:Filesystem \
>          params device="/dev/drbd1" \
> 	directory="/mnt/wwwdata" fstype="ocfs2" \
> 	options="rw,noatime,noacl,nouser_xattr,commit=30,data=writeback" \
>          op start interval="0" timeout="90s" \
>          op stop interval="0" timeout="300s"
>
> clone c_fs_wwwdata p_fs_wwwdata \
>          params master-max="2" clone-max="2" \
>          meta target-role="Started" is-managed="true"
>
> one of the nodes (node01) went down last night and I started it with the cluster put into maintenance-mode.
> After checking everything else, I mounted the ocfs2-resource manually, did some "crm resource reprobe/cleanup" to make the cluster aware of this and finally turned off the maintenance-mode.
>
> Looking at the output of crm_mon, everything looks good again:
>
>   Clone Set: c_fs_wwwdata [p_fs_wwwdata]
>       Started: [ node01 node02 ]
>
> alternatively looking at "crm_mon -n":
>
> Node node02: online
>          p_fs_wwwdata:1  (ocf::heartbeat:Filesystem) Started
>
> Node node01: online
>          p_fs_wwwdata:0  (ocf::heartbeat:Filesystem) Started
>
> but the HAWK web interface (version 0.3.6 coming with SLES11SP1-HAE) displays this:
>
> Clone Set: c_fs_wwwdata
>    - p_fs_wwwdata:0: Started: node01, node02
>    - p_fs_wwwdata:1: Stopped
>
> Does anybody know why there is a difference?
> Did I make a mistake when manually mounting the FS while it was unmanaged?
> Or is this only a cosmetical issue with HAWK?
>
> When these resources are started by pacemaker, HAWK shows exactly what's expected: two started resoures, one per node.
>
> Thanks in advance!
>

It's almost certainly a cosmetic issue in Hawk.  I have fixed one or two 
bugs along these lines since version 0.3.6.  If you'd like to try a 
newer (not-officially-supported-by-SUSE-but-best-effort-support-by-me) 
build, you can try hawk-0.4.1 from:

http://software.opensuse.org/search?q=Hawk&baseproject=SUSE%3ASLE-11%3ASP1&lang=en

Alternately, if you can reproduce the issue then send me the output of 
"cibadmin -Q" (offlist is fine), I can verify/fix it.

Regards,

Tim
-- 
Tim Serong
Senior Clustering Engineer
SUSE
tserong at suse.com




More information about the Pacemaker mailing list