[Pacemaker] Configuring LVM and Filesystem resources on top of DRBD

Dejan Muhamedagic dejanmm at fastmail.fm
Tue Feb 9 05:57:36 EST 2010


Hi,

On Mon, Feb 08, 2010 at 03:45:25PM -0600, D. J. Draper wrote:
> 
> On Mon, Feb 8 13:36:47 EST 2010, Dejan Muhamedagic wrote:
> The logs don't contain the period when CRM probes for running
> resources. But I can imagine what is actually going on. This is a
> deficiency in handling probes in the LVM and, perhaps, the
> Filesystem resource agents. Can you please post the logs from the
> time when the cluster is starting. Actually, best to open a
> bugzilla and attach a hb_report report.
> 
> Thanks,
> 
> Dejan
> Thanks for the reply Dejan. I attached a zip file with several
> log files covering two reboots on each server. To generate

According to Node01Reboot1500ha-log.log, CRM first starts LVM
then drbd:

Feb 08 15:03:36 node01.houseofdraper.org lrmd: [1771]: info: rsc:lvm_data0:6: start
Feb 08 15:03:36 node01.houseofdraper.org crmd: [1774]: info: do_lrm_rsc_op: Performing key=7:1:0:1fcb0ada-cc5d-463b-ab2d-e046fee580ed op=drbd_data0:1_start_0 )
Feb 08 15:03:36 node01.houseofdraper.org lrmd: [1771]: info: rsc:drbd_data0:1:7: start
Feb 08 15:03:36 node01.houseofdraper.org crmd: [1774]: info: do_lrm_rsc_op: Performing key=35:1:0:1fcb0ada-cc5d-463b-ab2d-e046fee580ed op=drbd_data1:1_start_0 )
Feb 08 15:03:36 node01.houseofdraper.org lrmd: [1771]: info: rsc:drbd_data1:1:8: start

That's obviously a configuration problem. Similar in all other
logs, it's as if there are no constraints.

There are also numerous drbd errors:

Node01Reboot1400messages.log:Feb  8 14:00:56 node01 drbd[6124]: ERROR: data0: Called drbdadm -c /etc/drbd.conf secondary data0
Node01Reboot1400messages.log:Feb  8 14:00:56 node01 drbd[6124]: ERROR: data0: Exit code 11
Node01Reboot1400messages.log:Feb  8 14:00:56 node01 drbd[6124]: ERROR: data0: Command output: 
Node01Reboot1400messages.log:Feb  8 14:00:56 node01 drbd[6124]: ERROR: data0: Called drbdadm -c /etc/drbd.conf secondary data0

etc.

Looking again at your configuration, there are some strange
resource relations:

> order ord_data00 inf: ms_drbd_data0:promote ms_drbd_data1:promote

How these two dependent of each other?

> order ord_data01 inf: ms_drbd_data0:promote lvm_data0:start
> order ord_data02 inf: lvm_data0:start fs_data0:start
> order ord_data03 inf: ms_drbd_data1:promote lvm_data1:start
> order ord_data04 inf: lvm_data1:start fs_data1:start
> order ord_data05 inf: fs_data0:start fs_data1:start

And these two.

> order ord_data06 inf: fs_data1:start ip_data:start
> order ord_data07 inf: ip_data:start svc_nfs:start
> order ord_data08 inf: ip_data:start svc_samba:start

Perhaps you could use groups to reduce the configuration size a
bit. It's quite hard to follow all the constraints.

Please use hb_report, it is the only way one can correlate
events with logs with configuration. And you'll find it a tad
easier than collecting stuff by hand.

The bugzilla is at http://developerbugs.linux-foundation.org/

Thanks,

Dejan

> these, I started with all the resources running on Node01. I
> issued the first reboot at 14:00, after which all the resources
> except fs_data0 started successfully on Node02. I issued a
> second reboot at 15:00, after which only the drbd resources
> successfully restarted on Node01:
> 
> -bash-4.0# crm status
> ============
> Last updated: Mon Feb  8 15:42:25 2010
> Stack: Heartbeat
> Current DC: node02.houseofdraper.org (a91b7362-448e-4437-a543-19e0067a5d2e) - partition with quorum
> Version: 1.0.7-d3fa20fc76c7947d6de66db7e52526dc6bd7d782
> 2 Nodes configured, unknown expected votes
> 4 Resources configured.
> ============
> 
> Online: [ node01.houseofdraper.org node02.houseofdraper.org ]
> 
>  Master/Slave Set: ms_drbd_data0
>      Masters: [ node01.houseofdraper.org ]
>      Slaves: [ node02.houseofdraper.org ]
>  Master/Slave Set: ms_drbd_data1
>      Masters: [ node01.houseofdraper.org ]
>      Slaves: [ node02.houseofdraper.org ]
> 
> Failed actions:
>     lvm_data0_start_0 (node=node02.houseofdraper.org, call=14, rc=1, status=complete): unknown error
>     fs_data0_start_0 (node=node02.houseofdraper.org, call=6, rc=5, status=complete): not installed
>     lvm_data0_start_0 (node=node01.houseofdraper.org, call=6, rc=1, status=complete): unknown error
>     fs_data0_start_0 (node=node01.houseofdraper.org, call=14, rc=5, status=complete): not installed
> -bash-4.0# 
> 
> As for the bugzilla report, if you would kindly point me to a
> FAQ or HOWTO covering the proper submission of a bugzilla
> report for this group, I would be happy to initiate one.


> Thanks in advance,
> 
> DJ
>  		 	   		  
> _________________________________________________________________
> Your E-mail and More On-the-Go. Get Windows Live Hotmail Free.
> http://clk.atdmt.com/GBL/go/201469229/direct/01/


> _______________________________________________
> Pacemaker mailing list
> Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker





More information about the Pacemaker mailing list