[Pacemaker] Immediate fs errors on iscsi connection problem

Vladislav Bogdanov bubble at hoster-ok.com
Sun Apr 3 02:53:31 EDT 2011


Hi,

You need some tuning from both sides.
First, (at least some versions of) ietd needs to be blocked (-j DROP)
with iptables on restarts. That means, you should block all incoming and
outgoing packets (later is more important) before ietd stop and unblock
all after it starts. I use home-brew stateful RA for this, which blocks
(DROP) all traffic to/from VIP in slave mode and passes it to a later
decision (no -j) in master mode.

Next, tune replacement_timeout (set it big) and disable iscsi ping:
noop_out_timeout=0 and noop_out_interval=0 at initiator side.

This should help
Best,
Vladislav

02.04.2011 23:57, ruslan usifov wrote:
> Hello
> 
> I have 2 nodes which manage iscsi targets which exported for svn and web
> server usage.
> But I have follow problem when one of the nodes are dead (when i do
> "shutdown -P now") and happens resource migration from one node to
> another (ipaddr, iscsi target and so on) on iscsi initiator side I often
> see follow in log:
> 
> Apr  1 19:05:43 web-server kernel: [5437679.000891] sd 0:0:0:1: [sda]
> Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
> Apr  1 19:05:43 web-server kernel: [5437679.000897] sd 0:0:0:1: [sda]
> Sense Key : Illegal Request [current]
> Apr  1 19:05:43 web-server kernel: [5437679.000902] Info fld=0x0
> Apr  1 19:05:43 web-server kernel: [5437679.000904] sd 0:0:0:1: [sda]
> Add. Sense: Logical unit not supported
> Apr  1 19:05:43 web-server kernel: [5437679.000910] sd 0:0:0:1: [sda]
> CDB: Write(10): 2a 00 26 08 7d c8 00 00 08 00
> Apr  1 19:05:43 web-server kernel: [5437679.000920] end_request: I/O
> error, dev sda, sector 638090696
> Apr  1 19:05:43 web-server kernel: [5437679.000937] __ratelimit: 28
> callbacks suppressed
> Apr  1 19:05:43 web-server kernel: [5437679.000940] Buffer I/O error on
> device sda, logical block 79761337
> Apr  1 19:05:43 web-server kernel: [5437679.000948] lost page write due
> to I/O error on sda
> Apr  1 19:05:43 web-server kernel: [5437679.001145] sd 0:0:0:1: [sda]
> Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
> Apr  1 19:05:43 web-server kernel: [5437679.001149] sd 0:0:0:1: [sda]
> Sense Key : Illegal Request [current]
> Apr  1 19:05:43 web-server kernel: [5437679.001153] Info fld=0x0
> Apr  1 19:05:43 web-server kernel: [5437679.001154] sd 0:0:0:1: [sda]
> Add. Sense: Logical unit not supported
> Apr  1 19:05:43 web-server kernel: [5437679.001159] sd 0:0:0:1: [sda]
> CDB: Write(10): 2a 00 20 fd 5f d8 00 00 08 00
> Apr  1 19:05:43 web-server kernel: [5437679.001168] end_request: I/O
> error, dev sda, sector 553476056
> Apr  1 19:05:43 web-server kernel: [5437679.001178] Buffer I/O error on
> device sda, logical block 69184507
> Apr  1 19:05:43 web-server kernel: [5437679.001186] lost page write due
> to I/O error on sda
> Apr  1 19:05:43 web-server kernel: [5437679.001203] JBD: Detected IO
> errors while flushing file data on sda
> Apr  1 19:05:43 web-server kernel: [5437679.002116] sd 0:0:0:1: [sda]
> Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
> Apr  1 19:05:43 web-server kernel: [5437679.002120] sd 0:0:0:1: [sda]
> Sense Key : Illegal Request [current]
> Apr  1 19:05:43 web-server kernel: [5437679.002124] Info fld=0x0
> Apr  1 19:05:43 web-server kernel: [5437679.002125] sd 0:0:0:1: [sda]
> Add. Sense: Logical unit not supported
> Apr  1 19:05:43 web-server kernel: [5437679.002130] sd 0:0:0:1: [sda]
> CDB: Write(10): 2a 00 13 f8 7b 28 00 00 18 00
> Apr  1 19:05:43 web-server kernel: [5437679.002138] end_request: I/O
> error, dev sda, sector 335051560
> Apr  1 19:05:43 web-server kernel: [5437679.002160] Aborting journal on
> device sda.
> Apr  1 19:05:43 web-server kernel: [5437679.002864] sd 0:0:0:1: [sda]
> Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
> Apr  1 19:05:43 web-server kernel: [5437679.002868] sd 0:0:0:1: [sda]
> Sense Key : Illegal Request [current]
> Apr  1 19:05:43 web-server kernel: [5437679.002872] Info fld=0x0
> Apr  1 19:05:43 web-server kernel: [5437679.002874] sd 0:0:0:1: [sda]
> Add. Sense: Logical unit not supported
> Apr  1 19:05:43 web-server kernel: [5437679.002878] sd 0:0:0:1: [sda]
> CDB: Write(10): 2a 00 13 f8 10 10 00 00 08 00
> Apr  1 19:05:43 web-server kernel: [5437679.002887] end_request: I/O
> error, dev sda, sector 335024144
> Apr  1 19:05:43 web-server kernel: [5437679.002898] Buffer I/O error on
> device sda, logical block 41878018
> Apr  1 19:05:43 web-server kernel: [5437679.002906] lost page write due
> to I/O error on sda
> Apr  1 19:05:45 web-server kernel: [5437681.360912] ext3_abort called.
> Apr  1 19:05:45 web-server kernel: [5437681.360928] EXT3-fs error
> (device sda): ext3_journal_start_sb: Detected aborted journal
> Apr  1 19:05:45 web-server kernel: [5437681.360939] Remounting
> filesystem read-only
> Apr  1 19:06:06 web-server kernel: [5437702.070344]  connection1:0:
> detected conn error (1020)
> 
> 
> And after this file system remounted in read only mode. How can i
> prevent this behaviour? I found follow discussion
> http://www.mail-archive.com/open-iscsi@googlegroups.com/msg00554.html,
> but how apply solution in it for my case, сertainly if this is my case.
> 
> 
> PS: as iscsi target I use ietd
> 
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker





More information about the Pacemaker mailing list