On Thu, 2007-02-01 at 15:02 -0500, Mark Lord wrote:
We can ... the big concern with your approach, which you haven't
addressed is the time factor. For most SCSI devices, returning a fatal
MEDIUM ERROR means we're out of remapping table, and also that there's
probably a bunch of sectors on the track that are now out. Thus, there
are almost always multiple sector failures. In linux, the average
request size on a filesystem is around 64-128kb; thats 128-256 sectors.
If we fail at the initial sector, we have to go through another 128-256
attempts, with the internal device retries, before we fail the entire
request. Some devices can take a second or so for each read before they
finally give up and decide they really can't read the sector, so you're
looking at 2-5 minutes before the machine finally fails this one
request ... and much worse for devices that retry more times.
Er ... define "skip over the bio". A bio is simply a block
representation for a bunch of sg elements coming in to the elevator.
Mostly what we see in SCSI is a single bio per request, so skipping the
bio is really the current behaviour (to fail the rest of the request).
James
-