On Thu, Jun 03, 2010 at 11:46:34AM -0400, Chris Mason wrote:
That's true but a naive kernel crc cannot do better than the
user/kernel boundary (and has very big problems even doing that well
with mmap, get_user_pages, concurrent dirtying). So we are already
resigned there to a best effort approach.
Since we fundamentally can't have end-to-end protection then, it's
much harder to argue for significant complexity just to close the
hole a little.
So if we do the block layer retries in response to concurrent writes, it
opens the window there a little bit, but remember only a small
proportion of writes will require retries, and for that proportion, the
window is only opening a small amount.
As far as I know, we're not checksumming at the usercopy point, but the
writeback point, so we have a vastly bigger window there already.
I'm not totally against freezing redirtying events during writeback,
but as long as there is a huge window of the page sitting dirty in
pagecache, it's not worth much if any complexity.
Also I don't think we can deal with memory errors and scribbles just by
crcing dirty data. The calculations generating the data could get
corrupted. Data can be corrupted on its way back from the device to
userspace. Dirty memory writeback is usually only a small part of the
entire data transformation process.
After protecting writeback from IO bus and wire errors, I think this
would be the most productive thing to work on. Obviously this feature is
being pushed by databases and such that really want to pass checksums
all the way from userspace. Block retrying is _not_ needed or wanted
here of course.
After that is done, it might be worth coming back to see if
regular pagecache can be protected any better. I just think that's
the highest hanging fruit at the moment.
--