On Mon, Jan 03, 2011 at 11:58:21AM -0500, Christoph Hellwig wrote:
Just to recap, basically we have 2 main problems in vfs/filesystems:
- i_state dirtyness is checked outside the correct synchronization
protcol, so it may be seen as clean before a concurrent writer
has finished.
- .write_inode is only guaranteed to be called once regardless of sync
or async mode, for a dirty inode at a sync point. Many filesystems
were incorrectly assuming they would be called once *in synchronous
mode*.
The optimal approach for .write_inode seems to be clean the struct
inode so that it may be eventually reclaimed. Then have your .fsync and
.sync_fs implementations enforce the actual data integrity.
Note that "clean struct inode" often means to copy the metadata
somewhere else to be scheduled for asynch writeout. You have to be
careful to note that if you allow the inode to be evicted at this
point without data integrity point also in .evict_inode, then you need
to keep in mind that .sync_fs (and subsequent .fsync, if the inode is
re opened) need to still enforce integrity for these potentially
evicted inodes.
Everyone happy with this? Please review your filesystems and look at
my patches :)
Thanks,
Nick
--