Re: [patch 8/8] fs: add i_op->sync_inode

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Nick Piggin
Date: Monday, January 3, 2011 - 11:27 pm

On Wed, Dec 29, 2010 at 10:12:46AM -0500, Christoph Hellwig wrote:

OK



Basically the blocking or non blocking behaviour of ->write_inode is
irrelevant and will go away. Specific behaviour cannot be relied upon from
generic code, only filesystems. But even filesystems should not
differentiate between blocking and non-blocking (where they do, they
have been buggy). That is the *other* big bug in the code (the first one
being syncing code not waiting for writeback).

So after my series, we guarantee ->write_inode is called after the inode
has been dirtied. We make no guarantees about what mode it is called in
(blocking or non blocking). So the filesystem should either have _all_
write_inode calls do sync writeout, or have them all do part of the op
(eg. clean struct inode by dirtying pagecache) and then syncing in fsync
and sync_fs. In the latter scheme, it doesn't make sense to do anything
more in the case of a "sync" call.

Basically you can spot the bugs by looking for where the .write_inode
functions differentiate the writeback mode.

So, that's all very confusing with the current set of APIs that are
called sync_inode, sync_inode_metadata, etc. So I'm trying to reduce and
remove these gradually.



Yes ok.

 

It does have a reason, which is the nfsd syncing callback -- using
sync_inode_metadata there is actually wrong and should really be
replaced with a warning that the fs cannot support such syncing.

See the problem I explained above -- it really needs to do a full
fsync call. But it doesn't have a file descriptor, and most filesystems
don't need one, so I think a sync_inode operation is nice.

Also giving filesystems the flexibility to do the data writeout can
be more optimal by doing all writeout at once or ordering to suit their
writeback patterns. Having range data could give some performance
advantages writing back mappings or commit operations over network. I
don't see it as a big complexity. It gives a chance to do range fsyncs
and sync_file_range and such properly too.


--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[patch 0/8] Inode data integrity patches, Nick Piggin, (Fri Dec 17, 6:46 pm)
[patch 1/8] fs: mark_inode_dirty barrier fix, Nick Piggin, (Fri Dec 17, 6:46 pm)
[patch 2/8] fs: simple fsync race fix, Nick Piggin, (Fri Dec 17, 6:46 pm)
[patch 3/8] fs: introduce inode writeback helpers, Nick Piggin, (Fri Dec 17, 6:46 pm)
[patch 5/8] fs: ext2 inode sync fix, Nick Piggin, (Fri Dec 17, 6:46 pm)
[patch 6/8] fs: fsync optimisations, Nick Piggin, (Fri Dec 17, 6:46 pm)
[patch 8/8] fs: add i_op->sync_inode, Nick Piggin, (Fri Dec 17, 6:46 pm)
Re: [patch 7/8] fs: fix or note I_DIRTY handling bugs in f ..., Christoph Hellwig, (Wed Dec 29, 8:01 am)
Re: [patch 8/8] fs: add i_op->sync_inode, Christoph Hellwig, (Wed Dec 29, 8:12 am)
Re: [patch 8/8] fs: add i_op->sync_inode, Nick Piggin, (Mon Jan 3, 11:27 pm)
Re: [patch 8/8] fs: add i_op->sync_inode, Christoph Hellwig, (Mon Jan 3, 11:57 pm)
Re: [patch 8/8] fs: add i_op->sync_inode, Nick Piggin, (Tue Jan 4, 1:03 am)
Re: [patch 8/8] fs: add i_op->sync_inode, Nick Piggin, (Tue Jan 4, 2:49 am)