- Fix NULL pointer deference on error path - Extent header we found may be not latest node of the inode. In order to find latest extent we have to traverse a path from very beginning. https://bugzilla.kernel.org/show_bug.cgi?id=15792 Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> --- fs/ext4/extents.c | 50 +++++++++++++++++++++++++++++++++++++++----------- 1 files changed, 39 insertions(+), 11 deletions(-) diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index 7c7b1d5..13758c3 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -3327,7 +3327,7 @@ int ext4_ext_get_blocks(handle_t *handle, struct inode *inode, { struct ext4_ext_path *path = NULL; struct ext4_extent_header *eh; - struct ext4_extent newex, *ex, *last_ex; + struct ext4_extent newex, *ex; ext4_fsblk_t newblock; int err = 0, depth, ret, cache_type; unsigned int allocated = 0; @@ -3510,17 +3510,38 @@ int ext4_ext_get_blocks(handle_t *handle, struct inode *inode, } if (unlikely(ext4_test_inode_flag(inode, EXT4_INODE_EOFBLOCKS))) { - if (unlikely(!eh->eh_entries)) { - EXT4_ERROR_INODE(inode, - "eh->eh_entries == 0 ee_block %d", - ex->ee_block); - err = -EIO; - goto out2; - } - last_ex = EXT_LAST_EXTENT(eh); + struct ext4_extent *last_ex = EXT_LAST_EXTENT(eh); + /* + * Optimization: Extent header we found may not be the latest + * extent of the inode, but it is already in-core memory. + * Let's test against this EH to avoid unecessery IO. + */ + if (unlikely(!eh->eh_entries)) + goto bad_eh; if (iblock + ar.len > le32_to_cpu(last_ex->ee_block) - + ext4_ext_get_actual_len(last_ex)) - ext4_clear_inode_flag(inode, EXT4_INODE_EOFBLOCKS); + + ext4_ext_get_actual_len(last_ex)) { + struct ext4_ext_path *path2; + struct ext4_extent_header *eh2; + /* Real search of the latests extent is necessery */ + path2 = ext4_ext_find_extent(inode, EXT_MAX_BLOCK, NULL); + eh2 = path2[depth].p_hdr; + if (IS_ERR(path2)) ...
I split this patch into two patches, since they are addressing two very distinct and different bugs. As it turns out I chose a completely different way of tackling the second issue, which has the advantage being much simpler, and not requiring a second call to ext4_ext_find_extent(), which can end up requiring extra disk reads. Also note that I supplied a test case to demonstrate the problem. This was helpful in assuring that the problem was fixed, and also in proving that there really *was* a problem; as it turns out triggering it is quite difficult and I would be very surprised if it has really happenned in real life. To construct the test case I first of all used a 1k block file system, and then generated an extremely fragmented free space: mkdir a; cd a seq -f %05.0f 1 65536 | xargs touch seq -f %05.0f 1 65536 | xargs -L 1 fallocate -l 1k seq -f %05.0f 1 2 65536 | xargs rm cd .. I then created the fragmented file with the EOFBLOCKS set: fallocate -n -l 32m foo This should generate a file with a two-deep extent tree. (It is otherwise *very* hard to create a deep extent tree.) I then found the last block in an leaf block in the middle of the tree, and deleted the last extent in that leaf block, using the tst_extents program found in lib/ext2fs in the e2fsprogs sources (it is built using "make check"). In the case described in the commit, this happened to be for the logical block 17925. Could such a file be generated in real life? Yes, but you'd have to be quite unlucky, as it would require extending a sparse, fragmented file using an i_size-preserving fallocate call, where there was a hole at precisely the right (wrong) place in the extent tree. - Ted --
If the EOFBLOCK_FL flag is set when it should not be and the inode is
zero length, then eh_entries is zero, and ex is NULL, so dereferencing
ex to print ex->ee_block causes a kernel OOPS in
ext4_ext_map_blocks().
On top of that, the error message which is printed isn't very helpful.
So we fix this by printing something more explanatory which doesn't
involve trying to print ex->ee_block.
Addresses-Google-Bug: #2655740
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
---
fs/ext4/extents.c | 9 +++++----
1 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 595ab6d..aeec5c7 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -3370,8 +3370,9 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
*/
if (unlikely(path[depth].p_ext == NULL && depth != 0)) {
EXT4_ERROR_INODE(inode, "bad extent address "
- "iblock: %d, depth: %d pblock %lld",
- map->m_lblk, depth, path[depth].p_block);
+ "lblock: %lu, depth: %d pblock %lld",
+ (unsigned long) map->m_lblk, depth,
+ path[depth].p_block);
err = -EIO;
goto out2;
}
@@ -3501,8 +3502,8 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
if (unlikely(ext4_test_inode_flag(inode, EXT4_INODE_EOFBLOCKS))) {
if (unlikely(!eh->eh_entries)) {
EXT4_ERROR_INODE(inode,
- "eh->eh_entries == 0 ee_block %d",
- ex->ee_block);
+ "eh->eh_entries == 0 and "
+ "EOFBLOCKS_FL set");
err = -EIO;
goto out2;
}
--
1.6.6.1.1.g974db.dirty
--
Dimitry Monakhov discovered an edge case where it was possible for the EXT4_EOFBLOCKS_FL flag could get cleared unnecessarily. This is true; I have a test case that can be exercised via downloading and decompressing the file: wget ftp://ftp.kernel.org/pub/linux/kernel/people/tytso/ext4-testcases/eofblocks-fl-test-ca... bunzip2 eofblocks-fl-test-case.img dd if=/dev/zero of=eofblocks-fl-test-case.img bs=1k seek=17925 bs=1k count=1 conv=notrunc However, triggering it in real life is highly unlikely since it requires an extremely fragmented sparse file with a hole in exactly the right place in the extent tree. (It actually took quite a bit of work to generate this test case.) Still, it's nice to get even extreme corner cases to be correct, so this patch makes sure that we don't clear the EXT4_EOFBLOCKS_FL incorrectly even in this corner case. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> --- fs/ext4/extents.c | 18 +++++++++++++++--- 1 files changed, 15 insertions(+), 3 deletions(-) diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index aeec5c7..c7c304f 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -3319,7 +3319,7 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode, struct ext4_extent_header *eh; struct ext4_extent newex, *ex, *last_ex; ext4_fsblk_t newblock; - int err = 0, depth, ret, cache_type; + int i, err = 0, depth, ret, cache_type; unsigned int allocated = 0; struct ext4_allocation_request ar; ext4_io_end_t *io = EXT4_I(inode)->cur_aio_dio; @@ -3508,8 +3508,20 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode, goto out2; } last_ex = EXT_LAST_EXTENT(eh); - if (map->m_lblk + ar.len > le32_to_cpu(last_ex->ee_block) - + ext4_ext_get_actual_len(last_ex)) + /* + * If the current leaf block was reached by looking at + * the last index block all the way down the tree, and + * we are extending the inode beyond the last extent + * in the current leaf block, then clear the + ...
This condition was triggered during fsstress test. So I consider it as rare but possible in real life. Nor than less it is better to fix it now, than fix it in response from a midnight call from some --
Fair enough, I just didn't have an 8-core system handy when I was dealing with it, so hand-crafted an fs image. BTW, whatever happened to the xfstests patch that you created in https://bugzilla.kernel.org/show_bug.cgi?id=15792 Did you submit it to the xfstests folks for inclusion? - Ted --
Yes but changes was requested, so i've prepared new version. I'll post it against Jan's xfsqa quota branch. --
Is there a reason you don't just base it against the xfstests mainline and send the patch directly to xfs@oss.sgi.com? Your patch isn't related to quota or has any dependency on other changes that might be in Jan's branch, right? Or am I missing something? - Ted --
They are related because test number always increasing, and i want to use the test to run with and w/o quota enabled. --
