Commit Graph

1804 Commits (6509eebb6365de75915a197a4672442d4b5a230e)

Author SHA1 Message Date
Darrick J. Wong 6509eebb63 libext2fs: set interior tree block goal more intelligently
When we're splitting an extent node, try to allocate the new interior
tree block just prior to the first extent in the block we're trying to
split.  The previous logic only set a goal block if we had to split
both the current node and its parent, which is somewhat infrequent.
When that would happen, the goal would start at zero, leading to poor
locality.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-12-13 20:14:14 -05:00
Darrick J. Wong 7b486ec08c libext2fs: find inode goal when allocating blocks
Try to be a little smarter about where we go to allocate blocks for a
inode.  For a given inode and logical offset, set the goal as if the
file were physically continuous.  If it's bmapped, just start looking
at wherever lblk 0 is.  If that's not possible (the file has no
lblk>pblk mappings, inline data, etc.) then start looking in the
inode's block group.

[ Fixed memory leak --tytso ]

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-12-13 20:07:13 -05:00
Theodore Ts'o bc57b123d6 libext2fs: use block_buf in ext2fs_alloc_block2() if it is provided
If the caller supplies a buffer to ext2fs_alloc_block2(), use it
instead of calling ext2fs_zero_blocks2().

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-12-12 22:12:45 -05:00
Darrick J. Wong 0a92af260d libext2fs: use a dynamically sized block zeroing buffer
Dynamically grow the block zeroing buffer to a maximum of 4MB, and
allow callers to provide their own zeroed buffer in
ext2fs_zero_blocks2().

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-12-12 19:28:35 -05:00
Dmitry Monakhov e50e985d6a ext2fs: fix integer overflow in rb_get_bmap_range
bmap_rb_extent is defined as __u64:blk __u64:count.  So count can
exceed INT_MAX on populated filesystems.

TESTCASE: xfstest ext4/004

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-12-11 17:57:35 -05:00
Darrick J. Wong dc7b8dad99 libext2fs: file IO routines should handle uninit blocks
The file IO routines do not handle uninit blocks at all.  The read
method should check for the uninit flag and return a buffer of zeroes,
and the write routine should convert unwritten extents.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-12-02 22:57:14 -05:00
Darrick J. Wong 3548bb64b5 libext2fs: refactor extent head creation
Don't open-code the creation of the extent tree header, since
ext2fs_extent_open2() knows how to take care of this.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-12-02 22:55:04 -05:00
Darrick J. Wong 54f6faf7f2 libext2fs: don't report garbage inodes with really large inodes
If the inode size is large enough that there are fewer than two inodes
per block, don't report an inode checksum failure as a garbage inode
during the scan because the "more than half are broken" criteria that
we use to decide if a block of inodes is garbage doesn't really apply.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-12-02 22:17:10 -05:00
Theodore Ts'o bbf29ce6e9 Merge branch 'maint' into next 2014-12-02 22:15:25 -05:00
Darrick J. Wong c9d6c22ded libext2fs: don't allow alloc_stats on bad inode/block numbers
Don't allow callers to feed bad block/inode numbers to
ext2fs_*_alloc_stats2, because evil callers (<cough>resize2fs<cough>)
can corrupt library state this way, leading to a crash.

(There will be a subsequent patch to resize2fs to fix its bad
behavior.)

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-11-17 17:59:42 -05:00
Darrick J. Wong c0ff3a21b6 libext2fs: set BLOCK_UNINIT for non-last blockgroups if all blocks are free
Set BLOCK_UNINIT in any group whose blocks are all unused, so long as
it isn't the last group.  This helps us speed up future e2fsck runs
and mounts because we don't need to read or checksum block bitmaps for
these groups.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-11-17 17:46:13 -05:00
Darrick J. Wong 407916f5af libext2fs: fix endian handling error; reduce fragmentation some
If we're going to read the "nr - 1" entry in an indirect block for use
as a "goal" input to the block allocator, we need to byteswap the
entry.  While we're at it, if we're allocating blocks for the zeroth
entry in the indirect block, we might as well use the indirect block
as the starting point to try to reduce fragmentation.

(d_fallocate_blkmap will test this...)

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-11-07 21:27:53 -05:00
Darrick J. Wong 180f376b04 misc: fix compiler warnings and minor build errors
Fix some gcc-4.8 warnings and other problems that broke the build.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-11-07 21:23:41 -05:00
Darrick J. Wong 12406b37b2 libext2fs: fix endian checking bits
Commit 3e683eef93 ("define bitwise types and annotate conversion
routines") broke the build on various platforms.  Turns out that
crossing our fingers wasn't such a good idea, so just define it
separately.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-11-05 11:08:32 -05:00
Darrick J. Wong bab25cb7a7 libext2fs: zero the EA block buffer before filling it
When writing an extended attribute (EA) block, it's quite possible
that the EA formatting code will not write the entire buffer.
Therefore, we must zero the buffer beforehand to avoid writing random
heap contents to disk.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reported-by: Sami Liedes <sami.liedes@iki.fi>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-11-04 11:47:30 -05:00
Theodore Ts'o dfa667dab6 Merge branch 'maint' into next
Conflicts:
	lib/ext2fs/dir_iterate.c
2014-11-04 11:46:55 -05:00
Darrick J. Wong 8d5324c43f libext2fs: don't memcpy identical pointers when writing a cache block
Sami Liedes found a scenario where we could memcpy incorrectly:

If a block read fails during an e2fsck run, the UNIX IO manager will
call the io->read_error routine with a pointer to the internal block
cache.  The e2fsck read error handler immediately tries to write the
buffer back out to disk(!), at which point the block write code will
try to copy the buffer contents back into the block cache.  Normally
this is fine, but not when the write buffer is the cache itself!

So, plumb in a trivial check for this condition.  A more thorough
solution would pass a duplicated buffer to the IO error handlers, but
I don't know if that happens frequently enough to be worth the extra
point of failure.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reported-by: Sami Liedes <sami.liedes@iki.fi>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-11-04 11:43:08 -05:00
Darrick J. Wong dab7435917 libext2fs: directory iteration mustn't walk off the buffer end
When we're iterating a directory, the loop control code reads the
length of the next directory record, failing to account for the fact
that there must be at least 8 bytes (the minimum size of a directory
entry) left in the buffer to read the next directory record.  Fix the
loop conditional so that we don't read off the end of the buffer.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reported-by: Sami Liedes <sami.liedes@iki.fi>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-11-04 11:39:51 -05:00
Eric Sandeen a7db275f31 quotaio: annotate & fix up for sparse endian checker
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-11-04 11:26:24 -05:00
Eric Sandeen 8f358e58fe libext2: minor sparse endian checker fixup
The sparse checker treats 0 assignments as special, but
doesn't catch a = b = 0; separate them to make it quieter.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-11-04 11:26:23 -05:00
Eric Sandeen 1224c0d3ea endian-annotate most on-disk structures
This annotates most on-disk structures for endianness;
however it does not annotate some, like the superblock, inodes,
mmp, etc, as these are swapped in-place at this point.  This is
a little inconsistent, but should help catch some endian mistakes,
at least.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-11-04 11:24:56 -05:00
Eric Sandeen 387e03160c libext2fs: fix endian handling of ext3_extent_header in inline_data
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
2014-11-04 11:24:50 -05:00
Eric Sandeen 3e683eef93 define bitwise types and annotate conversion routines
This lays the groundwork for sparse-checking e2fsprogs for
endianness; defines bitwise types, and fixes up the ext2fs_*
swapping routines to do the proper casts.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-11-04 11:24:44 -05:00
Theodore Ts'o 8b779489ea Merge branch 'maint' into next
Conflicts:
	configure
2014-11-04 11:20:09 -05:00
Eric Sandeen 160f131dee libext2fs: fix endian handling of ext3_extent_header
This turned up when trying to resize a filesystem containing
a file with many extents on PPC64.

Fix all locations where ext3_extent_header members aren't
handled in an endian-safe manner.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
2014-11-04 11:12:45 -05:00
Theodore Ts'o 441eb337a8 util: allow subst to build on systems that do not have utimes()
Make subst more portable so it can deal with such oler systems that do
not have utimes().  Note that it is important that subst build
correctly without an autoconf-generated config.h (since that is what
happens on a cross-compile), as well as using whatever features are
available as determined by autoconf when doing a native build.  We
currently assume the presence of utime(), but not utimes() or
futimes().

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-10-19 22:13:09 -04:00
Darrick J. Wong 08c8e319e3 libext2fs/e2fsck: refactor everyone who writes zero blocks to disk
Convert all call sites that write zero blocks to disk to use
ext2fs_zero_blocks2() since it can use Linux's zero out feature to do
the writes more quickly.  Reclaim the zero buffer at freefs time and
make the write-zeroes fallback use a larger buffer.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-10-13 04:31:17 -04:00
Theodore Ts'o 074931ab76 libext2fs: use ~0UL instead of -1UL to avoid static checker warnings
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-21 15:36:57 -04:00
Darrick J. Wong b291c11f08 misc: use libmagic when libblkid can't identify something
If we're using check_plausibility() to try to identify something that
obviously isn't an ext* filesystem and libblkid doesn't know what it
is, try libmagic instead.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-20 23:42:19 -04:00
Darrick J. Wong c8b20b40eb misc: add plausibility checks to debugfs/tune2fs/dumpe2fs/e2fsck
If any of these utilities detect a bad superblock magic, call
check_plausibility to see if blkid can identify the passed-in argument
as something else (xfs, partition, etc.) in the hopes of catching a
user error.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-19 23:44:31 -04:00
Darrick J. Wong b598c517b3 misc: move check_plausibility into a separate file
Move check_plausibility() into a separate file so that various
programs can use it without having to declare useless global variables
that the util.c functions seem to require.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-19 13:10:21 -04:00
Andreas Dilger ca209dc625 ext2fs: add readahead method to improve scanning
Add a readahead method for prefetching ranges of disk blocks.  This is
useful for inode table scanning, and other large contiguous ranges of
blocks, and may also prove useful for random block prefetch, since it
will allow reordering of the IO without waiting synchronously for the
reads to complete.

It is currently using the posix_fadvise(POSIX_FADV_WILLNEED)
interface, as this proved most efficient during our testing.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-19 12:16:08 -04:00
Theodore Ts'o cc0d983303 Fix build failures due to missing $(SYSLIBS)
Two link lines were missing $(SYSLIBS), which is needed for dietlibc.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-19 01:05:14 -04:00
Theodore Ts'o 1bbea9c909 Merge branch 'maint' into next 2014-09-18 21:28:59 -04:00
Darrick J. Wong d9112409a2 misc: zero s_jnl_blocks when adding journal online or removing external journal
Erase s_jnl_blocks when removing an external journal, or adding an
internal journal online.  We can't add the backup for the internal
journal because we have no good way to get the indirect block or ETB
addresses, so the best we can do is hope that the user runs e2fsck,
which will correct that.  We are motivated to erase during external
journal removal to state emphatically that there's no journal.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reported-by: thomas_reardon@hotmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-18 21:24:26 -04:00
Theodore Ts'o a2c664ae90 lib/ext2fs: fix Makefile to avoid a build splat when building without VPATH
When building in the source tree, the order of the includes caused the
compiling of debugfs/journal.c while in the lib/ext2fs directory to
find the version in lib/ext2fs instead of the desired version in
e2fsck/jfs_user.h.

We need to eventually get rid of this whole mess and have only one
jfs_user.h and build the journal-related functions once in an internal
library which is used only by e2fsprogs progams.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reported-by: "Darrick J. Wong" <darrick.wong@oracle.com>
2014-09-11 19:15:22 -04:00
Darrick J. Wong 551ab6d8e0 libext2fs: check ea value offset when loading
When reading extended attributes, check e_value_offs to make sure that
it starts in the value area and not the name area.  The attached test
case image will crash the kernel if it is mounted and you append more
than 4096 bytes of data to /a, due to insufficient validation.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-11 18:10:23 -04:00
Darrick J. Wong 463eb92131 debugfs: add the ability to write transactions to the journal
Extend debugfs with the ability to create transactions and replay the
journal.  This will eventually be used to test kernel recovery and
metadata_csum recovery.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-11 16:52:37 -04:00
Darrick J. Wong 759c46cf45 debugfs: create journal handling routines
Create a journal.c with routines adapted from e2fsck/journal.c to
handle opening and closing the journal, and setting up the
descriptors, and all that.  Unlike e2fsck's versions which try to
identify and fix problems, the routines here have no way to repair
anything.

[ Modified by tytso to fold debugfs/jfs_user.h into e2fsck/jfs_user.h,
  so we don't have to copy recovery.c and revoke.c into debugfs. --tytso ]

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-11 16:44:10 -04:00
Darrick J. Wong e690eae513 misc: zero s_jnl_blocks when removing internal journal
When we're removing the internal journal (broken journal, turning it
off, or adding an external journal), zero s_jnl_blocks so that they
can't be picked up by accident later.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Cc: TR Reardon <thomas_reardon@hotmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-11 12:40:55 -04:00
Darrick J. Wong fc06f25a10 libext2fs: write_journal_inode should check iterate return value
When creating a journal inode, check the return value from
block_iterate3() because otherwise we fail to capture errors such as
being unable to allocate an extent tree block, which leads to e2fsck
creating broken journals.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-11 12:40:54 -04:00
Darrick J. Wong f92c600c09 libext2fs: report bad magic over bad sb checksum
We don't want ext2fs_open2() to report bad sb checksum on something
that's not even an ext* superblock.  This apparently happens pretty
easily if we try to open an XFS filesystem.  Thus, make it so that a
bad magic number code always trumps the sb checksum error code.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-11 12:40:54 -04:00
Darrick J. Wong 38d5adf339 e2fsck/debugfs: fix descriptor block size handling errors with journal_csum
It turns out that there are some serious problems with the on-disk
format of journal checksum v2.  The foremost is that the function to
calculate descriptor tag size returns sizes that are too big.  This
causes alignment issues on some architectures and is compounded by the
fact that some parts of jbd2 use the structure size (incorrectly) to
determine the presence of a 64bit journal instead of checking the
feature flags.  These errors regrettably lead to the journal
corruption reported by Mr. Reardon.

Therefore, introduce journal checksum v3, which enlarges the
descriptor block tag format to allow for full 32-bit checksums of
journal blocks, fix the journal tag function to return the correct
sizes, and fix the jbd2 recovery code to use feature flags to
determine 64bitness.

Add a few function helpers so we don't have to open-code quite so
many pieces.

Switching to a 16-byte block size was found to increase journal size
overhead by a maximum of 0.1%, to convert a 32-bit journal with no
checksumming to a 32-bit journal with checksum v3 enabled.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reported-by: TR Reardon <thomas_reardon@hotmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-11 12:40:54 -04:00
Theodore Ts'o 330cebc0e9 Merge branch 'maint' into next
Conflicts:
	debugfs/debugfs.c
	e2fsck/Makefile.in
	lib/ext2fs/Makefile.in
	tests/test_config
2014-09-11 12:40:43 -04:00
Michael Forney 943e21cda8 compile_et: Allow user to override ET_DIR
Signed-off-by: Michael Forney <forney@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-08 19:02:12 -04:00
Michael Forney 53904ae543 Apply LDFLAGS when building tests
Signed-off-by: Michael Forney <forney@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-08 19:00:24 -04:00
Michael Forney 60abcd7394 tests: Add to LD_LIBRARY_PATH instead of overriding
Signed-off-by: Michael Forney <forney@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-08 18:55:42 -04:00
Darrick J. Wong 97f168b67e e2fsck: resync jbd2 revoke code from Linux 3.16
Synchronize e2fsck's copy of revoke.c with the kernel's copy in
fs/jbd2.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-08-26 23:43:20 -04:00
Darrick J. Wong 13af4b93fb e2fsck: resync jbd2 recovery code from Linux 3.16
Synchronize e2fsck's copy of recovery.c with the kernel's copy in
fs/jbd2.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-08-26 23:32:14 -04:00
Darrick J. Wong 2432a41a58 libext2fs: fix problems with LE<->BE conversions on BE platforms
Fix more problems that I found when testing on ppc64:

- Inode swap cut and paste error leads to immutable inodes being
  detected as inlinedata inodes, leading to e2fsck incorrectly barfing
  on i_block[] contents.

- Superblock csum/verify must be aware of the fs->super byte order
  when checking for metadata_csum feature flag.  (Hint: in _openfs(),
  fs->super is in LE order for the first csum verification)

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-08-24 22:01:36 -04:00