Commit Graph

314 Commits (a5abfe0382729fba2c5fea6aaae486cb8bc98f00)

Author SHA1 Message Date
Darrick J. Wong a5abfe0382 e2fsck: read-ahead metadata during passes 1, 2, and 4
e2fsck pass1 is modified to use the block group data prefetch function
to try to fetch the inode tables into the pagecache before it is
needed.  We iterate through the blockgroups until we have enough inode
tables that need reading such that we can issue readahead; then we sit
and wait until the last inode table block read of the last group to
start fetching the next bunch.

pass2 is modified to use the dirblock prefetching function to prefetch
the list of directory blocks that are assembled in pass1.  We use the
"iterate a subset of a dblist" and avoid copying the dblist.  Directory
blocks are fetched incrementally as we walk through the directory
block list.  In previous iterations of this patch we would free the
directory blocks after processing, but the performance hit to e2fsck
itself wasn't worth it.  Furthermore, it is anticipated that most
users will then mount the FS and start using the directories, so they
may as well remain in the page cache.

pass4 is modified to prefetch the block and inode bitmaps in
anticipation of pass 5, because pass4 is entirely CPU bound.

In general, these mechanisms can decrease fsck time by 10-40%, if the
host system has sufficient memory and the storage system can provide a
lot of IOPs.  Pretty much any storage system capable of handling
multiple IOs in-flight at any time will see a fairly large performance
boost.  (Single-issue USB mass storage disks seem to suffer badly.)

By default, the readahead buffer size will be set to the size of a block
group's inode table (which is 2MiB for a regular ext4 FS).  The -E
readahead_kb= option can be given to specify the amount of memory to
use for readahead or zero to disable it entirely; or an option can be
given in e2fsck.conf.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-04-21 10:40:21 -04:00
Darrick J. Wong 76761ca221 e2fsck: turn inline data symlink into a fast symlink when possible
When there's a problem accessing the EA part of an inline data symlink
and we want to truncate the symlink back to 60 characters (hoping the
user can re-establish the link later on, apparently) be sure to turn
off the inline data flag to convert the symlink back to a regular fast
symlink.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-04-20 21:48:02 -04:00
Theodore Ts'o 4a05268cf8 Remove compression support
The compression patches were an out-of-kernel patch set that was (a)
only available for ext2, (b) something that was never could be
stablized due to file system corruption, and (c) the most recent
patches were for 3.1, last updated in 2011.

The history of the compression patches has been a bit checkered.
There is a long history here at http://e2compr.sourceforge.net which
lists the perspective of the people working on it from the e2compr
side.

From the ext2/3/4 mainline developers' perspective, initial
compression support was added to e2fsprogs in 2000 (in the Linux 2.2
era), but due to stability concerns the kernel patches were never
merged into the mainline kernel.  While there were some sporadic
efforts to try to get the ext2 compression patches working in the 2.4
and 2.6 era, by that time mainline work had moved on to ext4, and the
e2compr approach could only work with 32-bit block numbers and
indirect mapped files.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-04-12 08:42:40 -04:00
Darrick J. Wong dbb328576d e2fsck: actually fix inline_data flags problems when user says to do so
fix_problem() returning 1 means to fix the fs error, so do that.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-03-29 00:04:46 -04:00
Darrick J. Wong fae2467fb6 libext2fs: ext2fs_new_block2() should call alloc_block hook
If ext2fs_new_block2() is called without a specific block map, we
should call the alloc_block hook before checking fs->block_map.  This
helps us to avoid a bug in e2fsck where we need to allocate a block
but instead of consulting block_found_map, we use the FS bitmaps,
which (prior to pass 5) could be wrong.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-03-28 23:58:20 -04:00
Theodore Ts'o 62ad24802c e2fsck: handle encrypted directories which are indexed using htree
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-03-08 19:09:52 -04:00
Theodore Ts'o dbff534ec6 e2fsck: suppress bad name checks for encrypted directories
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-02-23 17:44:23 -05:00
Theodore Ts'o ad5d05d645 Merge branch 'maint' into next 2015-02-16 10:17:21 -05:00
Darrick J. Wong e274cc39b9 e2fsck: improve the inline directory detector
Strengthen the checks that guess if the inode we're looking at is an
inline directory.  The current check sweeps up any inline inode if
its length is a multiple of four; now we'll at least try to see if
there's the beginning of a valid directory entry.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-01-28 11:37:44 -05:00
Darrick J. Wong 0ac4b3973f e2fsck: inspect inline dir data as two directory blocks
The design of inline directories (apparently) calls for the i_block[]
region and the EA regions to be treated as if they were two separate
blocks of dirents.  Effectively this means that it is impossible for a
directory entry to straddle both areas.  e2fsck doesn't enforce this,
so teach it to do so.  e2fslib already knows to do this....

Cc: Zheng Liu <gnehzuil.liu@gmail.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-01-28 09:00:13 -05:00
Darrick J. Wong 5e61441a40 e2fsck: handle multiple *ind block collisions with critical metadata
An earlier patch tried to detect indirect blocks that conflicted with
critical FS metadata for the purpose of preventing corrections being
made to those indirect blocks.  Unfortunately, that patch cannot
handle more than one conflicting *ind block per file; therefore, use
the ref_block parameter to test the metadata block map to decide if
we need to avoid fixing the *ind block when we're iterating the
block's entries.  (We have to iterate the block to capture any blocks
that the block points to, as they could be in use.)

As a side note, in 1B we'll reallocate all those conflicting *ind
blocks and restart fsck, so the contents will be checked eventually.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-01-27 13:12:59 -05:00
Darrick J. Wong 3ee2946581 e2fsck: clear i_block[] when there are too many bad mappings on a special inode
If we decide to clear a special inode because of bad mappings, we need
to zero the i_block array.  The clearing routine depends on setting
i_links_count to zero to keep us from re-checking the block maps,
but that field isn't checked for special inodes.  Therefore, if we
haven't erased the mappings, check_blocks will restart fsck and fsck
will try to check the blocks again, leading to an infinite loop.

(This seems easy to trigger if the bootloader inode extent map is
corrupted.)

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-01-27 13:12:31 -05:00
Justus Winter 36769c606c e2fsck: fix corruption of Hurd filesystems
Previously, e2fsck accessed the field osd2.linux2.l_i_file_acl_high
field without checking that the filesystem is indeed created for
Linux.  This lead to e2fsck constantly complaining about certain
nodes:

i_file_acl_hi for inode XXX (/dev/console) is 32, should be zero.

By "correcting" this problem, e2fsck would clobber the field
osd2.hurd2.h_i_mode_high.

Properly guard access to the OS dependent fields.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-01-23 10:15:57 -05:00
Darrick J. Wong 250304879a e2fsck: force-reread of inode from disk when re-checking a checksum error
When we're rechecking an inode checksum failure, we need to force the
inode to be re-read from disk so that the verification routine runs,
so drop the stashed inode.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-12-11 17:48:03 -05:00
Darrick J. Wong 08c8e319e3 libext2fs/e2fsck: refactor everyone who writes zero blocks to disk
Convert all call sites that write zero blocks to disk to use
ext2fs_zero_blocks2() since it can use Linux's zero out feature to do
the writes more quickly.  Reclaim the zero buffer at freefs time and
make the write-zeroes fallback use a larger buffer.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-10-13 04:31:17 -04:00
Darrick J. Wong 7ef1b8b424 e2fsck: fix sliding the directory block down on bigalloc
If we find a hole in a directory on a bigalloc filesystem, we need to
obey the cluster alignment rules when collapsing the gap to avoid
later complaints.

Specifically, the calculation of the new logical cluster number was
incorrect, and we need to ensure that the logical cluster alignment
respects the physical cluster alignment, since we've concluded that
the extent's logical block number is wrong.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-18 21:29:21 -04:00
Darrick J. Wong 7936236065 e2fsck: offer to clear overlapping extents
If in the course of iterating extents we find that an otherwise
valid-seeming second extent maps the same logical blocks as a
previously examined first extent, offer to clear the duplicate
mapping.

The test for this is already in f_extents.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-18 21:29:19 -04:00
Darrick J. Wong c6c681632e e2fsck: ignore badblocks if it says badblocks inode is bad
If the badblocks list says that the badblocks inode is bad, it's quite
likely that badblocks is broken.  Worse yet, if the root inode is in
the same block as the badblocks inode (likely since they're adjacent),
the filesystem becomes unfixable because pass3 notices the bad root
inode and exits.

So... if we encounter this case, just kill the badblocks inode.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-11 18:08:26 -04:00
Darrick J. Wong c4c9bc590c misc: fix gcc warnings
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-08-24 12:22:11 -04:00
Darrick J. Wong 4339c6d419 e2fsck: be more careful in assuming inline_data inodes are directories
If a file is marked inline_data but its i_size isn't a multiple of
four, it probably isn't an inline directory, because directory entries
have sizes that are multiples of four.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-08-10 18:49:37 -04:00
Darrick J. Wong ef9c58d572 e2fsck: do a better job of fixing i_size of inline directories
If we encounter a directory whose i_size != the inline data size, just
set i_size to the size of the inline data.  The pb.last_block
calculation is wrong since pb.last_block == -1, which results in
i_size being set to zero, which corrupts the directory.

Clear the inline_data inode flag if we actually /are/ setting i_size
to zero.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-08-10 18:44:07 -04:00
Darrick J. Wong fa441f91f1 e2fsck: fix conflicting extents|inlinedata inode flags
If we come across an inode with the inline data and extents inode flag
set, try to figure out the correct flag settings from the contents of
i_block and i_size.  If i_blocks looks like an extent tree head, we'll
make it an extent inode; if it's small enough for inline data, set it
to that.  This leaves the weird gray area where there's no extent
tree but it's too big for the inode -- if /could/ be a block map,
change it to that; otherwise, just clear the inode.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-08-10 18:43:24 -04:00
Darrick J. Wong 8dae07fb55 e2fsck: clear extents and inline_data flags from fifo/socket/device inodes
Since fifo, socket, and device inodes cannot have inline data or
extents, strip off these flags if we find them.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-08-10 18:41:07 -04:00
Darrick J. Wong 68073429d3 e2fsck: check inline directory data "block" first
Since the inline data flag will cause the extent/block map iteration
code to abort fsck early, move the test for the inode flag and the
actual block check call further forward in check_blocks.  This
eliminates an e2fsck abort on an inline data symlink when the file ACL
block is set.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-08-10 18:39:47 -04:00
Darrick J. Wong 04af897878 e2fsck: handle inline data symlinks
Perform some basic checks on inline-data symlinks.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-08-10 18:38:34 -04:00
Darrick J. Wong 49a3749ade e2fsck: clear inline_data inode flag if EA missing
If i_size indicates that an inode requires a system.data extended
attribute to hold overflow from i_blocks but the EA cannot be found,
offer to truncate the file.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-08-10 18:37:28 -04:00
Darrick J. Wong 78c666b832 e2fsck: check ea-in-inode regions for overlap
Ensure that the various blobs in the in-inode EA region do not overlap.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-08-10 18:34:43 -04:00
Darrick J. Wong 88334ce084 libext2fs/e2fsck: don't run off the end of the EA block
When we're (a) reading EAs into a buffer; (b) byte-swapping EA
entries; or (c) checking EA data, be careful not to run off the end of
the memory buffer, because this causes invalid memory accesses and
e2fsck crashes.  This can happen if we encounter a specially crafted
FS image.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-08-10 18:22:07 -04:00
Darrick J. Wong 6e3c3b7552 e2fsck: always ask to fix an inode that fails checksum verification
If an inode fails checksum verification during pass 1 and the user
doesn't fix or clear the inode as part of the regular inode checks,
ensure that e2fsck remembers to ask the user if he simply wants to
correct the checksum.

We weren't capturing all the ways out of an interation of the inode
scanning loop, which means that not all errors were caught.  Also,
we might as well clear the 'failed csum' flag if we write the inode
directly from the inode scanning loop.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-08-02 22:52:29 -04:00
Darrick J. Wong d4864e0204 e2fsck: disable checksum verification in a few select places
Selectively disable checksum verification in a couple more places:

In check_blocks, disable checksum verification when iterating a block
map because the block map iterator function (re)reads the inode, which
could be unchanged since the scan found that the checksum fails.  We
don't want to abort here; we want to keep evaluating the inode, and we
already know if the inode checksum doesn't match.

Further down in check_blocks when we're trying to see if i_size
matches the amount of data stored in the inode, don't allow checksum
errors when we go looking for the size of inline data.  If the
required attribute is at all find-able in the EA block, we'll fix any
other problems with the EA block later.  In the meantime, we don't
want to be truncating files unnecessarily.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-08-02 22:51:33 -04:00
Darrick J. Wong 68d70624e3 e2fsck: offer to clear inode table blocks that are insane
Add a new behavior flag to the inode scan functions; when specified,
this flag will do some simple sanity checking of entire inode table
blocks.  If all the checksums are ok, we can skip checksum
verification on individual inodes later on.  If more than half of the
inodes look "insane" (bad extent tree root or checksum failure) then
ext2fs_get_next_inode_full() can return a special status code
indicating that what's in the buffer is probably garbage.

When e2fsck' inode scan encounters the 'inode is garbage' return code
it'll offer to zap the inode straightaway instead of trying to recover
anything.  This replaces the previous behavior of asking to zap
anything with a checksum error (strict_csum).

Signed-off-by: Darrick J. Wong <darrick.wong@orale.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-08-02 22:46:16 -04:00
Darrick J. Wong 49fed79e7c e2fsck: try to salvage extent blocks with bad checksums
Remove the code that would zap an extent block immediately if the
checksum failed (i.e. strict_csums).  Instead, we'll only do that if
the extent block header shows obvious structural problems; if the
header checks out, then we'll iterate the block and see if we can
recover some extents.

Requires a minor modification to ext2fs_extent_get such that the
extent block will be returned in the buffer even if the return code
indicates a checksum error.  This brings its behavior in line with
the rest of libext2fs.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-08-02 22:32:11 -04:00
Darrick J. Wong 5b9cbd76df libext2fs: check EA block headers when reading in the block
When reading an EA block in from disk, do a quick sanity check of the
block header, and return an error if we think we have garbage.  Teach
e2fsck to ignore the new error code in favor of doing its own
checking, and remove the strict_csums bits while we're at it.

(Also document some assumptions in the new ext_attr code.)

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-08-02 22:32:11 -04:00
Darrick J. Wong 409f3884b5 e2fsck: never free critical metadata blocks in the block found map
Don't allow critical metadata blocks to be marked free in the block
found map.  This can theoretically happen on an FS where a first
inode's ETB/indirect map block is in the inode table, the first inode
is itself unclonable (and thus gets deleted) and there are enough
crosslinked files before and after the first inode to use up all the
free blocks during pass 1b.

(I do actually have a test FS image but it's 256M and it proved very
difficult to craft a bite-sized test case that actually hit this bug.)

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-08-02 22:18:29 -04:00
Darrick J. Wong ae23dd19d8 e2fsck: don't offer to fix the checksum of fixed extents
If an extent fails checksum and the sanity checks, and the user elects
to fix the extents, don't bother asking (the second time) if the user
would like to fix the checksum.  Refactor some redundant code to make
what's going on a little cleaner.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-07-27 19:51:37 -04:00
Darrick J. Wong 492f901e2d e2fsck: clear badblocks inode when checksum fails
If the badblocks inode fails checksum verification, just clear the
inode and move on.  If we don't do this, we can end up importing a lot
of garbage into the badblocks list, which will then cause fsck to try
to regenerate anything that was sitting atop the supposedly damaged
blocks.  Given that most hardware will remap bad sectors transparently
from ext4, the number of people this could affect adversely is pretty
low.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-07-27 19:42:11 -04:00
Theodore Ts'o 87b9f5e3fe Merge branch 'maint' into next
Conflicts:
	e2fsck/pass1b.c
2014-07-26 16:53:37 -04:00
Darrick J. Wong 9a1d614df2 e2fsck: fix rule-violating lblk->pblk mappings on bigalloc filesystems
As far as I can tell, logical block mappings on a bigalloc filesystem are
supposed to follow a few constraints:

 * The logical cluster offset must match the physical cluster offset.
 * A logical cluster may not map to multiple physical clusters.

Since the multiply-claimed block recovery code can be used to fix these
problems, teach e2fsck to find these transgressions and fix them.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-07-26 16:27:41 -04:00
Darrick J. Wong 5eeb88585f e2fsck: fix merge error in "clear uninit flag on directory extents"
In the original patch (against -next), the hunk to fix uninit dirs was
just prior to the hunk labelled "Corrupt but passes checks?".  The
hunks are ordered this way so that if e2fsck obtains permission to fix
a failed-csum extent (which in turn fixes the checksum), it will not
subsequently ask to (re)fix the checksum.

Due to a merge error the hunk moved to the wrong place, so put it
back.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-07-26 16:03:10 -04:00
Theodore Ts'o 22302aa320 Merge branch 'maint' into next
Conflicts:
	debugfs/debugfs.c
	e2fsck/pass1.c
2014-07-26 15:57:42 -04:00
Darrick J. Wong b729b7dfab e2fsck: reserve blocks for root/lost+found directory repair
If we think we're going to need to repair either the root directory or
the lost+found directory, reserve a block at the end of pass 1 to
reduce the likelihood of an e2fsck abort while reconstructing
root/lost+found during pass 3.

If / and/or /lost+found are corrupt and duplicate processing in pass
1b allocates all the free blocks in the FS, fsck aborts with an
unusable FS since pass 3 can't recreate / or /lost+found.  If either
of those directories are missing, an admin can't easily mount the FS
and access the directory tree to move files off the injured FS and
free up space; this in turn prevents subsequent runs of e2fsck from
being able to continue repairs of the FS.

(One could migrate files manually with debugfs without the help of
path names, but it seems easier if users can simply mount the FS and
use regular FS management tools.)

[ Fixed up an obvious C trap: const char * and const char [] are not
  the same thing when you are taking the size of the parameter.
  People, run your regression tests!  Like spinach, it's good for you.  :-)
  -- tytso ]

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-07-26 15:45:42 -04:00
Darrick J. Wong 97c607b1a2 libext2fs: provide a function to set inode size
Provide an API to set i_size in an inode and take care of all required
feature flag modifications.  Refactor the code to use this new
function.

[ Moved the function to lib/ext2fs/blk_num.c, which is the rest of
  these sorts of functions live, and renamed it to be
  ext2fs_inode_size_set() instead of ext2fs_inode_set_size() to be
  consistent with the other functions in in blk_num.c -- tytso ]

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-07-26 14:34:56 -04:00
Theodore Ts'o e05a05630a Merge branch 'maint' into next
Conflicts:
	e2fsck/pass1.c
	e2fsck/problem.h
2014-07-25 08:58:10 -04:00
Darrick J. Wong 57b7fabc2e e2fsck: clear uninit flag on directory extents
Directories can't have uninitialized extents, so offer to clear the
uninit flag when we find this situation.  The actual directory blocks
will be checked in pass 2 and 3 regardless of the uninit flag.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-07-25 08:50:23 -04:00
Darrick J. Wong c28c2741ba e2fsck: pass2 should not process directory blocks that are impossibly large
Currently, directories cannot be fallocated, which means that the only
way they get bigger is for the kernel to append blocks one by one.
Therefore, if we encounter a logical block offset that is too big, we
needn't bother adding it to the dblist for pass2 processing, because
it's unlikely to contain a valid directory block.  The code that
handles extent based directories also does not add toobig blocks to
the dblist.

Note that we can easily cause e2fsck to fail with ENOMEM if we start
feeding it really large logical block offsets, as the dblist
implementation will try to realloc() an array big enough to hold it.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-07-25 08:41:11 -04:00
Darrick J. Wong 0733835bf7 e2fsck: always submit logical block 0 of a directory for pass 2
Always iterate logical block 0 in a directory, even if no physical
block has been allocated.  Pass 2 will notice the lack of mapping and
offer to allocate a new directory block; this enables us to link the
directory into lost+found.

Previously, if there were no logical blocks mapped, we would fail to
pick up even block 0 of the directory for processing in pass 2.  This
meant that e2fsck never allocated a block 0 and therefore wouldn't fix
the missing . and .. entries for the directory; subsequent e2fsck runs
would complain about (yet never fix) the problem.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-07-25 08:39:45 -04:00
Theodore Ts'o 60203cb171 Merge branch 'maint' into next
Conflicts:
	e2fsck/pass1.c
2014-07-25 08:38:39 -04:00
Darrick J. Wong 9f005a90f8 e2fsck: collapse holes in extent-based directories
If we notice a hole in the block map of an extent-based directory,
offer to collapse the hole by decreasing the logical block # of the
extent.  This saves us from pass 3's inefficient strategy, which fills
the holes by mapping in a lot of empty directory blocks.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-07-25 08:30:11 -04:00
Darrick J. Wong b4f724c8a9 e2fsck: fix off-by-one bounds check on group number
Since fs->group_desc_count is the number of block groups, the number
of the last group is always one less than this count.  Fix the bounds
check to reflect that.

This flaw shouldn't have any user-visible side effects, since the
block bitmap test based on last_grp later on can handle overbig block
numbers.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-07-24 22:19:27 -04:00
Darrick J. Wong b4a4088338 e2fsck: force all block allocations to use block_found_map
During the later passes of efsck, we sometimes need to allocate and
map blocks into a file.  This can happen either by fsck directly
calling new_block() or indirectly by the library calling new_block
because it needs to allocate a block for lower level metadata (bmap2()
with BMAP_SET; block_iterate3() with BLOCK_CHANGED).

We need to force new_block to allocate blocks from the found block
map, because the FS block map could be inaccurate for various reasons:
the map is wrong, there are missing blocks, the checksum failed, etc.

Therefore, any time fsck does something that could to allocate blocks,
we need to intercept allocation requests so that they're sourced from
the found block map.  Remove the previous code that swapped bitmap
pointers as this is now unneeded.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-07-24 22:16:59 -04:00