Don't say the physical block number is invalid if an extent block
fails only the checksum. It passes checks, so it's not invalid.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Fix some gcc-4.8 warnings and other problems that broke the build.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
e2fsck uses an array to store directory usage information during pass
3; the usage context also contains a pointer to the last directory
looked up. When expanding the dir_info array, this cache pointer
needs to be cleared if the array resize changed the pointer location,
or else we'll later walk off the end of this dead pointer.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reported-by: Sami Liedes <sami.liedes@iki.fi>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Sami Liedes reports that e2fsck fails to report the correct directory
inode number during a pass2 check for unexpected HTREE blocks.
Provide the inode number in the problem report.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reported-by: Sami Liedes <sami.liedes@iki.fi>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This lays the groundwork for sparse-checking e2fsprogs for
endianness; defines bitwise types, and fixes up the ext2fs_*
swapping routines to do the proper casts.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Convert all call sites that write zero blocks to disk to use
ext2fs_zero_blocks2() since it can use Linux's zero out feature to do
the writes more quickly. Reclaim the zero buffer at freefs time and
make the write-zeroes fallback use a larger buffer.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If we're using check_plausibility() to try to identify something that
obviously isn't an ext* filesystem and libblkid doesn't know what it
is, try libmagic instead.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If any of these utilities detect a bad superblock magic, call
check_plausibility to see if blkid can identify the passed-in argument
as something else (xfs, partition, etc.) in the hopes of catching a
user error.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
There is no reason to request a aligned buffer in
check_{inode,block}_bitmap, and this will cause failures for dietlibc,
which doesn't have support for posix_memalign() or any other way to
request an aligned memory allocation. Fortunately, this is only
needed in very few places where direct I/O is required.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The asm_types.h file needs to include stdio.h and stdlib.h in order to
get integer types included. So add those includes into jfs_user.h to
avoid a build faliure under dietlibc.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Free the buffer head if the journal descriptor block fails checksum
verification. This has been patched before (see "e2fsck: free bh on
csum verify error in do_one_pass") but apparently the patch was never
committed to jbd2 in the kernel, so when we resync'd the recovery code
with 3.16, the bug came back. Sigh.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
If we find a hole in a directory on a bigalloc filesystem, we need to
obey the cluster alignment rules when collapsing the gap to avoid
later complaints.
Specifically, the calculation of the new logical cluster number was
incorrect, and we need to ensure that the logical cluster alignment
respects the physical cluster alignment, since we've concluded that
the extent's logical block number is wrong.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If in the course of iterating extents we find that an otherwise
valid-seeming second extent maps the same logical blocks as a
previously examined first extent, offer to clear the duplicate
mapping.
The test for this is already in f_extents.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If there isn't space in the root directory to add the lost+found
entry, try expanding the root directory before failing the fsck.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If the badblocks list says that the badblocks inode is bad, it's quite
likely that badblocks is broken. Worse yet, if the root inode is in
the same block as the badblocks inode (likely since they're adjacent),
the filesystem becomes unfixable because pass3 notices the bad root
inode and exits.
So... if we encounter this case, just kill the badblocks inode.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The journal superblock's s_sequence field seems to track the tid of
the tail (oldest) transaction in the log. Therefore, when we release
the journal, set the s_sequence to the tail_sequence, because setting
it to the transaction_sequence means that we're setting the tid to
that of the head of the log. Granted, for replay these two are
usually the same (and s_start == 0 anyway) so thus far we've gotten
lucky and nobody noticed.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Create a journal.c with routines adapted from e2fsck/journal.c to
handle opening and closing the journal, and setting up the
descriptors, and all that. Unlike e2fsck's versions which try to
identify and fix problems, the routines here have no way to repair
anything.
[ Modified by tytso to fold debugfs/jfs_user.h into e2fsck/jfs_user.h,
so we don't have to copy recovery.c and revoke.c into debugfs. --tytso ]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When we're removing the internal journal (broken journal, turning it
off, or adding an external journal), zero s_jnl_blocks so that they
can't be picked up by accident later.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Cc: TR Reardon <thomas_reardon@hotmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Verify the (ext4) superblock checksum of an external journal device
and prompt to correct the checksum if nothing else is wrong with the
superblock.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Cc: TR Reardon <thomas_reardon@hotmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
It turns out that there are some serious problems with the on-disk
format of journal checksum v2. The foremost is that the function to
calculate descriptor tag size returns sizes that are too big. This
causes alignment issues on some architectures and is compounded by the
fact that some parts of jbd2 use the structure size (incorrectly) to
determine the presence of a 64bit journal instead of checking the
feature flags. These errors regrettably lead to the journal
corruption reported by Mr. Reardon.
Therefore, introduce journal checksum v3, which enlarges the
descriptor block tag format to allow for full 32-bit checksums of
journal blocks, fix the journal tag function to return the correct
sizes, and fix the jbd2 recovery code to use feature flags to
determine 64bitness.
Add a few function helpers so we don't have to open-code quite so
many pieces.
Switching to a 16-byte block size was found to increase journal size
overhead by a maximum of 0.1%, to convert a 32-bit journal with no
checksumming to a 32-bit journal with checksum v3 enabled.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reported-by: TR Reardon <thomas_reardon@hotmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If the reallocation of dir_info fails, we will eventually cause e2fsck
to fail with an internal error. So if the realloc fails, print a
message and bail out with a fatal error early when at the time of the
reallocation failure.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When recovering the journal, don't fall into an infinite loop if we
encounter a corrupt journal block. Instead, just skip the block and
proceed with the full filesystem fsck.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Synchronize e2fsck's copy of revoke.c with the kernel's copy in
fs/jbd2.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Synchronize e2fsck's copy of recovery.c with the kernel's copy in
fs/jbd2.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
On big-endian systems, if the dirent swap routine finds a rec_len that
it doesn't like, it continues processing the block as if rec_len == 8.
This means that the name field gets byte swapped, which means that
salvage will not detect the correct name length (unless the name has a
length that's an exact multiple of four bytes), and it'll discard the
entry (unnecessarily) and the rest of the dirent block. Therefore,
swap the rest of the block back to disk order, run salvage, and
re-swap anything after the salvaged dirent.
The test case for this is f_inlinedata_repair if you run it on a BE
system.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
ext2fs_flush2() unconditionally writes the block group descriptors to
disk even if the underlying FS isn't marked dirty. This causes the
following error message on a fsck -n run:
e2fsck 1.43-WIP (09-Jul-2014)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Error writing block 2 (Attempt to write block to filesystem resulted in short write). Ignore error? no
Error writing block 2 (Attempt to write block to filesystem resulted in short write). Ignore error? no
Error writing file system info: Attempt to write block to filesystem resulted in short write
Since ext2fs_close2() only calls flush if the dirty flag is set,
modify e2fsck to exhibit the same behavior so that we don't spit out
write errors for a read only check.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
In an inline directory, the '..' entry is compacted down to just the
inode number; there is no full '..' entry. Therefore, it makes no
sense to assign 'prev' to the fake dotdot entry we put on the stack,
as this could confuse a salvage_directory call on a corrupted next
entry into modifying stack contents (the fake dotdot entry).
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If a file is marked inline_data but its i_size isn't a multiple of
four, it probably isn't an inline directory, because directory entries
have sizes that are multiples of four.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Directory entries must have a size that's a multiple of 4; therefore
the inline directory structure must also have a size that is a muliple
of 4. Since e2fsck doesn't check this, we should check that now.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Now that the directory salvaging operation is fed the block size,
teach pass 2 that it should use the size of the inline data if the
directory is inline_data. Without this, it'll "fix" inline
directories by setting the rec_len to something approaching the FS
blocksize, which is clearly wrong.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If we encounter a directory whose i_size != the inline data size, just
set i_size to the size of the inline data. The pb.last_block
calculation is wrong since pb.last_block == -1, which results in
i_size being set to zero, which corrupts the directory.
Clear the inline_data inode flag if we actually /are/ setting i_size
to zero.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If we come across an inode with the inline data and extents inode flag
set, try to figure out the correct flag settings from the contents of
i_block and i_size. If i_blocks looks like an extent tree head, we'll
make it an extent inode; if it's small enough for inline data, set it
to that. This leaves the weird gray area where there's no extent
tree but it's too big for the inode -- if /could/ be a block map,
change it to that; otherwise, just clear the inode.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Since fifo, socket, and device inodes cannot have inline data or
extents, strip off these flags if we find them.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Inodes with inline_data set do not have iterable blocks, so don't try
to iterate the blocks, because that will just fail, causing e2fsck to
abort.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Since the inline data flag will cause the extent/block map iteration
code to abort fsck early, move the test for the inode flag and the
actual block check call further forward in check_blocks. This
eliminates an e2fsck abort on an inline data symlink when the file ACL
block is set.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Perform some basic checks on inline-data symlinks.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If i_size indicates that an inode requires a system.data extended
attribute to hold overflow from i_blocks but the EA cannot be found,
offer to truncate the file.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>