Since fifo, socket, and device inodes cannot have inline data or
extents, strip off these flags if we find them.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If i_size indicates that an inode requires a system.data extended
attribute to hold overflow from i_blocks but the EA cannot be found,
offer to truncate the file.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Ensure that the various blobs in the in-inode EA region do not overlap.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
In pass 3, convert the "delete files and re-run e2fsck" message to a
proper error code for more consistent error reporting and to make
translation easier.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Add a new behavior flag to the inode scan functions; when specified,
this flag will do some simple sanity checking of entire inode table
blocks. If all the checksums are ok, we can skip checksum
verification on individual inodes later on. If more than half of the
inodes look "insane" (bad extent tree root or checksum failure) then
ext2fs_get_next_inode_full() can return a special status code
indicating that what's in the buffer is probably garbage.
When e2fsck' inode scan encounters the 'inode is garbage' return code
it'll offer to zap the inode straightaway instead of trying to recover
anything. This replaces the previous behavior of asking to zap
anything with a checksum error (strict_csum).
Signed-off-by: Darrick J. Wong <darrick.wong@orale.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Remove the code that would prompt the user to zap directory entry
blocks with bad checksums (i.e. strict_csums). Instead, we'll run the
directory entries through the usual repair routines in an attempt to
save whatever we can. At the same time, refactor the code that
schedules the repair of missing dirblock checksum entries.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Remove the code that would zap an extent block immediately if the
checksum failed (i.e. strict_csums). Instead, we'll only do that if
the extent block header shows obvious structural problems; if the
header checks out, then we'll iterate the block and see if we can
recover some extents.
Requires a minor modification to ext2fs_extent_get such that the
extent block will be returned in the buffer even if the return code
indicates a checksum error. This brings its behavior in line with
the rest of libext2fs.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When reading an EA block in from disk, do a quick sanity check of the
block header, and return an error if we think we have garbage. Teach
e2fsck to ignore the new error code in favor of doing its own
checking, and remove the strict_csums bits while we're at it.
(Also document some assumptions in the new ext_attr code.)
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If we're totally unable to allocate a lost+found directory, ask the
user if he would like to dump orphaned files in the root directory.
Hopefully this enables the user to delete enough files so that a
subsequent run of e2fsck will make more progress. Better to cram lost
files in the rootdir than the current behavior, which is to fail at
linking them in, thereby leaving them as lost files.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
As far as I can tell, logical block mappings on a bigalloc filesystem are
supposed to follow a few constraints:
* The logical cluster offset must match the physical cluster offset.
* A logical cluster may not map to multiple physical clusters.
Since the multiply-claimed block recovery code can be used to fix these
problems, teach e2fsck to find these transgressions and fix them.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Directories can't have uninitialized extents, so offer to clear the
uninit flag when we find this situation. The actual directory blocks
will be checked in pass 2 and 3 regardless of the uninit flag.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If we notice a hole in the block map of an extent-based directory,
offer to collapse the hole by decreasing the logical block # of the
extent. This saves us from pass 3's inefficient strategy, which fills
the holes by mapping in a lot of empty directory blocks.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If we encounter an inode with IND/DIND/TIND blocks or internal extent
tree blocks that point into critical FS metadata such as the
superblock, the group descriptors, the bitmaps, or the inode table,
it's quite possible that the validation code for those blocks is not
going to like what it finds, and it'll ask to try to fix the block.
Unfortunately, this happens before duplicate block processing (pass
1b), which means that we can end up doing stupid things like writing
extent blocks into the inode table, which multiplies e2fsck'
destructive effect and can render a filesystem unfixable.
To solve this, create a bitmap of all the critical FS metadata. If
before pass1b runs (basically check_blocks) we find a metadata block
that points into these critical regions, continue processing that
block, but avoid making any modifications, because we could be
misinterpreting inodes as block maps. Pass 1b will find the
multiply-owned blocks and fix that situation, which means that we can
then restart e2fsck from the beginning and actually fix whatever
problems we find.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When pass1 finds blocks that are mapped to multiple files, it will
print every duplicated block. If there are long sequences of
duplicate blocks (e.g. the e_pblk field is wrong in an extent), this
can cause a gigantic flood of output when a range could convey the
same information. Therefore, teach pass1b to print ranges when
possible.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
In e2fsck_expand_directory() we don't handle a dir with inline data
because when this function is called the directory inode shouldn't
contains inline data.
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Since it's impossible to address all blocks of a 64bit filesystem
without extents, have e2fsck turn on the feature if it finds (64bit &&
!extents).
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
e2fsck does not detect extents which are outside their location in the
extent tree. This can result in a bad extent at the end of an extent-block
not being detected.
From a part of a dump_extents output:
1/ 2 37/ 68 143960 - 146679 123826181 2720
2/ 2 1/ 2 143960 - 146679 123785816 - 123788535 2720
2/ 2 2/ 2 146680 - 147583 123788536 - 123789439 904 Uninit <-bad extent
1/ 2 38/ 68 146680 - 149391 123826182 2712
2/ 2 1/ 2 146680 - 147583 18486 - 19389 904
2/ 2 2/ 2 147584 - 149391 123789440 - 123791247 1808
e2fsck does not detect this bad extent which both overlaps another, valid
extent, and is invalid by being beyond the end of the extent above it in
the tree.
This patch modifies e2fsck to detect this invalid extent and remove it.
Signed-off-by: David Jeffery <djeffery@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
An index node's logical start (ei_block) should
match the logical start of the first node (index
or leaf) below it. If we find a node whose start
does not match its parent, fix all of its parents
accordingly.
If it finds such a problem, we'll see:
Pass 1: Checking inodes, blocks, and sizes
Interior extent node level 0 of inode 274258:
Logical start 3666 does not match logical start 4093 at next level. Fix<y>?
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Check and handle MMP checksum problems by resetting the block.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Use the helper function to determine if group descriptors have a
checksum. Ensure that metadata_csum and uninit_bg flags are not set
simultaneously, as part of pass 0.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Verify the checksums of separate extended attribute blocks and offer
to clear it if there is a mismatch.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Checks that directory leaf blocks have the necessary fake dir_entry at
the end of the block to hold a checksum and that the checksum is
valid. It will resize the block and/or rebuild the directory if
necessary.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Check htree internal node checksums. If broken, ask user to clear
the htree index and recreate it later.
[ Move the check for not rehashing the lost+found directory to pass1
so that we don't end up truncating lost+found when the metadata
checksum feature is enabled. -- TYT ]
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
When we encounter an extent tree block that passes the header check
but fails the checksum, offer to clear just that extent block instead
of failing the whole tree, which results in the entire inode being
wiped out.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Check block bitmap checksum and write a new checksum if the
verification fails. This is ok because e2fsck has already computed
the correct block bitmap.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Rewrite the block bitmap when the checksum doesn't match. This is
ok since e2fsck will have already computed the correct inode bitmap.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Detect mismatches of the inode and checksum, and prompt the user to
fix the situation.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Currently fsck recomputes quotas and overwrites quota files
whenever its run. This causes unnecessary modification of
filesystem even when quotas were never inconsistent. We also
lose the limits information because of this. With this patch,
e2fsck compares the computed quotas to the on-disk quotas
(while updating the in-memory limits) and writes out the
quota inode only if it is inconsistent.
Signed-off-by: Aditya Kali <adityakali@google.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
We've decided to remove EOFBLOCKS_FL from the ext4 file system entirely,
because it is not actually very useful and it is causing more problems
than it solves. We're going to remove it from e2fsprogs first and then
after the new e2fsprogs version is common enough we can remove the
kernel part as well.
This commit changes e2fsck to not check for EOFBLOCKS_FL. Instead we
simply search for initialized extents past the i_size as this should not
happen. Uninitialized extents can be past the i_size as we can do
fallocate with KEEP_SIZE flag.
Also remove the EXT4_EOFBLOCKS_FL from lib/ext2fs/ext2_fs.h since it is
no longer needed.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Print the actual errors returned by ext2fs_open2() and
ext2fs_check_desc() before we fall back to the backup block group
descriptors so that it's easier to see if there is some obscure
failure that is causing e2fsck to think that it should use the backup
block group descriptors.
Addresses-Google-Bug: #6208183
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Add the ability to log messages about a file system to a specified
directory, using a file name templace that can be specified in
/etc/e2fsck.conf. This allows us to suppress the output of overly
verbose e2fsck outputs while still allowing the full logging output to
go to an appropriate file.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
If an extent has e_len set to zero, the kernel will oops with a
BUG_ON. Unfortunately, e2fsck wasn't catching this case. The kernel
needs to be fixed to notice this case and call ext4_error() instead of
failing an assertion check, but e2fsck should catch this case and
repair it (by deleting the errant extent).
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Removing this check will allow us to eventually eliminate code from
the kernel which forcibly initialized the block bitmap when the inode
bitmap is first used. This would eliminate a required journal credit
and extra disk write.
Addresses-Google-Bug: #5944440
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
In some cases the bad block inode gets corrupted. If it looks insane,
offer to clear it before trying to interpret it does more harm than
good.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Multi-mount protection is feature that allows mke2fs, e2fsck, and
others to detect if the filesystem is mounted on a remote node (on
SAN disks) and avoid corrupting the filesystem. For e2fsprogs this
means that it checks the MMP block to see if the filesystem is in use,
and marks the filesystem busy while e2fsck is running on the system.
This is useful on SAN disks that are shared between high-availability
servers, or accessible by multiple nodes that aren't in HA pairs. MMP
isn't intended to serve as a primary HA exclusion mechanism, but as a
failsafe to protect against user, software, or hardware errors.
There is no requirement that e2fsck updates the MMP block at regular
intervals, but e2fsck does this occasionally to provide useful
information to the sysadmin in case of a detected conflict.
For the kernel (since Linux 3.0) MMP adds a "heartbeat" mechanism to
periodically write to disk (every few seconds by default) to notify
other nodes that the filesystem is still in use and unsafe to modify.
Originally-by: Kalpak Shah <kalpak@clusterfs.com>
Signed-off-by: Johann Lombardi <johann@whamcloud.com>
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This patch adds support for doing quota accounting during full
e2fsck scan if the 'quota' feature was set on the superblock.
If user-visible quota inodes are in use, they will be hidden
and converted to the reserved quota inodes.
Signed-off-by: Aditya Kali <adityakali@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Commit 2a77a784a3 (firest released in e2fsprogs 1.33) compared
superblock summary free blocks and inode counts with the allocation
bitmap counts before starting the file system check proper, and if
they differed, set the superblock and marked it as dirty. If no other
file systme changes were required, this would cause a "*** FILE SYSTEM
WAS MODIFIED ***" message without any explanation of what e2fsck had
changed.
We fix this by only setting the superblock summary free block/inodes
counts if we are skipping a full check, and in non-preen mode, e2fsck
will now print an explicit message stating how the superblock had been
updated.
In a full check, any updates to the superblock free blocks/inodes
fields will be noted in pass5.
This change requires changing a few test results (essentially
reversing the changes made in commit 2a77a784a3).
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
There were a number of problems that were prompting the user whether
or not to ABORT, but then would abort regardless of whether the user
answered yes or no. Change those to be PROMPT_NONE, PR_FATAL.
Also, fix PR_1_RESIZE_INODE_CREATE so that it recovers appropriately
after failing to create the resize inode. This problem now uses
PROMPT_CONTINUE instead of PROMPT_ABORT, and if the user says, "no",
the code will abort.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Some kernels will crash if EOFBLOCKS_FL is set when it is it not
needed, and this if it is left set when it isn't needed, it is a sign
of a kernel bug.
Addresses-Google-Bug: #2604224
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>