Add tests to ensure that we know how to recover journals with the
csum_v2 feature set.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Test that we can write and replay transactions with the old journal
checksum algorithm.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Simple tests for the 32bit journal transaction creation code when
journal and metadata_csum are enabled. We test the following:
(a) writing and replaying transactions with multiple
descriptor blocks
(b) same, but with multiple revoke blocks.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Simple tests for the 64bit journal transaction creation code when
journal and metadata_csum are enabled. We test writing (bad) block
bitmaps out through the journal and replaying them via fsck, with a
few twists:
(a) All bitmaps are committed (fs errors reported)
(b) All the bitmap blocks are revoked (no errors)
(c) The transaction is never committed (no errors)
(d) Same as (a), but debugfs gets to do the replay.
We also test:
(a) writing and replaying transactions with multiple
descriptor blocks
(b) same, but with multiple revoke blocks.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Simple tests for the journal transaction creation code. We test
writing (bad) block bitmaps out through the journal and replaying them
via fsck, with a few twists:
(a) All bitmaps are committed (fs errors reported)
(b) All the bitmap blocks are revoked (no errors)
(c) The transaction is never committed (no errors)
(d) Same as (a), but debugfs gets to do the replay.
We also test:
(a) writing and replaying transactions with multiple
descriptor blocks
(b) same, but with multiple revoke blocks.
(c) adding the 64bit flag to a journal
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Display the feature flags of an external journal.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reported-by: TR Reardon <thomas_reardon@hotmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Spit out a more specific error if someone tries to modify an
external journal device.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reported-by: TR Reardon <thomas_reardon@hotmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Verify the (ext4) superblock checksum of an external journal device
and prompt to correct the checksum if nothing else is wrong with the
superblock.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Cc: TR Reardon <thomas_reardon@hotmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Enable mke2fs to create an external journal device with a superblock
checksum.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Cc: TR Reardon <thomas_reardon@hotmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When creating a journal inode, check the return value from
block_iterate3() because otherwise we fail to capture errors such as
being unable to allocate an extent tree block, which leads to e2fsck
creating broken journals.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Also add the convenience macro $CLEAN_OUTPUT in test_config which can
be used to run the "sed -e $cmd_dir/filter.sed" command to clean up
e2fsprogs command output before comparing with the expected golden
output.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Add to ext2fs_symlink the ability to create inline data symlinks.
[ Modified by tytso to add more logging to the test script ]
Suggested-by: Pu Hou <houpu.hp@alibaba-inc.com>
Cc: Pu Hou <houpu.hp@alibaba-inc.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The following tests were using md5sum: i_e2image, u_mke2fs, and
u_tune2fs. Convert them to use crcsum for better portability (not all
environments have md5sum; some might have sha1sum instead :-)
For our purposes crcsum is quite sufficient.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
ext2fs_flush2() unconditionally writes the block group descriptors to
disk even if the underlying FS isn't marked dirty. This causes the
following error message on a fsck -n run:
e2fsck 1.43-WIP (09-Jul-2014)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Error writing block 2 (Attempt to write block to filesystem resulted in short write). Ignore error? no
Error writing block 2 (Attempt to write block to filesystem resulted in short write). Ignore error? no
Error writing file system info: Attempt to write block to filesystem resulted in short write
Since ext2fs_close2() only calls flush if the dirty flag is set,
modify e2fsck to exhibit the same behavior so that we don't spit out
write errors for a read only check.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Add a regression test to ensure that previous patches' fixes to e2fsck
do not revert.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If we come across an inode with the inline data and extents inode flag
set, try to figure out the correct flag settings from the contents of
i_block and i_size. If i_blocks looks like an extent tree head, we'll
make it an extent inode; if it's small enough for inline data, set it
to that. This leaves the weird gray area where there's no extent
tree but it's too big for the inode -- if /could/ be a block map,
change it to that; otherwise, just clear the inode.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Since fifo, socket, and device inodes cannot have inline data or
extents, strip off these flags if we find them.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Ensure that the various blobs in the in-inode EA region do not overlap.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Fix up the rest of the inline data code not to complain if there's no
EA, since it's possible that there's no EA because we're in the
process of creating an inline data file. Also, don't return an error
code when removing a nonexistent EA, because there's no reason to.
Furthermore, if we write less than 60 bytes of inline data, remove the
EA to avoid wasting space.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
In pass 3, convert the "delete files and re-run e2fsck" message to a
proper error code for more consistent error reporting and to make
translation easier.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This test checks to make sure resize2fs can properly handle a file
system which started life as a normal ext4 file system and then was
grown to a size where meta_bg was enabled, and then shrunk back below
the point where the meta_bg format is still needed.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The test verifies that e2fsck can properly fix a file system where the
value of s_first_meta_bg in the superblock is larger than the number
of block group descriptors in the file system. E2fsck will fix this
by clearing the meta_bg feature.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Place the allocation bitmaps and inode table blocks so they are
adjacent, even in the last flexbg.
Previously, after running "mke2fs -t ext4 DEV 286720", the layout of
the last few block groups would look like this:
Group 32: (Blocks 262145-270336) [INODE_UNINIT, ITABLE_ZEROED]
Block bitmap at 262145 (+0), Inode bitmap at 262161 (+16)
Inode table at 262177-262432 (+32)
Group 33: (Blocks 270337-278528) [INODE_UNINIT, BLOCK_UNINIT, ITABLE_ZEROED]
Block bitmap at 262146 (bg #32 + 1), Inode bitmap at 262162 (bg #32 + 17)
Inode table at 262433-262688 (bg #32 + 288)
Group 34: (Blocks 278529-286719) [INODE_UNINIT, ITABLE_ZEROED]
Block bitmap at 262147 (bg #32 + 2), Inode bitmap at 262163 (bg #32 + 18)
Inode table at 262689-262944 (bg #32 + 544)
Now, they look like this:
Group 32: (Blocks 262145-270336) [INODE_UNINIT, ITABLE_ZEROED]
Block bitmap at 262145 (+0), Inode bitmap at 262148 (+3)
Inode table at 262151-262406 (+6)
Group 33: (Blocks 270337-278528) [INODE_UNINIT, BLOCK_UNINIT, ITABLE_ZEROED]
Block bitmap at 262146 (bg #32 + 1), Inode bitmap at 262149 (bg #32 + 4)
Inode table at 262407-262662 (bg #32 + 262)
Group 34: (Blocks 278529-286719) [INODE_UNINIT, ITABLE_ZEROED]
Block bitmap at 262147 (bg #32 + 2), Inode bitmap at 262150 (bg #32 + 5)
Inode table at 262663-262918 (bg #32 + 518)
This reduces the free space fragmentation in a freshly created file
system. It also allows the following mke2fs command to succeed:
mke2fs -t ext4 -b 4096 -O ^resize_inode -G $((2**20)) DEV 2130483
(Note that while this allows people to run mke2fs with insanely large
flexbg sizes, this is not a recommended practice, as the kernel may
refuse to resize such a file system while mounted, since it currently
tries to allocate an in-memory data structure based on the size of the
flexbg, and so a file system with a very large flexbg size will cause
the memory allocation to fail. This will hopefully be fixed in a
future kernel release, but if the goal is to force all of the metadata
blocks to be at the beginning of the file system, it's better to use
the packed_meta_blocks configuration parameter in mke2fs.conf.)
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Add regression tests to e2fsck to examine how it deals with inode
table blocks which (a) have been zero'd; (b) have been one'd; (c) have
corrupt inodes with obvious problems; and (d) have inodes with
non-obvious problems.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Add tests to examine how e2fsck deals with (a) the block bitmap being
corrupt; (b) the inode bitmap being corrupt; (c) the bitmap checksums
being incorrect (but the bitmaps are fine); and (d) the group
descriptor checksum itself is incorrect.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Add regression tests to examine how e2fsck deals with random
superblock corruption such as obviously wrong fields and the checksum
itself being incorrect.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Add regression tests to examine how e2fsck deals with MMP blocks with
(a) a bad magic number; and (b) an incorrect checksum.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Add some regression tests to examine how e2fsck handles directory
entry blocks and htree blocks with (a) malformed directory entries;
(b) incorrect checksums; or (c) obviously garbage entries.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Add some regression tests to examine how e2fsck deals with (a) extent
blocks with only a bad checksum; (b) extent blocks with a bad magic
number; and (c) extent entries with corruption.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Add regression tests for e2fsck dealing with (a) EA block with a bad
checksum; (b) EA block with a bad magic number; and (c) EA block with
damage that isn't otherwise noticeable.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If an inode fails checksum verification during pass 1 and the user
doesn't fix or clear the inode as part of the regular inode checks,
ensure that e2fsck remembers to ask the user if he simply wants to
correct the checksum.
We weren't capturing all the ways out of an interation of the inode
scanning loop, which means that not all errors were caught. Also,
we might as well clear the 'failed csum' flag if we write the inode
directly from the inode scanning loop.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If an inode fails checksum verification, don't stuff a copy of it in
the inode cache, because this can cause the library to fail to return
the "corrupt inode" error code.
In general, this happens if ext2fs_read_inode_full() is called twice
on an inode with an incorrect checksum. If fs->flags has
EXT2_FLAG_IGNORE_CSUM_ERRORS set during the first call and *unset*
during the second call, the cache hit during the second call fails to
return EXT2_ET_INODE_CSUM_INVALID as you'd expect. This happens
during fsck because the first read_inode call happens as part of
check_blocks and the second call happens during inode checksum
revalidation. A file system with a slightly corrupt non-extent inode
will trigger this.
While we're at it, make the inode read function consistent with the
rest of libext2fs -- copy the metadata object into the caller's buffer
even if it fails checksum verification. This will help e2fsck avoid a
double re-read later on down the line.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If we're totally unable to allocate a lost+found directory, ask the
user if he would like to dump orphaned files in the root directory.
Hopefully this enables the user to delete enough files so that a
subsequent run of e2fsck will make more progress. Better to cram lost
files in the rootdir than the current behavior, which is to fail at
linking them in, thereby leaving them as lost files.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The f_badcluster output format depends on how libreadline formats
and outputs the commands read from stdin. Instead of trying to
handle these differences, use an input command file, which does
not depend on external components to be consistent.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This should have been part of commit 9a1d614df2 ("e2fsck: fix
rule-violating lblk->pblk mappings on bigalloc filesystems") but it
accidentally got dropped when the patch was applied.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Fix the routine that adds dirent checksum structures to the directory
block to handle oddball situations a bit more robustly.
First, when we're walking the entry array, we might encounter an
entry that ends exactly one byte before where the checksum entry needs
to start, i.e. there's space for the tail entry, but it needs to be
reinitialized. When that happens, we should proceed until d points to
that space so that the tail entry can be initialized.
Second, it's possible that we've been fed a directory block where the
entries end just short of the end of the block. In this case, we need
to adjust the size of the last entry to point exactly to where the
dirent tail starts. The current code requires that entries end
exactly on the block boundary, but this is not always the case with
damaged filesystems.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If we trash the root directory block, e2fsck will find inode 11 (the
old lost+found) and try to attach it to l+f. The lost+found checker
also fails to find l+f and tries to add one to the root dir. The root
dir is not found but is recreated with incorrect checksums, so linking
in the l+f dir fails and the l+f '..' entry isn't set. Since both
dirs now fail checksum verification, they're both referred to rehash
to have that fixed, but because l+f doesn't have a '..' entry, rehash
crashes because l+f has < 2 entries.
On a checksumming filesystem, the routines in e2fsck that recreate
/lost+found and / must write the new directory block *after* the inode
has been written to disk because the checksum depends on i_generation.
Add a regression test while we're at it.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If we think we're going to need to repair either the root directory or
the lost+found directory, reserve a block at the end of pass 1 to
reduce the likelihood of an e2fsck abort while reconstructing
root/lost+found during pass 3.
If / and/or /lost+found are corrupt and duplicate processing in pass
1b allocates all the free blocks in the FS, fsck aborts with an
unusable FS since pass 3 can't recreate / or /lost+found. If either
of those directories are missing, an admin can't easily mount the FS
and access the directory tree to move files off the injured FS and
free up space; this in turn prevents subsequent runs of e2fsck from
being able to continue repairs of the FS.
(One could migrate files manually with debugfs without the help of
path names, but it seems easier if users can simply mount the FS and
use regular FS management tools.)
[ Fixed up an obvious C trap: const char * and const char [] are not
the same thing when you are taking the size of the parameter.
People, run your regression tests! Like spinach, it's good for you. :-)
-- tytso ]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Provide an API to set i_size in an inode and take care of all required
feature flag modifications. Refactor the code to use this new
function.
[ Moved the function to lib/ext2fs/blk_num.c, which is the rest of
these sorts of functions live, and renamed it to be
ext2fs_inode_size_set() instead of ext2fs_inode_set_size() to be
consistent with the other functions in in blk_num.c -- tytso ]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>