In practice, it is **extremely** rare for users to try to use more
than the first backup superblock located at the beginning of block
group #1. (i.e., at block number 32768 for file systems with a 4k
block size). This new compat feature restricts the backup superblock
to block group #1 and the last block group in the file system.
Aside from reducing the overhead of the file system by a small number
of blocks, by eliminating the rest of the backup superblocks, it
allows us to have a much more flexible metadata layout. For example,
we can force all of the allocation bitmaps and inode table blocks to
the beginning of the disk, which allows most of the disk to be
exclusively used for contiguous data blocks.
This simplifies taking advantage of certain HDD specific features,
such as Shingled Magnetic Recording (aka Shingled Drives), and the
TCG's OPAL Storage Specification where having a simple mapping between
LBA block ranges and the data blocks used by the file system can make
life much simpler.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Mostly by adding static and removing excess extern qualifiers. Also
convert a few remaining non-ANSI function declarations to ANSI.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Compiling with LLVM generates a large number of warnings due
to the use of _() for wrapping strings for i18n:
warning: format string is not a string literal
(potentially insecure) [-Wformat-security]
./nls-enable.h:4:14: note: expanded from macro '_'
#define _(a) (gettext (a))
^~~~~~~~~~~~
These warnings are fixed by using "%s" as the format string,
and then _() is used as the string argument.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
If the file system is being shrunk, and a block group's inode table
falls beyond the end of the inode table, we need to try to relocate
the inode table blocks.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
If the file system's inode table blocks in the last block group are
located in the middle or the end of the block group, it's possible for
resize2fs -M to use a size which will require relocating the inode
table blocks in the last block group. This can lead to all sorts of
problems, so solve it by simply guaranteeing that we will never do
that.
Reported-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
free_gdp_blocks needs to be taught to use 64-bit fields and the appropriate
getters, otherwise it'll truncate high block numbers (when, say, resizing a
>16T fs) and mark the low numbered group descriptor blocks as free. Yikes.
Reported-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
It is possible to have a flex_bg filesystem with block groups
which have inode & block bitmaps at some point well past the
start of the group.
If an offline shrink puts the new size somewhere between
the start of the block group and the (old) location of
the bitmaps, they can be left beyond the end of the filesystem,
i.e. result in fs corruption.
Check each remaining block group for whether its bitmaps
are beyond the end of the new filesystem, and reallocate
them in a new location if needed.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When doing an off-line resize2fs of an initially very small file
system, it's possible to run out of reserved gdt blocks (which are
reserved via the resize inode). Once we run out, we need to move the
allocation bitmaps and inode table out of the way to grow the gdt
blocks. Unfortunately, when moving these metadata blocks, it was
possible that a block that had been just been newly allocated for a
new block group could also get allocated for a metadata block for an
existing block group that was being moved.
To prevent this, after we grow the gdt blocks and allocate the
metadata blocks for the new block groups, make sure all of these
blocks are marked as reserved.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reported-by: John Jolly <john.jolly@gmail.com>
Fixes resize2fs so it correctly calculates the number of free clusters
in each block group for file systems with the bigalloc feature
enabled.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
There are a number of places where we multiply a dgrp_t with
s_blocks_per_group expecting that we will get a blk64_t. This
requires a cast, or using the convenience function
ext2fs_group_first_block2().
This audit was suggested by Eric Sandeen.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Use ext2fs_[un]mark_block_range2() functions to reduce the CPU
overhead of resizing large file systems by 45%, primarily by
reducing the time spent in fix_uninit_block_bitmaps().
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Add a new debug flag which prints how much time is consumed by the
various parts of resize2fs's processing.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This caused the free blocks count in the superblock to be incorrect
after resizing a 64-bit file system if the number of free blocks
overflowed a 32-bit value.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Fix a 32-bit overflow bug caused by a missing blk64_t cast which can
cause the block bitmap to get corrupted when doing an off-line resize
of a 64-bit file system.
This problem can be reproduced as follows:
rm -f foo.img; touch foo.img
truncate -s 8T foo.img
mke2fs -F -t ext4 -O 64bit foo.img
e2fsck -f foo.img
truncate -s 21T foo.img
resize2fs foo.img
e2fsck -fy foo.img
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Now that we are reserving all of the bg-specific metadata before we
try to allocate the metadata for the new block groups, we don't have
to temporarily disable the flex_bg feature flag while we allocate the
new metadata blocks --- this allows the newly created block groups to
have a much more optimized layout, instead of fragmenting the inode
table and block/inode bitmaps in sepraate block groups.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
With flex_bg file systems, bg-specific metadata (i.e., bitmaps and the
inode table blocks) can be located in another block group. Hence,
when we grow the number of block group descriptors, we need to check
if we need to relocate metadata blocks not just for the block group
where the bgd blocks are located, but in all block groups.
This change fixes the following test case:
rm -f foo.img; touch foo.img
truncate -s 32G foo.img
mke2fs -F -t ext4 -E resize=12582912 foo.img
e2fsck -f foo.img
truncate -s 256G foo.img
./resize2fs foo.img
e2fsck -fy foo.img
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
For flex_bg file systems, if we need to relocate an allocation bitmap
or inode table, we need to make sure that all metadata blocks have
been reserved, lest we end up overwriting a metadata block belonging
to a different block group.
This change fixes the following test case:
rm -f foo.img; touch foo.img
truncate -s 32G foo.img
mke2fs -F -t ext4 -E resize=12582912 foo.img
e2fsck -f foo.img
truncate -s 64G foo.img
./resize2fs foo.img
e2fsck -fy foo.img
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This is the first commit to add support for off-line resizing using
flex_bg without the assist of using the resize_inode to reserve gdt
blocks. This functionality has been broken up into separate commits
which are hopefully obviously correct to make them easier to review
for correctness.
In this first step, we break up the for loop at the end of
blocks_to_move() so that we first mark all of the metadata blocks
which don't need to be moved in the reserve_blocks bitmap, and then
try to allocate the metadata blocks are new or which need to moved
second.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
One of these fixes was triggering failures when running:
./test_scripts --valgrind r_move_itable r_inline_xattr r_resize_inode
It should be a false positive, but it fixing this makes it easier to
see real problems.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
If the uninit_bg feature is enabled and the kernel supports
lazy_itable_init, skip zeroing the inode table so that the resize
operation can go much more quickly. Also set the itable_unused fields
so that the first e2fsck after the resize will run faster.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The file system overhead calculation in calculate_minimum_resize_size
was incorrect meta_bg file systems. This caused the minimum size to
underflow for very large file systems, which threw resize2fs into a
loop generally lasted longer than the user's patience.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
calculate_minimum_resize_size() forgot to account s_first_data_block
into minimum filesystem size. Thus in case the size of filesystem was
such that the last group had the minimal size (50 blocks + metadata
overhead), the code in adjust_fs_info() decided the group is unneeded,
removed it, and in some cases the resizing then failed with ENOSPC.
Fix the issue by properly accounting for s_first_data_block in
calculate_minimum_resize_size().
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The ext2fs_file_acl_block() and ext2fs_set_file_acl_block() needs to
only check i_file_acl_high if the 64-bit flag is set. This is needed
because otherwise we will run into problems on Hurd systems which
actually use that field for h_i_mode_high.
This involves an ABI change since we need to pass ext2_filsys to these
functions. Fortunately these functions were first included in the
1.42-WIP series, so it's OK for us to change them now. (This is why
we have 1.42-WIP releases. :-)
Addresses-Sourceforge-Bug: #3379227
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This commit fixes a failure when running the commands:
dd if=/dev/zero of=fs bs=1k count=100k; mke2fs fs; resize2fs -Mp fs
We should not try truncating the file system if there is only a single
block group in the file system.
Addresses-Sourceforge-Bug: #3404051
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The DEFS line in MCONFIG had gotten so long that it exceeded 4k, and
this was starting to cause some tools heartburn. It also made "make
V=1" almost useless, since trying to following the individual commands
run by make was lost in the noise of all of the defines.
So fix this by putting the configure-generated defines in lib/config.h
and the directory pathnames to lib/dirpaths.h.
In addition, clean up some vestigal defines in configure.in and in the
Makefiles to further shorten the cc command lines.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Code to count the number of blocks in the last partial
group is cut and pasted around the e2fsprogs codebase
a few times.
Making this a helper function should improve matters.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The write_journal_inode() code is only setting the low 32-bit i_size
for the journal size, even though it is possible to specify a journal
up to 10M blocks in size. Trying to create a journal larger than 2GB
will succeed, but an immediate e2fsck would fail. Store i_size_high
for the journal inode when creating it, and load it upon access.
Use s_jnl_blocks[15] to store the journal i_size_high backup. This
field is currently unused, as EXT2_N_BLOCKS is 15, so it is using
s_jnl_blocks[0..14], and i_size is in s_jnl_blocks[16].
Rename the "size" argument "num_blocks" for the journal creation functions
to clarify this parameter is in units of filesystem blocks and not bytes.
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
I ran into odd behavior where mkfs.ext4 of a 16T filesystem would
create a resize inode with 0 reserved blocks, and mark the resize_inode
feature.
A subsequent slight downward resize of the filesystem would remove
the resize inode, making any further offline resizing impossible.
This is especially odd in light of the fact that a large downward
resize (say, to 8T) will actually add blocks to the resize inode -
so a small resize removes it, a large resize expands it ...
commit 8ade268cf2 had added this:
If the filesystem is grown to the point where the resize_inode is no
longer needed, clean it up properly so e2fsck doesn't have to.
but, it seems e2fsck does not care about this situation, either.
So, simply leave the resize_inode intact in this case, and everything
seems to be happy.
Note, this is for the 1.41.xx branch.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Commit 74128f8 added tests for uninit groups, but it could access past
the end of the group_desc[] array after processing the last group:
==19668== Invalid read of size 2
==19668== at 0x40518C: resize_fs (resize2fs.c:1824)
==19668== by 0x405A46: main (main.c:451)
==19668== Address 0x5a0d002 is not stack'd, malloc'd or (recently) free'd
==19668==
==19668== Invalid read of size 2
==19668== at 0x405391: resize_fs (resize2fs.c:1864)
==19668== by 0x405A46: main (main.c:451)
==19668== Address 0x5a0d002 is not stack'd, malloc'd or (recently) free'd
==19668==
It was found by Eric Sandeen running the regression suite through
valgrind.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Resizing a filesystem with an external journal fails when it tries
to read inode 0:
# touch testfs
# truncate testfs 1342177280
# touch testjournal
# truncate testjournal 134217728
# mke2fs -O journal_dev testjournal
# losetup /dev/loop0 testjournal
# mkfs.ext4 -J device=/dev/loop0 testfs 127680
# resize2fs testfs
resize2fs 1.41.9 (22-Aug-2009)
Resizing the filesystem on testfs to 327680 (4k) blocks.
resize2fs: Illegal inode number while trying to resize testfs
Please run 'e2fsck -fy testfs' to fix the filesystem
after the aborted resize operation.
I think the right, simple thing to do is just bail out early
for an external journal here, as there are no backup blocks
to update.
Reported-by: mjevans1983@gmail.com
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
After cleaning up ext2fs_bg_flag_set() and ext2fs_bg_flag_clear(),
we're left with ext2fs_bg_flag_test(). Convert it to
ext2fs_bg_flags_test().
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The ext2fs_bg_flag* functions were confusing.
Currently we have this:
void ext2fs_bg_flags_set(ext2_filsys fs, dgrp_t group, __u16 bg_flags);
void ext2fs_bg_flags_clear(ext2_filsys fs, dgrp_t group,__u16 bg_flags);
(_set (unused) sets exactly bg_flags; _clear clears all and ignores bg_flags)
and these, which can twiddle individual bits in bg_flags:
void ext2fs_bg_flag_set(ext2_filsys fs, dgrp_t group, __u16 bg_flag);
void ext2fs_bg_flag_clear(ext2_filsys fs, dgrp_t group, __u16 bg_flag);
A better interface, after the patch below, is just:
ext2fs_bg_flags_zap(fs, group) /* zeros bg_flags */
ext2fs_bg_flags_set(fs, group, flags) /* adds flags to bg_flags */
ext2fs_bg_flags_clear(fs, group, flags) /* clears flags in bg_flags */
and remove the original ext2fs_bg_flags_set / ext2fs_bg_flags_clear.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
When flex_bg is on, calculate_minimum_resize_size() should add more meta
blocks for newly added flex_bg.
Addresses-RedHat-Bugzilla: #519131
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If the resize operation fails in the middle of the operation, mark the
filesystem as needing to be checked, and tell the user that they
should run e2fsck -fy on the device.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>