Use libfuse's command line parsing, which is much more powerful and
flexible than what we had before, and to allow the user to have more
fine-grained control over FUSE's run-time options.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If the journal needs to be recovered to avoid clobbering whatever
changes tune2fs makes, do so.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Give admins a short amount of time to confirm that they want to
proceed with a dangerous operation. Refuse to perform the op
unless the filesystem is freshly checked.
Cc: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Currently, filefrag's "expected physical block" column expects extent
records to be physically adjacent regardless of the amount of logical
block space between the two records. This means that if we punch a
hole in a file, we get reports like this:
ext: logical_offset: physical_offset: length: expected: flags:
4: 4096.. 8343: 57376.. 61623: 4248:
5: 8345.. 10313: 61625.. 63593: 1969: 61624:
Notice how it expects 8345 to map to 61624, and scores this against
the fragmentation of the file. Flagging this as "unexpected" is
incorrect because the gap in the logical mapping is exactly the same
size as the gap in the physical extents.
Furthermore, this particular mapping leaves the door open to the
optimal mapping -- if a write to block 8344 causes it to be mapped to
61624, the entire range 4096-10313 can be mapped with a single extent.
Until that happens, there's no way to combine extents 4 and 5 because
of the gap in the logical mapping at block 8344.
Therefore, tweak the extent report to account for holes.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Allow users to turn on metadata_csum_seed at format time so that UUIDs
can be live-changed at any time.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Chattr and lsattr can be used to set or get project ID:
chattr -p <project id> file
lsattr -p file
Signed-off-by: Li Xi <lixi@ddn.com>
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If s_mkfs_time is not set in the superblock, print the s_mtime field
instead to identify the different superblocks. This can happen if the
superblock is corrupted, since s_mkfs_time is not reset by e2fsck.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The EXT2_GETVERSION ioctl is defined to take a "long" parameter, but
fgetversion() calls ioctl() with an "int" parameter instead. This is
handled in the kernel correctly, but the generation is sign-extended
in fgetversion() before return on 64-bit systems and lsattr prints
it as a huge positive number for inode generation above 0x80000000:
1635574212 -------------e-- /mnt/ost0/O/0/d0/12928
18446744073045131735 -------------e-- /mnt/ost0/O/0/d0/166240
782808861 -------------e-- /mnt/ost0/O/0/d0/31744
18446744072181134840 -------------e-- /mnt/ost0/O/0/d0/135008
Correctly assign the returned generation number as an unsigned value,
and print it with a 10-character field width. The version is printed
left-aligned for consistency with the old code and to ensure it is
always printed in the first column for use with tools like "cut":
1635574212 -------------e-- /mnt/ost0/O/0/d0/12928
3630547415 -------------e-- /mnt/ost0/O/0/d0/166240
782808861 -------------e-- /mnt/ost0/O/0/d0/31744
2766550520 -------------e-- /mnt/ost0/O/0/d0/135008
Do not return a random value from the stack as the version on error.
Clean up some style issues and consolidate some duplicate code.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Fix compile warnings for missing declarations on the maint branch.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This patch add EXT4_PROJINHERIT_FL to enable inherit feature for
project ID. If an directory has its inherit flag set, all its
newly created children will inherit its project ID. Conversely,
new inodes will get a default project ID (i.e. zero). Also, no
hard link or rename is permitted if the directory and child has
different project ID.
Signed-off-by: Li Xi <lixi@ddn.com>
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This patch adds project quota support. An new quota type PRJQUOTA(2)
is added. EXT4_PRJ_QUOTA_INO(11) is reserved for project quota inode.
The super block reservers an field s_prj_quota_inum for saving
project quota inode. And each inode adds an internal field i_projid
for saving its project ID.
Signed-off-by: Li Xi <lixi@ddn.com>
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This patch add project feature flag EXT4_FEATURE_RO_COMPAT_PROJECT.
Project feature is a read-only compat feature. Thus, an ext4 file
system with project feature enabled could only be read by ext4
kernel module without project feature support.
Signed-off-by: Li Xi <lixi@ddn.com>
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Project quota related fields are reserved in Linux kernel.
As a preparation for it, this patch cleans up quota codes
of e2fsprogs so as to make it easier to add new quota type(s).
Signed-off-by: Li Xi <lixi@ddn.com>
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This fixes a number of incompatibilities which caused maint branch to
fail to build on on FreeBSD. Also fix the Makefile in the tests
directory so that "make -jN check" works correctly on FreeBSD.
Previously the Makefile in the tests directory used a construct which
was specific to GNU Make, which which silently expanded to an empty
list, which caused "make check" to be a no-op when running using BSD's
pmake. This Makefile has been changed to use the != macro assignment
syntax which is common to GNU make and BSD pmake. It's technically
not completely portable (it will not be recognized by Solaris's ccs
make, for example), but most other operating systems ship GNU make
(Solaris, AIX), or BSD pmake (*BSD, Mac OS) as either the primary or
alternative make utility that this should an acceptable compromise,
since it makes running all of tests using something like "make -j8
check" or "make -j16 check" run *much* faster.
There are still some caveats if using BSD pmake; in particular, if the
configure script is run on a system which has GNU make (installed as
gmake on FreeBSD for example), the configure script will find it, and
enable some GNU make features in the Makefile, and the generated
makefiles *must* be built using gmake. However, if isolated build
jail / chroot is used which only has pmake, the Makefiles should now
work with pmake.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Move the mke2fs "-d" option to be alphabetical like other options.
Rename "root_dir" to "src_root_dir" to avoid confusion with the
actual root inode in the new filesystem.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When creating a file in op_create, set the file's uid and gid to the
user's uid and gid. Do the same in op_mknod.
Reported-by: Lennart Lövstrand <lennart@lovstrand.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Running tune2fs on a filesystem with an unrecovered journal can
cause the tune2fs settings changes in the superblock to be reverted
when the journal is replayed if it contains an uncommitted copy of
the superblock. Print a warning if this is detected so that the
user isn't surprised if it happens.
Signed-off-by: Jim Garlick <garlick@llnl.gov>
Updated message printed to include steps to replay journal.
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Create separate predicate functions to test/set/clear feature flags,
thereby replacing the wordy old macros. Furthermore, clean out the
places where we open-coded feature tests.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
There are times when it is necessary to update the UUID on a mounted
root file system (for example). So when we add this this safety check
to e2fsprogs 1.43, we will likely break some scripts. Allow the -f
option to force an override of this safety check.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The posix_fadvise() to hint to the system that the file can be removed
from memory will probably not work well without the sync_file_range(2)
call, but e4defrag should still fundamentally work, and this will
allow e4defrag to compile if the C library doesn't happen this system
call exposed.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Also change ext2fs_symlink() so that the target parameter is a const
char *, thus promising that we will never change the incoming string.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
We are using __func__ without any backup definition in the rest of
e2fsprogs, and this is causing warnings in the Android build, so just
remove it.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The quota code required that we included dict.o in libsupport.a, so we
might as well just move dict.c and dict.h to lib/support, and then
have e2fsck use the version of dict.c in libsupport.a. This
simplifies the build system and eliminates having two identical copies
of dict.o floating around in the build tree.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The check_plausibility() function is now used all over the place, so
we should move the plausible.c file to lib/support and remove the
special case handling for that file that had been in the build system.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The profile functions started as something specific to e2fsck. It's
now used by mke2fs and e2fsck, so it's better to move it into
libsupport.a.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
We will be using libsupport.a for e2fsprogs's internal support
functions. It will contain the quota support functions, but we will
also be moving code such as profile.c and plausible.c to libsupport.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
For the 1.43 release, quota support will be the default. It's much
simpler if we don't try to make quota support optional. This was done
originally because the quota feature wasn't fully tested. It is now,
so we can remove this as an option.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Prompt for user verification before rewriting the filesystem
superblocks using the "-S" (super-only) option. This should
not normally be used at all, so adding the extra verification
will probably save a few user filesystems in the future. Since
this is something that should only be done in rare cases under
user supervision, wait for user input rather than proceeding
automatically after a timeout.
Update the mke2fs man page to more fully explain the many
dangers of this option.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Some temporary char buffers allocated on the stack are not properly
aligned when typecast to a structure containing __u32 or __u64 types,
and this can cause alignment warnings on ARM and other alignment
sensitive architectures, and potential slowdowns to do fixups.
Fix the buffer alignment to avoid such issues.
Addresses: https://bugzilla.redhat.com/show_bug.cgi?id=680090
Reported-by: Gordan Bobic <gordan.bobic@gmail.com>
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Diet libc doesn't support syscall correctly, but it does have
add_key() and keyctl() in libc (although glibc does not). So change
e4crypt to use add_key() and keyctl() directly if they are available.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This resulted in the build failing when building e2fsprogs from
scratch.
Reported-by: "Darrick J. Wong" <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Modern Linux major/minor numbering on block devices no longer conform to
the divisible by 64 rule for minor numbering. On my development system,
the correct number is 16. Consequently, this applies only to every 4th
drive on a modern system, which is inconsistent. That caused the
following bug to be filed against Flocker:
https://clusterhq.atlassian.net/browse/FLOC-2041
We could unconditionally pass -F to override this check whenever it
triggers, but that it would also override the libblkid check that
determines whether there are existing partitions, logical volumes or
filesystems on the disk, which seems unwise.
I propose that this check be removed because passing a whole disk to
mke2fs is a valid use case and given how long this has been broken,
users are already accustomed to the behavior where -F is not necessary
to format a whole disk as ext4.
Signed-off-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
This is the initial implementation of a FUSE server based on
e2fsprogs. The point of this program is to enable ext4 to run on any
OS that FUSE supports (and doesn't already have a native driver), such
as MacOS X, BSDs, and Windows. The code requires FUSE API v28, which
is available in Linux fuse and osxfuse releases that are available as
of August 2013.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Use the new fallocate API for creating the journal and the mk_hugefile
feature.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Having multiple versions of jfs_user.h was confusing the Android
build. Clean up things by removing the lib/ext2fs/jfs_user.h and
misc/jfs_user.h and simplifying how we emulate the kernel
infrastructure needed by journal replay code and removing the
kernel-specific lines from kernel-jbd.h.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
e2fsck/dirinfo.c and misc/e4crypt.c use functions from libext2fs, so
we need to include its header file or clang will complain.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
We need to use lgetxattr(2) instead of getxattr(2) or attempts to
create file systems with extended attributes will fail:
set_inode_xattr: No data available while reading attribute "trusted.link" of "link"
__populate_fs: No data available while setting xattrs for "link"
mke2fs: No data available while populating file system
Reported-by: Jack_Fewx@Dell.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This started with the fm_ext being uninitialized, but upon closer
analysis I discovered that forcing extent emulation in FIBMAP mode
was reporting an extent for every block in the file. Fix both
problems.
The Coverity bug was 1297512.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Fix Coverity bugs 1297094-1297101 by fixing all the mutations in the
*_setup_tdb() functions, fixing buffer overflows, and checking
return values.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Add some simple tests for mke2fs -d (create image from dir) and make
the manpage options appear in alphabetic order.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Save errno (in retval) before doing anything else, because the
"anything else" (usually com_err()) can call library functions, which
will reset errno.
Fix the error messages to use the message catalog, and don't _ever_
print an error without providing context.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Rewrite the file copy-in algorithm to detect smaller holes in the
files we're copying in. Use SEEK_DATA/SEEK_HOLE/FIEMAP when available
to skip known empty parts. This fixes the particular bug where zeroed
blocks on a system with 64k pages are needlessly copied into a
4k-block filesystem. It also saves time by skipping parts we know to
be zeroed.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When we're creating hard links via ext2fs_link, the (misnamed?) flags
argument specifies the filetype for the directory entry. This is
*derived* from i_mode, so provide a translator. Otherwise, fsck will
complain about unset file types.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Provide the user with an option to create an undo file so that they
can roll back a failed tuning operation. Previously, one would be
created if force_undo was set in the configuration file and a bunch of
(undocumented) conditions were met.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Provide the user with an option to create an undo file so that they
can roll back a failed tuning operation. Previously, one would be
created for inode resize if a bunch of (undocumented) conditions were
met.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The existing undo file format (which is based on tdb) has many
problems. First, its comparison of superblock fields is ineffective,
since the last mount time is only written by the kernel, not the tools
(which means that undo files can be applied out of order, thus
corrupting the filesystem); block numbers are written in CPU byte
order, which will cause silent failures if an undo file is moved from
one type of system to another; using the tdb database costs us an
enormous amount of CPU overhead to maintain the key data structure,
and finally, the tdb database is unable to deal with databases larger
than 2GB. (Upstream tdb 1.2.12 can handle 4GB, but upgrading a 2TB FS
to 64bit,metadata_csum easily produces 2.9GB of undo files, so we
might as well move off of tdb now.)
The last problem is fatal if you want to use tune2fs to turn on
metadata checksumming, since that rewrites every block on the
filesystem, which can easily produce a many-gigabyte undo file, which
of course is unreadable and therefore the operation cannot be undone.
Therefore, rip all of that out in favor of writing to a flat file.
Old blocks are appended to a file and the index is written to the end
when we're done. This implementation is much faster than wasting a
considerable amount of time trying to maintain a hash index, which
drops the runtime overhead of tune2fs -O metadata_csum from ~45min
to ~20 seconds on a 2TB filesystem.
I have a few reasons that factored in my decision not to repurpose the
jbd2 file format for undo files. First, undo files are limited to
2^32 blocks (16TB) which some day might not serve us well. Second,
the journal block size is tied to the file system block size, but
mke2fs wants to be able to back up big chunks of old device contents.
This would require large changes to the e2fsck journal replay code,
which itself is derived from the kernel jbd2 driver, which I'd rather
not destabilize. Third, I want to require undo files to store the FS
superblock at the end of undo file creation so that e2undo can be
reasonably sure that an undo file is supposed to apply against the
given block device, and doing so would require changes to the jbd2
format. Fourth, it didn't seem like a good idea that external
journals should resemble undo files so closely.
v2: Provide a state bit that is only set when the undo channel is
closed correctly so we can warn the user about potentially incomplete
undo files. Straighten out the superblock handling so that undo files
won't be confused for real ext* FS images. Record multi-block runs in
each block key to reduce overhead even further. Support reopening an
undo file so that we can combine multiple FS operations into one
(overall smaller) transaction file, which will be easier to manage.
Flush the undo index data if the program should terminate
unexpectedly. Update the ext4 superblock bits if errors or -f is
found to encourage fsck to do a full run the next time it's invoked.
Enable undoing the undo.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Fix memory leaks and improve the error messages to make it easier
to figure out why e2undo went wrong.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The directory hash is now calculated using the on-disk encrypted
filename, and we no longer use the digest encoding or the SHA-256
encoding, so remove them from the ext2fs library until there is some
reason we need them.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Once we've "fixed" the filesystem, try mounting and modifying it to see
if we can break the kernel.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Previously we were using a weird hybrid CBC/CTS. Switch things so we
are using straight CTS; this corresponds to changes made in the latest
ext4 encryption patches.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Add missing new lib/ext2fs source files that were added for encryption
support. Also move configuration #define's from individual Android.mk
to the android_config.h file, since we've moved away from specifying
configuration #define's on the command-line upstream.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Previously, e4crypt required the user to manually specify the salt
used for their passphrase. This was user unfriendly to say the least.
The e4crypt program can now request the salt using an ioctl, which
will automatically generate the salt if necessary, and keep it in the
ext4 superblock.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This patch adds new e4crypt tool for encryption management in the ext4
filesystem.
Signed-off-by: Ildar Muslukhov <muslukhovi@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The Android.mk files were taken from the Android AOSP sources, and
updated for the 1.43 next branch. The intention is that this will
allow the repository which is currently located in external/e2fsprogs
with one which is based off of the upstream e2fsprogs. Right now
external/e2fsprogs was not created using "git clone", so it means that
git merges don't work. After the external/e2fsprogs Android
repository is replaced, with one based off the upstream repository,
Android will be able to synchronize with the upstream repository by
pulling and merging from upstream, and then running the script
"./util/gen-android-files" to update any generated files. (This is
necessary because in the Android build system, the Android.mk files
are rather stylized and don't make it easy to run arbitrary shell
scripts during the build phase.)
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
If e2fsck encounters a read error on a block past the end of the
filesystem, don't bother trying to "rewrite" the block. We might
still want to re-try the read to capture FS data marooned past the end
of the filesystem, but in that case e2fsck ought to move the block
back inside the filesystem.
This enables e2fuzz to detect writes past the end of the FS due to
software bugs.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If the user tries to enable or disable the 64bit feature via tune2fs,
tell them how to use resize2fs to effect the conversion.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Earlier, I tried to make tune2fs abort if the user tried to enable or
disable metadata_csum on a mounted FS, but forgot the exit() call.
Supply it now.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When we're turning on metadata checksumming /and/ resizing the inode
at the same time, disable checksum verification during the
resize_inode() call because the subroutines it calls will try to
verify the checksums (which have not yet been set), causing the
operation to fail unnecessarily.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
resize2fs does its magic by loading a filesystem, duplicating the
in-memory image of that fs, moving relevant blocks out of the way of
whatever new metadata get created, and finally writing everything back
out to disk. Enabling 64bit mode enlarges the group descriptors,
which makes resize2fs a reasonable vehicle for taking care of the rest
of the bookkeeping requirements, so add to resize2fs the ability to
convert a filesystem to 64bit mode and back.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Cc: TR Reardon <thomas_reardon@hotmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Currently maximum number of bad blocks is not limited in any way.
However our code can really handle at most INT_MAX/2 bad blocks (for
larger numbers binary search indexes start overflowing). So report
number of bad blocks is just too big instead of plain segfaulting.
It won't be too hard to raise the limit but I don't think there's any
real use for disks with over 1 billion of bad blocks...
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
At mke2fs time, if we discard the device and discard zeroes data,
don't bother zeroing the inode table blocks a second time.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If we're disabling metadata_csum and the user doesn't provide explicit
instructions to enable or disable uninit_bg, assume that they want
uninit_bg to be turned on by default. Otherwise, we lose all block
group flags and unused inode count, which is a big hit to performance.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Warn the user if we're trying to enable metadata_csum on a FS that
doesn't support extents (since block maps cannot contain checksums).
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Don't display unused inodes twice, and make it clear that we're
printing a descriptor checksum.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Cc: TR Reardon <thomas_reardon@hotmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The current mk_hugefile code in mke2fs doesn't support creating
non-extent files, so disable the functionality when we're mkfs'ing
without extent support.
The fallocate patches further on will eliminate the need for this.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Don't open-code the creation of the extent tree header, since
ext2fs_extent_open2() knows how to take care of this.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If we apply this patch 'e2fsprogs/tune2fs: rewrite metadata checksums
when resizing inode size', we will trigger a segfault, this is because
of the inode cache issues.
Firstly we should notice that in expand_inode_table(), we have change
the super block's s_inode_size to new inode size(for example, 256).
Then we re-compute metadata checksums, see below code flow:
|-->rewrite_metadata_checksums
|----->rewrite_inodes
|-------->ext2fs_write_inode_full
In ext2fs_write_inode_full(), if an inode cache is hit, the below code will be executed:
/* Check to see if the inode cache needs to be updated */
if (fs->icache) {
for (i=0; i < fs->icache->cache_size; i++) {
if (fs->icache->cache[i].ino == ino) {
memcpy(fs->icache->cache[i].inode, inode,
(bufsize > length) ? length : bufsize);
break;
}
}
}
Before executing rewrite_inodes(), actually the inode in inode cache
is allocated by old inode size(for example, 128), but here the memcpy
will obviously write overflow, '(bufsize > length) ? length : bufsize'
here will return 256(new inode size), so this is wrong, we need to fix
this. I think we should call ext2fs_free_inode_cache() in
expand_inode_table(), to drop the inode cache, because inode size has
changed, if necessary, we will re-create this inode cache.
Steps to reproduce this bug (apply 'tune2fs: rewrite metadata checksums
when resizing inode size' first):
dd if=/dev/zero of=file.img bs=1M count=128
device_name=$(/sbin/losetup -f)
/sbin/losetup -f file.img
mkfs.ext4 -I 128 -O ^flex_bg $device_name
tune2fs -I 256 $device_name
Signed-off-by: Xiaoguang Wang <wangxg.fnst@cn.fujitsu.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When we use tune2fs -I new_ino_size to change inode size, if
everything is OK, the corresponding ext4_group_desc.bg_free_blocks_count
will be decreased, so obviously, we need to re-compute the group
descriptor checksums, and the inode 's size has also changed, we also
need to recompute the checksums of inodes for metadata_csum
filesystem, so here we choose to call a rewrite_metadata_checksums(),
this will fix checksum issues.
Meanwhile, the patch will trigger an existing memory write overflow,
which will casue segfault, please see the next patch.
Signed-off-by: Xiaoguang Wang <wangxg.fnst@cn.fujitsu.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>