This speeds up reading bitmaps from disk for very large (and full)
disks by significant amounts (i.e., up to two CPU minutes for a 4T
file system).
Addresses-Google-Bug: #7534813
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Restructure the ext2fs_get_device_size() and blkid_get_dev_size()
code to localize the variables used for different device probing
methods. This at least reduces the #ifdef mess to only one part
of the code for each method, and avoids "unused variable" compiler
warnings added when variables are declared without being #ifdef'd.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Change the license of the mmp.c file to LGPL to match the license
of other files in the libext2fs library.
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Profiling shows that rb_test_bit() is now calling ext2fs_rb_next() a
lot, and this function is now the hot spot when running e2freefrag.
If we cache the results of ext2fs_rb_next(), we can eliminate those
extra calls, which further speeds up both e2freefrag and e2fsck by
reducing the amount of CPU time spent in userspace.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The code was previously allocating a single 4 or 8 byte pointer for
the rcursor and wcursor fields in the ext2fs_rb_private structure;
this added two extra memory allocations (which could fail), and extra
indirections, for no good reason. Removing the extra indirection also
makes the code more readable, so it's all upside and no downside.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Optimize testing for a bit in an rbtree-based bitmap for the case
where the calling application is scanning through the bitmap
sequentially. Previously, we did this for a set of bits which were
inside an allocated extent, but we did not optimize the case where
there was a large number of bits after an allocated extents which were
not in use.
1111111111111110000000000000000000
^ optimized ^not optimized
In my tests of a roughly half-filled file system, the run time of
e2freefrag was halved, and the cpu time spent in userspace was during
e2fsck's pass 5 was reduced by a factor of 30%.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Throttle updates for the "Allocating Groups" progress updates to once
a second as well. We now do this throttling in libext2fs, so we don't
have to do this for each of mke2fs's progress updates, and because the
updates from ext2fs_allocate_tables() come from within libext2fs
anyway.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Since clang uses C99 semantics by default, the main changes required
to allow clang to build e2fsprogs was to add support the C99 inline
semantics, while still allowing us to be built when the legacy (but
still default for gcc) GNU C89 inline semantics are in force.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
mke2fs -m option can set reserved blocks ratio up to 50%. But if the
last block group is not big enough to support the necessary data
structures, it gets dropped, we have to recalculate the number of
reserved blocks so that the reserved blocks matches the requested
percentage.
It also avoids a problem where if the user specifies a reserved blocks
of 50%, and after the last partial block group was dropped, if the
number of reserved blocks is greater than 50%, e2fsck will complain.
Steps to reproduce:
1. Create a FS which has the overhead for the last BG
and specify 50 % for reserved blocks ratio
# mke2fs -m 50 -t ext4 DEV 1025M
mke2fs 1.42.5 (29-Jul-2012)
warning: 256 blocks unused.
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
656640 inodes, 2621440 blocks
1310848 blocks (50.00%) reserved for the super user
~~~~~~~ <-- Reserved blocks exceed 50% of FS blocks count!
2. e2fsck outputs filesystem corruption
# e2fsck DEV
e2fsck 1.42.5 (29-Jul-2012)
Corruption found in superblock. (r_blocks_count = 1310848).
The superblock could not be read or does not describe a correct ext2
filesystem. If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
e2fsck -b 32768 <device>
Signed-off-by: Akira Fujita <a-fujita@rs.jp.ne.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Commit a7c17431b9 attempted to fix a problem where the system
libraries might get used instead of local libraries for things like
-lcom_err. It tried to accomplish this by moving $(ELF_OTHER_LIBS) to
before $(LDFLAGS).
Unfortunately, this was the wrong fix; $(ELF_OTHER_LIBS) *MUST* be
after the object files, or the linker might not pull in the necessary
library and not include it into the DT_NEEDED section of the shared
library. The proper fix is to add a -L$(LIB) before $(LDFLAGS), and
then remove the -L option from all of the ELF_OTHER_LIBS definitions
in the library Makefiles.
Addresses-Sourceforge-Bug: #3554345
Cc: Olivier Blin <olivier.blin@softathome.com>
Reported-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The following commands:
dd if=/dev/zero of=/tmp/foo count=1 ibs=$(( 256 * 1024 * 1024 ))
mke2fs -N 256 -t ext4 /tmp/foo
... will cause mke2fs to write until it fills the device. The cause
for this is that the explicit request for 256 inodes causes the number
of inodes per block group to be 8. The ext2fs_initialize() function
assumed that all of the reserved inodes would be in the first block
group, which is not true in this case. This caused the number of
uninitialized inodes in the first block group to be negative, which
then resulted in mke2fs trying to zero out a very large number of
blocks. Oops.
Addresses-Sourceforge-Bug: #3528892
Reported-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
ELF_OTHER_LIBS usually contains local search dirs (-L ../..), but it
was added in link command after system search dirs from LDFLAGS.
Libraries and executables were linked with the system libraries if
present, and possibly using static archives instead of shared
libraries.
It could also make final executable link to fail when shared libraries
are enabled: if libext2fs.so is linked with a static libcom_err.a from
system, build system would attempt to link without -lpthread.
This fixes the issue by moving ELF_OTHER_LIBS before LDFLAGS in the
link command.
Addresses-Sourceforge-Bug: #3542572
Reported-by: Olivier Blin <blino@users.sourceforge.net>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
We failed to clear EXT2_FLAG_SUPER_ONLY after deleting the
quota inode and so, the updated block bitmap was not written
back. This caused fsck to complain after running
'tune2fs -O ^quota <dev>'. Clear this flag so that updated
block bitmap gets written. Also, avoid truncating the quota
inode if it is not hidden.
Signed-off-by: Aditya Kali <adityakali@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Currently 'tune2fs -O quota <dev>' will try to use existing
quota files and write their inode numbers in the superblock.
Next e2fsck run then converts these into hidden quota inodes
(ino #3 & #4). But this approach has problems:
1) Before e2fsck run, the inodes are visible to the user and
might get corrupted or removed or replaced by the user.
2) Since these are user visible, we have to include
their block usage in the quota accounting. But once
these inodes are hidden, e2fsck will have to decrement
their usage from the quota accounting (which e2fsck
currently doesn't do and instead reports error).
(the following used to give e2fsck error previously:
# assume <dev> has aquota.user & aquota.group files
$ tune2fs -O quota <dev> # stores ino# of quota files in
# ext4 superblock
$ e2fsck -f <dev> # hides quota files, but now quota
# usage is incorrect.
<< quota errors >>
Instead of making e2fsck complicated, this patch creates the
hidden quota inodes at 'tune2fs -O quota' time iteself. The
usage is computed freshly and limits are copied from the
aquota.user and aquota.group files as earlier.
Signed-off-by: Aditya Kali <adityakali@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The Build Log Hardening Check is a debian tool which scans the output
of a package build making sure that the security hardening flags are
used when compiling and linking all of binaries in a package.
For the most part we were passing CFLAGS, CPPFLAGS, and LDFLAGS down
to the compiler and link commands, but there there were one or two
exceptions. In addition, there where a few places in "make install"
where the V=1 option was not being honored, which triggered blhc
warnings since it couldn't analyze those commands.
The e2fsck.static was the only binary that was not getting built and
packaged with the hardening flags, but I've fixed all of the blhc
warnings so in the future it will be obvious if we regress.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The attempted inclusion of sys/quota.h is causing failures in when
building on the hurd and freebsd platforms for Debian. It's not
necessary any more, so just remove the #include.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
When e2fsck uses the block iterator to release the blocks in an
extent-mapped inode, when the last block in an extent is removed, the
current extent has been removed and the extent cursor is now pointing
at the next inode. But the block iterator code doesn't know that. So
when it tries to go the next extent, it will end up skipping an
extent, and so the inode will be incompletely truncated.
The fix is to go to the next extent before calling the callback
function for the current extent. This way, regardless of whether the
current extent gets removed, the extent cursor is still pointing at
the right place.
Reported-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
When libext2fs allocates/deletes an extent leaf, the i_blocks
value is incremented/decremented by fs->blocksize / 512. This
is incorrect in case of bigalloc. The correct way here is to
use cluster_size / 512.
The problem is seen if we try to create a large inode using
libext2fs (say using ext2fs_block_iterate3()) on a bigalloc
filesystem. fsck catches this and complains.
Signed-off-by: Aditya Kali <adityakali@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Quite some definitions in quota library are not necessary. Remove them.
Also fold quota.h file into quotaio.h since it didn't contain that many
definitions.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The creation of inline wrappers ext2fs_open_file() and ext2fs_stat()
in commit c859cb1de0 in ext2fs.h caused
difficulties with the use of headers, since the headers for open64()
and stat64() may already be included (and skip the declaration of the
64-bit variants) before ext2fs.h is ever read. There is no real way
to solve the missing prototypes and resulting compiler warnings inside
ext2fs.h.
Since ext2fs_open_file() and ext2fs_stat() are not performance
critical operations, they do not need to be inline functions at all,
and the needed function headers can be handled properly in one file.
Similarly, posix_memalloc() was having difficulties with headers, and
was being defined in ext2fs.h, but it is now only being used by a
single file, so move the required header there.
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The quotactl() system call was being used without the use of a
function prototype. On closer examination, it turns out the one user
of that system call was the quota_is_on() function, which is not used
by e2fsprogs at all. Since libquota is an e2fsprogs-internal library,
and not one that we plan to export any time soon, the simplest thing
to do is to simply remove quota_is_on(), which in turn allows us to
remove all of the infrastructure around using the Linux-specific
quotactl() system call.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This code uses time() but doesn't include time.h leading to:
quotaio.c:89:2: warning: implicit declaration of function 'time'
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
For a completely full filesystem with more than 2^32 blocks, the
rbtree bitmap backend can assemble an extent of used blocks which is
longer than 2^32. If it does, it will overflow ->count, and corrupt
the rbtree for the bitmaps.
Discovered by completely filling a 32T filesystem using fallocate, and
then observing debugfs, dumpe2fs, and e2fsck all behaving badly.
(Note that filling with only 31 x 1T files did not show the problem,
because freespace was fragmented enough that there was no sufficiently
long range of used blocks.)
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Change the include path in the Cflags field so that #include
<lib/foo.h> and <foo.h> will work. We had originally used a C flags
which allowed <foo.h> to work, but many applications (especially those
not using pkg-config) had been using the <lob/foo.h> formulation which
didn't require an explicit -I{$includedir} option to the C compiler.
If those applications then converted over to pkg-config, and the
e2fsprogs libraries were installed with a prefix other than /usr, so
that the header files were in some directory such as
/usr/local/include, a program that used #include <lib/foo.h> would
fail to compile.
So change the pkg-config files to include both -I{$includedir} and
-I{$includir}/lib.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The code was assuming that "unsigned long" was 64-bit, which of course
it isn't on 32-bit systems. This caused blocks to get written to the
wrong place.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Add a configure option, --enable-relative-symlinks, which will use
relative symlinks for the ELF shared library files.
Addresses-Sourceforge-Bug: #3520767
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Add a few dependencies where needed, so that "make -j17 check" now
works.
Signed-off-by: Matthias Andree <matthias.andree@gmx.de>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
MacOS 10.5 doesn't have posix_memalign() nor memalign(), but it does
have valloc(). The Android SDK would like to be built on MacOS 10.5,
so I've added support for a good-enough emulation of memalign()'s
functionality using valloc(), with an explicit test to make sure
valloc() is returning a pointer which is sufficiently aligned given
the requested alignment. This won't work if you try to operate on a
file system with a 16k blocksize using an e2fsprogs built on MacOS
10.5 system, but it is good enough for the common case of 4k
blocksize file systems, and we will let the memory allocation fail in
the alignment is not good enough.
I've also added a unit test for ext2fs_get_memalign() so we can be
sure it's working as expected. I've tested the code paths with
HAVE_POSIX_MEMALIGN defined, HAVE_POSIX_MEMALIGN undefined, and
HAVE_POSIX_MEMALIGN and HAVE_MEMALIGN undefined on an x86 Linux
system, and so I know the valloc() code path works OK. The simplistic
(and less safe) patch at:
https://trac.macports.org/attachment/ticket/33692/patch-lib-ext2fs-inline.c.diff
Shows that using valloc() apparently works OK for MacOS 10.5 (but if
it doesn't the unit test will catch a problem).
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Create a new function, io_channel_alloc_buf() which allocates I/O
buffers with appropriate alignment if we are using direct I/O. The
original code was sometimes using a larger alignment factor than
necessary, and would always request an aligned memory buffer even when
it was not necessary since the block device was not opened with
O_DIRECT.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Read in a full block for each allocation bitmap, to avoid using a
kernel bounce buffer when using direct I/O.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Create a new function, ext2fs_get_dio_alignment(), which returns the
alignment requirements for direct I/O. This way we can factor out the
code from MMP and the Unix I/O manager. The two modules weren't
consistently calculating the alignment factors, and in particular MMP
would sometimes use a larger alignment factor than was strictly
necessary.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The align field which indicated the required data alignment of data
buffers was stored in a field specific to the unix_io manager. Move
it to the top-level io_channel structure so it can be better
generalized.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Currently fsck recomputes quotas and overwrites quota files
whenever its run. This causes unnecessary modification of
filesystem even when quotas were never inconsistent. We also
lose the limits information because of this. With this patch,
e2fsck compares the computed quotas to the on-disk quotas
(while updating the in-memory limits) and writes out the
quota inode only if it is inconsistent.
Signed-off-by: Aditya Kali <adityakali@google.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
compile_et puts #include <et/com_err.h> in
generated header files so pkg-config --cflags
com_err should provide the path to the directory
containing et/.
Improve the test coverage of tst_bitmaps by:
(a) adding the ability to test the legacy (32-bit) bitmap code
(b) adding tests for ext2fs_find_first_zero_inode_bitmap2() and
ext2fs_find_first_zero_block_bitmap2()
The recent regressions caused by the addition (and use) of
ext2fs_find_first_zero_inode_bitmap2() would have been caught if we
had added these tests first. (Another object lesson in why unit tests
are critically important!)
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Fortunately nothing was using this inline function, so we'll just fix
the types in its function signature, which were nonsensical (this was
caused by a cut-and-paste error).
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The lack of 32-bit support was causing febootstrap to crash since it
wasn't passing EXT2_FLAG_64BITS when opening the file system, so we
were still using the legacy bitmaps.
Also add support for bigalloc bitmap into the ffz functions.
Addresses-Red-Hat-Bugzilla: #808421
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Change autoconf to test for setmntent() and use that to decide whether
to use getmntent() and setmntent(), since some systems don't have
setmntent() but they do have the mntent.h header file.
Also, remove the includes of mntent.h from e2fsck and mke2fs and other
places where it is not needed.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The ext2fs_add_journal_inode() function calls
ext2fs_check_mount_point(), which can fail if /etc/mtab is missing.
This causes mke2fs to fail in the middle of the file system format
process; mke2fs calls ext2fs_check_mount_point() already (and has
appropriate fallbacks that calls fails), so add a flag so that mke2fs
can request ext2fs_add_journal_inode() to skip trying to call
e2fsck_check_mount_point().
Addresses-Sourceforge-Bug: #3509398
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
With this change the CPU time needed to shrink a 100G filesystem drops
to 0.8% of the original (17 CPU seconds instead of 2057).
Signed-off-by: Sami Liedes <sami.liedes@iki.fi>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Update the block group descriptor checksum and mark the superblock and
allocation bitmaps as dirty in check_inode_uninit() and
check_block_uninit().
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>