Make sure the s_mmp_update_interval super block field is set
from the file system parameters block which is passed into the
ext2fs_initialize() function.
Addresses-Lustre-Bug: LU-1888
Signed-off-by: Gregoire Pichon <gregoire.pichon@bull.net>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This optimizies the CPU utilization of the rb_get_bmap_range()
function when most of the bitmap is allocated.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
This simplifies the rb_get_bmap_range() function and speeds it up for
the case where most of the bitmap is zero.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
This function efficiently counts the number of bits in a block of
memory.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
This speeds up reading bitmaps from disk for very large (and full)
disks by significant amounts (i.e., up to two CPU minutes for a 4T
file system).
Addresses-Google-Bug: #7534813
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Restructure the ext2fs_get_device_size() and blkid_get_dev_size()
code to localize the variables used for different device probing
methods. This at least reduces the #ifdef mess to only one part
of the code for each method, and avoids "unused variable" compiler
warnings added when variables are declared without being #ifdef'd.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Change the license of the mmp.c file to LGPL to match the license
of other files in the libext2fs library.
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Profiling shows that rb_test_bit() is now calling ext2fs_rb_next() a
lot, and this function is now the hot spot when running e2freefrag.
If we cache the results of ext2fs_rb_next(), we can eliminate those
extra calls, which further speeds up both e2freefrag and e2fsck by
reducing the amount of CPU time spent in userspace.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The code was previously allocating a single 4 or 8 byte pointer for
the rcursor and wcursor fields in the ext2fs_rb_private structure;
this added two extra memory allocations (which could fail), and extra
indirections, for no good reason. Removing the extra indirection also
makes the code more readable, so it's all upside and no downside.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Optimize testing for a bit in an rbtree-based bitmap for the case
where the calling application is scanning through the bitmap
sequentially. Previously, we did this for a set of bits which were
inside an allocated extent, but we did not optimize the case where
there was a large number of bits after an allocated extents which were
not in use.
1111111111111110000000000000000000
^ optimized ^not optimized
In my tests of a roughly half-filled file system, the run time of
e2freefrag was halved, and the cpu time spent in userspace was during
e2fsck's pass 5 was reduced by a factor of 30%.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Throttle updates for the "Allocating Groups" progress updates to once
a second as well. We now do this throttling in libext2fs, so we don't
have to do this for each of mke2fs's progress updates, and because the
updates from ext2fs_allocate_tables() come from within libext2fs
anyway.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Since clang uses C99 semantics by default, the main changes required
to allow clang to build e2fsprogs was to add support the C99 inline
semantics, while still allowing us to be built when the legacy (but
still default for gcc) GNU C89 inline semantics are in force.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
mke2fs -m option can set reserved blocks ratio up to 50%. But if the
last block group is not big enough to support the necessary data
structures, it gets dropped, we have to recalculate the number of
reserved blocks so that the reserved blocks matches the requested
percentage.
It also avoids a problem where if the user specifies a reserved blocks
of 50%, and after the last partial block group was dropped, if the
number of reserved blocks is greater than 50%, e2fsck will complain.
Steps to reproduce:
1. Create a FS which has the overhead for the last BG
and specify 50 % for reserved blocks ratio
# mke2fs -m 50 -t ext4 DEV 1025M
mke2fs 1.42.5 (29-Jul-2012)
warning: 256 blocks unused.
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
656640 inodes, 2621440 blocks
1310848 blocks (50.00%) reserved for the super user
~~~~~~~ <-- Reserved blocks exceed 50% of FS blocks count!
2. e2fsck outputs filesystem corruption
# e2fsck DEV
e2fsck 1.42.5 (29-Jul-2012)
Corruption found in superblock. (r_blocks_count = 1310848).
The superblock could not be read or does not describe a correct ext2
filesystem. If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
e2fsck -b 32768 <device>
Signed-off-by: Akira Fujita <a-fujita@rs.jp.ne.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Commit a7c17431b9 attempted to fix a problem where the system
libraries might get used instead of local libraries for things like
-lcom_err. It tried to accomplish this by moving $(ELF_OTHER_LIBS) to
before $(LDFLAGS).
Unfortunately, this was the wrong fix; $(ELF_OTHER_LIBS) *MUST* be
after the object files, or the linker might not pull in the necessary
library and not include it into the DT_NEEDED section of the shared
library. The proper fix is to add a -L$(LIB) before $(LDFLAGS), and
then remove the -L option from all of the ELF_OTHER_LIBS definitions
in the library Makefiles.
Addresses-Sourceforge-Bug: #3554345
Cc: Olivier Blin <olivier.blin@softathome.com>
Reported-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The following commands:
dd if=/dev/zero of=/tmp/foo count=1 ibs=$(( 256 * 1024 * 1024 ))
mke2fs -N 256 -t ext4 /tmp/foo
... will cause mke2fs to write until it fills the device. The cause
for this is that the explicit request for 256 inodes causes the number
of inodes per block group to be 8. The ext2fs_initialize() function
assumed that all of the reserved inodes would be in the first block
group, which is not true in this case. This caused the number of
uninitialized inodes in the first block group to be negative, which
then resulted in mke2fs trying to zero out a very large number of
blocks. Oops.
Addresses-Sourceforge-Bug: #3528892
Reported-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
ELF_OTHER_LIBS usually contains local search dirs (-L ../..), but it
was added in link command after system search dirs from LDFLAGS.
Libraries and executables were linked with the system libraries if
present, and possibly using static archives instead of shared
libraries.
It could also make final executable link to fail when shared libraries
are enabled: if libext2fs.so is linked with a static libcom_err.a from
system, build system would attempt to link without -lpthread.
This fixes the issue by moving ELF_OTHER_LIBS before LDFLAGS in the
link command.
Addresses-Sourceforge-Bug: #3542572
Reported-by: Olivier Blin <blino@users.sourceforge.net>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
We failed to clear EXT2_FLAG_SUPER_ONLY after deleting the
quota inode and so, the updated block bitmap was not written
back. This caused fsck to complain after running
'tune2fs -O ^quota <dev>'. Clear this flag so that updated
block bitmap gets written. Also, avoid truncating the quota
inode if it is not hidden.
Signed-off-by: Aditya Kali <adityakali@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Currently 'tune2fs -O quota <dev>' will try to use existing
quota files and write their inode numbers in the superblock.
Next e2fsck run then converts these into hidden quota inodes
(ino #3 & #4). But this approach has problems:
1) Before e2fsck run, the inodes are visible to the user and
might get corrupted or removed or replaced by the user.
2) Since these are user visible, we have to include
their block usage in the quota accounting. But once
these inodes are hidden, e2fsck will have to decrement
their usage from the quota accounting (which e2fsck
currently doesn't do and instead reports error).
(the following used to give e2fsck error previously:
# assume <dev> has aquota.user & aquota.group files
$ tune2fs -O quota <dev> # stores ino# of quota files in
# ext4 superblock
$ e2fsck -f <dev> # hides quota files, but now quota
# usage is incorrect.
<< quota errors >>
Instead of making e2fsck complicated, this patch creates the
hidden quota inodes at 'tune2fs -O quota' time iteself. The
usage is computed freshly and limits are copied from the
aquota.user and aquota.group files as earlier.
Signed-off-by: Aditya Kali <adityakali@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The Build Log Hardening Check is a debian tool which scans the output
of a package build making sure that the security hardening flags are
used when compiling and linking all of binaries in a package.
For the most part we were passing CFLAGS, CPPFLAGS, and LDFLAGS down
to the compiler and link commands, but there there were one or two
exceptions. In addition, there where a few places in "make install"
where the V=1 option was not being honored, which triggered blhc
warnings since it couldn't analyze those commands.
The e2fsck.static was the only binary that was not getting built and
packaged with the hardening flags, but I've fixed all of the blhc
warnings so in the future it will be obvious if we regress.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The attempted inclusion of sys/quota.h is causing failures in when
building on the hurd and freebsd platforms for Debian. It's not
necessary any more, so just remove the #include.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
When e2fsck uses the block iterator to release the blocks in an
extent-mapped inode, when the last block in an extent is removed, the
current extent has been removed and the extent cursor is now pointing
at the next inode. But the block iterator code doesn't know that. So
when it tries to go the next extent, it will end up skipping an
extent, and so the inode will be incompletely truncated.
The fix is to go to the next extent before calling the callback
function for the current extent. This way, regardless of whether the
current extent gets removed, the extent cursor is still pointing at
the right place.
Reported-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
When libext2fs allocates/deletes an extent leaf, the i_blocks
value is incremented/decremented by fs->blocksize / 512. This
is incorrect in case of bigalloc. The correct way here is to
use cluster_size / 512.
The problem is seen if we try to create a large inode using
libext2fs (say using ext2fs_block_iterate3()) on a bigalloc
filesystem. fsck catches this and complains.
Signed-off-by: Aditya Kali <adityakali@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Quite some definitions in quota library are not necessary. Remove them.
Also fold quota.h file into quotaio.h since it didn't contain that many
definitions.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The creation of inline wrappers ext2fs_open_file() and ext2fs_stat()
in commit c859cb1de0 in ext2fs.h caused
difficulties with the use of headers, since the headers for open64()
and stat64() may already be included (and skip the declaration of the
64-bit variants) before ext2fs.h is ever read. There is no real way
to solve the missing prototypes and resulting compiler warnings inside
ext2fs.h.
Since ext2fs_open_file() and ext2fs_stat() are not performance
critical operations, they do not need to be inline functions at all,
and the needed function headers can be handled properly in one file.
Similarly, posix_memalloc() was having difficulties with headers, and
was being defined in ext2fs.h, but it is now only being used by a
single file, so move the required header there.
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The quotactl() system call was being used without the use of a
function prototype. On closer examination, it turns out the one user
of that system call was the quota_is_on() function, which is not used
by e2fsprogs at all. Since libquota is an e2fsprogs-internal library,
and not one that we plan to export any time soon, the simplest thing
to do is to simply remove quota_is_on(), which in turn allows us to
remove all of the infrastructure around using the Linux-specific
quotactl() system call.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This code uses time() but doesn't include time.h leading to:
quotaio.c:89:2: warning: implicit declaration of function 'time'
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
For a completely full filesystem with more than 2^32 blocks, the
rbtree bitmap backend can assemble an extent of used blocks which is
longer than 2^32. If it does, it will overflow ->count, and corrupt
the rbtree for the bitmaps.
Discovered by completely filling a 32T filesystem using fallocate, and
then observing debugfs, dumpe2fs, and e2fsck all behaving badly.
(Note that filling with only 31 x 1T files did not show the problem,
because freespace was fragmented enough that there was no sufficiently
long range of used blocks.)
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Change the include path in the Cflags field so that #include
<lib/foo.h> and <foo.h> will work. We had originally used a C flags
which allowed <foo.h> to work, but many applications (especially those
not using pkg-config) had been using the <lob/foo.h> formulation which
didn't require an explicit -I{$includedir} option to the C compiler.
If those applications then converted over to pkg-config, and the
e2fsprogs libraries were installed with a prefix other than /usr, so
that the header files were in some directory such as
/usr/local/include, a program that used #include <lib/foo.h> would
fail to compile.
So change the pkg-config files to include both -I{$includedir} and
-I{$includir}/lib.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The code was assuming that "unsigned long" was 64-bit, which of course
it isn't on 32-bit systems. This caused blocks to get written to the
wrong place.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Add a configure option, --enable-relative-symlinks, which will use
relative symlinks for the ELF shared library files.
Addresses-Sourceforge-Bug: #3520767
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Add a few dependencies where needed, so that "make -j17 check" now
works.
Signed-off-by: Matthias Andree <matthias.andree@gmx.de>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
MacOS 10.5 doesn't have posix_memalign() nor memalign(), but it does
have valloc(). The Android SDK would like to be built on MacOS 10.5,
so I've added support for a good-enough emulation of memalign()'s
functionality using valloc(), with an explicit test to make sure
valloc() is returning a pointer which is sufficiently aligned given
the requested alignment. This won't work if you try to operate on a
file system with a 16k blocksize using an e2fsprogs built on MacOS
10.5 system, but it is good enough for the common case of 4k
blocksize file systems, and we will let the memory allocation fail in
the alignment is not good enough.
I've also added a unit test for ext2fs_get_memalign() so we can be
sure it's working as expected. I've tested the code paths with
HAVE_POSIX_MEMALIGN defined, HAVE_POSIX_MEMALIGN undefined, and
HAVE_POSIX_MEMALIGN and HAVE_MEMALIGN undefined on an x86 Linux
system, and so I know the valloc() code path works OK. The simplistic
(and less safe) patch at:
https://trac.macports.org/attachment/ticket/33692/patch-lib-ext2fs-inline.c.diff
Shows that using valloc() apparently works OK for MacOS 10.5 (but if
it doesn't the unit test will catch a problem).
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Create a new function, io_channel_alloc_buf() which allocates I/O
buffers with appropriate alignment if we are using direct I/O. The
original code was sometimes using a larger alignment factor than
necessary, and would always request an aligned memory buffer even when
it was not necessary since the block device was not opened with
O_DIRECT.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Read in a full block for each allocation bitmap, to avoid using a
kernel bounce buffer when using direct I/O.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Create a new function, ext2fs_get_dio_alignment(), which returns the
alignment requirements for direct I/O. This way we can factor out the
code from MMP and the Unix I/O manager. The two modules weren't
consistently calculating the alignment factors, and in particular MMP
would sometimes use a larger alignment factor than was strictly
necessary.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The align field which indicated the required data alignment of data
buffers was stored in a field specific to the unix_io manager. Move
it to the top-level io_channel structure so it can be better
generalized.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Currently fsck recomputes quotas and overwrites quota files
whenever its run. This causes unnecessary modification of
filesystem even when quotas were never inconsistent. We also
lose the limits information because of this. With this patch,
e2fsck compares the computed quotas to the on-disk quotas
(while updating the in-memory limits) and writes out the
quota inode only if it is inconsistent.
Signed-off-by: Aditya Kali <adityakali@google.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
compile_et puts #include <et/com_err.h> in
generated header files so pkg-config --cflags
com_err should provide the path to the directory
containing et/.
Improve the test coverage of tst_bitmaps by:
(a) adding the ability to test the legacy (32-bit) bitmap code
(b) adding tests for ext2fs_find_first_zero_inode_bitmap2() and
ext2fs_find_first_zero_block_bitmap2()
The recent regressions caused by the addition (and use) of
ext2fs_find_first_zero_inode_bitmap2() would have been caught if we
had added these tests first. (Another object lesson in why unit tests
are critically important!)
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Fortunately nothing was using this inline function, so we'll just fix
the types in its function signature, which were nonsensical (this was
caused by a cut-and-paste error).
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The lack of 32-bit support was causing febootstrap to crash since it
wasn't passing EXT2_FLAG_64BITS when opening the file system, so we
were still using the legacy bitmaps.
Also add support for bigalloc bitmap into the ffz functions.
Addresses-Red-Hat-Bugzilla: #808421
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Change autoconf to test for setmntent() and use that to decide whether
to use getmntent() and setmntent(), since some systems don't have
setmntent() but they do have the mntent.h header file.
Also, remove the includes of mntent.h from e2fsck and mke2fs and other
places where it is not needed.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The ext2fs_add_journal_inode() function calls
ext2fs_check_mount_point(), which can fail if /etc/mtab is missing.
This causes mke2fs to fail in the middle of the file system format
process; mke2fs calls ext2fs_check_mount_point() already (and has
appropriate fallbacks that calls fails), so add a flag so that mke2fs
can request ext2fs_add_journal_inode() to skip trying to call
e2fsck_check_mount_point().
Addresses-Sourceforge-Bug: #3509398
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
With this change the CPU time needed to shrink a 100G filesystem drops
to 0.8% of the original (17 CPU seconds instead of 2057).
Signed-off-by: Sami Liedes <sami.liedes@iki.fi>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Update the block group descriptor checksum and mark the superblock and
allocation bitmaps as dirty in check_inode_uninit() and
check_block_uninit().
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This function searches a bitmap for the first zero bit within a range.
It checks if there is a bitmap backend specific implementation
available (if the relevant field in bitmap_ops is non-NULL). If not,
it uses a generic and slow method by repeatedly calling test_bmap() in
a loop. Also change ext2fs_new_inode() to use this new function.
This change in itself does not result in a large speedup, rather it
refactors the code in preparation for the introduction of a faster
find_first_zero() for bitarray based bitmaps.
Signed-off-by: Sami Liedes <sami.liedes@iki.fi>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Filesystem shrinking in particular is a heavy user of this loop in
ext2fs_new_inode(). This change makes resize2fs use 24% less CPU time
for shrinking a 100G filesystem.
Signed-off-by: Sami Liedes <sami.liedes@iki.fi>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
We've decided to remove EOFBLOCKS_FL from the ext4 file system entirely,
because it is not actually very useful and it is causing more problems
than it solves. We're going to remove it from e2fsprogs first and then
after the new e2fsprogs version is common enough we can remove the
kernel part as well.
This commit changes e2fsck to not check for EOFBLOCKS_FL. Instead we
simply search for initialized extents past the i_size as this should not
happen. Uninitialized extents can be past the i_size as we can do
fallocate with KEEP_SIZE flag.
Also remove the EXT4_EOFBLOCKS_FL from lib/ext2fs/ext2_fs.h since it is
no longer needed.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This can be useful when using mke2fs on loaded servers, since
otherwise mke2fs can dirty a huge amount of memory very quickly,
leading to other applications not being happy at all.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
In addition to validating the ordering of fields within the inode
and superblock structures, also validate the field sizes. Otherwise
it is possible to incorrectly change the size of one of these fields
without getting any kind of error from these tests. Failures would
only show up later in the test image checks if the field that is
changed is before another in-use field.
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If we have newer kernel headers which define FALLOC_FL_PUNCH_HOLE, but we
are on an older glibc which lacks fallocate, we end up trying to use the
func anyways. Check the ifdef that autoconf already set up for us.
Reported-by: Ortwin Glueck <odi@odi.ch>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
By using m4_flatten, should be easier to maintain these lists.
Regen configure and config.h.in after doing this.
(Modified by tytso to use m4_flatten for the list of header files
checked by AC_CHECK_HEADERS)
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Clean up some compile warnings related to fstat64(), which is
verbosely deprecated on OSX.
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Don't use the system <sys/quota.h> header in mkquota.c, since there
is a local e2fsprogs version of quota.h that is already included and
has the desired quota constants, and avoids symbol conflicts with the
system <sys/quota.h> on other platforms (in particular OSX).
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
We check HAVE_UNISTD_H but haven't included config.h yet, so we end up
hitting warnings about missing prototypes for close/read/etc... funcs.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Building on my glibc-2.15 system hits a warning:
gen_bitmap64.c: In function 'ext2fs_alloc_generic_bmap':
gen_bitmap64.c:127:2: warning: implicit declaration of function
'gettimeofday' [-Wimplicit-function-declaration]
Include sys/time.h if it's available for the prototype.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If the file system is read/only opened with a backup superblock, and
the file system has uninit_bg enabled, the super block must not be
marked as dirty; otherwise, ext2fs_close() will call ext2fs_flush(),
which will fail, since the file descriptor for the block device was
opened read/only, and then the file descriptor won't actually be
closed.
This is normally not a problem since most of the time the program will
exit shortly after calling ext2fs_close(), and many programs don't
bother checking the error return from ext2fs_close(), especially if
the file system was opened read/only.
A big exception to this is e2fsck, since it opens and close the file
systems during its startup, and to make matters worse, registers an
error handler which will noisly complain about the failed writes
caused by ext2fs_flush().
Fix this by not marking the superblock as dirty if the file system was
opened read/only. The changes to the block group descriptors to clear
the uninit bits will still happen, so that e2fsck -n will properly
scan the whole file system. However, those changes will get dropped
when the file system handle is closed.
Addresses-SourceForge-Bug: #3444351
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The replica is a feature which stores multiple copies of the key
metadata blocks so a single block failure in failure-prone media
(read: certain types of flash storage) doesn't take out the entire
file system.
Discussion on the upstream list proved not to be very positive on this
feature; the arguments were that it added complexity that wasn't
warrented, since common practice in industry is to insist on reliable
media, and if media is unreliable, you're kind of toast anyway (unless
the file system is being used as the back-end store of a cluster file
system where checksuming and data replication is happening above the
local disk file system level). So, this feature is being developed
out of tree.
We reserve the code points so that other people won't accidentally
step on them. Since it's not upstream, it's a soft reservation, but
it's not like we have any shortage of RO_COMPAT features. We are a
bit more tight on reserved inodes, but EXT2_BOOT_LOADER_INO and
EXT2_UNDEL_DIR_INO are not currently used anywhere, and
EXT2_EXCLUDE_INO is a reservation for another out-of-tree feature.
There are no features currently being discussed which require a
reserved inode, but if a need were to arise, we can claw back code
point reservations that were never used or not in tree, as those will
always be considered lower priority than in-tree features.
Cc: Aditya Kali <adityakali@google.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Add a function to return the inode number of an open file.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When processing files that contain extents, the block iterator
functions were not properly handling the BLOCK_ABORT bit. This could
cause problems such as ext2fs_link() adding a directory entry multiple
times.
Thanks to Darrick Wong <djwong@us.ibm.com> for reporting this.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Currently, ext2fs_file_set_size2 punches out data blocks between the
end of the file and infinity when truncate_block <= old_truncate
(i.e. when you've made the file longer). This is not a useful
behavior, particularly since it *fails* to punch out the data blocks
when the file is shortened (i.e. truncate_block < old_truncate). This
seems to be the result of the test being backwards, so fix the code to
punch only when the file is getting shorter.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
In the case where debugfs (or rdebugfs) is installed setgid disk, or
some such, we need to disable the use of environment variables for the
obvious reasons.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
If we have to read the backup group descriptor checksums, the UNINIT
flags are cleared to ensure that all of the inodes in the filesystem
are scanned. However, the code that reset the UNINIT flags did not
reset the group checksum, and this produced many spurious error
messages in e2fsck.
Group descriptor 0 checksum is invalid. FIXED.
Group descriptor 1 checksum is invalid. FIXED.
:
:
Recompute checksums after modifying group descriptors to avoid these
error messages. Remove expected error messages in f_illitable_flexbg.
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The function ext2fs_get_pathname() used to return EXT2_ET_NO_DIRECTORY
if one of the directories in an inode's pathname is not a directory.
This is not very useful in an emergency, when the file system is
corrupted. This commit will cause ext2fs_get_pathname() to return a
partial pathname, which should help system administrators trying to
use debugfs to investigate a corrupted file system.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Newer versions of glibc no longer export the getpagesize() prototype when
using recent versions of POSIX (_XOPEN_SOURCE). So building tdb.c gives
use implicit function declaration warnings. Fix the issue by using the
portable sysconf() function which returns the same answer.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This feature is especially useful for better understanding how e2fsprogs
tools (mainly e2fsck) treats bitmaps and what bitmap backend can be most
suitable for particular bitmap. Backend itself (if implemented) can
provide statistics of its own as well.
[ Changed to provide basic statistics when enabled with the
E2FSPROGS_BITMAPS_STATS environment variable -- tytso]
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Now that we have multiple backend implementations of the bitmap code,
this commit teaches e2fsck to use either the most appropriate backend
for each use case.
Since we don't know for sure if we will get it all right, the default
choices can be overridden via e2fsck.conf. The various definitions
are shown here, with the current defaults (which may change as we add
more bitmap implementations and as learn what works better).
; EXT2FS_BAMP64_BITARRAY is 1
; EXT2FS_BMAP64_RBTREE is 2
; EXT2FS_BMAP64_AUTODIR is 3
[bitmaps]
inode_used_map = 2 ; pass1
inode_dir_map = 3 ; pass1
inode_reg_map = 2 ; pass1
block_found_map = 2 ; pass1
inode_bad_map = 2 ; pass1
inode_imagic_map = 2 ; pass1
block_dup_map = 2 ; pass1
block_ea_map = 2 ; pass1
inode_link_info = 2 ; pass1
inode_dup_map = 2 ; pass1b
inode_done_map = 3 ; pass3
inode_loop_detect = 3 ; pass3
fs_bitmaps = 2
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This change causes the max resident memory of mke2fs, as reported by
/usr/bin/time, to drop from 9296k to 5328k when formatting a 25
gig volume.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This backend type will automatically switch between the bitarray and
the rbtree backend based on the number of directories in the file
system.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
For a long time we had a bitarray backend for storing filesystem
metadata bitmaps, however today this approach might hit its limits with
todays huge data storage devices, because of its memory utilization.
Bitarrays stores bitmaps as ..well, as bitmaps. But this is in most
cases highly unefficient because we need to allocate memory even for the
big parts of bitmaps we will never use, resulting in high memory
utilization especially for huge filesystem, when bitmaps might occupy
gigabytes of space.
This commit adds another backend to store bitmaps. It is based on
rbtrees and it stores just used extents of bitmaps. It means that it can
be more memory efficient in most cases.
I have done some limited benchmarking and it shows that rbtree backend
consumes approx 65% less memory that bitarray on 312GB filesystem aged
with Impression (default config). This number may grow significantly
with the filesystem size, but also it may be a lot lower (even negative)
if the inodes are very fragmented (need more benchmarking).
This commit itself does not enable the use of rbtree backend.
[ Simplified the code by avoiding unneeded memory allocation and
deallocation of del_ext. In addition, fixed a bug discovered by the
tst_bitmaps tests: rb_unamrk_bmap() must return true if the bit was
previously set in bitmap, and zero otherwise -- tytso ]
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This commit adds rbtree library into e2fsprogs so it can be used for
various internal data structures. The rbtree implementation is ripped of
kernel rbtree implementation with small changes needed for it to work
outside kernel.
[ I prefixed the exported symbols and interface with ext2fs_ to keep
avoid pulluting the namespace exported by the libext2fs shared
library. -- tytso ]
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This allows a program to control the bitmap backend implementation
that will get used without needing to change the current library API.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This is only an issue for programs compiled against e2fsprogs 1.41
that manipulate bitmaps directly. Fortunately there are very few
programs which do that, especially those that try to clear a bitmap.
Addresses-Sourceforge-Bugs: #3451486
Reported-by: robi6@users.sourceforge.net
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Optimize how the tdb library so that running with [scratch_files] in
/etc/e2fsck.conf is more efficient. Use a better hash function,
supplied by Rogier Wolff, and supply an estimate of the size of the
hash table to tdb_open instead of using the default (which is way too
small in most cases). Also, disable the tdb locking and fsync calls,
since it's not necessary for our use in this case (which is
essentially as cheap swap space; the tdb files do not contain
persistent data.)
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Turns out the Hurd defines MS_SYNC but doesn't define msync(). Go
figure. So check for both.
Reported by Svante Signell.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
PATH_MAX is not portable (for example, it doesn't exist on the Hurd).
So replace it with a new define, which defines the maximum length of
the base quota name. As it turns out, this is substantially smaller
than PATH_MAX.
Also move the definitions relating to quotaio.c from mkquota.h to
quotaio.h, as a cleanup.
Addresses-Debian-Bug: #649689
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Define EXT2FS_MAX_NESTED_LINKS as 8, and check the link count to make
sure we don't exceed it in ext2fs_find_block_device() and
follow_link(). This fixes a potential infinite loop in
ext2fs_find_block_device() if there are symbolic loop links in the
device directory.
Signed-off-by: Niu Yawei <niu@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The freefrag command provides the functionality of e2freefrag on the
currently open file system in debugfs.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
In quota_compute_usage(), the space usage should be in bytes but
not quota block.
Signed-off-by: Niu Yawei <niu@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>