With this change the CPU time needed to shrink a 100G filesystem drops
to 0.8% of the original (17 CPU seconds instead of 2057).
Signed-off-by: Sami Liedes <sami.liedes@iki.fi>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Update the block group descriptor checksum and mark the superblock and
allocation bitmaps as dirty in check_inode_uninit() and
check_block_uninit().
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This function searches a bitmap for the first zero bit within a range.
It checks if there is a bitmap backend specific implementation
available (if the relevant field in bitmap_ops is non-NULL). If not,
it uses a generic and slow method by repeatedly calling test_bmap() in
a loop. Also change ext2fs_new_inode() to use this new function.
This change in itself does not result in a large speedup, rather it
refactors the code in preparation for the introduction of a faster
find_first_zero() for bitarray based bitmaps.
Signed-off-by: Sami Liedes <sami.liedes@iki.fi>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Filesystem shrinking in particular is a heavy user of this loop in
ext2fs_new_inode(). This change makes resize2fs use 24% less CPU time
for shrinking a 100G filesystem.
Signed-off-by: Sami Liedes <sami.liedes@iki.fi>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
We've decided to remove EOFBLOCKS_FL from the ext4 file system entirely,
because it is not actually very useful and it is causing more problems
than it solves. We're going to remove it from e2fsprogs first and then
after the new e2fsprogs version is common enough we can remove the
kernel part as well.
This commit changes e2fsck to not check for EOFBLOCKS_FL. Instead we
simply search for initialized extents past the i_size as this should not
happen. Uninitialized extents can be past the i_size as we can do
fallocate with KEEP_SIZE flag.
Also remove the EXT4_EOFBLOCKS_FL from lib/ext2fs/ext2_fs.h since it is
no longer needed.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This can be useful when using mke2fs on loaded servers, since
otherwise mke2fs can dirty a huge amount of memory very quickly,
leading to other applications not being happy at all.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
In addition to validating the ordering of fields within the inode
and superblock structures, also validate the field sizes. Otherwise
it is possible to incorrectly change the size of one of these fields
without getting any kind of error from these tests. Failures would
only show up later in the test image checks if the field that is
changed is before another in-use field.
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If we have newer kernel headers which define FALLOC_FL_PUNCH_HOLE, but we
are on an older glibc which lacks fallocate, we end up trying to use the
func anyways. Check the ifdef that autoconf already set up for us.
Reported-by: Ortwin Glueck <odi@odi.ch>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
By using m4_flatten, should be easier to maintain these lists.
Regen configure and config.h.in after doing this.
(Modified by tytso to use m4_flatten for the list of header files
checked by AC_CHECK_HEADERS)
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Clean up some compile warnings related to fstat64(), which is
verbosely deprecated on OSX.
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Don't use the system <sys/quota.h> header in mkquota.c, since there
is a local e2fsprogs version of quota.h that is already included and
has the desired quota constants, and avoids symbol conflicts with the
system <sys/quota.h> on other platforms (in particular OSX).
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
We check HAVE_UNISTD_H but haven't included config.h yet, so we end up
hitting warnings about missing prototypes for close/read/etc... funcs.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Building on my glibc-2.15 system hits a warning:
gen_bitmap64.c: In function 'ext2fs_alloc_generic_bmap':
gen_bitmap64.c:127:2: warning: implicit declaration of function
'gettimeofday' [-Wimplicit-function-declaration]
Include sys/time.h if it's available for the prototype.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If the file system is read/only opened with a backup superblock, and
the file system has uninit_bg enabled, the super block must not be
marked as dirty; otherwise, ext2fs_close() will call ext2fs_flush(),
which will fail, since the file descriptor for the block device was
opened read/only, and then the file descriptor won't actually be
closed.
This is normally not a problem since most of the time the program will
exit shortly after calling ext2fs_close(), and many programs don't
bother checking the error return from ext2fs_close(), especially if
the file system was opened read/only.
A big exception to this is e2fsck, since it opens and close the file
systems during its startup, and to make matters worse, registers an
error handler which will noisly complain about the failed writes
caused by ext2fs_flush().
Fix this by not marking the superblock as dirty if the file system was
opened read/only. The changes to the block group descriptors to clear
the uninit bits will still happen, so that e2fsck -n will properly
scan the whole file system. However, those changes will get dropped
when the file system handle is closed.
Addresses-SourceForge-Bug: #3444351
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The replica is a feature which stores multiple copies of the key
metadata blocks so a single block failure in failure-prone media
(read: certain types of flash storage) doesn't take out the entire
file system.
Discussion on the upstream list proved not to be very positive on this
feature; the arguments were that it added complexity that wasn't
warrented, since common practice in industry is to insist on reliable
media, and if media is unreliable, you're kind of toast anyway (unless
the file system is being used as the back-end store of a cluster file
system where checksuming and data replication is happening above the
local disk file system level). So, this feature is being developed
out of tree.
We reserve the code points so that other people won't accidentally
step on them. Since it's not upstream, it's a soft reservation, but
it's not like we have any shortage of RO_COMPAT features. We are a
bit more tight on reserved inodes, but EXT2_BOOT_LOADER_INO and
EXT2_UNDEL_DIR_INO are not currently used anywhere, and
EXT2_EXCLUDE_INO is a reservation for another out-of-tree feature.
There are no features currently being discussed which require a
reserved inode, but if a need were to arise, we can claw back code
point reservations that were never used or not in tree, as those will
always be considered lower priority than in-tree features.
Cc: Aditya Kali <adityakali@google.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Add a function to return the inode number of an open file.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When processing files that contain extents, the block iterator
functions were not properly handling the BLOCK_ABORT bit. This could
cause problems such as ext2fs_link() adding a directory entry multiple
times.
Thanks to Darrick Wong <djwong@us.ibm.com> for reporting this.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Currently, ext2fs_file_set_size2 punches out data blocks between the
end of the file and infinity when truncate_block <= old_truncate
(i.e. when you've made the file longer). This is not a useful
behavior, particularly since it *fails* to punch out the data blocks
when the file is shortened (i.e. truncate_block < old_truncate). This
seems to be the result of the test being backwards, so fix the code to
punch only when the file is getting shorter.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
In the case where debugfs (or rdebugfs) is installed setgid disk, or
some such, we need to disable the use of environment variables for the
obvious reasons.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
If we have to read the backup group descriptor checksums, the UNINIT
flags are cleared to ensure that all of the inodes in the filesystem
are scanned. However, the code that reset the UNINIT flags did not
reset the group checksum, and this produced many spurious error
messages in e2fsck.
Group descriptor 0 checksum is invalid. FIXED.
Group descriptor 1 checksum is invalid. FIXED.
:
:
Recompute checksums after modifying group descriptors to avoid these
error messages. Remove expected error messages in f_illitable_flexbg.
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The function ext2fs_get_pathname() used to return EXT2_ET_NO_DIRECTORY
if one of the directories in an inode's pathname is not a directory.
This is not very useful in an emergency, when the file system is
corrupted. This commit will cause ext2fs_get_pathname() to return a
partial pathname, which should help system administrators trying to
use debugfs to investigate a corrupted file system.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Newer versions of glibc no longer export the getpagesize() prototype when
using recent versions of POSIX (_XOPEN_SOURCE). So building tdb.c gives
use implicit function declaration warnings. Fix the issue by using the
portable sysconf() function which returns the same answer.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This feature is especially useful for better understanding how e2fsprogs
tools (mainly e2fsck) treats bitmaps and what bitmap backend can be most
suitable for particular bitmap. Backend itself (if implemented) can
provide statistics of its own as well.
[ Changed to provide basic statistics when enabled with the
E2FSPROGS_BITMAPS_STATS environment variable -- tytso]
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Now that we have multiple backend implementations of the bitmap code,
this commit teaches e2fsck to use either the most appropriate backend
for each use case.
Since we don't know for sure if we will get it all right, the default
choices can be overridden via e2fsck.conf. The various definitions
are shown here, with the current defaults (which may change as we add
more bitmap implementations and as learn what works better).
; EXT2FS_BAMP64_BITARRAY is 1
; EXT2FS_BMAP64_RBTREE is 2
; EXT2FS_BMAP64_AUTODIR is 3
[bitmaps]
inode_used_map = 2 ; pass1
inode_dir_map = 3 ; pass1
inode_reg_map = 2 ; pass1
block_found_map = 2 ; pass1
inode_bad_map = 2 ; pass1
inode_imagic_map = 2 ; pass1
block_dup_map = 2 ; pass1
block_ea_map = 2 ; pass1
inode_link_info = 2 ; pass1
inode_dup_map = 2 ; pass1b
inode_done_map = 3 ; pass3
inode_loop_detect = 3 ; pass3
fs_bitmaps = 2
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This change causes the max resident memory of mke2fs, as reported by
/usr/bin/time, to drop from 9296k to 5328k when formatting a 25
gig volume.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This backend type will automatically switch between the bitarray and
the rbtree backend based on the number of directories in the file
system.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
For a long time we had a bitarray backend for storing filesystem
metadata bitmaps, however today this approach might hit its limits with
todays huge data storage devices, because of its memory utilization.
Bitarrays stores bitmaps as ..well, as bitmaps. But this is in most
cases highly unefficient because we need to allocate memory even for the
big parts of bitmaps we will never use, resulting in high memory
utilization especially for huge filesystem, when bitmaps might occupy
gigabytes of space.
This commit adds another backend to store bitmaps. It is based on
rbtrees and it stores just used extents of bitmaps. It means that it can
be more memory efficient in most cases.
I have done some limited benchmarking and it shows that rbtree backend
consumes approx 65% less memory that bitarray on 312GB filesystem aged
with Impression (default config). This number may grow significantly
with the filesystem size, but also it may be a lot lower (even negative)
if the inodes are very fragmented (need more benchmarking).
This commit itself does not enable the use of rbtree backend.
[ Simplified the code by avoiding unneeded memory allocation and
deallocation of del_ext. In addition, fixed a bug discovered by the
tst_bitmaps tests: rb_unamrk_bmap() must return true if the bit was
previously set in bitmap, and zero otherwise -- tytso ]
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This commit adds rbtree library into e2fsprogs so it can be used for
various internal data structures. The rbtree implementation is ripped of
kernel rbtree implementation with small changes needed for it to work
outside kernel.
[ I prefixed the exported symbols and interface with ext2fs_ to keep
avoid pulluting the namespace exported by the libext2fs shared
library. -- tytso ]
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This allows a program to control the bitmap backend implementation
that will get used without needing to change the current library API.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This is only an issue for programs compiled against e2fsprogs 1.41
that manipulate bitmaps directly. Fortunately there are very few
programs which do that, especially those that try to clear a bitmap.
Addresses-Sourceforge-Bugs: #3451486
Reported-by: robi6@users.sourceforge.net
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Optimize how the tdb library so that running with [scratch_files] in
/etc/e2fsck.conf is more efficient. Use a better hash function,
supplied by Rogier Wolff, and supply an estimate of the size of the
hash table to tdb_open instead of using the default (which is way too
small in most cases). Also, disable the tdb locking and fsync calls,
since it's not necessary for our use in this case (which is
essentially as cheap swap space; the tdb files do not contain
persistent data.)
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Turns out the Hurd defines MS_SYNC but doesn't define msync(). Go
figure. So check for both.
Reported by Svante Signell.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
PATH_MAX is not portable (for example, it doesn't exist on the Hurd).
So replace it with a new define, which defines the maximum length of
the base quota name. As it turns out, this is substantially smaller
than PATH_MAX.
Also move the definitions relating to quotaio.c from mkquota.h to
quotaio.h, as a cleanup.
Addresses-Debian-Bug: #649689
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Define EXT2FS_MAX_NESTED_LINKS as 8, and check the link count to make
sure we don't exceed it in ext2fs_find_block_device() and
follow_link(). This fixes a potential infinite loop in
ext2fs_find_block_device() if there are symbolic loop links in the
device directory.
Signed-off-by: Niu Yawei <niu@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The freefrag command provides the functionality of e2freefrag on the
currently open file system in debugfs.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
In quota_compute_usage(), the space usage should be in bytes but
not quota block.
Signed-off-by: Niu Yawei <niu@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>