Use the new ext2fs_punch() call to truncate the quota file. This also
eliminates the need to fix it to work with bigalloc.
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
In the ^extent case, passing ~0ULL as the 'end' parameter to
ext2fs_punch() causes the (end - start + 1) calculation to overflow to
zero. Since the old-style mapped block files cannot have more than
2^32 blocks, just clamp it to ~0U.
This fixes a regression in t_quota_2off with the patch "libext2fs: use
ext2fs_punch() to truncate quota file" applied.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Tweak the wording to be a little less ambiguous, since 'block' can be
a noun or a verb.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
When we set the file size, find the block containing EOF, and zero
everything in that block past EOF so that we can't return stale data
if we ever use fallocate or truncate to lengthen the file.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
If we're asked to punch a file with no data blocks mapped to it and a
non-zero length, we don't need to do any work in ext2fs_punch_extent()
and can return success. Unfortunately, the extent_get() function
returns "no current node" because it (correctly) failed to find any
extents, which is bubbled up to callers. Since no extents being found
is not an error in this corner case, fix up ext2fs_punch_extent() to
return 0 to callers.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
When deleting an entire extent, we cannot always slip to the previous
leaf extent because there might not /be/ a previous extent.
Attempting to correct for that error by asking for the 'current' leaf
extent also doesn't work, because the failed attempt to change to the
previous extent leaves us with no current extent.
Fix this problem by recording the lblk of the next extent before
deleting the current extent and _goto()ing to the next extent after
the deletion.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
If we're using ext2fs_file_write() to write to a hole in a file,
ensure that we can actually allocate the block before updating i_size.
In other words, don't update i_size and don't return success if we hit
an error while allocating space.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Fix up a few places where we ignore return values.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Fix memory allocation calculations and check for NULL pointer returns.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
ext2fs_free_mem() takes a pointer to a pointer, similar to
ext2fs_get_mem(). Improve the documentation, and fix debugfs.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
When reading or writing file blocks, use the IO manager routines that
can handle 64bit block numbers.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
If we have to create a big symlink (i.e. one that doesn't fit into
i_block[]), we are not 64bit block safe and the namei code does not
handle extents at all. Fix both.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Forbid clients from trying to map logical block numbers that are
larger than the lblk->pblk data structures are capable of handling.
While we're at it, don't let clients set the file size to a number
that's beyond what can be mapped.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
For each site where we test for a large file (> 2GB) and set the
LARGE_FILE feature, use a helper function to make the size test
consistent with the test that's in e2fsck. This fixes the fsck
complaints when we try to create a 2GB journal (not so hard with 64k
block size) and fixes the incorrect test in fileio.c.
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
On a FS with a rather large blockize (> 4K), the old block map
structure can construct a fat enough "tree" (or whatever we call that
lopsided thing) that (at least in theory) one could create mappings
for logical blocks higher than 32 bits. In practice this doesn't
happen, but the 'max' and 'iter' variables that the punch helpers use
will overflow because the BLOCK_SIZE_BITS shifts are too large to fit
a 32-bit variable. The current variable declarations also cause punch
to fail on TIND-mapped blocks even if the file is < 16T. So enlarge
the fields to fit.
Yes, this is an obscure corner case, but it seems a little silly if we
can't punch a file's block 300,000,000 on a 64k-block filesystem.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Fix the checking of s_mmp_block in e2fsck_pass1() and
ext2fs_mmp_read() to handle the high 32 bits of s_blocks_count.
Remove redundant check of s_mmp_block in do_dump_mmp() right before
ext2fs_mmp_read() is called.
Also fix s_blocks_count_hi in check_backup_super_block(), since it
cannot use the ext2fs_blocks_count() helper easily.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
A recent patch to fix blk_t to blk64_t assignment mismatches in
e2fsprogs (commit 4dbfd79d14) created
a printf conversion spec / argument type mismatch in tst_iscan.c.
Fix this to avoid truncation of the printed value and to silence
a compiler warning seen when "make check" is run.
Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
These memory leaks were discovered by using "valgrind
--leak-check=full" while running "e2image -I bar.img foo.e2i"
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
e2image manually opens a new IO channel, and then sets the file system
to use this new IO channel via ext2fs_rewrite+to_io(). We need to
make sure the IO channel is set to the file system's block size to
avoid some nasty buffer overruns.
[ Modified by tytso to use io_channel_set_blksize() ]
Signed-off-by: Kit Westneat <kwestneat@ddn.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Don't accept block numbers larger than 2^32 for the badblocks list,
and don't run badblocks on them either.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When we've succesfully linked an inode into a directory, we can stop
iterating the directory.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Update all superblock copies when disabling the quota feature.
Added basic tests for the quota feature.
Signed-off-by: Niu Yawei <niu@whamcloud.com>
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
An inode with inline data has no data blocks, so we can not iterate
over such an inode. Return an error code which indicates this fact;
callers can use this to determine whether or not the inode has inline
data, and then call some routine to iterate over the directory intries
in the line data or read the inline data, as appropriate.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
EXT4_FEATURE_INCOMPAT_INLINE_DATA flag is added into
EXT2_LIB_SOFTSUPP_INCOMPAT due to we still need to take a long time to
test inline_data feature.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The ext2fs_link function has the unfortunate habit of converting
hashed directories into unhashed directories. It doesn't notice that
it's slicing and dicing directory entries from a former dx_{root,node}
block, and therefore doesn't write a protective dirent into the end of
the block to store the checksum. Teach it to do this.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Apparently libext2fs didn't have an error code defined for block
bitmap checksum errors, so add one.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Run sparse against source files when building e2fsprogs with 'make C=1'. If
instead C=2, it configures basic ext2 types for bitwise checking with sparse,
which can help find the (many many) spots where conversion errors are
(possibly) happening.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Currently, only the new 64-bit bitmap implementation supports the
block<->cluster conversions that bigalloc requires. Therefore, if we
have a bigalloc filesystem, require EXT2_FLAGS_64BITS be passed in to
ext2fs_open(). This does not mean that bigalloc file systems have to
be 64-bits; just that the userspace utilities have to be able to use
the new 64-bit capable library functions.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
implied_cluster_alloc() is written such that if the the user passes in
a logical block that is the zeroth block in a logical cluster (lblk %
cluster_ratio == 0), then it will assume that there is no physical
cluster mapped to any other part of the logical cluster.
This is not true if we happen to be allocating logical blocks in
reverse order. Therefore, search the whole cluster, except for the
lblk that we passed in.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
When told to truncate a file, ext2fs_file_set_size2() should start with
the first block past the end of the file. The current calculation
jumps one more block ahead, with the result that it fails to hack off
the last block. Adding blocksize-1 and dividing is sufficient to find
the last block.
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
ext2fs_file_write() needs to update i_size on successful write,
otherwise, ext2fs_file_read() in same open/close cycle will not
be able to read the just written data.
This fixes a bug which results in the the problem of quotacheck
triggered on 'tune2fs -O quota' failed to write back multiple
users/groups accounting information.
Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The inode and block relocation functions aren't currently compiled in
(so we don't need to worry about breaking ABI compatibility). They
were originally intended for use by resize2fs, but we never ended up
using them, so (wisely) they weren't ever included in libext2fs as an
exported interface (they're not even compiled by the Makefile).
Fix them so that in case we ever use them, so that in places where raw
data types (int, long, etc.) stood in for blk_t and blk64_t. Also fix
some sites where we should probably be using blk64_t.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Fix all the places where we should be using a blk64_t instead of a
blk_t. These fixes are more severe because 64bit values could be
truncated silently.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
When we're iterating the main loop in ind_punch(), "offset" tracks how
far we've progressed into the block map, "start" tells us where to
start punching, and "count" tells us how many blocks we are to punch
after "start". Therefore, we would like to break out of the loop once
the "offset" that we're looking at has progressed past the end of the
punch range. Unfortunately, if start !=0, the if-break clause in the
loop causes us to break out of the loop early.
Therefore, change the breakout test to terminate the loop at the
correct time.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The range of blocks to punch is treated as an inclusive range on both
ends, i.e. if start=1 and end=2, both blocks 1 and 2 are punched out.
Thus, start == end means that the caller wishes to punch a single
block. Remove the check that prevents us from punching a single
block.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
During a punch operation, if we decide to delete an extent out of the
extent tree, the subsequent extents are moved on top of the current
extent (that is to say, they're memmmove'd down one slot). Therefore
it is not correct to advance to the next leaf because that means we
miss half the extents in the range! Rereading the current pointer
should be fine.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
If someone tries to write a file that is larger than 2GB, we need to
set the large_file feature flag to affirm that i_size_hi can hold
meaningful contents.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The ext2fs_link helper function link_proc does not check the value of
ls->done, which means that if the function finds multiple empty spaces
that will fit the new directory entry, it will create a directory
entry in each of the spaces. Instead of doing that, check the done
value and don't do anything more if we've already added the directory
entry.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When we define an error in lib/ext2fs/ext2_err.et.in, we will always use
EXT2_ET_* prefix for a new error. But EXT2_NO_MTAB_FILE doesn't obey
this rule. So fix it.
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
If we have an extent tree like this (from debuge2fs's "ex" command):
Level Entries Logical Physical Length Flags
...
2/ 2 60/ 63 13096 - 13117 650024 - 650045 22
2/ 2 61/ 63 13134 - 13142 650062 - 650070 9
2/ 2 62/ 63 13193 - 13194 650121 - 650122 2
2/ 2 63/ 63 13227 - 13227 650155 - 650155 1 A)
1/ 2 4/ 14 13228 - 17108 655367 3881 B)
2/ 2 1/117 13228 - 13251 650156 - 650179 24 C)
2/ 2 2/117 13275 - 13287 650203 - 650215 13
2/ 2 3/117 13348 - 13353 650276 - 650281 6
...
and we resize the fs in such a way that all of those blocks must
be moved down, we do them one at a time. Eventually we move 1-block
extent A) to a lower block, and then follow it with the other
blocks in the next logical offsets from extent C) in the next
interior node B).
The userspace extent code tries to merge, so when it finds that
logical 13228 can be merged with logical 13227 into a single extent,
it does. And so on, all through extent C), up to block 13250 (why
not 13251? [1]), and eventually move the node block as well.
So we end up with this when all the blocks are moved post-resize:
Level Entries Logical Physical Length Flags
...
2/ 2 120/122 13193 - 13193 33220 - 33220 1
2/ 2 121/122 13194 - 13194 33221 - 33221 1
2/ 2 122/122 13227 - 13250 33222 - 33245 24 D)
1/ 2 5/ 19 13228 - 17108 34676 3881 E) ***
2/ 2 1/222 13251 - 13251 33246 - 33246 1 F)
2/ 2 2/222 13275 - 13286 33247 - 33258 12
...
All those adjacent blocks got moved into extent D), which is nice -
but the next interior node E) was never updated to reflect its new
starting point - it says the leaf extents beneath it start at 13228,
when in fact they start at 13251.
So as we move blocks one by one out of original extent C) above, we
need to keep updating C)'s parent node B) for a proper starting point.
fix_parents() does this.
Once the tree is corrupted like this, more corruption can
ensue post-resize, because we traverse the tree by interior nodes,
relying on their start block to know where we are in the tree.
If it gets off, we'll end up inserting blocks into the wrong part
of the tree, etc.
I have a testcase using fsx to create a complex extent tree which
is then moved during resize; it hit this corruption quite easily,
and with this fix, it succeeds.
Note the first hunk in the commit is for going the other way,
moving the last block of an extent to the extent after it; this
needs the same sort of fix-up, although I haven't seen it in
practice.
[1] We leave the last block because a single-block extent is its
own case, and there is no merging code in that case. \o/
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
It turns out that resize2fs uses ext2fs_dup_handle to duplicate fs handles. If
MMP is enabled, this causes both handles to share MMP buffers, which is bad
news when it comes time to free both handles. Change the code to (we hope) fix
this. This prevents resize2fs from failing with a double-free error when
handed a MMP filesystem.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The environment variable EXT2FS_NO_MTAB_OK will suppress the error
code EXT2_NO_MTAB_FILE when the /etc/mtab file can not be found. This
allows the e2fsprogs regression test suite to be run in chroots which
might not have an /etc/mtab file.
By default will still want to complain if the /etc/mtab file is
missing, since we really don't want to discourage distributions and
purveyors of embedded systems from running without an /etc/mtab file.
But if it's missing it only results in a missing sanity check that
might cause file system corruption if the file system is mounted when
programs such as e2fsck, tune2fs, or resize2fs is running, so there is
no potential security problems that might result if this environment
variable is set inappropriately.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Avoid compatibility problems by using the byte swapping functions
defined by e2fsprogs, instead of the ones defined in the system header
files. We use them everywhere else, so we should use them in
kernel-jbd.h too.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Add a test to see if the backtrace() function requires linking in a
library in /usr/lib.
Addresses-Debian-Bug: #708307
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
If secure_getenv() use it in preference to __secure_getenv().
Starting with (e)glibc version 2.17, secure_getenv() exists, while
__secure_getenv() only works with shared library links (where it is a
weak symbol), but not for static links with /lib/libc.a
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This file was never getting compiled, and there is no user of
ext2fs_list_backups() in the e2fsprogs sources. So remove it as a
clean up.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Accessing name_len (and file_type) in ext4_dir_entry structure is
somewhat problematic because on big endian architecture we need to now
whether we are really dealing with ext4_dir_entry (which has u16
name_len which needs byte swapping) or ext4_dir_entry_2 (which has u8
name_len which must not be byte swapped).
Currently the code is somewhat surprising and name_len is always
treated as u16 and byte swapped (flag EXT2_DIRBLOCK_V2_STRUCT isn't
ever used) and then masking of name_len is used to access real
name_len or file_type. Doing things this way in applications using
libext2fs is unexpected to say the least (more natural is to type
struct ext4_dir_entry * to struct ext4_dir_entry_2 * but that gives
wrong results on big endian architectures. So provide helper functions
that give endian-safe access to these fields. Also convert users in
e2fsprogs to use these functions.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The ext2fs_read_inode_full() function should not use fs->read_inode()
if the caller has requested more than the base 128 byte inode
structure and the inode size is greater than 128 bytes. Otherwise the
caller won't get all of the bytes that they were asking for, since
there's no way for the fs->read_inode override function can know what
the size of the buffer passed to ext2fs_read_inode_full().
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This prevents from SIGSEGV when -s options is used.
Signed-off-by: Tomas Racek <tracek@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
New function ext2fs_symlink() doesn't have a prototype in ext2fs.h and
thus debugfs compilation gives warning:
debugfs.c:2219:2: warning: implicit declaration of function 'ext2fs_symlink'
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
parse_num_blocks2() wrongly did:
num << 1;
when log_block_size < 0. That is obviously wrong as such statement has
no effect (and the compiler properly warns about it). Callers expect
returned value to be in bytes when log_block_size < 0 so fix the
statement accordingly.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
ext2fs_llseek() was using lseek instead of lseek64. The
only time it would use lseek64 is if passed an offset that
overflowed 32 bits. This works for SEEK_SET, but not
SEEK_CUR, which can apply a small offset to move the file
pointer past the 32 bit limit.
The code has been changed to instead try lseek64 first, and
fall back to lseek if that fails. It also was doing a
runtime check of the size of off_t. This has been moved to
compile time.
This fixes a problem which would cause e2image when built for
x86-32 to bomb out when used with large file systems.
Signed-off-by: Phillip Susi <psusi@ubuntu.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The ext2fs_{mark,unmark,test}_block_bitmap2() functions understand
about clusters, and will take block numbers and convert them to
clusters before checking the bitmap. The
ext2fs_*_block_bitmap_range2() functions did not do this, which made
them inconsistent. Fortunately, nothing has depended on this
incorrect behavior, and in fact most of the usage of these functions
have only recently been added, and only for optimizations that were
only enabled for non-bigalloc file systems.
So this is a change in previously exported functions, but (a) it
doesn't change the behavior at all for non-bigalloc file systems, and
(b) the change is more likely to fix bugs for bigalloc file systems.
For example, this change fixes a problem with resize2fs and bigalloc
file systems.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Creating symlinks is a complex affair when accounting for slowlinks.
Create a new function, ext2fs_symlink(), modeled after ext2fs_mkdir().
Like ext2fs_mkdir(), ext2fs_symlink() takes on the task of allocating a
new inode and block (for slowlinks), setting up sane default values in
the inode, copying the target path to either the inode (for fastlinks)
or to the first block (for slowlinks), and accounting for the inode and
block stats. Disallow link targets longer than blocksize as the Linux
kernel prevents this.
It does not attempt to expand the parent directory, instead returning
EXT2_ET_DIR_NO_SPACE and leaving it to the caller to expand just as
ext2fs_mkdir() does. Ideally, I think both of these functions should
make a single attempt to expand the directory.
[ Fixed a few bugs discovered when creating a test case for ext2fs_symlink() ]
Signed-off-by: Darren Hart <dvhart@infradead.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Andreas Dilger <adilger@dilger.ca>
To maintain the error codes numbering, we need to pull in the changes
from the 1.43.x development branch for the libext2's error table.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
If the user attemps to create a 512MB cluster, we need to adjust the
defaults to avoid a 32-bit overflow of s_blocks_per_group. Also check
to make sure that the caller of ext2fs_initialize() has not given a
value of s_clusters_per_group that would result in an overflow of
s_blocks_per_group.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Previously the behavior of parse_num_block2 was undefined if
log_block_size was less than zero. It will now return a number in
units of bytes.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
The addition of MMP code was added in the wrong place, so ret_fs could
get set (and EXT2_FLAG_NOFREE_ON_ERROR was cleared as well, which
could confuse e2fsck which depends on this flag being cleared if
ext2fs_open2() succeeded.)
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
There are a number of places where we multiply a dgrp_t with
s_blocks_per_group expecting that we will get a blk64_t. This
requires a cast, or using the convenience function
ext2fs_group_first_block2().
This audit was suggested by Eric Sandeen.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Use ext2fs_[un]mark_block_range2() functions to reduce the CPU
overhead of resizing large file systems by 45%, primarily by
reducing the time spent in fix_uninit_block_bitmaps().
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Fix up the com_err.texinfo file so it will produce a valid printed
output, by cleaning up some errors in the texinfo file, and updating
texinfo.tex to be consistent with the version in the doc subdirectory.
Also add rules so we can generate pdf and ps files from
com_err.texinfo and libext2fs.texinfo.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The macro for log_err() was written so that it needed to always
have an argument, but GCC was unhappy to have an argument when
none was specified in the format string. Use the CPP "##" to
eat the preceeding comma if no argument is specified.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Quiet a number of simple compiler warnings:
- pointers not initialized by ext2fs_get_mem()
- return without value in non-void function
- dereferencing type-punned pointers
- unused variables
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
In a number of places, the output format from "make check" is
incorrectly interpreted as compiler warning output (triggered by
the presence of colons and parenthesis in the output). Convert
these lines to similar output that does not trigger false build
warnings.
In the case of the tst_uuid.c program, the "ctime()" output was
difficult to change, but in fact it is better to actually compare
the time-based UUID against wallclock time instead of just printing
the formatted time as a string, so this test is improved.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Force the use of the static libraries when linking the test program so
that "make check" works when the shared libraries have not been
installed, and so that we test against the version of the libraries in
the source tree.
Reported-by: g.esp@free.fr
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This commit adds the functionality which had previously only been in
the tst_extents command to debugfs. The debugfs command extent_open
will open extent tree of a particular inode, and enables a series of
commands which will allow the user to interact with the extent tree
directly. Once the extent tree is closed via extent_open(), these
additional commands will be disabled again.
This commit exports two new functions from lib/ext2fs/extent.c which
had previously been statically defined: ext2fs_extent_node_split() and
ext2fs_extent_goto2().
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Previously, ext2fs_extent_fix_parents() would only avoid modifying the
cursor location associated with the extent handle the cursor was
pointed at a leaf node in the extent tree. This is because it saved
the starting logical block number of the current extent, but not the
"level" of the extent (where level 0 is the leaf node, level 1 is the
interior node which points at blocks containing leaf nodes, etc.)
Fix ext2fs_extent_fix_parents() so it is guaranteed to not change the
current extent in the handle even if the current extent is not at the
bottom of the tree.
Also add a fix_extent command to the tst_extents program to make it
easier to test this function.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
An index node's logical start (ei_block) should
match the logical start of the first node (index
or leaf) below it. If we find a node whose start
does not match its parent, fix all of its parents
accordingly.
If it finds such a problem, we'll see:
Pass 1: Checking inodes, blocks, and sizes
Interior extent node level 0 of inode 274258:
Logical start 3666 does not match logical start 4093 at next level. Fix<y>?
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The current m68k code was buggy for multiple reasons; first the bfset,
et. al commands interpret the bit number as a signed number, not an
unsigned number. Secondly, there were missing memory clobbers. Since
there is no real benefit in using explicit asm's at this point (gcc is
smart enough to optimize the generic C code to use the set/clear/test
bit m68k instruction) fix this bug by removing the m68k specific asm
versions of these functions.
Tested on m68k-linux with e2fsprogs-1.42.6 and gcc-4.6.3 as before.
All tests pass and the debug output looks sane.
I compared the e2fsck binaries from the previous build with this
one. They had identical .text sizes, and almost the same number
of bit field instructions (obviously compiler-generated), so this
change should have no serious performance implications.
Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Fix a potential memory leak reported by Li Xi. In addition, there
were possible error cases where the file descriptor would not be
properly closed, so fix those as well while we're at it.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reported-by: Li Xi <pkuelelixi@gmail.com>
Make sure the s_mmp_update_interval super block field is set
from the file system parameters block which is passed into the
ext2fs_initialize() function.
Addresses-Lustre-Bug: LU-1888
Signed-off-by: Gregoire Pichon <gregoire.pichon@bull.net>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The changes to support metadata checksum allocated a single large
array for all of the inodes in the inode cache. This is slightly more
efficient, but given that the inode cache is small (only 4 inodes) it
doesn't really have that much benefit. The problem with doing things
this way is that the memory overruns, such as the one fixed in commit
43c4910371, do not get detected by valgrind.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
An inode cache slot will be overrun if a caller to ext2fs_read_inode_full()
or ext2fs_write_inode_full() attempts to read or write a full sized 156
byte inode when the target filesystem contains 128 byte inodes. Limit the
copied inode to the smaller of the target filesystem's or the caller's
requested inode size.
Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This optimizies the CPU utilization of the rb_get_bmap_range()
function when most of the bitmap is allocated.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
This simplifies the rb_get_bmap_range() function and speeds it up for
the case where most of the bitmap is zero.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
This function efficiently counts the number of bits in a block of
memory.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
This speeds up reading bitmaps from disk for very large (and full)
disks by significant amounts (i.e., up to two CPU minutes for a 4T
file system).
Addresses-Google-Bug: #7534813
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Restructure the ext2fs_get_device_size() and blkid_get_dev_size()
code to localize the variables used for different device probing
methods. This at least reduces the #ifdef mess to only one part
of the code for each method, and avoids "unused variable" compiler
warnings added when variables are declared without being #ifdef'd.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Change the license of the mmp.c file to LGPL to match the license
of other files in the libext2fs library.
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
In addition, make the directory interator more robust in the case
where the file system has the metadata checksum feature enabled, but
the directory checksum is not present in a directory block.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Profiling shows that rb_test_bit() is now calling ext2fs_rb_next() a
lot, and this function is now the hot spot when running e2freefrag.
If we cache the results of ext2fs_rb_next(), we can eliminate those
extra calls, which further speeds up both e2freefrag and e2fsck by
reducing the amount of CPU time spent in userspace.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The code was previously allocating a single 4 or 8 byte pointer for
the rcursor and wcursor fields in the ext2fs_rb_private structure;
this added two extra memory allocations (which could fail), and extra
indirections, for no good reason. Removing the extra indirection also
makes the code more readable, so it's all upside and no downside.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Optimize testing for a bit in an rbtree-based bitmap for the case
where the calling application is scanning through the bitmap
sequentially. Previously, we did this for a set of bits which were
inside an allocated extent, but we did not optimize the case where
there was a large number of bits after an allocated extents which were
not in use.
1111111111111110000000000000000000
^ optimized ^not optimized
In my tests of a roughly half-filled file system, the run time of
e2freefrag was halved, and the cpu time spent in userspace was during
e2fsck's pass 5 was reduced by a factor of 30%.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Throttle updates for the "Allocating Groups" progress updates to once
a second as well. We now do this throttling in libext2fs, so we don't
have to do this for each of mke2fs's progress updates, and because the
updates from ext2fs_allocate_tables() come from within libext2fs
anyway.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The function ext2fs_init_csum_seed() has nothing to do with the
ext2fs_get_mem()/ext2fs_get_memzero()/ext2fs_get_array()/ext2fs_get_arrayzero()
functions. (This define is there so that on platforms where we need
to use the standard C functions, they can be replaced --- this is
primarily needed when trying to compile libext2fs for strange,
non-quite-standards-compliant platforms, such as Windows.)
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Since clang uses C99 semantics by default, the main changes required
to allow clang to build e2fsprogs was to add support the C99 inline
semantics, while still allowing us to be built when the legacy (but
still default for gcc) GNU C89 inline semantics are in force.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
mke2fs -m option can set reserved blocks ratio up to 50%. But if the
last block group is not big enough to support the necessary data
structures, it gets dropped, we have to recalculate the number of
reserved blocks so that the reserved blocks matches the requested
percentage.
It also avoids a problem where if the user specifies a reserved blocks
of 50%, and after the last partial block group was dropped, if the
number of reserved blocks is greater than 50%, e2fsck will complain.
Steps to reproduce:
1. Create a FS which has the overhead for the last BG
and specify 50 % for reserved blocks ratio
# mke2fs -m 50 -t ext4 DEV 1025M
mke2fs 1.42.5 (29-Jul-2012)
warning: 256 blocks unused.
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
656640 inodes, 2621440 blocks
1310848 blocks (50.00%) reserved for the super user
~~~~~~~ <-- Reserved blocks exceed 50% of FS blocks count!
2. e2fsck outputs filesystem corruption
# e2fsck DEV
e2fsck 1.42.5 (29-Jul-2012)
Corruption found in superblock. (r_blocks_count = 1310848).
The superblock could not be read or does not describe a correct ext2
filesystem. If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
e2fsck -b 32768 <device>
Signed-off-by: Akira Fujita <a-fujita@rs.jp.ne.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
[ Also teach libe2p's print_flags() function to display this flag so
that lsattr will allow us to see whether a file has inline data or not.
--tytso ]
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This is what the patches from Zhen Liu uses, so let's make this change
now to keep things easier. INCOMPAT_INLINE_DATA also looks better
IMHO. :-)
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Handle EXT4_FEATURE_RO_COMPAT_QUOTA the same way we handle INCOMPAT
features, so we don't have to have two definitions for
EXT2_LIB_FEATURE_RO_COMPAT_SUPP depending on whether or not
CONFIG_QUOTA is enabled or not.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Commit a7c17431b9 attempted to fix a problem where the system
libraries might get used instead of local libraries for things like
-lcom_err. It tried to accomplish this by moving $(ELF_OTHER_LIBS) to
before $(LDFLAGS).
Unfortunately, this was the wrong fix; $(ELF_OTHER_LIBS) *MUST* be
after the object files, or the linker might not pull in the necessary
library and not include it into the DT_NEEDED section of the shared
library. The proper fix is to add a -L$(LIB) before $(LDFLAGS), and
then remove the -L option from all of the ELF_OTHER_LIBS definitions
in the library Makefiles.
Addresses-Sourceforge-Bug: #3554345
Cc: Olivier Blin <olivier.blin@softathome.com>
Reported-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
When the kernel writes an inode where all of the other inodes in in
the inode table (itable) block are unused, it skips reading the itable
block from disk, and instead uses an all zeros block. This can cause
e2fsck to complain when it iterates over the inodes using
ext2fs_get_next_inode() since the inode apparently has an invalid
checksum. Normally the inode won't be returned at all if it is at the
end of the block group's part of the inode table, thanks to the
bg_itable_unused field. But it's possible for this situation to
happen earlier in the inode table block.
Fix this by changing ext2fs_inode_csum_verify() to allow the inode to
be all zero's; if the checksum fails, and the inode is all zero's,
treat it as a valid checksum.
Reported-by: Tao Ma <boyu.tm@taobao.com>
Reported-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The crc32c implementation in the kernel has been refactored a bit to
reduce the amount of code that needs to be maintained, and to speed up
tune2fs/e2fsck on PowerPC by 5-10%. Port the crc32c changes over, and
provide a crc32_be so that we can remove the duplicate functionality
from e2fsck. Also drop crc32c_be and crc32_le since neither got used.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Add metadata checksumming to the list of supported features.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Check the data block checksums when recovering the journal.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Modify the dump code to print information about jbd2 v2 checksum data.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Define flags and change journal structure definitions to support v2 journal
checksumming.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Change the block group algorithm to use the same algorithm as the rest
of the metadata_csum. This mostly involves providing a helper
function to tell if group descriptors should have checksums set or
verified, and modifying the gdt checksum code to use the correct
algorithm.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Record the type of checksum algorithm we're using for metadata in the
superblock, in case we ever want/need to change the algorithm.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Calculate and verify the superblock checksums. Each copy of the
superblock records the number of the group it's in and the FS UUID, so
we can simply checksum the whole block.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Calculate and verify the checksum for separate (i.e. not in the inode)
extended attribute blocks; the checksum lives in the header.
[ Merged in change from Tao so that we always use the fs checksum seed
for the xattr blocks. ]
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Introduce small structures for recording directory tree checksums, and
some API changes to support writing out directory blocks with
checksums.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>