Fix more problems that I found when testing on ppc64:
- Inode swap cut and paste error leads to immutable inodes being
detected as inlinedata inodes, leading to e2fsck incorrectly barfing
on i_block[] contents.
- Superblock csum/verify must be aware of the fs->super byte order
when checking for metadata_csum feature flag. (Hint: in _openfs(),
fs->super is in LE order for the first csum verification)
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
On BE platforms, we need to swap the inode bytes after doing the
checksum verification but before looking at i_blocks.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Add to ext2fs_symlink the ability to create inline data symlinks.
[ Modified by tytso to add more logging to the test script ]
Suggested-by: Pu Hou <houpu.hp@alibaba-inc.com>
Cc: Pu Hou <houpu.hp@alibaba-inc.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The xattr_get method returns to us a pointer to a buffer containing
the EA value. If for some reason we decide to fail out of iterating
the EA part of an inline-data directory, we must free the buffer that
xattr_get passed to us (via inline_data_ea_get).
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Fix up the rest of the inline data code not to complain if there's no
EA, since it's possible that there's no EA because we're in the
process of creating an inline data file. Also, don't return an error
code when removing a nonexistent EA, because there's no reason to.
Furthermore, if we write less than 60 bytes of inline data, remove the
EA to avoid wasting space.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If we're doing a strict overwrite (same data size) of data in an
inline data file, we should be able to skip the size check. If the
in-core EA representation is fine but the on-disk EA is slightly
corrupt (this happens when fixing minor errors in an inline dir), the
ext2fs_xattr_inode_max_size() call, which reads the disk EA, can lead
us to think that there's no space when in reality there is no issue
with doing a strict overwrite.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The inline data code fails to perform endianness conversions correctly
or at all in a number of places, so fix this so that big-endian
machines function properly.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When we're (a) reading EAs into a buffer; (b) byte-swapping EA
entries; or (c) checking EA data, be careful not to run off the end of
the memory buffer, because this causes invalid memory accesses and
e2fsck crashes. This can happen if we encounter a specially crafted
FS image.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Perform a little more sanity checking of EA value offsets so that we
don't crash while trying to load things from the filesystem.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If s_first_meta_bg is greater than the of number block group
descriptor blocks, then reading or writing the block group descriptors
will end up overruning the memory buffer allocated for the
descriptors. Fix this by limiting first_meta_bg to no more than
fs->desc_blocks. This doesn't correct the bad s_first_meta_bg value,
but it avoids causing the e2fsprogs userspace programs from
potentially crashing.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Commit baa3544609 ("libext2fs: have UNIX IO manager use
pread/pwrite) causes a breakage on 32-bit systems where off_t is
32-bits for file systems larger than 4GB. Fix this by using
pread64/pwrite64 if possible, and if pread64/pwrite64 is not present,
using pread/pwrite only if the size of off_t is at least as big as
ext2_loff_t.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Place the allocation bitmaps and inode table blocks so they are
adjacent, even in the last flexbg.
Previously, after running "mke2fs -t ext4 DEV 286720", the layout of
the last few block groups would look like this:
Group 32: (Blocks 262145-270336) [INODE_UNINIT, ITABLE_ZEROED]
Block bitmap at 262145 (+0), Inode bitmap at 262161 (+16)
Inode table at 262177-262432 (+32)
Group 33: (Blocks 270337-278528) [INODE_UNINIT, BLOCK_UNINIT, ITABLE_ZEROED]
Block bitmap at 262146 (bg #32 + 1), Inode bitmap at 262162 (bg #32 + 17)
Inode table at 262433-262688 (bg #32 + 288)
Group 34: (Blocks 278529-286719) [INODE_UNINIT, ITABLE_ZEROED]
Block bitmap at 262147 (bg #32 + 2), Inode bitmap at 262163 (bg #32 + 18)
Inode table at 262689-262944 (bg #32 + 544)
Now, they look like this:
Group 32: (Blocks 262145-270336) [INODE_UNINIT, ITABLE_ZEROED]
Block bitmap at 262145 (+0), Inode bitmap at 262148 (+3)
Inode table at 262151-262406 (+6)
Group 33: (Blocks 270337-278528) [INODE_UNINIT, BLOCK_UNINIT, ITABLE_ZEROED]
Block bitmap at 262146 (bg #32 + 1), Inode bitmap at 262149 (bg #32 + 4)
Inode table at 262407-262662 (bg #32 + 262)
Group 34: (Blocks 278529-286719) [INODE_UNINIT, ITABLE_ZEROED]
Block bitmap at 262147 (bg #32 + 2), Inode bitmap at 262150 (bg #32 + 5)
Inode table at 262663-262918 (bg #32 + 518)
This reduces the free space fragmentation in a freshly created file
system. It also allows the following mke2fs command to succeed:
mke2fs -t ext4 -b 4096 -O ^resize_inode -G $((2**20)) DEV 2130483
(Note that while this allows people to run mke2fs with insanely large
flexbg sizes, this is not a recommended practice, as the kernel may
refuse to resize such a file system while mounted, since it currently
tries to allocate an in-memory data structure based on the size of the
flexbg, and so a file system with a very large flexbg size will cause
the memory allocation to fail. This will hopefully be fixed in a
future kernel release, but if the goal is to force all of the metadata
blocks to be at the beginning of the file system, it's better to use
the packed_meta_blocks configuration parameter in mke2fs.conf.)
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This reverts commit d988201ef9.
The problem with this commit is that causes common small file system
configurations to fail. For example:
mke2fs -O flex_bg -b 4096 -I 1024 -F /tmp/tt 79106
mke2fs 1.42.11 (09-Jul-2014)
/tmp/tt: Invalid argument passed to ext2 library while setting
up superblock
This check in ext2fs_initialize() was added to prevent the metadata
from being allocated beyond the end of the filesystem, but it is
also causing a wide range of failures for small filesystems.
We'll address this in a different way, by using a smarter algorithm
for deciding the layout of metadata blocks for the last flex block
group.
Reported-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If an inode fails checksum verification, don't stuff a copy of it in
the inode cache, because this can cause the library to fail to return
the "corrupt inode" error code.
In general, this happens if ext2fs_read_inode_full() is called twice
on an inode with an incorrect checksum. If fs->flags has
EXT2_FLAG_IGNORE_CSUM_ERRORS set during the first call and *unset*
during the second call, the cache hit during the second call fails to
return EXT2_ET_INODE_CSUM_INVALID as you'd expect. This happens
during fsck because the first read_inode call happens as part of
check_blocks and the second call happens during inode checksum
revalidation. A file system with a slightly corrupt non-extent inode
will trigger this.
While we're at it, make the inode read function consistent with the
rest of libext2fs -- copy the metadata object into the caller's buffer
even if it fails checksum verification. This will help e2fsck avoid a
double re-read later on down the line.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When we need to modify the "ignore checksum error" behavior flag to
get us past a library call, it's possible that the library call can
result in other flag bits being changed. Therefore, it is not correct
to restore unconditionally the previous flags value, since this will
have unintended side effects on the other fs->flags; nor is it correct
to assume that we can unconditionally set (or clear) the "ignore csum
error" flag bit. Therefore, we must merge the previous value of the
"ignore csum error" flag with the value of flags after the call.
Note that we want to leave checksum verification on as much as
possible because doing so exposes e2fsck bugs where two metadata
blocks are "sharing" the same disk block, and attempting to fix one
before relocating the other causes major filesystem damage. The
damage is much more obvious when a previously checked piece of
metadata suddenly fails in a subsequent pass.
The modifications to the pass 2, 3, and 3A code are justified as
follows: When e2fsck encounters a block of directory entries and
cannot find the placeholder entry at the end that contains the
checksum, it will try to insert the placeholder. If that fails, it
will schedule the directory for a pass 3A reconstruction. Until that
happens, we don't want directory block writing (pass 2), block
iteration (pass 3), or block reading (pass 3A) to fail due to checksum
errors, because failing to find the placeholder is itself a checksum
verification error, which causes e2fsck to abort without fixing
anything.
The e2fsck call to ext2fs_read_bitmaps must never fail due to a
checksum error because e2fsck subsequently (a) verifies the bitmaps
itself; or (b) decides that they don't match what has been observed,
and rewrites them.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Add a new behavior flag to the inode scan functions; when specified,
this flag will do some simple sanity checking of entire inode table
blocks. If all the checksums are ok, we can skip checksum
verification on individual inodes later on. If more than half of the
inodes look "insane" (bad extent tree root or checksum failure) then
ext2fs_get_next_inode_full() can return a special status code
indicating that what's in the buffer is probably garbage.
When e2fsck' inode scan encounters the 'inode is garbage' return code
it'll offer to zap the inode straightaway instead of trying to recover
anything. This replaces the previous behavior of asking to zap
anything with a checksum error (strict_csum).
Signed-off-by: Darrick J. Wong <darrick.wong@orale.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Remove the code that would zap an extent block immediately if the
checksum failed (i.e. strict_csums). Instead, we'll only do that if
the extent block header shows obvious structural problems; if the
header checks out, then we'll iterate the block and see if we can
recover some extents.
Requires a minor modification to ext2fs_extent_get such that the
extent block will be returned in the buffer even if the return code
indicates a checksum error. This brings its behavior in line with
the rest of libext2fs.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When reading an EA block in from disk, do a quick sanity check of the
block header, and return an error if we think we have garbage. Teach
e2fsck to ignore the new error code in favor of doing its own
checking, and remove the strict_csums bits while we're at it.
(Also document some assumptions in the new ext_attr code.)
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When we're appending an extent to the end of a file and the index
block is full, don't split the index block into two half-full index
blocks because this leaves us with under utilized index blocks, at
least in the fallocate case. Instead, copy the last extent from the
full block into the new block. This isn't perfect utilization, but
there's a lot of work involved in teaching extent.c to be able to goto
a nonexistent node in a newly allocated (and empty) extent block.
This patch does not fix the general problem of keeping the extent tree
balanced.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If pread/pwrite are present, have the UNIX IO manager use them for
aligned IOs (instead of the current seek -> read/write), thereby
saving us a (minor) amount of system call overhead.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Use EXT2_MIN_BLOCK_SIZE, JFS_MIN_JOURNAL_BLOCKS, SUPERBLOCK_SIZE, and
SUPERBLOCK_OFFSET instead of hardcoded 1024 when it is okay, and also
add a helper ext2fs_journal_sb_start() that will return start of
journal sb with special case for fs with 1k block size.
Signed-off-by: Azat Khuzhin <a3at.mail@gmail.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When creating a file system using a source directory, also copy any extended
attributes that have been set.
[ Add configure tests for Linux-specific xattr syscalls and add fallback
when compiling on non-Linux systems. --tytso ]
Signed-off-by: Ross Burton <ross.burton@intel.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Provide an API to set i_size in an inode and take care of all required
feature flag modifications. Refactor the code to use this new
function.
[ Moved the function to lib/ext2fs/blk_num.c, which is the rest of
these sorts of functions live, and renamed it to be
ext2fs_inode_size_set() instead of ext2fs_inode_set_size() to be
consistent with the other functions in in blk_num.c -- tytso ]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
We rely on a nasty hack to adjust the free block count where we pass
signed value into ext2fs_free_blocks_count_add(), which takes an
64-bit unsigned value, and relies on overflow and C's signed->unsigned
semantics to do the subtraction. This works, so long as a 64-bit
signed value is used.
Unfortunately, ext2fs_block_alloc_stats2() and
ext2fs_block_alloc_stats_range(), this is not true, so on a 64-bit
file system, the free blocks accounting can get screwed up.
A simple way to demonstrate the problem is:
mke2fs -F -t ext4 -O 64bit /tmp/foo.img 1M
e2fsck -fy /tmp/foo.img
... which will result in the following e2fsck complaint:
Pass 5: Checking group summary information
Free blocks count wrong (4294968278, counted=982).
Fix? yes
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
There are a number of places where we need convert groups to blocks or
clusters by multiply the groups by blocks/clusters per group.
Unfortunately, both quantities are 32-bit, but the result needs to be
64-bit, and very often the cast to 64-bit gets lost.
Fix this by adding new macros, EXT2_GROUPS_TO_BLOCKS() and
EXT2_GROUPS_TO_CLUSTERS().
This should fix a bug where resizing a 64bit file system can result in
calculate_minimum_resize_size() looping forever.
Addresses-Launchpad-Bug: #1321958
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Using C99 initializers makes the code a bit more readable, and it
avoids some gcc -Wall warnings regarding missing initializers.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The bits between end and real_end are set as a safety measure for the
kernel when it uses the bit scan instructions. We need to take this
into account when shrinking or growing the block allocation bitmap,
before we can safely use rbtree bitmaps in resize2fs.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Fix a few warnings about unused and uninitialized variables.
Also fix util/subst.c to include <sys/time.h> to avoid using
undeclared functions gettimeofday() and futimes().
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
In the loop in ext2fs_get_free_blocks2, we ask the bitmap if there's a
range of free blocks starting at "b" and ending at "b + num - 1".
That quantity is the number of the last block in the range. Since
ext2fs_blocks_count() returns the number of blocks and not the number
of the last block in the filesystem, the check is incorrect.
Put in a shortcut to exit the loop if finish > start, because in that
case it's obvious that we don't need to reset to the beginning of the
FS to continue the search for blocks. This is needed to terminate the
loop because the broken test meant that b could get large enough to
equal finish, which would end the while loop.
The attached testcase shows that with the off by one error, it is
possible to throw e2fsck into an infinite loop while it tries to
find space for the inode table even though there's no space for one.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
It's only necessary to build tst_libext2fs when running "make check".
Also make sure the links of the tst_* programs are done with
$(ALL_LDFLAGS).
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Most systems have a backwards compatibility symlink in
/usr/include/syscall.h to /usr/include/sys/syscall.h, but
sys/syscall.h is the documented location of the header file. Fix two
locations where we were using <syscall.h> instead of <sys/syscall.h>.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
In mke2fs command, if flex_bg count is too large to filesystem blocks
count, unmountable ext4 which has the out of filesystem block offset
is created (Case1). Moreover this large flex_bg count causes an
unintentional metadata layout (bmap-imap-itable-bmap-imap-itable .. in
block group) (Case2).
To fix these issues and keep healthy flex_bg layout, disallow creating
ext4 with obviously large flex_bg count to filesystem blocks count.
Steps to reproduce:
(Case1)
1.
# mke2fs -t ext4 -b 4096 -O ^resize_inode -G $((2**20)) DEV 2130483
2.
# mount -t ext4 DEV MP
mount: wrong fs type, bad option, bad superblock on /dev/sdb4,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so
3.
# dumpe2fs DEV
...
Block count: 2130483
...
Flex block group size: 1048576
...
Group 65: (Blocks 2129920-2130482) [INODE_UNINIT]
Checksum 0x4cb3, unused inodes 8080
Block bitmap at 67 (bg #0 + 67), Inode bitmap at 1048643 (bg #32 + 67)
Inode table at 2129979-2130483 (+59)
^^^^^^^ 2130483 is out of FS!
65535 free blocks, 8080 free inodes, 0 directories, 8080 unused inodes
Free blocks:
Free inodes: 525201-533280
(Case2)
1.
# mke2fs -t ext4 -G 2147483648 DEV 3145728
2.
# debugfs -R stats DEV
...
Block count: 786432
...
Flex block group size: 2147483648
...
Group 0: block bitmap at 193, inode bitmap at 194, inode table at 195
...
Group 1: block bitmap at 707, inode bitmap at 708, inode table at 709
...
Group 2: block bitmap at 1221, inode bitmap at 1222, inode table at 1223
...
Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
If a large flex_bg factor is specified and the block allocator was
laying out block or inode bitmaps or inode tables, and collides with
previously allocated metadata (for example the backup superblock or
group descriptors) it would reset the allocator back to the beginning
of the flex_bg instead of continuing past the obstruction.
For example, with "-G 131072" the inode table will hit the backup
descriptors in groups 1, 3, 5, 7, 9 and start interleaving with the
block and inode bitmaps. That results in poorly allocated bitmaps
and inode tables that are interleaved and not contiguous as was
intended for flex_bg:
Group 0: (Blocks 0-32767)
Primary superblock at 0, Group descriptors at 1-2048
Block bitmap 2049 (+2049), Inode bitmap at 133121 (bg #4+2049)
Inode table 264193-264200 (bg #8+2049)
:
:
Group 3838: (Blocks 125763584-125796351) [INODE_UNINIT, BLOCK_UNINIT]
Block bitmap 5887 (bg #0+5887), Inode bitmap 136959 (bg #4+5887)
Inode table 294897-294904 (bg #8 + 32753)
Group 3839: (Blocks 125796352-125829119) [INODE_UNINIT, BLOCK_UNINIT]
Block bitmap 5888 (bg #0+5888), Inode bitmap 136960 (bg #4+5888)
Inode table 5889-5896 (bg #0 + 5889)
Group 3840: (Blocks 125829120-125861887) [INODE_UNINIT, BLOCK_UNINIT]
Block bitmap 5897 (bg #0+5897), Inode bitmap 136961 (bg #4+5889)
Inode table 5898-5905 (bg #0 + 5898)
:
:
Instead, skip the intervening blocks if there aren't too many of them.
That mostly keeps the flex_bg allocations from colliding, though still
not perfect because there is still some overlap with the backups.
This patch addresses the majority of the problem, allowing about 124k
groups to be layed out perfectly, instead of less than 4k groups with
the previous code.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Currently there are many uses of ext2fs_close() which might be wrong.
First of all ext2fs_close() does not set the ext2_filsys pointer to NULL
so the caller is responsible for clearing it, however there are some
cases there we do not do it.
Second of all very small number of users of ext2fs_close() actually
check the return value. If there is a problem in ext2fs_close() it will
not even free the ext2_filsys structure, but majority of users expect it
to do so.
To fix both problems this commit introduces a new helper
ext2fs_close_free() which will not only check for the return value and
free the ext2_filsys structure if the call to ext2fs_close2() failed,
but it will also set the ext2_filsys pointer to NULL.
Replace every use of ext2fs_close() in e2fsprogs tools with
ext2fs_close_free() - there is no real reason to keep using
ext2fs_close().
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Try to avoid name clashes with definitions of __u8, __u16, __u32,
and __u64 in userspace, in case other headers also define these
types. Define HAVE___{S,U}{8,16,32,64} preprocessor macros to
show that these types are already defined.
This would avoid the need to check for _BLKID_TYPES_H in ext2_types.h
and _EXT2_TYPES_H in blkid_types.h, but since older versions of these
headers did not use HAVE___U8 et.al. keep these checks around for now.
Report an error if there are no 64-bit types available. The code
will not compile if these are not available.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The LIST_HEAD macro is not directly used in getsize.c, so
<sys/queue.h> is not needed at all, and could cause confusion at
some later point if the Linux-style list macros are ever used.
Build was verified on MacOS which defined HAVE_SYS_DISK_H true.
I manually inspected the sources for recent *BSD headers to check
if this was needed there or not. MacOS and FreeBSD <sys/disk.h>
do not use lists at all. NetBSD and OpenBSD <sys/disk.h> and all
of the <sys/mount.h> headers include <sys/queue.h> internally.
I used http://fxr.watson.org/fxr/source/sys/mount.h?v={OSTYPE}
as a reference, checking both old and new *BSD versions.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Per http://www.gnu.org/software/checker/ the gcc "-checker" option
is long deprecated. Nuke it from e2fsprogs.
Most people would never hit this, but people who love to turn knobs,
such as the reporter of kernel.org bz#74171, might run into it and be
sad.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Quiet a couple of build warnings in tst_libext2fs.c
Add missing unistd.h header for misc/util.c.
Ignore generated files for lib/ext2fs/tst_libext2fs and intl/ files.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This commit fixes two warning messages when compiling with LLVM.
Reported-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Fix compile warnings found on the master branch when using LLVM.
- Add missing format string when using the libintl _() macro
- include <limits.h> header to get PATH_MAX definition
- fix format vs. variable mismatches
- add header block for create_inode.c file
- remove use of bzero(), use ext2fs_get_memzero() instead
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Fix various small resource leaks and error code handling issues that
Coverity pointed out.
Fixes-Coverity-Bugs: 1215250, 1193379, 119194[2-4], 1049160
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This makes memory management easier because when the quota context is
released, all of the quota file handles get released automatically.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Aditya Kali <adityakali@google.com>
There are interfaces that are used by mke2fs.c and tune2fs.c which are
in quotaio.h, and some future changes will be much simpler if we can
combine the two header files together. Also the guard #ifdef for
mkquota.h was incorrect, which caused problems when both header files
needed to be included.
Also remove quota.pc and installation rules for libquota, since this
library is never going to be something that we can export externally
anyway. Eventually we'll want to clean up the interfaces and move the
external publishable interfaces to the libext2fs library, and then
rename what's left from libquota.a to libsupport.a for internal use
only.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Aditya Kali <adityakali@google.com>
The quota_handle wasn't getting closed in quota_compare_and_update().
Fix this, and also make sure that quota_file_close() doesn't
unnecessarily modify the quota inode if it's not necessary. Otherwise
e2fsck will claim that the file system is modified when it didn't need
to be.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Aditya Kali <adityakali@google.com>
Previously if there was a missing quota entry --- i.e., if there were
files owned by group "eng", but there was no quota record for group
"eng", e2fsck would not notice the missing entry. This means that the
usage informtion would not be properly repaired. This is unfortunate.
Fix this by marking each quota record in quota_dict that has a
corresponding record on disk, and then check to see if there are any
records in quota_dict that have not been marked as having been seen.
In that case, we know we need to update the relevant quota inode.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Aditya Kali <adityakali@google.com>
In scan_dquota_callback() we were copying dqb_off from the on-disk
dquot structure to the dqot structure that we have constructed in
memory. This is a bad idea, because if we detect that the quota inode
is corrupted and needs to be replaced, when we later write out the
inode, the fact that we have a non-zero dqb_off value means that quota
routines will not bother updating the quota tree index, since
presumably the quota tree index already has an entry for this dqot.
The problem is that e2fsck, the only user of these functions, has
zapped the quota inode so it can be written from scratch. So this
means the index for the quota records are not written out, so the
kernel can not find any of the records in the quota inodes. Oops.
The fix is simple; there is no reason to copy the dqb_off field into
the quota_dict copy of struct dquot.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Aditya Kali <adityakali@google.com>
Fix various small resource leaks and error code handling issues that
Coverity pointed out.
Fixes-Coverity-Bugs: 11919{39-45}, 1174118, 1049160, 1049144
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Fix some minor bugs relating to passing CFLAGS to cppcheck, and
package the cppcheck output into nicer looking reports.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
dict_uint_cmp() returns an usigned int value in int type, which
could mess the dict key comparison when the difference of two
keys is greater than INT_MAX.
Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Adding the pkgconfigdir variable allows specifying an installation
location for pkg-config files independent of libdir.
Signed-off-by: David Michael <fedora.dm0@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This supercedes the "whole disk" check, since it does a better job and
there are times when it is quite legitimate to want to use the whole
disk.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
If the previous block group's inode table ends at the very end of file
system, wrap around to the beginning of the flex_bg.
This fixes a bug was tickled by:
mke2fs.conf:
frontload = {
features = extent,huge_file,flex_bg,uninit_bg,dir_nlink,extra_isize,^resize_inode,sparse_super2
hash_alg = half_md4
num_backup_sb = 0
packed_meta_blocks = 1
inode_ratio = 4194304
flex_bg_size = 262144
}
mke2fs -T frontload /tmp/foo.img 2T
resize2fs -M /tmp/foo.img
resize2fs -M /tmp/foo.img
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Filefrag doesn't catch and print the shared extent flag. Add this for
users of filefrag on file systems with shared extents (such as btrfs).
Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Fix various unused variable and use-uninitialized warnings.
Add generated files into .gitignore.
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Fix the following build errors on bigendian hosts.
- ctx is a pointer, use '->' not '.'
- add missing argument to ext2fs_dirent_swab_in2
make[2]: Entering directory `/root/e2fsprogs/lib/ext2fs'
CC inline_data.c
inline_data.c: In function ‘ext2fs_inline_data_dir_iterate’:
inline_data.c:221:5: error: request for member ‘errcode’ in something not a structure or union
ctx.errcode = ext2fs_dirent_swab_in2(fs, ctx->buf, ctx->buflen, 0);
^
inline_data.c:222:9: error: request for member ‘errcode’ in something not a structure or union
if (ctx.errcode) {
^
inline_data.c: In function ‘ext2fs_inline_data_dir_expand’:
inline_data.c:364:2: error: too few arguments to function ‘ext2fs_dirent_swab_in2’
retval = ext2fs_dirent_swab_in2(fs, buf, size);
^
In file included from inline_data.c:19:0:
ext2fs.h:1569:18: note: declared here
extern errcode_t ext2fs_dirent_swab_in2(ext2_filsys fs, char *buf, size_t size,
^
make[2]: *** [inline_data.o] Error 1
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
In ext2fs_extent_set_bmap() and ext2fs_punch_extent(), fix the parents
when altering either end of an extent so that the parent nodes reflect
the added mapping.
There's a slight complication to using fix_parents: if there are two
mappings to an lblk in the tree, the value of handle->path->curr can
point to either extent afterwards), which is documented in a comment.
Some additional color commentary from Darrick:
In the _set_bmap() case, I noticed that the "remapping last block in
extent" case would produce symptoms if we are trying to remap a
block from "extent" to "next_extent", and the two extents are
pointed to by different index nodes. _extent_replace(...,
next_extent) updates e_lblk in the leaf extent, but because there's
no _extent_fix_parents() call, the index nodes never get updated.
In the _punch_extent() case, we conclude that we need to split an
extent into two pieces since we're punching out the middle. If the
extent is the last extent in the block, the second extent will be
inserted into a new leaf node block. Without _fix_parents(), the
index node doesn't seem to get updated.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
In ext2fs_extent_free(), h(andle)->max_depth is used as a loop
conditional variable to free all the h->path[].buf pointers. However,
ext2fs_extent_delete() sets max_depth = 0 if we've removed everything
from the extent tree, which causes a subsequent _free() to leak some
buf pointers. max_depth can be re-incremented when splitting extent
nodes, but there's no guarantee that it'll reach the old value before
the free.
Therefore, remember the size of h->paths[] separately, and use that
when freeing the extent handle.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Fix a few minor bugs that cppcheck complained about.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
In ext2fs_block_alloc_stats_range(), the quantity "-inuse * n" is
calculated as a signed 32-bit quantity. Unfortunately, gcc (4.6.3 on
Ubuntu 12.04) doesn't sign-extend this quantity to fill the blk64_t
parameter that ext2fs_free_blocks_count_add() wants, so the end result
is that the superblock gets a ridiculously huge free block count.
Changing the declaration of 'n' to blk64_t seems to fix this.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Fix a number of things that cppcheck complains about. Most of these
are minor resource leaks and forgotten declarations.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
In ext2fs_inline_data_dir_iterate(), we must be very careful to undo
any modifications we make to the dir_context pointer passed in by the
caller, because it's entirely possible that the caller will still want
to do something with the ctx or something inside.
Specifically, ext2fs_dblist_dir_iterate() wants to be able to free
ctx->buf, and it reuses the ctx for multiple dblist entries. That
means that assigning ctx->buf will cause weird crashes at the end of
dir_iterate().
Since we're being careful with ctx, we might as well handle adding the
INLINE_DATA flag to ctx->flags for ext2fs_process_dir_block, since the
dblist caller forgets to unset the flag before reusing the ctx.
This fixes some crashes and valgrind complaints in resize2fs, and is
necessary for the next patch, which fixes resize2fs not to corrupt
inline_data filesystems.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
When expanding an inline data inode, it's possible that the reduction
in the size of the EA structures causes the freeing of the EA block,
which changes the inode. If this happens, the local version of the
inode that ext2fs_inline_data_expand was modifying will be out of sync
with what's on the disk. This local copy gets written out to disk
after a block allocation, at which point it's possible that the inode
EA block and logical block zero point to the same physical block,
which is bad news.
Therefore, write the local copy to disk before removing the inline
data EA, and reread it afterwards.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
i_blocks covers the number of blocks allocated to an inode for data,
extents, and ACL blocks. Since it's possible for a file to have a
separate ACL block and inline data, we must be careful when expanding
an inline data file to adjust, not set, the value of i_blocks.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
ext2fs_inline_data_set() tries to ensure that there is sufficient free
space in the inode to store the inline data. Unfortunately, it gets
the check wrong -- ext2fs_xattr_inode_max_size() returns the amount of
unused bytes in the EA area, and _data_set() doesn't factor in the
size of the existing inline data. Therefore, a strict rewrite of an
N-byte inlinedata with another N-byte inlinedata fails.
Fix the code to do the size check correctly.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Fix a typo that we didn't notice because all the world's an x86. :-)
Reported-by: jon ernst <jonernst07@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Add magic number checking to the extended attribute editing handle;
move inline data to the head of the attribute list when writing so
that inline data ends up in the inode area; and always zero the
attribute space before writing to ensure that we can delete the last
xattr.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
In this unit test, we will test the interface of inline data and make
sure it is fine.
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Currently we have already exported inode cache flush and free functions
for users. This commit exports inode cache creation function. Later
we will use this function to initialize inode cache and do some unit
tests for inline data.
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Currently ext2fs_file_read/write are used to copy data from/to a file.
But they manipulate data by blocksize. For supporting inline data, we
handle it in two new fucntions called ext2fs_file_read/write_inline_data.
In read path the implementation is straightforward. But in write path
things get more complicated because if the size of data is greater than
the maximum size of inline data we will expand this file. So now we
will check this in ext2fs_inline_data_set. If this inode doesn't have
enough space, it will return EXT2_ET_INLINE_DATA_NO_SPACE error. Then
the caller will check this error and tries to expand the file.
The following commands in debugfs can handle inline_data feature after
applying this patch:
- dump
- cat
- rdump
- write
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Now punch command only can remove all inline data because now
punch command is based on block unit and the size of inline data is
never beyond a block size.
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This commit tries to make mkdir command in debugfs support inline data.
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This commit defines a ext2fs_inline_data_expand() to expand an inode with
inline data. In this commit this function only can expand a directory.
But later it will expand a file.
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
If there is an inode with inline data, we just print the size of inline
data in stat command.
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
After applied this commit (a7f4c635), we have banned to traverse blocks
for an inode which has inline data because no block belongs to it. But
before calling this function, we need to check inline data flag. This
commit add a sanity check ext2fs_inode_has_valid_blocks2() to fix them
except that ext2fs_expand_dir because it will be fixed by another patch.
Meanwhile in this commit it fixes a bug that when we kill a file we
could leak an inode.
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Inline_data is handled in dir iterator because a lot of commands use
this function to traverse directory entries in debugfs. We need to
handle inline_data individually because inline_data is saved in two
places. One is in i_block, and another is in ibody extended attribute.
After applied this commit, the following commands in debugfs can
support the inline_data feature:
- cd
- chroot
- link*
- ls
- ncheck
- pwd
- unlink
* TODO: Inline_data doesn't expand to ibody extended attribute because
link command doesn't handle DIR_NO_SPACE error until now. But if we
have already expanded inline data to ibody ea area, link command can
occupy this space.
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Later we will use ext2fs_dirent_swab_in/out to handle big-endian problem
for inline data. Now interfaces assume that it handles a block, but it
is not true after adding inline data. So this commit defines a new
interface for inline data.
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>