Fix a typo that we didn't notice because all the world's an x86. :-)
Reported-by: jon ernst <jonernst07@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Add magic number checking to the extended attribute editing handle;
move inline data to the head of the attribute list when writing so
that inline data ends up in the inode area; and always zero the
attribute space before writing to ensure that we can delete the last
xattr.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
In this unit test, we will test the interface of inline data and make
sure it is fine.
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Currently we have already exported inode cache flush and free functions
for users. This commit exports inode cache creation function. Later
we will use this function to initialize inode cache and do some unit
tests for inline data.
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Currently ext2fs_file_read/write are used to copy data from/to a file.
But they manipulate data by blocksize. For supporting inline data, we
handle it in two new fucntions called ext2fs_file_read/write_inline_data.
In read path the implementation is straightforward. But in write path
things get more complicated because if the size of data is greater than
the maximum size of inline data we will expand this file. So now we
will check this in ext2fs_inline_data_set. If this inode doesn't have
enough space, it will return EXT2_ET_INLINE_DATA_NO_SPACE error. Then
the caller will check this error and tries to expand the file.
The following commands in debugfs can handle inline_data feature after
applying this patch:
- dump
- cat
- rdump
- write
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Now punch command only can remove all inline data because now
punch command is based on block unit and the size of inline data is
never beyond a block size.
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This commit tries to make mkdir command in debugfs support inline data.
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This commit defines a ext2fs_inline_data_expand() to expand an inode with
inline data. In this commit this function only can expand a directory.
But later it will expand a file.
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
If there is an inode with inline data, we just print the size of inline
data in stat command.
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
After applied this commit (a7f4c635), we have banned to traverse blocks
for an inode which has inline data because no block belongs to it. But
before calling this function, we need to check inline data flag. This
commit add a sanity check ext2fs_inode_has_valid_blocks2() to fix them
except that ext2fs_expand_dir because it will be fixed by another patch.
Meanwhile in this commit it fixes a bug that when we kill a file we
could leak an inode.
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Inline_data is handled in dir iterator because a lot of commands use
this function to traverse directory entries in debugfs. We need to
handle inline_data individually because inline_data is saved in two
places. One is in i_block, and another is in ibody extended attribute.
After applied this commit, the following commands in debugfs can
support the inline_data feature:
- cd
- chroot
- link*
- ls
- ncheck
- pwd
- unlink
* TODO: Inline_data doesn't expand to ibody extended attribute because
link command doesn't handle DIR_NO_SPACE error until now. But if we
have already expanded inline data to ibody ea area, link command can
occupy this space.
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Later we will use ext2fs_dirent_swab_in/out to handle big-endian problem
for inline data. Now interfaces assume that it handles a block, but it
is not true after adding inline data. So this commit defines a new
interface for inline data.
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Before loading extended attributes, free any key/value pairs that
might already be associated with the file.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Add another API to query the number of extended attributes.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
A few tweaks to the extended attribute editing APIs:
* Use size_t, not unsigned int, in the new extended attribute editing
API.
* Don't expose the _expand() call since there should be no external
users.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Add functions to allow clients to get, set, and remove extended
attributes from any file. It also supports modifying EAs living in
i_file_acl.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
To check the coverage of e2fsprogs's regression test, do the
following:
configure --enable-gcov
make -j8 ; make -j8 check ; make coverage.txt
The coverage information will be the coverage.txt and *.gcov files in
the build directories.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
ext2fs_free() does not set the ext2_filsys pointer to null so the
caller is responsible to setting it himself if it is needed.
This patch fixes some places where caller did not set ext2_filsys
pointer to NULL after ext2fs_free() which might result in use after
free. Fix it.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
After commit 62f17f3603, variable
"handle" has no use. So delete it.
Signed-off-by: Jon Enrst <jonernst07@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Declare struct_io_manager at the end of unix_io.c, undo_io.c, and
test_io.c files so that there isn't a need to forward declare every
member of this structure. That avoids a lot of redundant code
at the start of every one of these files.
Move the test_flush() function above test_abort() to avoid the need
for a forward declaration.
Fix a few instances of space before tab in these files.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The "mkswap" program is not available on MacOS, so just use the
existing swap0.img.bz2 and swap1.img.bz2 files directly.
Because MacOS HFS+ doesn't support sparse files (welcome to the 80's)
the m_bigjournal test takes forever to zero out the whole 42GB test
filesystem. Skip this test for Darwin kernels for now.
Unfortunately, neither "df -T" nor "stat -f -c %T" is available on
MacOS to directly determine the filesystem type, and I'm too lazy
to parse the output of "mount" and match it to the path of the test
directory in shell, so it just checks the kernel type and assumes
the filesystem type is HFS and skips the test.
Since this test runs on Linux the majority of the time, the loss of
test coverage is minimal. If MacOS should ever get a real filesystem,
this can be revisited.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Fix a number of non-literal string format warnings from LLVM due
to the use of _() that were not fixed in commit 45ff69ffeb.
Fix mismatched int vs. __u64 format warnings in blkmap64_rb.c.
There were also some comparisons of __u64 start or count <= 0.
Change them to be comparisons == 0, or start + count overflow.
Fix operator precedence warning for (value & (value - 1) != 0)
introduced in 11d1116a7c. It seems "&" is lower precedence
than "!=", so the above didn't fail for power-of-two values,
but only odd values. Fortunately, either s_desc_size nor
s_inode_size is valid if odd.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
In C++, "private" is a reserved keyword, so don't use it in the header
file as a function parameter name.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
If a client asks us to remap a block in the middle of an extent, we
potentially have to allocate a fair number of blocks to handle extent
tree splits. A failure in either of the ext2fs_extent_insert calls
leaves us with an extent tree that no longer maps the logical block in
question and everything that came after it! Therefore, try to roll
back the extent tree changes before returning an error code.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If we're doing a BMAP_ALLOC allocation and the extent tree update
fails, there's no point in hanging on to the newly allocated block.
So, free it to make fsck happy.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When modifying/removing an extent during punch, don't forget to update
the extent's parents.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When we're iterating extents during a punch operation, the loop exits
if the punch region is entirely to the right of the extent we're
looking at. This can happen if the punch region starts in the middle
of a hole and covers mapped extents. When this happens, we want to
skip to the next extent, because it might be punchable.
Also, if we've totally passed the punch range, stop.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Also use angle brackets for the #include of dirpaths.h to avoid the
need to manually massage the Makefile.in for the util directory. This
is needed because we have to create a fake dirpaths.h file in the util
directory. The fake dirpaths.h file is rquired to break the circular
dependency caused by util/subst creating dirpaths.h, while
util/subst.c is including config.h, which includes dirpaths.h.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Initialize the on-disk structure before we fill it in, to avoid the
following valgrind warning:
Conditional jump or move depends on uninitialised value(s)
at 0x4323A8: qtree_entry_unused (quotaio_tree.c:40)
by 0x431218: v2r1_mem2diskdqblk (quotaio_v2.c:85)
by 0x432409: qtree_write_dquot (quotaio_tree.c:336)
by 0x431136: v2_commit_dquot (quotaio_v2.c:264)
by 0x42FB63: quota_write_inode (mkquota.c:126)
by 0x408BE6: create_quota_inodes (mke2fs.c:2466)
by 0x409A2D: main (mke2fs.c:2850)
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
In practice, it is **extremely** rare for users to try to use more
than the first backup superblock located at the beginning of block
group #1. (i.e., at block number 32768 for file systems with a 4k
block size). This new compat feature restricts the backup superblock
to block group #1 and the last block group in the file system.
Aside from reducing the overhead of the file system by a small number
of blocks, by eliminating the rest of the backup superblocks, it
allows us to have a much more flexible metadata layout. For example,
we can force all of the allocation bitmaps and inode table blocks to
the beginning of the disk, which allows most of the disk to be
exclusively used for contiguous data blocks.
This simplifies taking advantage of certain HDD specific features,
such as Shingled Magnetic Recording (aka Shingled Drives), and the
TCG's OPAL Storage Specification where having a simple mapping between
LBA block ranges and the data blocks used by the file system can make
life much simpler.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
If there are hundreds of thousands of blocks which are in use before
the first free block, it is much, MUCH faster to use
ext2fs_find_first_zero_block_bitmap2() instead of searching the
allocation bitmap bit by bit.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
By using ext2fs_mark_block_bitmap_range2 and/or
ext2fs_block_alloc_stats_range(), we can significantly speed up the
time needed by mke2fs to allocate the inode table.
For example, the CPU time needed to run the command "mke2fs -t ext4
/tmp/foo.img 32T" (where tmpfs was mounted on /tmp) was decreased from
21.7 CPU seconds down to under 1.7 seconds.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This function is more efficient than using ext2fs_block_alloc_stats2()
for each block in a range. The efficiencies come from being able to
set a block range in the block bitmap at once, and from being update
the block group descriptors once per block group. Especially now that
we are checksuming the block group descriptors, and we are using red
black trees for the allocation bitmaps, these changes can make a huge
difference in the CPU time used by mke2fs when creating very large
file systems.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Commit 8e44eb64bb (libext2fs: mark group data blocks when loading
block bitmap) simplified check_block_uninit since we are now
initializing the bitmap when it is loaded from disk. It left some
variables which were being set but never used, however. In addition,
since we only need check_block_uninit() to clear the block bitmap's
uninit flag, rename it to clear_block_uninit(), and only call it once
we have found a free block in ext2fs_new_blocks2().
This cleans up the code some and optimizes things if we need to search
multiple block groups trying to find a free block.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
When building tst_bitmaps, enable #define DEBUG_RB, so we are
always testing the sanity of the in-memory representation of the
bitmap when using red-black trees as part of a "make check" run.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Move the error checking into the the generic bitmap code, and add
support for bitmaps with cluster_bits set.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
When inserting the first extent into an empty inode, the
ext2fs_extent_insert() leaves path->left set to 1 instead of 0. Since
path->curr is pointing at the last (only) extent in the file,
path->left should be 0.
This is mostly harmless, and gets corrected fairly quickly if the
calling applicaton jumps to a different part of the extent tree ---
for example, by calling ext2fs_extent_goto(), or calling
ext2fs_extent_get with the flags argument set to EXT2_EXTENT_ROOT.
Which is why we hadn't noticed this problem until now.
However, if you insert four extents using ext2fs_extent_insert, the
fourth insert will end up copying too many bytes in the i_block[]
array, since path->left is one larger than it should be. This results
in the inode fields i_generation, i_file_acl, and i_size_high getting
zeroed out.
This problem can be replicated as follows:
% cp /dev/null /tmp/foo.img
% mke2fs -F -t ext4 /tmp/foo.img 100
% debugfs -w /tmp/foo.img
debugfs: write /dev/null foo
debugfs: set_inode_field foo i_size_hi 1
debugfs: stat foo
<----- note that the inode's size is 4294967296
debugfs: extent_open foo
debugfs (extent ino 12): insert --after 0 1 100
debugfs (extent ino 12): insert --after 1 1 101
debugfs (extent ino 12): insert --after 2 1 102
debugfs (extent ino 12): insert --after 3 1 103
debugfs (extent ino 12): extent_close
debugfs: stat foo
<----- note that the inode's size is now 0
debugfs: quit
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>