Add functions to allow clients to get, set, and remove extended
attributes from any file. It also supports modifying EAs living in
i_file_acl.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
In practice, it is **extremely** rare for users to try to use more
than the first backup superblock located at the beginning of block
group #1. (i.e., at block number 32768 for file systems with a 4k
block size). This new compat feature restricts the backup superblock
to block group #1 and the last block group in the file system.
Aside from reducing the overhead of the file system by a small number
of blocks, by eliminating the rest of the backup superblocks, it
allows us to have a much more flexible metadata layout. For example,
we can force all of the allocation bitmaps and inode table blocks to
the beginning of the disk, which allows most of the disk to be
exclusively used for contiguous data blocks.
This simplifies taking advantage of certain HDD specific features,
such as Shingled Magnetic Recording (aka Shingled Drives), and the
TCG's OPAL Storage Specification where having a simple mapping between
LBA block ranges and the data blocks used by the file system can make
life much simpler.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This function is more efficient than using ext2fs_block_alloc_stats2()
for each block in a range. The efficiencies come from being able to
set a block range in the block bitmap at once, and from being update
the block group descriptors once per block group. Especially now that
we are checksuming the block group descriptors, and we are using red
black trees for the allocation bitmaps, these changes can make a huge
difference in the CPU time used by mke2fs when creating very large
file systems.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Add functions which try to find the first set block or inode in a
bitmap. This is useful when trying to allocate a range of blocks
efficiently.
Like the find_first_zero family of functions, provide a generic O(N)
search function which will be used if there is no optimized version
provided by the red-black tree or bitarray functions.
Also, expand the test cases for ext2fs_find_first_zero_*() functions.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
On a filesystem with 1K blocks and meta_bg enabled, opening a
filesystem with automatic superblock detection tries to compensate for
the fact that the superblock lives in block 1. However, the method by
which this is done is later misinterpreted to mean "read the backup
group descriptors", which is not what we want in this case.
Therefore, in ext2fs_open3() separate the 'group zero' adjustment into
its own variable so that we don't get fed backup group descriptors
when we try to load meta_bg group descriptors.
Furthermore, enhance ext2fs_descriptor_block_loc2() to perform its own
group zero correction. The other caller of this function neglects to
do any group-zero correction of their own, so this fixes them too.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When bigalloc is enabled, using ext2fs_block_alloc_stats2() to free
any block in a cluster has the effect of freeing the entire cluster.
This is problematic if a caller instructs us to punch, say, blocks
12-15 of a 16-block cluster, because blocks 0-11 now point to a "free"
cluster.
The naive way to solve this problem is to see if any of the other
blocks in this logical cluster map to a physical cluster. If so, then
we know that the cluster is still in use and it mustn't be freed.
Otherwise, we are punching the last mapped block in this cluster, so
we can free the cluster.
The implementation given only does the rigorous checks for the partial
clusters at the beginning and end of the punching range.
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
ext2fs_free_mem() takes a pointer to a pointer, similar to
ext2fs_get_mem(). Improve the documentation, and fix debugfs.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
For each site where we test for a large file (> 2GB) and set the
LARGE_FILE feature, use a helper function to make the size test
consistent with the test that's in e2fsck. This fixes the fsck
complaints when we try to create a 2GB journal (not so hard with 64k
block size) and fixes the incorrect test in fileio.c.
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
EXT4_FEATURE_INCOMPAT_INLINE_DATA flag is added into
EXT2_LIB_SOFTSUPP_INCOMPAT due to we still need to take a long time to
test inline_data feature.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Run sparse against source files when building e2fsprogs with 'make C=1'. If
instead C=2, it configures basic ext2 types for bitwise checking with sparse,
which can help find the (many many) spots where conversion errors are
(possibly) happening.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The range of blocks to punch is treated as an inclusive range on both
ends, i.e. if start=1 and end=2, both blocks 1 and 2 are punched out.
Thus, start == end means that the caller wishes to punch a single
block. Remove the check that prevents us from punching a single
block.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Accessing name_len (and file_type) in ext4_dir_entry structure is
somewhat problematic because on big endian architecture we need to now
whether we are really dealing with ext4_dir_entry (which has u16
name_len which needs byte swapping) or ext4_dir_entry_2 (which has u8
name_len which must not be byte swapped).
Currently the code is somewhat surprising and name_len is always
treated as u16 and byte swapped (flag EXT2_DIRBLOCK_V2_STRUCT isn't
ever used) and then masking of name_len is used to access real
name_len or file_type. Doing things this way in applications using
libext2fs is unexpected to say the least (more natural is to type
struct ext4_dir_entry * to struct ext4_dir_entry_2 * but that gives
wrong results on big endian architectures. So provide helper functions
that give endian-safe access to these fields. Also convert users in
e2fsprogs to use these functions.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
New function ext2fs_symlink() doesn't have a prototype in ext2fs.h and
thus debugfs compilation gives warning:
debugfs.c:2219:2: warning: implicit declaration of function 'ext2fs_symlink'
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This commit adds the functionality which had previously only been in
the tst_extents command to debugfs. The debugfs command extent_open
will open extent tree of a particular inode, and enables a series of
commands which will allow the user to interact with the extent tree
directly. Once the extent tree is closed via extent_open(), these
additional commands will be disabled again.
This commit exports two new functions from lib/ext2fs/extent.c which
had previously been statically defined: ext2fs_extent_node_split() and
ext2fs_extent_goto2().
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
An index node's logical start (ei_block) should
match the logical start of the first node (index
or leaf) below it. If we find a node whose start
does not match its parent, fix all of its parents
accordingly.
If it finds such a problem, we'll see:
Pass 1: Checking inodes, blocks, and sizes
Interior extent node level 0 of inode 274258:
Logical start 3666 does not match logical start 4093 at next level. Fix<y>?
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The changes to support metadata checksum allocated a single large
array for all of the inodes in the inode cache. This is slightly more
efficient, but given that the inode cache is small (only 4 inodes) it
doesn't really have that much benefit. The problem with doing things
this way is that the memory overruns, such as the one fixed in commit
43c4910371, do not get detected by valgrind.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
In addition, make the directory interator more robust in the case
where the file system has the metadata checksum feature enabled, but
the directory checksum is not present in a directory block.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The function ext2fs_init_csum_seed() has nothing to do with the
ext2fs_get_mem()/ext2fs_get_memzero()/ext2fs_get_array()/ext2fs_get_arrayzero()
functions. (This define is there so that on platforms where we need
to use the standard C functions, they can be replaced --- this is
primarily needed when trying to compile libext2fs for strange,
non-quite-standards-compliant platforms, such as Windows.)
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Since clang uses C99 semantics by default, the main changes required
to allow clang to build e2fsprogs was to add support the C99 inline
semantics, while still allowing us to be built when the legacy (but
still default for gcc) GNU C89 inline semantics are in force.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Handle EXT4_FEATURE_RO_COMPAT_QUOTA the same way we handle INCOMPAT
features, so we don't have to have two definitions for
EXT2_LIB_FEATURE_RO_COMPAT_SUPP depending on whether or not
CONFIG_QUOTA is enabled or not.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The crc32c implementation in the kernel has been refactored a bit to
reduce the amount of code that needs to be maintained, and to speed up
tune2fs/e2fsck on PowerPC by 5-10%. Port the crc32c changes over, and
provide a crc32_be so that we can remove the duplicate functionality
from e2fsck. Also drop crc32c_be and crc32_le since neither got used.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Add metadata checksumming to the list of supported features.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Change the block group algorithm to use the same algorithm as the rest
of the metadata_csum. This mostly involves providing a helper
function to tell if group descriptors should have checksums set or
verified, and modifying the gdt checksum code to use the correct
algorithm.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Record the type of checksum algorithm we're using for metadata in the
superblock, in case we ever want/need to change the algorithm.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Calculate and verify the superblock checksums. Each copy of the
superblock records the number of the group it's in and the FS UUID, so
we can simply checksum the whole block.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Calculate and verify the checksum for separate (i.e. not in the inode)
extended attribute blocks; the checksum lives in the header.
[ Merged in change from Tao so that we always use the fs checksum seed
for the xattr blocks. ]
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Introduce small structures for recording directory tree checksums, and
some API changes to support writing out directory blocks with
checksums.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Verify and calculate checksums of htree internal node blocks.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Verify and calculate extent tree block checksums when processing
filesystems.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Provide a field in the block group descriptor to store inode bitmap
checksum, and some helper functions to calculate and verify it.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This patch adds the ability for the libext2fs functions to read and
write the inode checksum.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Precompute the FS UUID checksum seed that is used for all metadata
checksumming operations and store it in ext2_filsys.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Define flags and extend ext4 structure definitions to support metadata
checksumming. Ted Ts'o covered many of these fields in an earlier
patch, but there are more required changes to the disk layout.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>