Allow the filesystem to store the metadata checksum seed in the
superblock and add an incompat feature to say that we're using it.
This enables tune2fs to change the UUID on a mounted metadata_csum
FS without having to (racy!) rewrite all disk metadata.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This patch adds project quota support. An new quota type PRJQUOTA(2)
is added. EXT4_PRJ_QUOTA_INO(11) is reserved for project quota inode.
The super block reservers an field s_prj_quota_inum for saving
project quota inode. And each inode adds an internal field i_projid
for saving its project ID.
Signed-off-by: Li Xi <lixi@ddn.com>
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Project quota related fields are reserved in Linux kernel.
As a preparation for it, this patch cleans up quota codes
of e2fsprogs so as to make it easier to add new quota type(s).
Signed-off-by: Li Xi <lixi@ddn.com>
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This code is partially derived from patches from David Turner to allow
debugfs to properly support extended timestamps.
Cc: David Turner <novalis@novalis.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Fix copy-n-paste errors:
* remove duplicate "lastcheck" and "min_extra_isize"
* fix pointer for "first_error_line" and "last_error_line"
* remove superblock field "inodes_count" from inode fields
* add null-termination for mmp_fields
Add assertions for catching such errors in the future.
Mark true aliases with flag "FLAG_ALIAS" and suppress assert for them.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The s_lpf_ino field is intended to store the location of the lost and
found directory if the root directory becomes encrypted (which is not
yet supported). The s_encryption_level field is designed to allow
support for future changes in the on-disk ext4 encryption format while
this feature under development, without having to burn a large number
of bits in the incompat feature flag.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Previously, e4crypt required the user to manually specify the salt
used for their passphrase. This was user unfriendly to say the least.
The e4crypt program can now request the salt using an ioctl, which
will automatically generate the salt if necessary, and keep it in the
ext4 superblock.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
After we determine that we can't parse the array value as an integer,
we need to restore the square brackets to the field name, so that we
can find a match with block[IND], block[DIND], and block[TIND] in the
inode field table.
Reported-by: Jun He <jhe@cs.wisc.edu>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Allow set_inode_field's bmap command in debugfs to allocate blocks,
which enables us to allocate blocks for indirect blocks and internal
extent tree blocks. True, we could do this manually, but seems like
unnecessary bookkeeping activity for humans.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
In practice, it is **extremely** rare for users to try to use more
than the first backup superblock located at the beginning of block
group #1. (i.e., at block number 32768 for file systems with a 4k
block size). This new compat feature restricts the backup superblock
to block group #1 and the last block group in the file system.
Aside from reducing the overhead of the file system by a small number
of blocks, by eliminating the rest of the backup superblocks, it
allows us to have a much more flexible metadata layout. For example,
we can force all of the allocation bitmaps and inode table blocks to
the beginning of the disk, which allows most of the disk to be
exclusively used for contiguous data blocks.
This simplifies taking advantage of certain HDD specific features,
such as Shingled Magnetic Recording (aka Shingled Drives), and the
TCG's OPAL Storage Specification where having a simple mapping between
LBA block ranges and the data blocks used by the file system can make
life much simpler.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The s_desc_size in the superblock specifies the group descriptor
size in bytes, but in various places the EXT4_FEATURE_INCOMPAT_64BIT
flag implies that the descriptor size is EXT2_MIN_DESC_SIZE_64BIT
(64 bytes) instead of checking the actual size. In other places,
the s_desc_size field is used without checking for INCOMPAT_64BIT.
In the case of ext2fs_group_desc() the s_desc_size was being ignored,
and assumed to be sizeof(struct ext4_group_desc), which would result
in garbage for any but the first group descriptor. Similarly, in
ext2fs_group_desc_csum() and print_csum() they assumed that the
maximum group descriptor size was sizeof(struct ext4_group_desc).
Fix these functions to use the actual superblock s_desc_size if
INCOMPAT_64BIT.
Conversely, in ext2fs_swap_group_desc2() s_desc_size was used
without checking for INCOMPAT_64BIT being set.
The e2fsprogs behaviour is different than that of the kernel,
which always checks INCOMPAT_64BIT, and only uses s_desc_size to
determine the offset of group descriptors and what range of bytes
to checksum.
Allow specifying the s_desc_size field at mke2fs time with the
"-E desc_size=NNN" option. Allow a power-of-two s_desc_size
value up to s_blocksize if INCOMPAT_64BIT is specified. This
is not expected to be used by regular users at this time, so it
is not currently documented in the mke2fs usage or man page.
Add m_desc_size_128, f_desc_size_128, and f_desc_bad test cases to
verify mke2fs and e2fsck handling of larger group descriptor sizes.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
ext2fs_free_mem() takes a pointer to a pointer, similar to
ext2fs_get_mem(). Improve the documentation, and fix debugfs.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Fix all the places where we should be using a blk64_t instead of a
blk_t. These fixes are more severe because 64bit values could be
truncated silently.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
In the tables which are used to parse the fields for the set_fields
command, there should never be a entry which has a size set to 8
bytes, and two pointers defined. Not only would it result in
undefined behavior in the compiled code, it doesn't make any sense and
is definitely a bug.
Reported-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Record the type of checksum algorithm we're using for metadata in the
superblock, in case we ever want/need to change the algorithm.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Add --{en,dis}able-mmp options for configure, default to enabled.
Also make tools fail gracefully in the event of encoutering a filesystem
with MMP enabled when the tools were compiled with --disable-mmp
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Several small fixes:
* Gracefully fail mmp commands if fs is not open
* Show magic number in dump_mmp command
* Fix header in output for set_mmp_value -l
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Multi-mount protection is feature that allows mke2fs, e2fsck, and
others to detect if the filesystem is mounted on a remote node (on
SAN disks) and avoid corrupting the filesystem. For e2fsprogs this
means that it checks the MMP block to see if the filesystem is in use,
and marks the filesystem busy while e2fsck is running on the system.
This is useful on SAN disks that are shared between high-availability
servers, or accessible by multiple nodes that aren't in HA pairs. MMP
isn't intended to serve as a primary HA exclusion mechanism, but as a
failsafe to protect against user, software, or hardware errors.
There is no requirement that e2fsck updates the MMP block at regular
intervals, but e2fsck does this occasionally to provide useful
information to the sysadmin in case of a detected conflict.
For the kernel (since Linux 3.0) MMP adds a "heartbeat" mechanism to
periodically write to disk (every few seconds by default) to notify
other nodes that the filesystem is still in use and unsafe to modify.
Originally-by: Kalpak Shah <kalpak@clusterfs.com>
Signed-off-by: Johann Lombardi <johann@whamcloud.com>
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The DEFS line in MCONFIG had gotten so long that it exceeded 4k, and
this was starting to cause some tools heartburn. It also made "make
V=1" almost useless, since trying to following the individual commands
run by make was lost in the noise of all of the defines.
So fix this by putting the configure-generated defines in lib/config.h
and the directory pathnames to lib/dirpaths.h.
In addition, clean up some vestigal defines in configure.in and in the
Makefiles to further shorten the cc command lines.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The set_fields commands (set_super_value, set_inode_field,
set_block_group) now handle fields which store in split fields on
ext4's on-disk format. For example, the superblock fields
s_blocks_count and s_blocks_count_hi.
The user can either set the low or high part of the field via
"blocks_count_lo" or "blocks_count_hi", or both parts can be set via
"blocks_count".
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reserve EXT4_FEATURE_RO_COMPAT_METADATA_CSUM and
EXT2_FEATURE_COMPAT_EXCLUDE_BITMAP. Also reserve fields in the
superblock and the inode for the checksums. In the block group
descriptor, reserve the exclude bitmap field for the snapshot feature,
and checksums for the inode and block allocation bitmaps.
With this commit, the metadata checksum and exclude bitmap features
should have reserved all of the fields they need in ext4's on-disk
format.
This commit also fixes an a missing byte swap for s_overhead_blocks.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Darrick J. Wong <djwong@us.ibm.com>
Cc: Amir Goldstein <amir73il@gmail.com>
It turns out that it's very hard to calculate overheads in the face of
clustered allocation (bigalloc). This is because multiple metadata
blocks from different block groups can end up in the same allocation
cluster. Calculating the exact overhead requires O(all block bitmaps)
in memory, or O(number of block groups**2) in time. So we will
calculate this at mkfs time and stash it in the superblock.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This adds the superblock fields needed so that dumpe2fs works and the
code points and renames the superblock fields from describing
fragments to clusters.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This patch adds support for detecting the new 'quota' feature in ext4.
The patch reserves code points for usr and group quota inodes and also
for the feature flag EXT4_FEATURE_RO_COMPAT_QUOTA.
Signed-off-by: Aditya Kali <adityakali@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
We also support for byte-swapping the Next3 fields, although the
current Next3 implementation doesn't support big-endian systems.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>