Run sparse against source files when building e2fsprogs with 'make C=1'. If
instead C=2, it configures basic ext2 types for bitwise checking with sparse,
which can help find the (many many) spots where conversion errors are
(possibly) happening.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The range of blocks to punch is treated as an inclusive range on both
ends, i.e. if start=1 and end=2, both blocks 1 and 2 are punched out.
Thus, start == end means that the caller wishes to punch a single
block. Remove the check that prevents us from punching a single
block.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
New function ext2fs_symlink() doesn't have a prototype in ext2fs.h and
thus debugfs compilation gives warning:
debugfs.c:2219:2: warning: implicit declaration of function 'ext2fs_symlink'
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This commit adds the functionality which had previously only been in
the tst_extents command to debugfs. The debugfs command extent_open
will open extent tree of a particular inode, and enables a series of
commands which will allow the user to interact with the extent tree
directly. Once the extent tree is closed via extent_open(), these
additional commands will be disabled again.
This commit exports two new functions from lib/ext2fs/extent.c which
had previously been statically defined: ext2fs_extent_node_split() and
ext2fs_extent_goto2().
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
An index node's logical start (ei_block) should
match the logical start of the first node (index
or leaf) below it. If we find a node whose start
does not match its parent, fix all of its parents
accordingly.
If it finds such a problem, we'll see:
Pass 1: Checking inodes, blocks, and sizes
Interior extent node level 0 of inode 274258:
Logical start 3666 does not match logical start 4093 at next level. Fix<y>?
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Since clang uses C99 semantics by default, the main changes required
to allow clang to build e2fsprogs was to add support the C99 inline
semantics, while still allowing us to be built when the legacy (but
still default for gcc) GNU C89 inline semantics are in force.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The creation of inline wrappers ext2fs_open_file() and ext2fs_stat()
in commit c859cb1de0 in ext2fs.h caused
difficulties with the use of headers, since the headers for open64()
and stat64() may already be included (and skip the declaration of the
64-bit variants) before ext2fs.h is ever read. There is no real way
to solve the missing prototypes and resulting compiler warnings inside
ext2fs.h.
Since ext2fs_open_file() and ext2fs_stat() are not performance
critical operations, they do not need to be inline functions at all,
and the needed function headers can be handled properly in one file.
Similarly, posix_memalloc() was having difficulties with headers, and
was being defined in ext2fs.h, but it is now only being used by a
single file, so move the required header there.
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Create a new function, ext2fs_get_dio_alignment(), which returns the
alignment requirements for direct I/O. This way we can factor out the
code from MMP and the Unix I/O manager. The two modules weren't
consistently calculating the alignment factors, and in particular MMP
would sometimes use a larger alignment factor than was strictly
necessary.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The lack of 32-bit support was causing febootstrap to crash since it
wasn't passing EXT2_FLAG_64BITS when opening the file system, so we
were still using the legacy bitmaps.
Also add support for bigalloc bitmap into the ffz functions.
Addresses-Red-Hat-Bugzilla: #808421
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The ext2fs_add_journal_inode() function calls
ext2fs_check_mount_point(), which can fail if /etc/mtab is missing.
This causes mke2fs to fail in the middle of the file system format
process; mke2fs calls ext2fs_check_mount_point() already (and has
appropriate fallbacks that calls fails), so add a flag so that mke2fs
can request ext2fs_add_journal_inode() to skip trying to call
e2fsck_check_mount_point().
Addresses-Sourceforge-Bug: #3509398
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The replica is a feature which stores multiple copies of the key
metadata blocks so a single block failure in failure-prone media
(read: certain types of flash storage) doesn't take out the entire
file system.
Discussion on the upstream list proved not to be very positive on this
feature; the arguments were that it added complexity that wasn't
warrented, since common practice in industry is to insist on reliable
media, and if media is unreliable, you're kind of toast anyway (unless
the file system is being used as the back-end store of a cluster file
system where checksuming and data replication is happening above the
local disk file system level). So, this feature is being developed
out of tree.
We reserve the code points so that other people won't accidentally
step on them. Since it's not upstream, it's a soft reservation, but
it's not like we have any shortage of RO_COMPAT features. We are a
bit more tight on reserved inodes, but EXT2_BOOT_LOADER_INO and
EXT2_UNDEL_DIR_INO are not currently used anywhere, and
EXT2_EXCLUDE_INO is a reservation for another out-of-tree feature.
There are no features currently being discussed which require a
reserved inode, but if a need were to arise, we can claw back code
point reservations that were never used or not in tree, as those will
always be considered lower priority than in-tree features.
Cc: Aditya Kali <adityakali@google.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Add a function to return the inode number of an open file.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This feature is especially useful for better understanding how e2fsprogs
tools (mainly e2fsck) treats bitmaps and what bitmap backend can be most
suitable for particular bitmap. Backend itself (if implemented) can
provide statistics of its own as well.
[ Changed to provide basic statistics when enabled with the
E2FSPROGS_BITMAPS_STATS environment variable -- tytso]
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Now that we have multiple backend implementations of the bitmap code,
this commit teaches e2fsck to use either the most appropriate backend
for each use case.
Since we don't know for sure if we will get it all right, the default
choices can be overridden via e2fsck.conf. The various definitions
are shown here, with the current defaults (which may change as we add
more bitmap implementations and as learn what works better).
; EXT2FS_BAMP64_BITARRAY is 1
; EXT2FS_BMAP64_RBTREE is 2
; EXT2FS_BMAP64_AUTODIR is 3
[bitmaps]
inode_used_map = 2 ; pass1
inode_dir_map = 3 ; pass1
inode_reg_map = 2 ; pass1
block_found_map = 2 ; pass1
inode_bad_map = 2 ; pass1
inode_imagic_map = 2 ; pass1
block_dup_map = 2 ; pass1
block_ea_map = 2 ; pass1
inode_link_info = 2 ; pass1
inode_dup_map = 2 ; pass1b
inode_done_map = 3 ; pass3
inode_loop_detect = 3 ; pass3
fs_bitmaps = 2
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This backend type will automatically switch between the bitarray and
the rbtree backend based on the number of directories in the file
system.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
For a long time we had a bitarray backend for storing filesystem
metadata bitmaps, however today this approach might hit its limits with
todays huge data storage devices, because of its memory utilization.
Bitarrays stores bitmaps as ..well, as bitmaps. But this is in most
cases highly unefficient because we need to allocate memory even for the
big parts of bitmaps we will never use, resulting in high memory
utilization especially for huge filesystem, when bitmaps might occupy
gigabytes of space.
This commit adds another backend to store bitmaps. It is based on
rbtrees and it stores just used extents of bitmaps. It means that it can
be more memory efficient in most cases.
I have done some limited benchmarking and it shows that rbtree backend
consumes approx 65% less memory that bitarray on 312GB filesystem aged
with Impression (default config). This number may grow significantly
with the filesystem size, but also it may be a lot lower (even negative)
if the inodes are very fragmented (need more benchmarking).
This commit itself does not enable the use of rbtree backend.
[ Simplified the code by avoiding unneeded memory allocation and
deallocation of del_ext. In addition, fixed a bug discovered by the
tst_bitmaps tests: rb_unamrk_bmap() must return true if the bit was
previously set in bitmap, and zero otherwise -- tytso ]
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This allows a program to control the bitmap backend implementation
that will get used without needing to change the current library API.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Quota support can be enabled using --enable-quota. There are still
some buglets that we need to fix up before it can be considered 100%
supported, so let's disable it for the 1.42 release.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Commit 6b56f3d92d introduced the use of HAVE_STAT64 without arranging
that it be defined in configure.in. Previously ext4.h used
HAVE_OPEN64, but apparently there are (broken) platforms that have
open64() but not stat64(). Go figure.
We do need to consistently use a single test for ext2fs_stat(),
ext2fs_fstat(), and struct ext2fs_struct_stat, or we could end up
passing a struct stat64 to a fstat() system call, or some such. I've
elected to use HAVE_FSTAT64 because: (a) it's already defined in the
configure script, and (b) if we ever come across a really broken
platform that defines fstat64() but not stat64(), we can always
emulate stat64() using open64() followed by a fstat64().
This commit fixed a bug whose symptoms were that mke2fs would not work
if given a file > 2GB on 32-bit platforms.
Addresses-Debian-Bug: #647245
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The ext2fs_file_acl_block() and ext2fs_set_file_acl_block() needs to
only check i_file_acl_high if the 64-bit flag is set. This is needed
because otherwise we will run into problems on Hurd systems which
actually use that field for h_i_mode_high.
This involves an ABI change since we need to pass ext2_filsys to these
functions. Fortunately these functions were first included in the
1.42-WIP series, so it's OK for us to change them now. (This is why
we have 1.42-WIP releases. :-)
Addresses-Sourceforge-Bug: #3379227
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Remove unused variables, places where 'return' was used with no value
in a non-void function, missing function declarations, etc. Don't
assume that all systems have quotactl(), and use <sys/quota.h> if it
exists to define the quotactl interfaces.
One of the unused variables also got rid of a non-portable use of
PATH_MAX.
Cc: Aditya Kali <adityakali@google.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
ext2fs.h now calls open() so it should include the headers needed
for this system call as well.
Addresses-Red-Hat-Bugzilla: #742147
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Fix several minor errors in structure definitions, the byteswap code,
and Makefiles that result from merging the crc32c and initial parts of
the metadata checksumming patchset.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Multi-mount protection is feature that allows mke2fs, e2fsck, and
others to detect if the filesystem is mounted on a remote node (on
SAN disks) and avoid corrupting the filesystem. For e2fsprogs this
means that it checks the MMP block to see if the filesystem is in use,
and marks the filesystem busy while e2fsck is running on the system.
This is useful on SAN disks that are shared between high-availability
servers, or accessible by multiple nodes that aren't in HA pairs. MMP
isn't intended to serve as a primary HA exclusion mechanism, but as a
failsafe to protect against user, software, or hardware errors.
There is no requirement that e2fsck updates the MMP block at regular
intervals, but e2fsck does this occasionally to provide useful
information to the sysadmin in case of a detected conflict.
For the kernel (since Linux 3.0) MMP adds a "heartbeat" mechanism to
periodically write to disk (every few seconds by default) to notify
other nodes that the filesystem is still in use and unsafe to modify.
Originally-by: Kalpak Shah <kalpak@clusterfs.com>
Signed-off-by: Johann Lombardi <johann@whamcloud.com>
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Several compiler errors are quieted:
- zero-length gnu_printf format string
- unused variable
- uninitalized variable (though it isn't actually used for anything)
- fixed a bug in ext2fs_stat() if stat64() does not exist
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This adds new APIs: ext2fs_flush2 and ext2fs_close2 which take an
extra 'int flags' parameter.
This allows us to pass in an EXT2_FLAG_FLUSH_NO_SYNC flag which avoids
fsync'ing the filesystem when closing it. For the case we have in
mind where we are just constructing a throwaway ext2 filesystem in a
file in order to boot a VM, this saves over 5 seconds during the boot
process and avoids many unnecessary disk writes.
Existing code using ext2fs_flush and ext2fs_close remains unaffected
by this change.
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Code to count the number of blocks in the last partial
group is cut and pasted around the e2fsprogs codebase
a few times.
Making this a helper function should improve matters.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
In many places we are using #ifdef HAVE_OPEN64 to determine if we can
use open64() but that's ugly. This commit creates two new helpers
ext2fs_open_file() for open() and ext2fs_stat() for stat(). Also we need
new typedef ext2fs_struct_stat for struct stat.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Add a slicing-by-8 CRC32c implementation for metadata checksumming.
Adapted from Bob Pearson's kernel patch.
Also added a self-test mechanism so we can verify that the crc32c
implementation is working correctly.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The dump program relies on fs->frag_size and the
EXT2_FRAGS_PER_BLOCK() macro. Kind of silly for it to do so, but it's
part of the kludgy way the dump program (which was originally written
for the BSD FFS was ported over to support ext2/3.) Given how it
makes assumptions about the ext2/3/4 file system being similar to the
BSD FFS, it's a bit of a miracle it works for ext4 --- or at least
appears to work...
Addresses-Debian-Bug: #636418
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This patch adds support for doing quota accounting during full
e2fsck scan if the 'quota' feature was set on the superblock.
If user-visible quota inodes are in use, they will be hidden
and converted to the reserved quota inodes.
Signed-off-by: Aditya Kali <adityakali@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Add the ability to skip zeroing journal blocks on disk. This can
significantly speed up mke2fs with large journals. At worst the
uninitialized journal is only a very short-term risk (if at all),
because the journal will be overwritten on any new filesystem as
soon as any significant amount of data is written to disk, and
the new journal TID would need to match the offset/TID of an old
commit block still left on disk.
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The write_journal_inode() code is only setting the low 32-bit i_size
for the journal size, even though it is possible to specify a journal
up to 10M blocks in size. Trying to create a journal larger than 2GB
will succeed, but an immediate e2fsck would fail. Store i_size_high
for the journal inode when creating it, and load it upon access.
Use s_jnl_blocks[15] to store the journal i_size_high backup. This
field is currently unused, as EXT2_N_BLOCKS is 15, so it is using
s_jnl_blocks[0..14], and i_size is in s_jnl_blocks[16].
Rename the "size" argument "num_blocks" for the journal creation functions
to clarify this parameter is in units of filesystem blocks and not bytes.
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Older distros do not define posix_memalign() by default in the
headers. If ext2fs.h is included early in the headers, it is
possible to "#define _XOPEN_SOURCE 600" so that the stdlib.h
header will define it, but if ext2fs.h is included after stdlib.h
there is no posix_memalign() declaration.
Add a posix_memalign() declaration if stdlib.h didn't do it. This
is a bit of a hack for GNU headers, but it works on Linux and OS/X
without problems.
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Fix several types of compiler warnings (unused variables/labels),
uninitialized variables, etc that are hit with gcc -Wall.
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This patch makes the following changes:
* ext2fs_allocate_block_bitmap() now allocates a bitmap with cluster
granularity for bigalloc file systems. For mke2fs and e2fsck, a
newly added function, ext2fs_allocate_subcluster_bitmap() allocates
a bitmap with block granularity (even for bigalloc file systems).
The newly added function ext2fs_get_bitmap_granularity() will return
the number of bits (log2) of the granularity used by the bitmap.
* The ext2fs_{mark,unmark,test}_block_bitmap2() functions will shift
their passed-in argument by log2(cluster_ganularity) bits right.
This means that the arguments for the single-argument bitmap
functions will be interpreted with block granluarity, since this
minimizes code changes in the rest of the code base.
* The ext2fs_{get,set}_block_bitmap_range() functions will interpret
their arguments in cluster granularity. This is a bit inconsistent,
but the caller of those functions will need to be taught about the
subtleties of clusters for bigalloc file systems.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The log2 of the ratio of cluster size to block size is far more useful
than just storing the cluster size. So make this change, and then
define basic utility macros: EXT2FS_CLUSTER_RATIO(),
EXT2FS_CLUSTER_MASK(), EXT2FS_B2C(), EXT2FS_C2B(), and
EXT2FS_NUM_B2C().
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Add functions ext2fs_get_memzero() which will malloc() the memory
using ext2fs_get_mem(), but it will zero the allocated memory afterwards
with memset().
Add function ext2fs_get_arrayzero() which will use calloc() for
allocating and zero-out the array.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>