ext2fs_llseek() was using lseek instead of lseek64. The
only time it would use lseek64 is if passed an offset that
overflowed 32 bits. This works for SEEK_SET, but not
SEEK_CUR, which can apply a small offset to move the file
pointer past the 32 bit limit.
The code has been changed to instead try lseek64 first, and
fall back to lseek if that fails. It also was doing a
runtime check of the size of off_t. This has been moved to
compile time.
This fixes a problem which would cause e2image when built for
x86-32 to bomb out when used with large file systems.
Signed-off-by: Phillip Susi <psusi@ubuntu.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The ext2fs_{mark,unmark,test}_block_bitmap2() functions understand
about clusters, and will take block numbers and convert them to
clusters before checking the bitmap. The
ext2fs_*_block_bitmap_range2() functions did not do this, which made
them inconsistent. Fortunately, nothing has depended on this
incorrect behavior, and in fact most of the usage of these functions
have only recently been added, and only for optimizations that were
only enabled for non-bigalloc file systems.
So this is a change in previously exported functions, but (a) it
doesn't change the behavior at all for non-bigalloc file systems, and
(b) the change is more likely to fix bugs for bigalloc file systems.
For example, this change fixes a problem with resize2fs and bigalloc
file systems.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Creating symlinks is a complex affair when accounting for slowlinks.
Create a new function, ext2fs_symlink(), modeled after ext2fs_mkdir().
Like ext2fs_mkdir(), ext2fs_symlink() takes on the task of allocating a
new inode and block (for slowlinks), setting up sane default values in
the inode, copying the target path to either the inode (for fastlinks)
or to the first block (for slowlinks), and accounting for the inode and
block stats. Disallow link targets longer than blocksize as the Linux
kernel prevents this.
It does not attempt to expand the parent directory, instead returning
EXT2_ET_DIR_NO_SPACE and leaving it to the caller to expand just as
ext2fs_mkdir() does. Ideally, I think both of these functions should
make a single attempt to expand the directory.
[ Fixed a few bugs discovered when creating a test case for ext2fs_symlink() ]
Signed-off-by: Darren Hart <dvhart@infradead.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Andreas Dilger <adilger@dilger.ca>
To maintain the error codes numbering, we need to pull in the changes
from the 1.43.x development branch for the libext2's error table.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
If the user attemps to create a 512MB cluster, we need to adjust the
defaults to avoid a 32-bit overflow of s_blocks_per_group. Also check
to make sure that the caller of ext2fs_initialize() has not given a
value of s_clusters_per_group that would result in an overflow of
s_blocks_per_group.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
The addition of MMP code was added in the wrong place, so ret_fs could
get set (and EXT2_FLAG_NOFREE_ON_ERROR was cleared as well, which
could confuse e2fsck which depends on this flag being cleared if
ext2fs_open2() succeeded.)
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
There are a number of places where we multiply a dgrp_t with
s_blocks_per_group expecting that we will get a blk64_t. This
requires a cast, or using the convenience function
ext2fs_group_first_block2().
This audit was suggested by Eric Sandeen.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Use ext2fs_[un]mark_block_range2() functions to reduce the CPU
overhead of resizing large file systems by 45%, primarily by
reducing the time spent in fix_uninit_block_bitmaps().
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Quiet a number of simple compiler warnings:
- pointers not initialized by ext2fs_get_mem()
- return without value in non-void function
- dereferencing type-punned pointers
- unused variables
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
In a number of places, the output format from "make check" is
incorrectly interpreted as compiler warning output (triggered by
the presence of colons and parenthesis in the output). Convert
these lines to similar output that does not trigger false build
warnings.
In the case of the tst_uuid.c program, the "ctime()" output was
difficult to change, but in fact it is better to actually compare
the time-based UUID against wallclock time instead of just printing
the formatted time as a string, so this test is improved.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Force the use of the static libraries when linking the test program so
that "make check" works when the shared libraries have not been
installed, and so that we test against the version of the libraries in
the source tree.
Reported-by: g.esp@free.fr
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This commit adds the functionality which had previously only been in
the tst_extents command to debugfs. The debugfs command extent_open
will open extent tree of a particular inode, and enables a series of
commands which will allow the user to interact with the extent tree
directly. Once the extent tree is closed via extent_open(), these
additional commands will be disabled again.
This commit exports two new functions from lib/ext2fs/extent.c which
had previously been statically defined: ext2fs_extent_node_split() and
ext2fs_extent_goto2().
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Previously, ext2fs_extent_fix_parents() would only avoid modifying the
cursor location associated with the extent handle the cursor was
pointed at a leaf node in the extent tree. This is because it saved
the starting logical block number of the current extent, but not the
"level" of the extent (where level 0 is the leaf node, level 1 is the
interior node which points at blocks containing leaf nodes, etc.)
Fix ext2fs_extent_fix_parents() so it is guaranteed to not change the
current extent in the handle even if the current extent is not at the
bottom of the tree.
Also add a fix_extent command to the tst_extents program to make it
easier to test this function.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
An index node's logical start (ei_block) should
match the logical start of the first node (index
or leaf) below it. If we find a node whose start
does not match its parent, fix all of its parents
accordingly.
If it finds such a problem, we'll see:
Pass 1: Checking inodes, blocks, and sizes
Interior extent node level 0 of inode 274258:
Logical start 3666 does not match logical start 4093 at next level. Fix<y>?
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The current m68k code was buggy for multiple reasons; first the bfset,
et. al commands interpret the bit number as a signed number, not an
unsigned number. Secondly, there were missing memory clobbers. Since
there is no real benefit in using explicit asm's at this point (gcc is
smart enough to optimize the generic C code to use the set/clear/test
bit m68k instruction) fix this bug by removing the m68k specific asm
versions of these functions.
Tested on m68k-linux with e2fsprogs-1.42.6 and gcc-4.6.3 as before.
All tests pass and the debug output looks sane.
I compared the e2fsck binaries from the previous build with this
one. They had identical .text sizes, and almost the same number
of bit field instructions (obviously compiler-generated), so this
change should have no serious performance implications.
Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Fix a potential memory leak reported by Li Xi. In addition, there
were possible error cases where the file descriptor would not be
properly closed, so fix those as well while we're at it.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reported-by: Li Xi <pkuelelixi@gmail.com>
Make sure the s_mmp_update_interval super block field is set
from the file system parameters block which is passed into the
ext2fs_initialize() function.
Addresses-Lustre-Bug: LU-1888
Signed-off-by: Gregoire Pichon <gregoire.pichon@bull.net>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This optimizies the CPU utilization of the rb_get_bmap_range()
function when most of the bitmap is allocated.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
This simplifies the rb_get_bmap_range() function and speeds it up for
the case where most of the bitmap is zero.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
This function efficiently counts the number of bits in a block of
memory.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
This speeds up reading bitmaps from disk for very large (and full)
disks by significant amounts (i.e., up to two CPU minutes for a 4T
file system).
Addresses-Google-Bug: #7534813
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Restructure the ext2fs_get_device_size() and blkid_get_dev_size()
code to localize the variables used for different device probing
methods. This at least reduces the #ifdef mess to only one part
of the code for each method, and avoids "unused variable" compiler
warnings added when variables are declared without being #ifdef'd.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Change the license of the mmp.c file to LGPL to match the license
of other files in the libext2fs library.
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Profiling shows that rb_test_bit() is now calling ext2fs_rb_next() a
lot, and this function is now the hot spot when running e2freefrag.
If we cache the results of ext2fs_rb_next(), we can eliminate those
extra calls, which further speeds up both e2freefrag and e2fsck by
reducing the amount of CPU time spent in userspace.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The code was previously allocating a single 4 or 8 byte pointer for
the rcursor and wcursor fields in the ext2fs_rb_private structure;
this added two extra memory allocations (which could fail), and extra
indirections, for no good reason. Removing the extra indirection also
makes the code more readable, so it's all upside and no downside.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Optimize testing for a bit in an rbtree-based bitmap for the case
where the calling application is scanning through the bitmap
sequentially. Previously, we did this for a set of bits which were
inside an allocated extent, but we did not optimize the case where
there was a large number of bits after an allocated extents which were
not in use.
1111111111111110000000000000000000
^ optimized ^not optimized
In my tests of a roughly half-filled file system, the run time of
e2freefrag was halved, and the cpu time spent in userspace was during
e2fsck's pass 5 was reduced by a factor of 30%.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Throttle updates for the "Allocating Groups" progress updates to once
a second as well. We now do this throttling in libext2fs, so we don't
have to do this for each of mke2fs's progress updates, and because the
updates from ext2fs_allocate_tables() come from within libext2fs
anyway.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Since clang uses C99 semantics by default, the main changes required
to allow clang to build e2fsprogs was to add support the C99 inline
semantics, while still allowing us to be built when the legacy (but
still default for gcc) GNU C89 inline semantics are in force.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
mke2fs -m option can set reserved blocks ratio up to 50%. But if the
last block group is not big enough to support the necessary data
structures, it gets dropped, we have to recalculate the number of
reserved blocks so that the reserved blocks matches the requested
percentage.
It also avoids a problem where if the user specifies a reserved blocks
of 50%, and after the last partial block group was dropped, if the
number of reserved blocks is greater than 50%, e2fsck will complain.
Steps to reproduce:
1. Create a FS which has the overhead for the last BG
and specify 50 % for reserved blocks ratio
# mke2fs -m 50 -t ext4 DEV 1025M
mke2fs 1.42.5 (29-Jul-2012)
warning: 256 blocks unused.
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
656640 inodes, 2621440 blocks
1310848 blocks (50.00%) reserved for the super user
~~~~~~~ <-- Reserved blocks exceed 50% of FS blocks count!
2. e2fsck outputs filesystem corruption
# e2fsck DEV
e2fsck 1.42.5 (29-Jul-2012)
Corruption found in superblock. (r_blocks_count = 1310848).
The superblock could not be read or does not describe a correct ext2
filesystem. If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
e2fsck -b 32768 <device>
Signed-off-by: Akira Fujita <a-fujita@rs.jp.ne.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Commit a7c17431b9 attempted to fix a problem where the system
libraries might get used instead of local libraries for things like
-lcom_err. It tried to accomplish this by moving $(ELF_OTHER_LIBS) to
before $(LDFLAGS).
Unfortunately, this was the wrong fix; $(ELF_OTHER_LIBS) *MUST* be
after the object files, or the linker might not pull in the necessary
library and not include it into the DT_NEEDED section of the shared
library. The proper fix is to add a -L$(LIB) before $(LDFLAGS), and
then remove the -L option from all of the ELF_OTHER_LIBS definitions
in the library Makefiles.
Addresses-Sourceforge-Bug: #3554345
Cc: Olivier Blin <olivier.blin@softathome.com>
Reported-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The following commands:
dd if=/dev/zero of=/tmp/foo count=1 ibs=$(( 256 * 1024 * 1024 ))
mke2fs -N 256 -t ext4 /tmp/foo
... will cause mke2fs to write until it fills the device. The cause
for this is that the explicit request for 256 inodes causes the number
of inodes per block group to be 8. The ext2fs_initialize() function
assumed that all of the reserved inodes would be in the first block
group, which is not true in this case. This caused the number of
uninitialized inodes in the first block group to be negative, which
then resulted in mke2fs trying to zero out a very large number of
blocks. Oops.
Addresses-Sourceforge-Bug: #3528892
Reported-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The Build Log Hardening Check is a debian tool which scans the output
of a package build making sure that the security hardening flags are
used when compiling and linking all of binaries in a package.
For the most part we were passing CFLAGS, CPPFLAGS, and LDFLAGS down
to the compiler and link commands, but there there were one or two
exceptions. In addition, there where a few places in "make install"
where the V=1 option was not being honored, which triggered blhc
warnings since it couldn't analyze those commands.
The e2fsck.static was the only binary that was not getting built and
packaged with the hardening flags, but I've fixed all of the blhc
warnings so in the future it will be obvious if we regress.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
When e2fsck uses the block iterator to release the blocks in an
extent-mapped inode, when the last block in an extent is removed, the
current extent has been removed and the extent cursor is now pointing
at the next inode. But the block iterator code doesn't know that. So
when it tries to go the next extent, it will end up skipping an
extent, and so the inode will be incompletely truncated.
The fix is to go to the next extent before calling the callback
function for the current extent. This way, regardless of whether the
current extent gets removed, the extent cursor is still pointing at
the right place.
Reported-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
When libext2fs allocates/deletes an extent leaf, the i_blocks
value is incremented/decremented by fs->blocksize / 512. This
is incorrect in case of bigalloc. The correct way here is to
use cluster_size / 512.
The problem is seen if we try to create a large inode using
libext2fs (say using ext2fs_block_iterate3()) on a bigalloc
filesystem. fsck catches this and complains.
Signed-off-by: Aditya Kali <adityakali@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The creation of inline wrappers ext2fs_open_file() and ext2fs_stat()
in commit c859cb1de0 in ext2fs.h caused
difficulties with the use of headers, since the headers for open64()
and stat64() may already be included (and skip the declaration of the
64-bit variants) before ext2fs.h is ever read. There is no real way
to solve the missing prototypes and resulting compiler warnings inside
ext2fs.h.
Since ext2fs_open_file() and ext2fs_stat() are not performance
critical operations, they do not need to be inline functions at all,
and the needed function headers can be handled properly in one file.
Similarly, posix_memalloc() was having difficulties with headers, and
was being defined in ext2fs.h, but it is now only being used by a
single file, so move the required header there.
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
For a completely full filesystem with more than 2^32 blocks, the
rbtree bitmap backend can assemble an extent of used blocks which is
longer than 2^32. If it does, it will overflow ->count, and corrupt
the rbtree for the bitmaps.
Discovered by completely filling a 32T filesystem using fallocate, and
then observing debugfs, dumpe2fs, and e2fsck all behaving badly.
(Note that filling with only 31 x 1T files did not show the problem,
because freespace was fragmented enough that there was no sufficiently
long range of used blocks.)
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Change the include path in the Cflags field so that #include
<lib/foo.h> and <foo.h> will work. We had originally used a C flags
which allowed <foo.h> to work, but many applications (especially those
not using pkg-config) had been using the <lob/foo.h> formulation which
didn't require an explicit -I{$includedir} option to the C compiler.
If those applications then converted over to pkg-config, and the
e2fsprogs libraries were installed with a prefix other than /usr, so
that the header files were in some directory such as
/usr/local/include, a program that used #include <lib/foo.h> would
fail to compile.
So change the pkg-config files to include both -I{$includedir} and
-I{$includir}/lib.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The code was assuming that "unsigned long" was 64-bit, which of course
it isn't on 32-bit systems. This caused blocks to get written to the
wrong place.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
MacOS 10.5 doesn't have posix_memalign() nor memalign(), but it does
have valloc(). The Android SDK would like to be built on MacOS 10.5,
so I've added support for a good-enough emulation of memalign()'s
functionality using valloc(), with an explicit test to make sure
valloc() is returning a pointer which is sufficiently aligned given
the requested alignment. This won't work if you try to operate on a
file system with a 16k blocksize using an e2fsprogs built on MacOS
10.5 system, but it is good enough for the common case of 4k
blocksize file systems, and we will let the memory allocation fail in
the alignment is not good enough.
I've also added a unit test for ext2fs_get_memalign() so we can be
sure it's working as expected. I've tested the code paths with
HAVE_POSIX_MEMALIGN defined, HAVE_POSIX_MEMALIGN undefined, and
HAVE_POSIX_MEMALIGN and HAVE_MEMALIGN undefined on an x86 Linux
system, and so I know the valloc() code path works OK. The simplistic
(and less safe) patch at:
https://trac.macports.org/attachment/ticket/33692/patch-lib-ext2fs-inline.c.diff
Shows that using valloc() apparently works OK for MacOS 10.5 (but if
it doesn't the unit test will catch a problem).
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Create a new function, io_channel_alloc_buf() which allocates I/O
buffers with appropriate alignment if we are using direct I/O. The
original code was sometimes using a larger alignment factor than
necessary, and would always request an aligned memory buffer even when
it was not necessary since the block device was not opened with
O_DIRECT.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Read in a full block for each allocation bitmap, to avoid using a
kernel bounce buffer when using direct I/O.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Create a new function, ext2fs_get_dio_alignment(), which returns the
alignment requirements for direct I/O. This way we can factor out the
code from MMP and the Unix I/O manager. The two modules weren't
consistently calculating the alignment factors, and in particular MMP
would sometimes use a larger alignment factor than was strictly
necessary.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The align field which indicated the required data alignment of data
buffers was stored in a field specific to the unix_io manager. Move
it to the top-level io_channel structure so it can be better
generalized.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>