Sami Liedes found a scenario where we could memcpy incorrectly:
If a block read fails during an e2fsck run, the UNIX IO manager will
call the io->read_error routine with a pointer to the internal block
cache. The e2fsck read error handler immediately tries to write the
buffer back out to disk(!), at which point the block write code will
try to copy the buffer contents back into the block cache. Normally
this is fine, but not when the write buffer is the cache itself!
So, plumb in a trivial check for this condition. A more thorough
solution would pass a duplicated buffer to the IO error handlers, but
I don't know if that happens frequently enough to be worth the extra
point of failure.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reported-by: Sami Liedes <sami.liedes@iki.fi>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Add a readahead method for prefetching ranges of disk blocks. This is
useful for inode table scanning, and other large contiguous ranges of
blocks, and may also prove useful for random block prefetch, since it
will allow reordering of the IO without waiting synchronously for the
reads to complete.
It is currently using the posix_fadvise(POSIX_FADV_WILLNEED)
interface, as this proved most efficient during our testing.
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Commit baa3544609 ("libext2fs: have UNIX IO manager use
pread/pwrite) causes a breakage on 32-bit systems where off_t is
32-bits for file systems larger than 4GB. Fix this by using
pread64/pwrite64 if possible, and if pread64/pwrite64 is not present,
using pread/pwrite only if the size of off_t is at least as big as
ext2_loff_t.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If pread/pwrite are present, have the UNIX IO manager use them for
aligned IOs (instead of the current seek -> read/write), thereby
saving us a (minor) amount of system call overhead.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Using C99 initializers makes the code a bit more readable, and it
avoids some gcc -Wall warnings regarding missing initializers.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Declare struct_io_manager at the end of unix_io.c, undo_io.c, and
test_io.c files so that there isn't a need to forward declare every
member of this structure. That avoids a lot of redundant code
at the start of every one of these files.
Move the test_flush() function above test_abort() to avoid the need
for a forward declaration.
Fix a few instances of space before tab in these files.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Some different types such as u_int16_t and __uint32_t have snuck into
e2fsprogs. These types are not guaranteed by any standard, and they
are not provided by dietlibc. Convert them to __u16, __u32,
etc. since these are guaranteed to be provided by e2fsprogs' build.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Fix a potential memory leak reported by Li Xi. In addition, there
were possible error cases where the file descriptor would not be
properly closed, so fix those as well while we're at it.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reported-by: Li Xi <pkuelelixi@gmail.com>
The creation of inline wrappers ext2fs_open_file() and ext2fs_stat()
in commit c859cb1de0 in ext2fs.h caused
difficulties with the use of headers, since the headers for open64()
and stat64() may already be included (and skip the declaration of the
64-bit variants) before ext2fs.h is ever read. There is no real way
to solve the missing prototypes and resulting compiler warnings inside
ext2fs.h.
Since ext2fs_open_file() and ext2fs_stat() are not performance
critical operations, they do not need to be inline functions at all,
and the needed function headers can be handled properly in one file.
Similarly, posix_memalloc() was having difficulties with headers, and
was being defined in ext2fs.h, but it is now only being used by a
single file, so move the required header there.
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The code was assuming that "unsigned long" was 64-bit, which of course
it isn't on 32-bit systems. This caused blocks to get written to the
wrong place.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Create a new function, io_channel_alloc_buf() which allocates I/O
buffers with appropriate alignment if we are using direct I/O. The
original code was sometimes using a larger alignment factor than
necessary, and would always request an aligned memory buffer even when
it was not necessary since the block device was not opened with
O_DIRECT.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Create a new function, ext2fs_get_dio_alignment(), which returns the
alignment requirements for direct I/O. This way we can factor out the
code from MMP and the Unix I/O manager. The two modules weren't
consistently calculating the alignment factors, and in particular MMP
would sometimes use a larger alignment factor than was strictly
necessary.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The align field which indicated the required data alignment of data
buffers was stored in a field specific to the unix_io manager. Move
it to the top-level io_channel structure so it can be better
generalized.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
If we have newer kernel headers which define FALLOC_FL_PUNCH_HOLE, but we
are on an older glibc which lacks fallocate, we end up trying to use the
func anyways. Check the ifdef that autoconf already set up for us.
Reported-by: Ortwin Glueck <odi@odi.ch>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The DEFS line in MCONFIG had gotten so long that it exceeded 4k, and
this was starting to cause some tools heartburn. It also made "make
V=1" almost useless, since trying to following the individual commands
run by make was lost in the noise of all of the defines.
So fix this by putting the configure-generated defines in lib/config.h
and the directory pathnames to lib/dirpaths.h.
In addition, clean up some vestigal defines in configure.in and in the
Makefiles to further shorten the cc command lines.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
In inode_open(), if the allocation of &io fails, we go to cleanup
and dereference io to test io->name, which is a bug.
Similarly in undo_open() if allocation of &data fails, we
go to cleanup and dereference data to test data->real.
In the test_open() case we explicitly set retval to the only
possible error return from ext2fs_get_mem(), so remove that
for tidiness.
The other changes just make make earlier returns go through
the error goto for consistency.
In many cases we returned directly from the first error, but
"goto cleanup" etc for every subsequent error. In some
cases this leads to "impossible" tests such as:
if (ptr)
ext2fs_free_mem(&ptr)
on paths where ptr cannot be null because we would have
returned directly earlier, and Coverity flags this.
This isn't really indicative of an error in most cases, but
I think it can be clearer to always exit through the error goto
if it's used later in the function.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If e2fsprogs tools (mke2fs, e2fsck) is run on regular file instead of
on block device, we can use punch hole instead of regular discard
command which would not work on regular file anyway. This gives us
several advantages. First of all when e2fsck is run with '-E discard'
parameter it will punch out all ununsed space from the image, hence
trimming down the file system image. And secondly, when creating an
file system on regular file (with '-E discard' which is default), we
can use punch hole to clear the file content, hence we can skip inode
table initialization, because reads from sparse area returns zeros. This
will result in faster file system creation (without the need to specify
lazy_itable_init) and smaller images.
This commit also fixes some tests that would fail due to mke2fs showing
discard progress, hence the output would differ.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
In many places we are using #ifdef HAVE_OPEN64 to determine if we can
use open64() but that's ugly. This commit creates two new helpers
ext2fs_open_file() for open() and ext2fs_stat() for stat(). Also we need
new typedef ext2fs_struct_stat for struct stat.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
O_DIRECT is not defined on OSX. Since direct IO is only a new
optimization and not needed for correct functionality, disable
it if O_DIRECT is unavailable.
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Fix several types of compiler warnings (unused variables/labels),
uninitialized variables, etc that are hit with gcc -Wall.
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When the device have discard support and simultaneously discard zeroes
data (and it is properly advertised), then we can take advantage of such
behavior in several e2fsprogs tools.
Add new flag CHANNEL_FLAGS_DISCARD_ZEROES for struct_io_channel so
each io_manager can take advantage of this. The flag is properly set
according to BLKDISCARDZEROES ioctl in unix_open.
Also remove old mke2fs_discard_zeroes_data() function and substitute it
with helper which test this flag.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
In order to provide generic "discard" function for all e2fsprogs tools
add a discard function prototype into struct_io_manager. Specific
function for specific io managers can be crated that way.
This commit also creates unix_discard function which uses BLKDISCARD
ioctl to discard data blocks on the block device and bind it into
unit_io_manager structure to be available for all e2fsprogs tools.
Note that BLKDISCARD is still Linux specific ioctl, however other
unix systems may provide similar functionality. So far the
unix_discard() remains linux specific hence is embedded in #ifdef
__linux__ macro.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This adds the basic support for Direct I/O to unix_io.c, and adds a
new flag EXT_FLAG_DIRECT_IO which can be passed to ext2fs_open() or
ext2fs_open2() to request Direct I/O support.
Note that device mapper devices in Linux don't support Direct I/O, and
in some circumstances using Direct I/O can actually make performance
*worse*!
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The top-level COPYING file states that the e2p and ext2fs libraries
are available under the LGPLv2. The files were incorrectly labelled.
Alex Thomas/Luster has been consulted wrt to the ext3_extents.h file;
the rest of the files were primarily authored by Theodore Ts'o.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
When we open a device on linux, test whether it is writable
right away, rather than trying to proceed and clean up when
writes start failing.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This patch instruments the libext2fs unix I/O manager and adds bytes
read/written and data rate to e2fsck -tt pass/overall timing output.
Signed-off-by: Jim Garlick <garlick@llnl.gov>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
There were still some %d's lurking when we print blocks & inodes; also
many of the counters in the e2fsck_struct were signed, and probably
need to be unsigned to avoid overflows.
Signed-off-by: Eric Sandeen <esandeen@redhat.com>
Add a new io_channel open flag, IO_FLAG_EXCLUSIVE,which requests that
the device be opened in exclusive (O_EXCL) mode. Add support to the unix_io
implementation for this flag.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
example, /tmp/test.img?offset=1024. Multiple options can separated using
the & character, although at the moment the only option implemented is
the offset option in the unix_io layer.
unix_write_blk): Optimize routines so that we don't end up
searching the cache twice when a block isn't in the
cache. If reads are larger than READ_DIRECT_SIZE, don't
let them go through the cache.
blocks to be erroneously marked as dirty, so they would
get written back to the disk before they are evicted from
the cache. Harmless, but it slows down e2fsck
significantly.
the kernel version is 2.4.10 -- 2.4.17, since otherwise an
old version of glibc (built against 2.2 headers) will
interact badly with the workaround to actually cause more
problems. I hate it when the glibc folks think they're
being smarter than the kernel....