e2fsprogs

Commit Graph

Author	SHA1	Message	Date
Darrick J. Wong	a5abfe0382	e2fsck: read-ahead metadata during passes 1, 2, and 4 e2fsck pass1 is modified to use the block group data prefetch function to try to fetch the inode tables into the pagecache before it is needed. We iterate through the blockgroups until we have enough inode tables that need reading such that we can issue readahead; then we sit and wait until the last inode table block read of the last group to start fetching the next bunch. pass2 is modified to use the dirblock prefetching function to prefetch the list of directory blocks that are assembled in pass1. We use the "iterate a subset of a dblist" and avoid copying the dblist. Directory blocks are fetched incrementally as we walk through the directory block list. In previous iterations of this patch we would free the directory blocks after processing, but the performance hit to e2fsck itself wasn't worth it. Furthermore, it is anticipated that most users will then mount the FS and start using the directories, so they may as well remain in the page cache. pass4 is modified to prefetch the block and inode bitmaps in anticipation of pass 5, because pass4 is entirely CPU bound. In general, these mechanisms can decrease fsck time by 10-40%, if the host system has sufficient memory and the storage system can provide a lot of IOPs. Pretty much any storage system capable of handling multiple IOs in-flight at any time will see a fairly large performance boost. (Single-issue USB mass storage disks seem to suffer badly.) By default, the readahead buffer size will be set to the size of a block group's inode table (which is 2MiB for a regular ext4 FS). The -E readahead_kb= option can be given to specify the amount of memory to use for readahead or zero to disable it entirely; or an option can be given in e2fsck.conf. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2015-04-21 10:40:21 -04:00
Theodore Ts'o	9a32411732	Merge branch 'maint' into next Conflicts: lib/ext2fs/inode.c	2014-12-25 23:43:10 -05:00
Theodore Ts'o	13f450addb	libext2fs: add sanity check for an invalid itable_used value in inode scan code If the number of unused inodes is greater than number of inodes a block group, this can cause an e2fsck -n run of the file system to crash. We should add more checks to e2fsck to detect this case directly, but this will at least protect progams (tune2fs, dump, etc.) which use the inode_scan abstraction from crashing on an invalid file system. Addresses-Debian-Bug: #773795 Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2014-12-25 23:29:19 -05:00
Darrick J. Wong	54f6faf7f2	libext2fs: don't report garbage inodes with really large inodes If the inode size is large enough that there are fewer than two inodes per block, don't report an inode checksum failure as a garbage inode during the scan because the "more than half are broken" criteria that we use to decide if a block of inodes is garbage doesn't really apply. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2014-12-02 22:17:10 -05:00
Darrick J. Wong	18b234b121	libext2fs: byteswap inode when performing the sanity scan On BE platforms, we need to swap the inode bytes after doing the checksum verification but before looking at i_blocks. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2014-08-24 22:00:56 -04:00
Darrick J. Wong	b9f95911e9	libext2fs: don't cache inodes that fail checksum verification If an inode fails checksum verification, don't stuff a copy of it in the inode cache, because this can cause the library to fail to return the "corrupt inode" error code. In general, this happens if ext2fs_read_inode_full() is called twice on an inode with an incorrect checksum. If fs->flags has EXT2_FLAG_IGNORE_CSUM_ERRORS set during the first call and unset during the second call, the cache hit during the second call fails to return EXT2_ET_INODE_CSUM_INVALID as you'd expect. This happens during fsck because the first read_inode call happens as part of check_blocks and the second call happens during inode checksum revalidation. A file system with a slightly corrupt non-extent inode will trigger this. While we're at it, make the inode read function consistent with the rest of libext2fs -- copy the metadata object into the caller's buffer even if it fails checksum verification. This will help e2fsck avoid a double re-read later on down the line. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2014-08-02 22:49:23 -04:00
Darrick J. Wong	2e9d839156	e2fsck: correctly preserve fs flags when modifying ignore-csum-error flag When we need to modify the "ignore checksum error" behavior flag to get us past a library call, it's possible that the library call can result in other flag bits being changed. Therefore, it is not correct to restore unconditionally the previous flags value, since this will have unintended side effects on the other fs->flags; nor is it correct to assume that we can unconditionally set (or clear) the "ignore csum error" flag bit. Therefore, we must merge the previous value of the "ignore csum error" flag with the value of flags after the call. Note that we want to leave checksum verification on as much as possible because doing so exposes e2fsck bugs where two metadata blocks are "sharing" the same disk block, and attempting to fix one before relocating the other causes major filesystem damage. The damage is much more obvious when a previously checked piece of metadata suddenly fails in a subsequent pass. The modifications to the pass 2, 3, and 3A code are justified as follows: When e2fsck encounters a block of directory entries and cannot find the placeholder entry at the end that contains the checksum, it will try to insert the placeholder. If that fails, it will schedule the directory for a pass 3A reconstruction. Until that happens, we don't want directory block writing (pass 2), block iteration (pass 3), or block reading (pass 3A) to fail due to checksum errors, because failing to find the placeholder is itself a checksum verification error, which causes e2fsck to abort without fixing anything. The e2fsck call to ext2fs_read_bitmaps must never fail due to a checksum error because e2fsck subsequently (a) verifies the bitmaps itself; or (b) decides that they don't match what has been observed, and rewrites them. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2014-08-02 22:48:21 -04:00
Darrick J. Wong	68d70624e3	e2fsck: offer to clear inode table blocks that are insane Add a new behavior flag to the inode scan functions; when specified, this flag will do some simple sanity checking of entire inode table blocks. If all the checksums are ok, we can skip checksum verification on individual inodes later on. If more than half of the inodes look "insane" (bad extent tree root or checksum failure) then ext2fs_get_next_inode_full() can return a special status code indicating that what's in the buffer is probably garbage. When e2fsck' inode scan encounters the 'inode is garbage' return code it'll offer to zap the inode straightaway instead of trying to recover anything. This replaces the previous behavior of asking to zap anything with a checksum error (strict_csum). Signed-off-by: Darrick J. Wong <darrick.wong@orale.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2014-08-02 22:46:16 -04:00
Zheng Liu	be31a8de5a	libext2fs: export inode cache creation function Currently we have already exported inode cache flush and free functions for users. This commit exports inode cache creation function. Later we will use this function to initialize inode cache and do some unit tests for inline data. Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2014-03-04 08:46:15 -05:00
Theodore Ts'o	e337e7fad8	Merge branch 'maint' into next Conflicts: e2fsck/problem.c e2fsck/rehash.c e2fsck/super.c	2013-10-12 22:26:28 -04:00
Darrick J. Wong	4dbfd79d14	e2fsprogs: fix blk_t <- blk64_t assignment mismatches Fix all the places where we should be using a blk64_t instead of a blk_t. These fixes are more severe because 64bit values could be truncated silently. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-10-07 09:51:48 -04:00
Theodore Ts'o	07bcd90f3d	Merge branch 'maint' into next	2013-04-22 00:07:08 -04:00
Theodore Ts'o	572ef60b89	libext2fs: only use override function when reading an 128 byte inode The ext2fs_read_inode_full() function should not use fs->read_inode() if the caller has requested more than the base 128 byte inode structure and the inode size is greater than 128 bytes. Otherwise the caller won't get all of the bytes that they were asking for, since there's no way for the fs->read_inode override function can know what the size of the buffer passed to ext2fs_read_inode_full(). Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-04-21 23:53:26 -04:00
Andreas Dilger	1b8c4c1b45	build: quiet build warnings for "gcc -Wall" Signed-off-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-01-27 22:31:03 -05:00
Theodore Ts'o	603e5ebc8b	libext2fs: allocate separate memory regions for each inode in the cache The changes to support metadata checksum allocated a single large array for all of the inodes in the inode cache. This is slightly more efficient, but given that the inode cache is small (only 4 inodes) it doesn't really have that much benefit. The problem with doing things this way is that the memory overruns, such as the one fixed in commit `43c4910371`, do not get detected by valgrind. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2012-11-29 20:40:21 -05:00
Eric Whitney	43c4910371	libext2fs: fix inode cache overruns An inode cache slot will be overrun if a caller to ext2fs_read_inode_full() or ext2fs_write_inode_full() attempts to read or write a full sized 156 byte inode when the target filesystem contains 128 byte inodes. Limit the copied inode to the smaller of the target filesystem's or the caller's requested inode size. Signed-off-by: Eric Whitney <enwlinux@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2012-11-29 19:59:41 -05:00
Darrick J. Wong	5b58dc2304	libext2fs: block group checksum should use metadata_csum algorithm Change the block group algorithm to use the same algorithm as the rest of the metadata_csum. This mostly involves providing a helper function to tell if group descriptors should have checksums set or verified, and modifying the gdt checksum code to use the correct algorithm. Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2012-08-02 20:47:45 -04:00
Darrick J. Wong	37d82b6a95	libext2fs: add inode checksum support This patch adds the ability for the libext2fs functions to read and write the inode checksum. Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2012-07-30 18:46:04 -04:00
Darrick J. Wong	91db7e206d	libext2fs: read and write full size inodes Change libext2fs to read and write full-size inodes in preparation for the metadata checksumming patchset, which will require this. Due to ABI compatibility requirements, this change must be hidden from client programs. Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2012-07-30 17:42:15 -04:00
Theodore Ts'o	fd1c5a0622	libext2fs: factor out I/O buffer allocation Create a new function, io_channel_alloc_buf() which allocates I/O buffers with appropriate alignment if we are using direct I/O. The original code was sometimes using a larger alignment factor than necessary, and would always request an aligned memory buffer even when it was not necessary since the block device was not opened with O_DIRECT. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2012-05-07 14:41:49 -04:00
Theodore Ts'o	d1154eb460	Shorten compile commands run by the build system The DEFS line in MCONFIG had gotten so long that it exceeded 4k, and this was starting to cause some tools heartburn. It also made "make V=1" almost useless, since trying to following the individual commands run by make was lost in the noise of all of the defines. So fix this by putting the configure-generated defines in lib/config.h and the directory pathnames to lib/dirpaths.h. In addition, clean up some vestigal defines in configure.in and in the Makefiles to further shorten the cc command lines. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-09-18 17:34:37 -04:00
Eric Sandeen	624e8ebe30	e2fsprogs: Fix some error cleanup path bugs In inode_open(), if the allocation of &io fails, we go to cleanup and dereference io to test io->name, which is a bug. Similarly in undo_open() if allocation of &data fails, we go to cleanup and dereference data to test data->real. In the test_open() case we explicitly set retval to the only possible error return from ext2fs_get_mem(), so remove that for tidiness. The other changes just make make earlier returns go through the error goto for consistency. In many cases we returned directly from the first error, but "goto cleanup" etc for every subsequent error. In some cases this leads to "impossible" tests such as: if (ptr) ext2fs_free_mem(&ptr) on paths where ptr cannot be null because we would have returned directly earlier, and Coverity flags this. This isn't really indicative of an error in most cases, but I think it can be clearer to always exit through the error goto if it's used later in the function. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2011-09-16 18:43:05 -04:00
Theodore Ts'o	41e102a4d1	libext2fs: fix 64-bit support in ext2fs_{read,write}_inode_full() This fixes a problem where reading or writing inodes located after the 4GB boundary would fail. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-07-05 20:02:27 -04:00
Theodore Ts'o	9d92a201de	Merge branch 'maint' into next Conflicts: configure configure.in lib/ext2fs/ext2fs.h misc/mke2fs.c	2010-09-24 22:40:21 -04:00
Theodore Ts'o	00f0b14118	ext2fs: Optimize for Direct I/O Allocate various memory structures to be properly aligned to avoid needing to use a bounce buffer when doing direct I/O read/writes. This should also help on FreeBSD systems which require aligned buffers unconditionally. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2010-09-24 10:06:45 -04:00
Theodore Ts'o	97d26ce9e3	Merge branch 'maint' into next Conflicts: e2fsck/journal.c e2fsck/pass1.c e2fsck/pass2.c misc/mke2fs.c	2010-06-07 12:42:40 -04:00
Theodore Ts'o	543547a52a	libe2p, libext2fs: Update file copyright permission states to match COPYING The top-level COPYING file states that the e2p and ext2fs libraries are available under the LGPLv2. The files were incorrectly labelled. Alex Thomas/Luster has been consulted wrt to the ext3_extents.h file; the rest of the files were primarily authored by Theodore Ts'o. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2010-05-17 23:04:39 -04:00
Valerie Aurora Henson	d7cca6b06f	Convert to use block group accessor functions Convert direct accesses to use the following block group accessor functions: ext2fs_block_bitmap_loc(), ext2fs_inode_bitmap_loc(), ext2fs_inode_table_loc(), ext2fs_bg_itable_unused(), ext2fs_block_bitmap_loc_set(), ext2fs_inode_bitmap_loc_set(), ext2fs_inode_table_loc_set(), ext2fs_bg_free_inodes_count(), ext2fs_ext2fs_bg_used_dirs_count(), ext2fs_bg_free_inodes_count_set(), ext2fs_bg_free_blocks_count_set(), ext2fs_bg_used_dirs_count_set() Signed-off-by: Valerie Aurora Henson <vaurora@redhat.com> Signed-off-by: Nick Dokos <nicholas.dokos@hp.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2009-10-25 21:43:47 -04:00
Theodore Ts'o	cd65a24e75	libext2fs: Convert ext2fs_bg_flag_test() to ext2fs_bg_flags_test() After cleaning up ext2fs_bg_flag_set() and ext2fs_bg_flag_clear(), we're left with ext2fs_bg_flag_test(). Convert it to ext2fs_bg_flags_test(). Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2009-10-25 21:42:12 -04:00
Theodore Ts'o	732c8cd58f	Use accessor functions fields for bg_flags in the block group descriptors Signed-off-by: Valerie Aurora Henson <vaurora@redhat.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2009-09-07 21:15:12 -04:00
Valerie Aurora Henson	24a117abd0	Convert to use io_channel_read_blk64() and io_channel_write_blk64() Signed-off-by: Valerie Aurora Henson <vaurora@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2009-09-07 21:14:24 -04:00
Theodore Ts'o	a7843581f5	ext2fs_read_inode_full: Add safety check to avoid SEGV's on corrupted fs's Thanks to Thiemo Nagel for suggesting this. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2009-01-19 23:09:37 -05:00
Theodore Ts'o	03fa6f8ae2	Fix various signed/unsigned gcc warnings Some of these could affect filesystems between 2^31 and 2^32-1 blocks. Thanks to Valerie Aurora Henson for pointing out the problems in lib/ext2fs/alloc_tables.c, which led me to do a "make gcc-wall" scan over the source tree. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-11-16 10:06:59 -05:00
Theodore Ts'o	efc6f628e1	Remove trailing whitespace for the entire source tree Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-08-27 23:07:54 -04:00
Theodore Ts'o	3bcc6276a0	libext2fs: Initialize unset inode timestamps when writing a new inode As Li Zefan <lizf@cn.fujitsu.com> reported, the creation timestamp was not getting set on the lost+found inode. This patch makes sure all of the timestamps are appropriately set. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-08-14 14:44:15 -04:00
Manish Katiyar	8eb3b8a0a0	ext2fs_read_inode: Check the validity of the inode number earlier It looks like the right place to check for ino=0 in ext2fs_read_inode_full() is before creating the inode cache, otherwise since we set icache[i].ino = 0 in create_icache(), it will match the loop below and thus we return a wrong value. Signed-off-by: "Manish Katiyar" <mkatiyar@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-07-15 11:10:19 -04:00
Theodore Ts'o	d11736c6dd	ext2fs_open_inode_scan: Handle an non-zero bg_itable_used in block group 0 Previously, the portion of the inode table for block group 0 was always completely zero'ed out, so the ext2fs_open_inode_scan() didn't handle a non-zero bg_itable_used value for the first block group. Fix this. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-04-22 23:22:17 -04:00
Theodore Ts'o	16b851cdae	Remove LAZY_BG feature This simplifies the code, and using the uninit_bg with the inode table lazily initialized is just as good. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-04-20 23:33:34 -04:00
Andreas Dilger	6f19f44a4c	libext2fs: Micro-optimization in inode scan code Signed-off-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-03-31 14:28:37 -04:00
Jose R. Santos	d4f34d41be	Add uninit block group support to various libext2fs functions Signed-off-by: Jose R. Santos <jrs@us.ibm.com> Signed-off-by: Andreas Dilger <adilger@clusterfs.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-03-20 15:33:12 -04:00
Theodore Ts'o	e6a4571eec	Merge branch 'maint' into next Conflicts: lib/ext2fs/closefs.c	2007-12-09 17:03:01 -05:00
Theodore Ts'o	ee01079a17	libext2fs: Add checks to prevent integer overflows passed to malloc() This addresses a potential security vulnerability where an untrusted filesystem can be corrupted in such a way that a program using libext2fs will allocate a buffer which is far too small. This can lead to either a crash or potentially a heap-based buffer overflow crash. No known exploits exist, but main concern is where an untrusted user who possesses privileged access in a guest Xen environment could corrupt a filesystem which is then accessed by the pygrub program, running as root in the dom0 host environment, thus allowing the untrusted user to gain privileged access in the host OS. Thanks to the McAfee AVERT Research group for reporting this issue. Addresses CVE-2007-5497. Signed-off-by: Rafal Wojtczuk <rafal_wojtczuk@mcafee.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2007-12-05 21:01:35 -05:00
Theodore Ts'o	126a291c76	Clean up libext2fs by byte swapping iff WORDS_BIGENDIAN We don't need byte swapping to be a run-time option; it can just be a compile-time option instead. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2007-08-11 01:59:13 -04:00
Kalpak Shah	1ed49d2c2a	Fix byte swapping bug in get_next_inode_full() On big-endian systems, while swapping, ext2fs_swap_inode_full() swaps only 128+extra_isize bytes and the EAs if they are present. Now if inode N has EAs, (and this is the inode in the "scratch inode") then inode N+1 also carries seems to have them since the "scratch inode" was never zeroed. Signed-off-by: Kalpak Shah <kalpak@clusterfs.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2007-06-29 21:40:19 -04:00
Kalpak Shah	915a2669ef	Fix ext2fs_read_inode_full() so that the whole inode is byte-swapped Signed-off-by: Kalpak Shah <kalpak@clusterfs.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2007-06-22 22:32:43 -04:00
Jim Garlick	cc37e0d3ae	Fix memory leak in ext2fs_write_new_inode() The following patch addresses a memory leak in libext2fs that occurs when using ext2fs_write_new_inode() on a file system configured with large inodes. Signed-off-by: Jim Garlick <garlick@llnl.gov> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2007-04-06 08:50:15 -04:00
Brian Behlendorf	e649be9daa	[COVERITY] Fix (error case) memory leak in libext2fs (ext2fs_write_inode_full) Need to free w_inode on early exit if w_inode != &temp_inode. Coverity ID: 22: Resource Leak Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2007-03-21 17:38:47 -04:00
Theodore Ts'o	f5fa20078b	Add support for EXT2_FEATURE_COMPAT_LAZY_BG This feature is initially intended for testing purposes; it allows an ext2/ext3 developer to create very large filesystems using sparse files where most of the block groups are not initialized and so do not require much disk space. Eventually it could be used as a way of speeding up mke2fs and e2fsck for large filesystem, but that would be best done by adding an RO_COMPAT extension to the filesystem to allow the inode table to be lazily initialized on a per-block basis, instead of being entirely initialized or entirely unused on a per-blockgroup basis. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2006-05-08 20:17:26 -04:00
Theodore Ts'o	b1ae119729	Add missing return values in error return cases in the ext2fs library. (Otherwise we return garbage instead of the error code.)	2005-04-09 01:21:21 -04:00
Theodore Ts'o	e27b45639a	Fix mke2fs so that it writes the root directory using ext2fs_write_new_inode(), and fix ext2fs_write_new_inode() so that it initializes i_extra_isize properly.	2005-03-21 01:02:53 -05:00

1 2

84 Commits (9603da15cb95c4635de6eac6f2ec7dd36a6972f3)