Commit Graph

248 Commits (06247428be8037b3739280f82cb29efe8397695f)

Author SHA1 Message Date
Peter Lieven aa7bfbfff7 block: add flags to bdrv_*_write_zeroes
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-28 10:30:51 +01:00
Max Reitz ba2ab2f2ca qcow2: Flush image after creation
Opening the qcow2 image with BDRV_O_NO_FLUSH prevents any flushes during
the image creation. This means that the image has not yet been flushed
to disk when qemu-img create exits. This flush is delayed until the next
operation on the image involving opening it without BDRV_O_NO_FLUSH and
closing (or directly flushing) it. For large images and/or images with a
small cluster size and preallocated metadata, this flush may take a
significant amount of time and may occur unexpectedly.

Reopening the image without BDRV_O_NO_FLUSH right before the end of
qcow2_create2() results in hoisting the potentially costly flush into
the image creation, which is expected to take some time (whereas
successive image operations may be not).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-28 17:34:32 +01:00
Max Reitz 6e13610aa4 qcow2: Unset zero_beyond_eof in save_vmstate
Saving the VM state is done using bdrv_pwrite. This function may perform
a read-modify-write, which in this case results in data being read from
beyond the end of the virtual disk. Since we are actually trying to
access an area which is not a part of the virtual disk, zero_beyond_eof
has to be set to false before performing the partial write, otherwise
the VM state may become corrupted.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-24 11:50:51 +02:00
Max Reitz eedff66f21 qcow2: Restore total_sectors value in save_vmstate
Since df2a6f29a5, bdrv_co_do_writev increases the total_sectors value of
a growable block devices on writes after the current end. This leads to
the virtual disk apparently growing in qcow2_save_vmstate, which in turn
affects the disk size captured by the internal snapshot taken directly
afterwards through e.g. the HMP savevm command. Such a "grown" snapshot
cannot be loaded after reopening the qcow2 image, since its disk size
differs from the actual virtual disk size (writing a VM state does not
actually increase the virtual disk size).

Fix this by restoring total_sectors at the end of qcow2_save_vmstate.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-24 11:45:06 +02:00
Max Reitz 1fa5cc839a qcow2: Evaluate overlap check options
Evaluate the runtime overlap check options and set
BDRVQcowState.overlap_check appropriately.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz 4092e99d93 qcow2: Array assigning options to OL check bits
Add an array which assigns the option string to its corresponding
overlap check bit.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz 05de7e86ca qcow2: Add overlap-check options
Add runtime options to tune the overlap checks to be performed before
write accesses.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz 3e3553905c qcow2: Make overlap check mask variable
Replace the QCOW2_OL_DEFAULT macro by a variable overlap_check in
BDRVQcowState.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz 231bb26764 qcow2: Use negated overflow check mask
In qcow2_check_metadata_overlap and qcow2_pre_write_overlap_check,
change the parameter signifying the checks to perform from its current
positive form to a negative one, i.e., it will no longer explicitly
specify every check to perform but rather a mask of checks not to
perform.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz f9bff97143 qcow2: Remove wrong metadata overlap check
In qcow2_write_compressed, if the compression fails, a normal cluster is
written to disk. This is done through bdrv_write on the qcow2 BDS
itself (using the guest offset), thus it is wrong to do a metadata
overlap check before.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:49:59 +02:00
Max Reitz 9e3f08923a qcow2: Add missing space in error message
The error message in qcow2_downgrade about an unsupported refcount
order is missing a space. This patch adds it.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:49:59 +02:00
Max Reitz 37764dfb71 qcow2: Add support for ImageInfoSpecific
Add a new ImageInfoSpecificQCow2 type as a subtype of ImageInfoSpecific.
This contains the compatibility level as a string and an optional
lazy_refcounts boolean (optional means mandatory for compat >= 1.1 and
not available for compat == 0.10).

Also, add qcow2_get_specific_info, which returns this information.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 14:03:57 +02:00
Jeff Cody c4217f645d block: qcow2 - used QEMU_PACKED for on-disk structures
QCowHeader and QCowExtension are structs that reside in the on-disk
image format, and are read and written directly via bdrv_pread()/write(),
and as such should be packed to avoid any unintentional struct padding.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-25 20:51:13 +02:00
Max Reitz 3ef6c40ad0 qcow2: Use Error parameter
Employ usage of the new Error ** parameter in qcow2_open, qcow2_create
and associated functions.

Signed-off-by: Max Reitz <mreitz@redhat.com>
2013-09-12 10:12:48 +02:00
Max Reitz cc84d90ff5 block: Error parameter for create functions
Add an Error ** parameter to bdrv_create and its associated functions to
allow more specific error messages.

Signed-off-by: Max Reitz <mreitz@redhat.com>
2013-09-12 10:12:48 +02:00
Max Reitz 34b5d2c68e block: Error parameter for open functions
Add an Error ** parameter to bdrv_open, bdrv_file_open and associated
functions to allow more specific error messages.

Signed-off-by: Max Reitz <mreitz@redhat.com>
2013-09-12 10:12:48 +02:00
Max Reitz d5124c00d8 bdrv: Use "Error" for creating images
Add an Error ** parameter to BlockDriver.bdrv_create to allow more
specific error messages.

Signed-off-by: Max Reitz <mreitz@redhat.com>
2013-09-12 10:12:48 +02:00
Max Reitz 015a1036a7 bdrv: Use "Error" for opening images
Add an Error ** parameter to BlockDriver.bdrv_open and
BlockDriver.bdrv_file_open to allow more specific error messages.

Signed-off-by: Max Reitz <mreitz@redhat.com>
2013-09-12 10:12:47 +02:00
Max Reitz 9296b3ed70 qcow2: Implement bdrv_amend_options
Implement bdrv_amend_options for compat, size, backing_file, backing_fmt
and lazy_refcounts.

Downgrading images from compat=1.1 to compat=0.10 is achieved through
handling all incompatible flags accordingly, clearing all compatible and
autoclear flags and expanding all zero clusters.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-12 10:12:46 +02:00
Max Reitz b6481f376b qcow2: Save refcount order in BDRVQcowState
Save the image refcount order in BDRVQcowState. This will be relevant
for future code supporting different refcount orders than four and also
for code that needs to verify a certain refcount order for an opened
image.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-12 10:12:46 +02:00
Kevin Wolf 1ebf561c11 qcow2: Discard VM state in active L1 after creating snapshot
During savevm, the VM state is written to the active L1 of the image and
then a snapshot is taken. After that, the VM state isn't needed any more
in the active L1 and should be discarded. This is implemented by this
patch.

The impact of not discarding the VM state is that a snapshot can never
become smaller than any previous snapshot (because it would be padded
with old VM state), and more importantly that future savevm operations
cause unnecessary COWs (with associated flushes), which makes subsequent
snapshots much slower.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2013-09-12 10:12:46 +02:00
Kevin Wolf 670df5e3b4 qcow2: Pass discard type to qcow2_discard_clusters()
The function will be used internally instead of only being called for
guest discard requests.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2013-09-12 10:12:46 +02:00
Paolo Bonzini 4bc74be997 block: return get_block_status data and flags for formats
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:09 +02:00
Paolo Bonzini b6b8a33354 block: introduce bdrv_get_block_status API
For now, bdrv_get_block_status is just another name for bdrv_is_allocated.
The next patches will add more flags.

This also touches all block drivers with a mostly mechanical rename.  The
sole exception is cow; because it calls cow_co_is_allocated from the read
code, we keep that function and make cow_co_get_block_status a wrapper.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:09 +02:00
Paolo Bonzini d663640c04 block: expect errors from bdrv_co_is_allocated
Some bdrv_is_allocated callers do not expect errors, but the fallback
in qcow2.c might make other callers trip on assertion failures or
infinite loops.

Fix the callers to always look for errors.

Cc: qemu-stable@nongnu.org
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:09 +02:00
Fam Zheng 4f6fd3491c block: make bdrv_delete() static
Manage BlockDriverState lifecycle with refcnt, so bdrv_delete() is no
longer public and should be called by bdrv_unref() if refcnt is
decreased to 0.

This is an identical change because effectively, there's no multiple
reference of BDS now: no caller of bdrv_ref() yet, only bdrv_new() sets
bs->refcnt to 1, so all bdrv_unref() now actually delete the BDS.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:08 +02:00
Max Reitz 24530f3e06 qcow2_check: Mark image consistent
If no corruptions remain after an image repair (and no errors have been
encountered), clear the corrupt flag in qcow2_check.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-02 10:15:15 +02:00
Max Reitz cf93980e77 qcow2: Employ metadata overlap checks
The pre-write overlap check function is now called before most of the
qcow2 writes (aborting it on collision or other error).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-08-30 15:48:43 +02:00
Max Reitz 69c9872653 qcow2: Add corrupt bit
This adds an incompatible bit indicating corruption to qcow2. Any image
with this bit set may not be written to unless for repairing (and
subsequently clearing the bit if the repair has been successful).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-08-30 15:48:43 +02:00
Kevin Wolf 9117b47717 qcow2: Change default for new images to compat=1.1
By the time that qemu 1.7 will be released, enough time will have passed
since qemu 1.1, which is the first version to understand version 3
images, that changing the default shouldn't hurt many people any more
and the benefits of using the new format outweigh the pain.

qemu-iotests already runs with compat=1.1 by default.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-08-30 15:28:51 +02:00
Asias He 0d51b4debe block: Introduce bs->zero_beyond_eof
In 4146b46c42e0989cb5842e04d88ab6ccb1713a48 (block: Produce zeros when
protocols reading beyond end of file), we break qemu-iotests ./check
-qcow2 022. This happens because qcow2 temporarily sets ->growable = 1
for vmstate accesses (which are stored beyond the end of regular image
data).

We introduce the bs->zero_beyond_eof to allow qcow2_load_vmstate() to
disable ->zero_beyond_eof temporarily in addition to enable ->growable.

[Since the broken patch "block: Produce zeros when protocols reading
beyond end of file" has not been merged yet, I have applied this fix
*first* and will then apply the next patch to keep the tree bisectable.
-- Stefan]

Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 14:10:21 +02:00
Kevin Wolf 8ad1898cf1 qcow2: Change default for new images to compat=1.1
By the time that qemu 1.7 will be released, enough time will have passed
since qemu 1.1, which is the first version to understand version 3
images, that changing the default shouldn't hurt many people any more
and the benefits of using the new format outweigh the pain.

qemu-iotests already runs with compat=1.1 by default.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-21 14:41:09 +02:00
Kevin Wolf 64aa99d3e0 qcow2: Use dashes instead of underscores in options
This is what QMP wants to use. The options haven't been enabled in any
release yet, so we're still free to change them.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-07-26 21:59:56 +02:00
Peter Lieven 3ac216270a block: change default of .has_zero_init to 0
.has_zero_init defaults to 1 for all formats and protocols.

this is a dangerous default since this means that all
new added drivers need to manually overwrite it to 0 if
they do not ensure that a device is zero initialized
after bdrv_create().

if a driver needs to explicitly set this value to
1 its easier to verify the correctness in the review process.

during review of the existing drivers it turned out
that ssh and gluster had a wrong default of 1.
both protocols support host_devices as backend
which are not by default zero initialized. this
wrong assumption will lead to possible corruption
if qemu-img convert is used to write to such a backend.

vpc and vmdk also defaulted to 1 altough they support
fixed respectively flat extends. this has to be addresses
in separate patches. both formats as well as the mentioned
ssh and gluster are turned to the default of 0 with this
patch for safety.

a similar problem with the wrong default existed for
iscsi most likely because the driver developer did
oversee the default value of 1.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-06-28 13:52:35 +02:00
Kevin Wolf 0b919fae31 qcow2: Batch discards
This optimises the discard operation for freed clusters by batching
discard requests (both snapshot deletion and bdrv_discard end up
updating the refcounts cluster by cluster).

Note that we don't discard asynchronously, but keep s->lock held. This
is to avoid that a freed cluster is reallocated and written to while the
discard is still in flight.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-06-24 10:25:17 +02:00
Kevin Wolf 67af674e47 qcow2: Options to enable discard for freed clusters
Deleted snapshots are discarded in the image file by default, discard
requests take their default from the -drive discard=... option and other
places that free clusters must always be enabled explicitly.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-06-24 10:25:17 +02:00
Kevin Wolf 6cfcb9b8b9 qcow2: Add refcount update reason to all callers
This adds a refcount update reason to all callers of update_refcounts(),
so that a follow-up patch can use this information to decide whether
clusters that reach a refcount of 0 should be discarded in the image
file.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-06-24 10:25:17 +02:00
Kevin Wolf 2cf7cfa1cd qcow2: Catch some L1 table index overflows
This catches the situation that is described in the bug report at
https://bugs.launchpad.net/qemu/+bug/865518 and goes like this:

    $ qemu-img create -f qcow2 huge.qcow2 $((1024*1024))T
    Formatting 'huge.qcow2', fmt=qcow2 size=1152921504606846976 encryption=off cluster_size=65536 lazy_refcounts=off
    $ qemu-io /tmp/huge.qcow2 -c "write $((1024*1024*1024*1024*1024*1024 - 1024)) 512"
    Segmentation fault

With this patch applied the segfault will be avoided, however the case
will still fail, though gracefully:

    $ qemu-img create -f qcow2 /tmp/huge.qcow2 $((1024*1024))T
    Formatting 'huge.qcow2', fmt=qcow2 size=1152921504606846976 encryption=off cluster_size=65536 lazy_refcounts=off
    qemu-img: The image size is too large for file format 'qcow2'

Note that even long before these overflow checks kick in, you get
insanely high memory usage (up to INT_MAX * sizeof(uint64_t) = 16 GB for
the L1 table), so with somewhat smaller image sizes you'll probably see
qemu aborting for a failed g_malloc().

If you need huge image sizes, you should increase the cluster size to
the maximum of 2 MB in order to get higher limits.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-05-14 16:44:33 +02:00
Stefan Hajnoczi f4d38bef7c qcow2: allow sub-cluster compressed write to last cluster
Compression in qcow2 requires image length to be a multiple of the
cluster size.  Lift this requirement by zero-padding the final cluster
when necessary.  The virtual disk size is still not cluster-aligned, so
the guest cannot access the zero sectors.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-04-22 10:27:58 +02:00
Kevin Wolf 8d3b1a2d0b block: Introduce bdrv_pwritev() for qcow2_save_vmstate
Directly pass the QEMUIOVector on instead of linearising it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-15 08:26:18 +02:00
Kevin Wolf cf8074b382 block: Introduce bdrv_writev_vmstate
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-15 08:26:18 +02:00
Aurelien Jarno 753d9b82c5 aes: move aes.h from include/block to include/qemu
Move aes.h from include/block to include/qemu to show it can be reused
by other subsystems.

Cc: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2013-04-13 13:51:57 +02:00
Kevin Wolf 88c6588c51 qcow2: Allow requests with multiple l2metas
Instead of expecting a single l2meta, have a list of them. This allows
to still have a single I/O request for the guest data, even though
multiple l2meta may be needed in order to describe both a COW overwrite
and a new cluster allocation (typical sequential write case).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:44 +01:00
Kevin Wolf 9ee6439e27 qcow2: Remove bogus unlock of s->lock
The unlock wakes up the next coroutine, but the currently running
coroutine will lock it again before it yields, so this doesn't make a
lot of sense.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:42 +01:00
Kevin Wolf 787e4a8500 block: Add options QDict to bdrv_file_open() prototypes
The new parameter is unused yet.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-03-22 17:51:31 +01:00
Kevin Wolf acdfb480ba qcow2: Fix segfault in qcow2_invalidate_cache
Need to pass an options QDict to qcow2_open() now. This fixes a segfault
on the migration target with qcow2.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-03-19 11:48:36 +01:00
Paolo Bonzini 381b487d54 qcow2: make is_allocated return true for zero clusters
Otherwise, live migration of the top layer will miss zero clusters and
let the backing file show through.  This also matches what is done in qed.

QCOW2_CLUSTER_ZERO clusters are invalid in v2 image files.  Check this
directly in qcow2_get_cluster_offset instead of replicating the test
everywhere.

Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-15 16:07:50 +01:00
Kevin Wolf 74c4510a3c qcow2: Allow lazy refcounts to be enabled on the command line
qcow2 images now accept a boolean lazy_refcounts options. Use it like
this:

  -drive file=test.qcow2,lazy_refcounts=on

If the option is specified on the command line, it overrides the default
specified by the qcow2 header flags that were set when creating the
image.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-15 16:07:49 +01:00
Kevin Wolf de9c0cec6c block: Add options QDict to bdrv_open() prototype
It doesn't do anything yet except storing the options QDict in the
BlockDriverState.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-15 16:07:49 +01:00
Kevin Wolf 1a86938f04 block: Add options QDict to .bdrv_open()
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-15 16:07:49 +01:00
Stefan Weil 15bac0d54f block: Use error code EMEDIUMTYPE for wrong format in some block drivers
This improves error reports for bochs, cow, qcow, qcow2, qed and vmdk
when a file with the wrong format is selected.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25 18:18:35 +01:00
Kevin Wolf 8d2497c355 qcow2: Fix segfault on zero-length write
One of the recent refactoring patches (commit f50f88b9) didn't take care
to initialise l2meta properly, so with zero-length writes, which don't
even enter the write loop, qemu just segfaulted.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-01-15 09:08:55 +01:00
Paolo Bonzini 1de7afc984 misc: move include files to include/qemu/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19 08:32:39 +01:00
Paolo Bonzini 737e150e89 block: move include files to include/block/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19 08:31:31 +01:00
Paolo Bonzini 7b1b5d1913 qapi: move include files to include/qobject/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19 08:31:31 +01:00
Kevin Wolf 4e95314e2b qcow2: Execute run_dependent_requests() without lock
There's no reason for run_dependent_requests() to hold s->lock, and a
later patch will require that in fact the lock is not held.

Also, before this patch, run_dependent_requests() not only does what its
name suggests, but also removes the l2meta from the list of in-flight
requests. When changing this, it becomes an one-liner, so just inline it
completely.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-13 15:37:59 +01:00
Kevin Wolf 280d373579 qcow2: Enable dirty flag in qcow2_alloc_cluster_link_l2
This is closer to where the dirty flag is really needed, and it avoids
having checks for special cases related to cluster allocation directly
in the writev loop.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-13 15:37:59 +01:00
Kevin Wolf f50f88b9fe qcow2: Allocate l2meta only for cluster allocations
Even for writes to already allocated clusters, an l2meta is allocated,
though it stays effectively unused. After this patch, only allocating
requests still have one. Each l2meta now describes an in-flight request
that writes to clusters that are not yet hooked up in the L2 table.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-13 15:37:59 +01:00
Kevin Wolf 060bee8943 qcow2: Drop l2meta.cluster_offset
There's no real reason to have an l2meta for normal requests that don't
allocate anything. Before we can get rid of it, we must return the host
cluster offset in a different way.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-13 15:37:59 +01:00
Kevin Wolf cf5c1a231e qcow2: Allocate l2meta dynamically
As soon as delayed COW is introduced, the l2meta struct is needed even
after completion of the request, so it can't live on the stack.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-13 15:37:59 +01:00
Kevin Wolf 67a7a0ebe5 qcow2: Move BLKDBG_EVENT out of the lock
We want to use these events to suspend requests for testing concurrent
AIO requests. Suspending requests while they are holding the CoMutex is
rather boring for this purpose.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-12 12:33:48 +01:00
Jim Meyering 00ea188125 qcow2: mark this file's sole strncpy use as justified
Acked-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-10-05 07:58:38 -05:00
Jeff Cody 21d82ac95f block: qcow2 image file reopen
These are the stubs for the file reopen drivers for the qcow2 format.

There is currently nothing that needs to be done by the qcow2 driver
in reopen.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-09-24 15:15:12 +02:00
Stefan Hajnoczi 058f8f16db block: add BLOCK_O_CHECK for qemu-img check
Image formats with a dirty bit, like qed and qcow2, repair dirty image
files upon open with BDRV_O_RDWR.  Performing automatic repair when
qemu-img check runs is not ideal because the bdrv_open() call repairs
the image before the actual bdrv_check() call from qemu-img.c.

Fix this "double repair" since it leads to confusing output from
qemu-img check.  Tell the block driver that this image is being opened
just for bdrv_check().  This skips automatic repair and qemu-img.c can
invoke it manually with bdrv_check().

Update the golden output for qemu-iotests 039 to reflect the new
qemu-img check output.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-08-10 10:25:12 +02:00
Stefan Hajnoczi acbe59829e qcow2: mark image clean after repair succeeds
The dirty bit is cleared after image repair succeeds in qcow2_open().
Move this into qcow2_check() so that all callers benefit from this
behavior when fix mode is enabled.

This is necessary so qemu-img check can call .bdrv_check() and mark the
image clean.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-08-10 10:25:12 +02:00
Stefan Hajnoczi bfe8043e92 qcow2: implement lazy refcounts
Lazy refcounts is a performance optimization for qcow2 that postpones
refcount metadata updates and instead marks the image dirty.  In the
case of crash or power failure the image will be left in a dirty state
and repaired next time it is opened.

Reducing metadata I/O is important for cache=writethrough and
cache=directsync because these modes guarantee that data is on disk
after each write (hence we cannot take advantage of caching updates in
RAM).  Refcount metadata is not needed for guest->file block address
translation and therefore does not need to be on-disk at the time of
write completion - this is the motivation behind the lazy refcount
optimization.

The lazy refcount optimization must be enabled at image creation time:

  qemu-img create -f qcow2 -o compat=1.1,lazy_refcounts=on a.qcow2 10G
  qemu-system-x86_64 -drive if=virtio,file=a.qcow2,cache=writethrough

Update qemu-iotests 031 and 036 since the extension header size changes
when we add feature bit table entries.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-08-06 22:39:14 +02:00
Stefan Hajnoczi c61d0004bc qcow2: introduce dirty bit
This patch adds an incompatible feature bit to mark images that have not
been closed cleanly.  When a dirty image file is opened a consistency
check and repair is performed.

Update qemu-iotests 031 and 036 since the extension header size changes
when we add feature bit table entries.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-08-06 22:39:14 +02:00
Anthony Liguori 23797df3d9 Merge remote-tracking branch 'mjt/mjt-iov2' into staging
* mjt/mjt-iov2:
  rewrite iov_send_recv() and move it to iov.c
  cleanup qemu_co_sendv(), qemu_co_recvv() and friends
  export iov_send_recv() and use it in iov_send() and iov_recv()
  rename qemu_sendv to iov_send, change proto and move declarations to iov.h
  change qemu_iovec_to_buf() to match other to,from_buf functions
  consolidate qemu_iovec_copy() and qemu_iovec_concat() and make them consistent
  allow qemu_iovec_from_buffer() to specify offset from which to start copying
  consolidate qemu_iovec_memset{,_skip}() into single function and use existing iov_memset()
  rewrite iov_* functions
  change iov_* function prototypes to be more appropriate
  virtio-serial-bus: use correct lengths in control_out() message

Conflicts:
	tests/Makefile

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-07-09 12:35:06 -05:00
Stefan Hajnoczi b35278f754 qcow2: fix #ifdef'd qcow2_check_refcounts() callers
The DEBUG_ALLOC qcow2.h macro enables additional consistency checks
throughout the code.  This makes it easier to spot corruptions that are
introduced during development.  Since consistency check is an expensive
operation the DEBUG_ALLOC macro is used to compile checks out in normal
builds and qcow2_check_refcounts() calls missed the addition of a new
function argument.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-07-09 15:53:01 +02:00
Stefan Hajnoczi af7b708db2 qcow2: fix autoclear image header update
The autoclear feature bits can be used for qcow2 file format features
that are safe to "drop" by old programs that do not understand the
feature.  Upon opening the image file unknown autoclear feature bits are
cleared and the image file header is rewritten, but this was happening
too early in the code when critical header fields were not yet loaded.

Process autoclear feature bits after all necessary header information
has been loaded.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-06-15 14:03:43 +02:00
Paolo Bonzini 6af4e9ead4 qcow2: always operate caches in writeback mode
Writethrough does not need special-casing anymore in the qcow2 caches.
The block layer adds flushes after every guest-initiated data write,
and these will also flush the qcow2 caches to the OS.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-06-15 14:03:43 +02:00
Kevin Wolf 166acf546f qcow2: Support for fixing refcount inconsistencies
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-06-15 14:03:42 +02:00
Kevin Wolf 4534ff5426 qemu-img check -r for repairing images
The QED block driver already provides the functionality to not only
detect inconsistencies in images, but also fix them. However, this
functionality cannot be manually invoked with qemu-img, but the
check happens only automatically during bdrv_open().

This adds a -r switch to qemu-img check that allows manual invocation
of an image repair.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-06-15 14:03:42 +02:00
Michael Tokarev d5e6b1619c change qemu_iovec_to_buf() to match other to,from_buf functions
It now allows specifying offset within qiov to start from and
amount of bytes to copy.  Actual implementation is just a call
to iov_to_buf().

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2012-06-11 23:12:11 +04:00
Michael Tokarev 1b093c480a consolidate qemu_iovec_copy() and qemu_iovec_concat() and make them consistent
qemu_iovec_concat() is currently a wrapper for
qemu_iovec_copy(), use the former (with extra
"0" arg) in a few places where it is used.

Change skip argument of qemu_iovec_copy() from
uint64_t to size_t, since size of qiov itself
is size_t, so there's no way to skip larger
sizes.  Rename it to soffset, to make it clear
that the offset is applied to src.

Also change the only usage of uint64_t in
hw/9pfs/virtio-9p.c, in v9fs_init_qiov_from_pdu() -
all callers of it actually uses size_t too,
not uint64_t.

One added restriction: as for all other iovec-related
functions, soffset must point inside src.

Order of argumens is already good:
 qemu_iovec_memset(QEMUIOVector *qiov, size_t offset,
                   int c, size_t bytes)
vs:
 qemu_iovec_concat(QEMUIOVector *dst,
                   QEMUIOVector *src,
                   size_t soffset, size_t sbytes)
(note soffset is after _src_ not dst, since it applies to src;
for memset it applies to qiov).

Note that in many places where this function is used,
the previous call is qemu_iovec_reset(), which means
many callers actually want copy (replacing dst content),
not concat.  So we may want to add a wrapper like
qemu_iovec_copy() with the same arguments but which
calls qemu_iovec_reset() before _concat().

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2012-06-11 23:12:11 +04:00
Michael Tokarev 03396148bc allow qemu_iovec_from_buffer() to specify offset from which to start copying
Similar to
 qemu_iovec_memset(QEMUIOVector *qiov, size_t offset,
                   int c, size_t bytes);
the new prototype is:
 qemu_iovec_from_buf(QEMUIOVector *qiov, size_t offset,
                     const void *buf, size_t bytes);

The processing starts at offset bytes within qiov.

This way, we may copy a bounce buffer directly to
a middle of qiov.

This is exactly the same function as iov_from_buf() from
iov.c, so use the existing implementation and rename it
to qemu_iovec_from_buf() to be shorter and to match the
utility function.

As with utility implementation, we now assert that the
offset is inside actual iovec.  Nothing changed for
current callers, because `offset' parameter is new.

While at it, stop using "bounce-qiov" in block/qcow2.c
and copy decrypted data directly from cluster_data
instead of recreating a temp qiov for doing that.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2012-06-11 23:12:11 +04:00
Michael Tokarev 3d9b49254f consolidate qemu_iovec_memset{,_skip}() into single function and use existing iov_memset()
This patch combines two functions into one, and replaces
the implementation with already existing iov_memset() from
iov.c.

The new prototype of qemu_iovec_memset():
  size_t qemu_iovec_memset(qiov, size_t offset, int fillc, size_t bytes)
It is different from former qemu_iovec_memset_skip(), and
I want to make other functions to be consistent with it
too: first how much to skip, second what, and 3rd how many
of it.  It also returns actual number of bytes filled in,
which may be less than the requested `bytes' if qiov is
smaller than offset+bytes, in the same way iov_memset()
does.

While at it, use utility function iov_memset() from
iov.h in posix-aio-compat.c, where qiov was used.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2012-06-11 23:07:44 +04:00
Jim Meyering b6c147622d qcow2: don't leak buffer for unexpected qcow_version in header
Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-25 18:12:54 +02:00
Kevin Wolf c44bfe4637 qcow2: Don't ignore failure to clear autoclear flags
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-14 17:02:19 +02:00
Paolo Bonzini 5f3777945d block: push bdrv_change_backing_file error checking up from drivers
This check applies to all drivers, but QED lacks it.

Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-10 10:32:11 +02:00
Zhi Yong Wu 15552c4ad3 qcow2: lock on prealloc
preallocate() will be locked. This is required because
qcow2_alloc_cluster_link_l2() assumes that it runs under a lock that it
can drop while COW is being performed.

Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-07 19:33:18 +02:00
Stefan Weil b9531b6eed block/qcow2: Add missing GCC_FMT_ATTR to function report_unsupported()
Cc: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-02 18:39:39 +02:00
Kevin Wolf 621f058940 qcow2: Zero write support
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-20 15:57:30 +02:00
Kevin Wolf cfcc4c62ff qcow2: Support for feature table header extension
Instead of printing an ugly bitmask, qemu can now print a more helpful
string even for yet unknown features.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-20 15:57:29 +02:00
Kevin Wolf 6377af48b0 qcow2: Support reading zero clusters
This adds support for reading zero clusters in version 3 images.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-20 15:57:29 +02:00
Kevin Wolf 6744cbab8c qcow2: Version 3 images
This adds the basic infrastructure to qcow2 to handle version 3 images.
It includes code to create v3 images, allow header updates for v3 images
and checks feature bits.

It still misses support for zero clusters, so this is not a fully
compliant implementation of v3 yet.

The default for creating new images stays at v2 for now.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-20 15:57:29 +02:00
Kevin Wolf 68d000a390 qcow2: Ignore reserved bits in get_cluster_offset
With this change, reading from a qcow2 image ignores all reserved bits
that are set in an L1 or L2 table entry.

Now get_cluster_offset() assigns *cluster_offset only the offset without
any other flags. The cluster type is not longer encoded in the offset,
but a positive return value in case of success.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-20 15:57:27 +02:00
Paolo Bonzini 29cdb2513c block: push recursive flushing up from drivers
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-05 14:54:39 +02:00
Kevin Wolf 259b217310 qcow2: Add error messages in qcow2_truncate
qemu-img resize has some limitations with qcow2, but the user is only
told that "this image format does not support resize". Quite confusing,
so add some more detailed error_report() calls and change "this image
format" into "this image".

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2012-03-12 15:14:06 +01:00
Kevin Wolf 3cce16f44d qcow2: Add some tracing
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2012-03-12 15:14:06 +01:00
Kevin Wolf 64ca6aee4f qcow2: Reject too large header extensions
Image files that make qemu-img info read several gigabytes into the
unknown header extensions list are bad. Just fail opening the image
if an extension claims to be larger than the header extension area.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2012-02-29 12:48:47 +01:00
Kevin Wolf fd29b4bbef qcow2: Fix offset in qcow2_read_extensions
The spec says that the length of extensions is padded to 8 bytes, not
the offset. Currently this is the same because the header size is a
multiple of 8, so this is only about compatibility with future changes
to the header size.

While touching it, move the calculation to a common place instead of
duplicating it for each header extension type.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2012-02-29 12:48:47 +01:00
Kevin Wolf 423477e556 qcow2: Fix build with DEBUG_EXT enabled
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-02-29 12:48:47 +01:00
Kevin Wolf 75bab85ca0 qcow2: Keep unknown header extension when rewriting header
If we want header extensions to work as compatible extensions, we can't
destroy yet unknown header extensions when rewriting the header (e.g.
for changing the backing file). Save all unknown header extensions in a
list of blobs and include them in a new header.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-02-09 16:17:51 +01:00
Kevin Wolf e24e49e619 qcow2: Update whole header at once
In order to switch the backing file, qcow2 issues multiple write
requests that only changed a part of the image header. Any failure after
the first one would leave the header in an corrupted state. With this
patch, the whole header is written at once, so we can't fail in the
middle.

At the same time, this gives us a reusable functions that updates all
fields of the qcow2 header and not only the backing file.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-02-09 16:17:51 +01:00
Li Zhi Hui 28c1202ba6 block/qcow2.c: call qcow2_free_snapshots in the function of qcow2_close
Signed-off-by: Li Zhi Hui <zhihuili@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-15 12:40:08 +01:00
Anthony Liguori eb5d5beaeb Merge remote-tracking branch 'kwolf/for-anthony' into staging 2011-12-05 09:39:25 -06:00
Stefan Hajnoczi e8ee5e4c47 coroutine: add qemu_co_queue_restart_all()
It's common to wake up all waiting coroutines.  Introduce the
qemu_co_queue_restart_all() function to do this instead of looping over
qemu_co_queue_next() in every caller.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-05 14:51:38 +01:00
Stefan Hajnoczi f8a2e5e3ca block: convert qcow2, qcow2, and vmdk to .bdrv_co_is_allocated()
The qcow2, qcow, and vmdk block drivers are based on coroutines.  They have a
coroutine mutex which protects internal state.  We can convert the
.bdrv_is_allocated() function to .bdrv_co_is_allocated() by holding the mutex
around the cluster lookup operation.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-05 14:51:37 +01:00
Kevin Wolf 42deb29fed qcow2: Return real error code in qcow2_read_snapshots
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-12-05 14:51:35 +01:00