Commit Graph

1790 Commits (b5cf2c1b0897506a40e0c420391875acc484792b)

Author SHA1 Message Date
Kevin Wolf 4f4896db5f qed: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the qed block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-15 15:07:15 +02:00
Kevin Wolf de82815db1 qcow2: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the qcow2 block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:15 +02:00
Kevin Wolf 0df93305f2 qcow1: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the qcow1 block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:15 +02:00
Kevin Wolf f7b593d937 parallels: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the parallels block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-15 15:07:15 +02:00
Kevin Wolf 2347dd7b68 nfs: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the nfs block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-15 15:07:15 +02:00
Kevin Wolf 4d5a3f888c iscsi: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the iscsi block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-08-15 15:07:15 +02:00
Kevin Wolf b546a94474 dmg: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the dmg block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-15 15:07:15 +02:00
Kevin Wolf 8dc7a7725b curl: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the curl block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-15 15:07:15 +02:00
Kevin Wolf 4ae7a52e43 cloop: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the cloop block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-15 15:07:15 +02:00
Kevin Wolf 7bf665ee35 bochs: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the bochs block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-08-15 15:07:15 +02:00
Jeff Cody fef6070eff block: vpc - use block layer ops in vpc_create, instead of posix calls
Use the block layer to create, and write to, the image file in the VPC
.bdrv_create() operation.

This has a couple of benefits: Images can now be created over protocols,
and hacks such as NOCOW are not needed in the image format driver, and
the underlying file protocol appropriate for the host OS can be relied
upon.

Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-15 15:07:15 +02:00
Jeff Cody dddc7750d6 block: use the standard 'ret' instead of 'result'
Most QEMU code uses 'ret' for function return values. The VDI driver
uses a mix of 'result' and 'ret'.  This cleans that up, switching over
to the standard 'ret' usage.

Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-15 15:07:15 +02:00
Jeff Cody 70747862f1 block: vdi - use block layer ops in vdi_create, instead of posix calls
Use the block layer to create, and write to, the image file in the
VDI .bdrv_create() operation.

This has a couple of benefits: Images can now be created over protocols,
and hacks such as NOCOW are not needed in the image format driver, and
the underlying file protocol appropriate for the host OS can be relied
upon.

Also some minor cleanup for error handling.

Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-15 15:07:14 +02:00
Jeff Cody 4f75b52a07 block: VHDX endian fixes
This patch contains several changes for endian conversion fixes for
VHDX, particularly for big-endian machines (multibyte values in VHDX are
all on disk in LE format).

Tests were done with existing qemu-iotests on an IBM POWER7 (8406-71Y).
This includes sample images created by Hyper-V, both with dirty logs and
without.

In addition, VHDX image files created (and written to) on a BE machine
were tested on a LE machine, and vice-versa.

Reported-by: Markus Armburster <armbru@redhat.com>
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-15 15:07:14 +02:00
Jeff Cody 349592e0b9 block: vhdx - add error check
This add an error check for an invalid descriptor entry signature,
when flushing the log descriptor entries.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-15 15:07:14 +02:00
Chrysostomos Nanakos 76d3d83a37 block/archipelago: Add support for creating images
qemu-img archipelago:<volumename>[/mport=<mapperd_port>[:vport=<vlmcd_port>]
 [:segment=<segment_name>]] [size]

Signed-off-by: Chrysostomos Nanakos <cnanakos@grnet.gr>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:14 +02:00
Chrysostomos Nanakos 70537a8506 block/archipelago: Implement bdrv_parse_filename()
VM Image on Archipelago volume can also be specified like this:

file=archipelago:<volumename>[/mport=<mapperd_port>[:vport=<vlmcd_port>][:
segment=<segment_name>]]

Examples:

file=archipelago:my_vm_volume
file=archipelago:my_vm_volume/mport=123
file=archipelago:my_vm_volume/mport=123:vport=1234
file=archipelago:my_vm_volume/mport=123:vport=1234:segment=my_segment

Signed-off-by: Chrysostomos Nanakos <cnanakos@grnet.gr>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:14 +02:00
Chrysostomos Nanakos c9a12e751b block: Support Archipelago as a QEMU block backend
VM Image on Archipelago volume is specified like this:

file.driver=archipelago,file.volume=<volumename>[,file.mport=<mapperd_port>[,
file.vport=<vlmcd_port>][,file.segment=<segment_name>]]

'archipelago' is the protocol.

'mport' is the port number on which mapperd is listening. This is optional
and if not specified, QEMU will make Archipelago to use the default port.

'vport' is the port number on which vlmcd is listening. This is optional
and if not specified, QEMU will make Archipelago to use the default port.

'segment' is the name of the shared memory segment Archipelago stack is using.
This is optional and if not specified, QEMU will make Archipelago to use the
default value, 'archipelago'.

Examples:

file.driver=archipelago,file.volume=my_vm_volume
file.driver=archipelago,file.volume=my_vm_volume,file.mport=123
file.driver=archipelago,file.volume=my_vm_volume,file.mport=123,
file.vport=1234
file.driver=archipelago,file.volume=my_vm_volume,file.mport=123,
file.vport=1234,file.segment=my_segment

Signed-off-by: Chrysostomos Nanakos <cnanakos@grnet.gr>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-15 15:07:14 +02:00
Chunyan Liu 000c4dfff4 qemu-img info: show nocow info
Add nocow info in 'qemu-img info' output to show whether the file
currently has NOCOW flag set or not.

Signed-off-by: Chunyan Liu <cyliu@suse.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:14 +02:00
Fam Zheng c6ac36e145 vmdk: Optimize cluster allocation
This drops the unnecessary bdrv_truncate() from, and also improves,
cluster allocation code path.

Before, when we need a new cluster, get_cluster_offset truncates the
image to bdrv_getlength() + cluster_size, and returns the offset of
added area, i.e. the image length before truncating.

This is not efficient, so it's now rewritten as:

  - Save the extent file length when opening.

  - When allocating cluster, use the saved length as cluster offset.

  - Don't truncate image, because we'll anyway write data there: just
    write any data at the EOF position, in descending priority:

    * New user data (cluster allocation happens in a write request).

    * Filling data in the beginning and/or ending of the new cluster, if
      not covered by user data: either backing file content (COW), or
      zero for standalone images.

One major benifit of this change is, on host mounted NFS images, even
over a fast network, ftruncate is slow (see the example below). This
change significantly speeds up cluster allocation. Comparing by
converting a cirros image (296M) to VMDK on an NFS mount point, over
1Gbe LAN:

    $ time qemu-img convert cirros-0.3.1.img /mnt/a.raw -O vmdk

    Before:
        real    0m21.796s
        user    0m0.130s
        sys     0m0.483s

    After:
        real    0m2.017s
        user    0m0.047s
        sys     0m0.190s

We also get rid of unchecked bdrv_getlength() and bdrv_truncate(), and
get a little more documentation in function comments.

Tested that this passes qemu-iotests for all VMDK subformats.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:14 +02:00
Markus Armbruster 52bf1e722d block: Avoid bdrv_get_geometry() where errors should be detected
bdrv_get_geometry() hides errors.  Use bdrv_nb_sectors() or
bdrv_getlength() instead where that's obviously inappropriate.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:13 +02:00
Markus Armbruster 75d3d21f9e block: Drop superfluous aligning of bdrv_getlength()'s value
It returns a multiple of the sector size.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:13 +02:00
Markus Armbruster 57322b7811 block: Use bdrv_nb_sectors() where sectors, not bytes are wanted
Instead of bdrv_getlength().

Aside: a few of these callers don't handle errors.  I didn't
investigate whether they should.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:13 +02:00
Kevin Wolf df26a35025 raw-posix: Fail gracefully if no working alignment is found
If qemu couldn't find out what O_DIRECT alignment to use with a given
file, it would run into assert(bdrv_opt_mem_align(bs) != 0); in block.c
and confuse users. This adds a more descriptive error message for such
cases.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-18 13:18:43 +01:00
Kevin Wolf 3baca89139 block: Add Error argument to bdrv_refresh_limits()
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-18 13:18:43 +01:00
Kevin Wolf 12ac6d3db7 qcow2: Fix error path for unknown incompatible features
qcow2's report_unsupported_feature() had two bugs: A 32 bit truncation
would prevent feature table entries for bits 32-63 from being used, and
it could assign errp multiple times if there was more than one unknown
feature, resulting in an error_set() assertion failure.

Fix the truncation, make sure to set the error exactly once and add a
qemu-iotests case for it.

This fixes https://bugs.launchpad.net/qemu/+bug/1342704/

Reported-by: Maria Kustova <maria.k@catit.be>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-18 13:12:15 +01:00
Gonglei a1abf40d6b linux-aio: Fix laio resource leak
when hotplug virtio-scsi disks using laio, the aio_nr will
increase in laio_init() by io_setup(), we can see the number by
  # cat /proc/sys/fs/aio-nr
  128
if the aio_nr attach the maxnum, which found from
  # cat /proc/sys/fs/aio-max-nr
  65536
the hotplug process will fail because of aio context leak.

Fix it by io_destroy in laio_cleanup().

Reported-by: daifulai <daifulai@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-15 15:34:13 +02:00
Kevin Wolf 8eb029c26e block: Assert qiov length matches request length
At least raw-posix relies on this because it can allocate bounce buffers
based on the request length, but access it using all of the qiov entries
later.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-07-14 12:03:20 +02:00
Kevin Wolf f06ee3d4aa qed: Make qiov match request size until backing file EOF
If a QED image has a shorter backing file and a read request to
unallocated clusters goes across EOF of the backing file, the backing
file sees a shortened request and the rest is filled with zeros.
However, the original too long qiov was used with the shortened request.

This patch makes the qiov size match the request size, avoiding a
potential buffer overflow in raw-posix.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-07-14 12:03:20 +02:00
Kevin Wolf 44deba5a52 qcow2: Make qiov match request size until backing file EOF
If a qcow2 image has a shorter backing file and a read request to
unallocated clusters goes across EOF of the backing file, the backing
file sees a shortened request and the rest is filled with zeros.
However, the original too long qiov was used with the shortened request.

This patch makes the qiov size match the request size, avoiding a
potential buffer overflow in raw-posix.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-07-14 12:03:20 +02:00
Kevin Wolf d40593dd90 block/backup: Fix hang for unaligned image size
When doing a block backup of an image with an unaligned size (with
respect to the BACKUP_CLUSTER_SIZE), qemu would check the allocation
status of sectors after the end of the image. bdrv_is_allocated()
returns a result that is valid for 0 sectors in this case, so the backup
job ran into an endless loop.

Stop looping when seeing a result valid for 0 sectors, we're at EOF then.

The test case looks somewhat unrelated at first sight because I
originally tried to reproduce a different suspected bug that turned out
to not exist. Still a good test case and it accidentally found this one.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-07-09 15:50:11 +02:00
Ming Lei 1b3abdcccf linux-aio: implement io plug, unplug and flush io queue
This patch implements .bdrv_io_plug, .bdrv_io_unplug and
.bdrv_flush_io_queue callbacks for linux-aio Block Drivers,
so that submitting I/O as a batch can be supported on linux-aio.

[Unprocessed requests are completed with -EIO instead of a bogus ret
value.
--Stefan]

Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-07 11:05:17 +02:00
Markus Armbruster aa729704f4 raw-posix: Fix raw_getlength() to always return -errno on error
We got a merry mix of -1 and -errno here.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-07 09:41:29 +02:00
Kevin Wolf 5a0f6fd5c8 mirror: Fix qiov size for short requests
When mirroring an image of a size that is not a multiple of the
mirror job granularity, the last request would have the right nb_sectors
argument, but a qiov that is rounded up to the next multiple of the
granularity. Don't do this.

This fixes a segfault that is caused by raw-posix being confused by this
and allocating a buffer with request length, but operating on it with
qiov length.

[s/Driver/Drive/ in qemu-iotests 041 as suggested by Eric
--Stefan]

Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-07 09:15:29 +02:00
Jeff Cody 13d8cc515d block: add backing-file option to block-stream
On some image chains, QEMU may not always be able to resolve the
filenames properly, when updating the backing file of an image
after a block job.

For instance, certain relative pathnames may fail, or drives may
have been specified originally by file descriptor (e.g. /dev/fd/???),
or a relative protocol pathname may have been used.

In these instances, QEMU may lack the information to be able to make
the correct choice, but the user or management layer most likely does
have that knowledge.

With this extension to the block-stream api, the user is able to change
the backing file of the active layer as part of the block-stream
operation.

This allows the change to be 'safe', in the sense that if the attempt
to write the active image metadata fails, then the block-stream
operation returns failure, without disrupting the guest.

If a backing file string is not specified in the command, the backing
file string to use is determined in the same manner as it was
previously.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 10:47:01 +02:00
Jeff Cody 54e2690090 block: extend block-commit to accept a string for the backing file
On some image chains, QEMU may not always be able to resolve the
filenames properly, when updating the backing file of an image
after a block commit.

For instance, certain relative pathnames may fail, or drives may
have been specified originally by file descriptor (e.g. /dev/fd/???),
or a relative protocol pathname may have been used.

In these instances, QEMU may lack the information to be able to make
the correct choice, but the user or management layer most likely does
have that knowledge.

With this extension to the block-commit api, the user is able to change
the backing file of the overlay image as part of the block-commit
operation.

This allows the change to be 'safe', in the sense that if the attempt
to write the overlay image metadata fails, then the block-commit
operation returns failure, without disrupting the guest.

If the commit top is the active layer, then specifying the backing
file string will be treated as an error (there is no overlay image
to modify in that case).

If a backing file string is not specified in the command, the backing
file string to use is determined in the same manner as it was
previously.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 10:47:01 +02:00
Peter Maydell 6764579f89 block/cow: Avoid use of uninitialized cow_bs in error path
Commit 25814e8987 introduced an error-exit code path which does
a "goto exit" before the cow_bs variable is initialized, meaning
we would call bdrv_unref() on an uninitialized variable and
likely segfault. Fix this by moving the NULL-initialization
to the top of the function and making the exit code path handle
the case where it is NULL.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 10:15:34 +02:00
Chunyan Liu 4ab1559085 qemu-img create: add 'nocow' option
Add 'nocow' option so that users could have a chance to set NOCOW flag to
newly created files. It's useful on btrfs file system to enhance performance.

Btrfs has low performance when hosting VM images, even more when the guest
in those VM are also using btrfs as file system. One way to mitigate this bad
performance is to turn off COW attributes on VM files. Generally, there are
two ways to turn off NOCOW on btrfs: a) by mounting fs with nodatacow, then
all newly created files will be NOCOW. b) per file. Add the NOCOW file
attribute. It could only be done to empty or new files.

This patch tries the second way, according to the option, it could add NOCOW
per file.

For most block drivers, since the create file step is in raw-posix.c, so we
can do setting NOCOW flag ioctl in raw-posix.c only.

But there are some exceptions, like block/vpc.c and block/vdi.c, they are
creating file by calling qemu_open directly. For them, do the same setting
NOCOW flag ioctl work in them separately.

[Fixed up 082.out due to the new 'nocow' creation option
--Stefan]

Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01 10:15:12 +02:00
Benoît Canet 09158f00e0 block: Add replaces argument to drive-mirror
drive-mirror will bdrv_swap the new BDS named node-name with the one
pointed by replaces when the mirroring is finished.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 20:00:00 +02:00
Stefan Hajnoczi 13344f3a17 block: acquire AioContext in qmp_query_blockstats()
Make query-blockstats safe for dataplane by acquiring the
BlockDriverState's AioContext.  This ensures that the dataplane IOThread
and the main loop's monitor code do not race.

Note the assumption that acquiring the drive's BDS AioContext also
protects ->file and ->backing_hd.  This assumption is made by other
aio_context_acquire() callers too.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 18:20:29 +02:00
Stefan Hajnoczi ac46821f2c block: make bdrv_query_stats() static
This function is only called from block/qapi.c.  There is no need to
keep it public.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 18:19:57 +02:00
Benoît Canet cf29a570a7 quorum: Add the rewrite-corrupted parameter to quorum
On read operations when this parameter is set and some replicas are corrupted
while quorum can be reached quorum will proceed to rewrite the correct version
of the data to fix the corrupted replicas.

This will shine with SSD where the FTL will remap the same block at another
place on rewrite.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-27 14:18:17 +02:00
Kevin Wolf 8ee79e707a block: Catch backing files assigned to non-COW drivers
Since we parse backing.* options to add a backing file from the command
line when the driver didn't assign one, it has been possible to have a
backing file for e.g. raw images (it just was never accessed).

This is obvious nonsense and should be rejected.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-06-26 13:51:01 +02:00
Peter Lieven f42ca3cad1 block/nfs: add knob to set readahead
upcoming libnfs will feature internal readahead support.
Add a knob to pass the optional readahead value as a URL
parameter.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-26 13:51:01 +02:00
Peter Lieven 7c24384b3b block/nfs: fix url parameter checking
this patch fixes the incorrect usage of strncmp and
adds simple error checking by means of parse_uint_full
instead of atoi for the supplied URL parameters.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-26 13:51:01 +02:00
Fam Zheng 9e48b02540 mirror: Go through ready -> complete process for 0 len image
When mirroring or active committing a zero length image, BLOCK_JOB_READY
is not reported now, instead the job completes because we short circuit
the mirror job loop.

This is inconsistent with non-zero length images, and only confuses
management software.

Let's do the same thing when seeing a 0-length image: report ready
immediately; wait for block-job-cancel or block-job-complete; clear the
cancel flag as existing non-zero image synced case (cancelled after
ready); then jump to the exit.

Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-06-26 13:50:57 +02:00
Stefan Weil 5d831be272 Fix new typos (found by codespell)
* accomodate -> accommodate
* aquiring -> acquiring
* beacuse -> because
* loosing -> losing
* prefering -> preferring
* threshhold -> threshold

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-24 20:01:24 +04:00
Wenchao Xia fe069d9d59 qapi event: convert QUORUM events
Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:12:28 -04:00
Wenchao Xia bcada37b19 qapi event: convert other BLOCK_JOB events
Since BLOCK_JOB_COMPLETED, BLOCK_JOB_CANCELLED, BLOCK_JOB_READY are
related, convert them in one patch. The block_job_event_* functions
are used to keep encapsulation of BlockJob structure.

Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:12:28 -04:00
Wenchao Xia c120f0fa14 qapi event: convert BLOCK_IMAGE_CORRUPTED
Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:12:27 -04:00
Wenchao Xia a589569f2f qapi: adjust existing defines
In order to let event defines use existing types later, instead of
redefine new ones, some old type defines for spice and vnc are changed,
and BlockErrorAction is moved from block.h to qapi schema. Note that
BlockErrorAction is not merged with BlockdevOnError.

At this point, VncInfo is not made a child of VncBasicInfo, because
VncBasicInfo has mandatory fields where VncInfo makes them optional.

Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:01:25 -04:00
Liu Yuan 5d5da114b3 sheepdog: fix NULL dereference in sd_create
Following command

qemu-img create -f qcow2 sheepdog:test 20g

will cause core dump because aio_context is NULL in sd_create. We should
initialize it by qemu_get_aio_context() to avoid NULL dereference.

Cc: qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-23 16:36:13 +08:00
Peter Lieven 8c215a9fbd block/iscsi: drop obsolete pointers from iscsi_co_writev
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 18:42:42 +02:00
Peter Lieven 9d2e256e62 block/iscsi: fix init value for iTask->retries
during rebasing the changed init value for the
retry counter was missed. This resulted in no retries
being performed at all.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 18:42:24 +02:00
Peter Lieven e49ab19fca block/iscsi: bump libiscsi requirement to 1.9.0
This patch lifts the minimum supported libiscsi version from 1.4.0 to
1.9.0 since the BUSY patch required that change.

On one this allows us to remove all #ifdefs from the code which
makes the code easier to maintain and read. On the other hand
I would not recommend libiscsi prior to 1.8.0 for production use
because the following important libiscsi fixes for deadlocks and
protocol errors are missing prior to 1.8.0:

dbe9a1e SOCKET queue cmd PDUs directly in waitpdu queue
30df192 DATA-OUT set pdu->cmdsn appropriately
548bd22 ISCSI fix broken send logic in iscsi_scsi_async_command
14bee10 RECONNECT do not increase CmdSN for immediate PDUs
1f4a66a PDU queue out PDUs in order of itt.
562dd46 PDU avoid incrementing itt to 0xffffffff
cd09c0f PDU use serial32 arithmetic for cmdsn, maxcmdsn and expcmdsn.
89e918e SOCKET validate data_size in in_pdu header
91267f5 Limit immediate and unsolicited data to FirstBurstLength

Note that libiscsi 1.9.0 was released on Feb 24th, 2013, about
one month after 1.8.0.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 18:03:33 +02:00
Peter Lieven 9281fe9eea block/iscsi: use 16 byte CDBs only when necessary
this patch changes the driver to uses 16 Byte CDBs for
READ/WRITE only if the target requires 64bit lba addressing.

On one hand this saves 6 bytes in each PDU on the other
hand it seems that 10 Byte CDBs seems to be much better
supported and tested as a recent issue I had with a
major storage supplier lined out.

For WRITESAME the logic is a bit more tricky as WRITESAME10
with UNMAP was added really late. Thus a fallback to WRITESAME16
is possible if it supports UNMAP and WRITESAME10 not.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 08:47:10 +02:00
Peter Lieven fcd470d857 block/iscsi: fix potential segfault on early callback
it might happen in the future that a function directly invokes its callback.
In this case we end up in a segfault because the iTask is gone when the BH
is scheduled.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 08:47:10 +02:00
Peter Lieven efc6de0d0e block/iscsi: handle BUSY condition
this patch adds handling of BUSY status reponse from an iSCSI target.
Currently, we fail with -EIO in case of SCSI_STATUS_BUSY while the
obvious reaction would be to retry the operation after some time.
The retry time is randomly choosen from a range with exponential
growth increasing with each retry.

This patch includes most of the changes by a an upcoming patch
from Stefan Hajnoczi:

 iscsi: implement .bdrv_detach/attach_aio_context()

because I also need the reference to the aio_context for
the retry timer to work. I included the changes to maintain
better mergeability.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 08:47:10 +02:00
Chunyan Liu c282e1fdf7 cleanup QEMUOptionParameter
Now that all backend drivers are using QemuOpts, remove all
QEMUOptionParameter related codes.

Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:21 +08:00
Chunyan Liu fec9921f0a vpc.c: replace QEMUOptionParameter with QemuOpts
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:21 +08:00
Chunyan Liu 5820f1da51 vmdk.c: replace QEMUOptionParameter with QemuOpts
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:21 +08:00
Chunyan Liu 5366092c7a vhdx.c: replace QEMUOptionParameter with QemuOpts
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:21 +08:00
Chunyan Liu 004b7f2522 vdi.c: replace QEMUOptionParameter with QemuOpts
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:21 +08:00
Chunyan Liu 766181fe57 ssh.c: replace QEMUOptionParameter with QemuOpts
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:21 +08:00
Chunyan Liu b222237b7b sheepdog.c: replace QEMUOptionParameter with QemuOpts
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:21 +08:00
Chunyan Liu bd0cf596fd rbd.c: replace QEMUOptionParameter with QemuOpts
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:21 +08:00
Chunyan Liu cd3a4cf631 raw_bsd.c: replace QEMUOptionParameter with QemuOpts
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:21 +08:00
Chunyan Liu ddef769993 raw-win32.c: replace QEMUOptionParameter with QemuOpts
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:21 +08:00
Chunyan Liu 6f482f742d raw-posix.c: replace QEMUOptionParameter with QemuOpts
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:21 +08:00
Chunyan Liu 7ab74849a5 qed.c: replace QEMUOptionParameter with QemuOpts
One extra change is to define QED_DEFAULT_CLUSTER_SIZE = 65536 instead
of 64 * 1024; because:
according to existing create_options, "cluster size" has default value =
QED_DEFAULT_CLUSTER_SIZE, after switching to create_opts, this has to be
stringized and set to .def_value_str. That is,
  .def_value_str = stringify(QED_DEFAULT_CLUSTER_SIZE),
so the QED_DEFAULT_CLUSTER_SIZE could not be a expression.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:21 +08:00
Chunyan Liu 1bd0e2d1c4 qcow2.c: replace QEMUOptionParameter with QemuOpts
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:21 +08:00
Chunyan Liu 16d12159e2 qcow.c: replace QEMUOptionParameter with QemuOpts
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:20 +08:00
Chunyan Liu 98c10b810a nfs.c: replace QEMUOptionParameter with QemuOpts
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:20 +08:00
Chunyan Liu a59479e3f3 iscsi.c: replace QEMUOptionParameter with QemuOpts
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:20 +08:00
Chunyan Liu 90c772de56 gluster.c: replace QEMUOptionParameter with QemuOpts
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:20 +08:00
Chunyan Liu 25814e8987 cow.c: replace QEMUOptionParameter with QemuOpts
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:20 +08:00
Chunyan Liu facdbb0272 vvfat.c: handle cross_driver's create_options and create_opts
vvfat shares create options of qcow driver. To avoid vvfat breaking when
qcow driver changes from QEMUOptionParameter to QemuOpts, let it able
to handle both cases.

Signed-off-by: Chunyan Liu <cyliu@suse.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:20 +08:00
Chunyan Liu 83d0521a1e change block layer to support both QemuOpts and QEMUOptionParamter
Change block layer to support both QemuOpts and QEMUOptionParameter.
After this patch, it will change backend drivers one by one. At the end,
QEMUOptionParameter will be removed and only QemuOpts is kept.

Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:20 +08:00
Peter Lieven a2c0fe2fd2 block/nfs: fix potential segfault on early callback
it will happen in the future that the callback of a libnfs call
directly invokes the callback. In this case we end up in a segfault
because the NFSRPC is gone when we the BH is scheduled.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:19 +08:00
Markus Armbruster f7047c2daf block: Drop superfluous conditionals around g_free()
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:19 +08:00
Paolo Bonzini 6998b6c11b vdi: remove double conversion
This should be a problem when running on big-endian machines.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-06-10 19:39:34 +04:00
Hitoshi Mitake 5d039baba4 sheepdog: reload only header in a case of live snapshot
sheepdog driver doesn't need to read data_vdi_id[] when a live snapshot is
created.

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Liu Yuan <namei.unix@gmail.com>
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-06 14:54:43 +02:00
Hitoshi Mitake b544c1aba8 sheepdog: fix vdi object update after live snapshot
sheepdog driver should decide a write request is COW or not based on inode
object which is active when the write request is issued.

Example of wrong inode update path in the previous driver:
1. drier issues an ordinal write request to an existing object
2. user creates a snapshot of the VDI before the write request is completed
3. the respones for the request is RDONLY, because the VDI is already a snapshot
4. the driver reload an inode object of the new active VDI, then issues a write
   request again
5. the second write request can be completed
6. driver decide the request is COW or not with the below conditional branch:
   	  if (s->inode.data_vdi_id[idx] != s->inode.vdi_id) {
7. the ID of the written object and VID of the new active VDI is different, so
   the driver updates data_vdi_id[idx] and writes inode object
8. the existing object cannot be seen by the new active VDI, it results object
   leaking

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Liu Yuan <namei.unix@gmail.com>
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-06 14:53:55 +02:00
Kevin Wolf 405a27640b rbd: Fix leaks in rbd_start_aio() error path
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-06 11:05:04 +02:00
Stefan Hajnoczi 76ef2cf549 raw-posix: drop raw_get_aio_fd() since it is no longer used
virtio-blk data-plane now uses the QEMU block layer for I/O.  We do not
need raw_get_aio_fd() anymore.  It was a layering violation anyway, so
let's get rid of it.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:12 +02:00
Stefan Hajnoczi c75f3bdf46 vmdk: implement .bdrv_detach/attach_aio_context()
Implement .bdrv_detach/attach_aio_context() interfaces to propagate
detach/attach to BDRVVmdkState->extents[].file.  The block layer takes
care of ->file and ->backing_hd but doesn't know about our extents
BlockDriverStates, which is also part of the graph.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:12 +02:00
Stefan Hajnoczi 2af0b20056 ssh: use BlockDriverState's AioContext
Drop the assumption that we're using the main AioContext.  Use
bdrv_get_aio_context() to register fd handlers in the right AioContext
for this BlockDriverState.

The .bdrv_detach_aio_context() and .bdrv_attach_aio_context() interfaces
are not needed since no fd handlers, timers, or BHs stay registered when
requests have been drained.

For now this doesn't make much difference but will allow ssh to work in
IOThread instances in the future.

Acked-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:12 +02:00
Stefan Hajnoczi 84390bed59 sheepdog: implement .bdrv_detach/attach_aio_context()
Drop the assumption that we're using the main AioContext.  Convert
qemu_aio_set_fd_handler() to aio_set_fd_handler() and qemu_aio_wait() to
aio_poll().

The .bdrv_detach/attach_aio_context() interfaces also need to be
implemented to move the socket fd handler from the old to the new
AioContext.

Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Acked-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:12 +02:00
Stefan Hajnoczi ea80019149 rbd: use BlockDriverState's AioContext
Drop the assumption that we're using the main AioContext.  Convert
qemu_bh_new() to aio_bh_new() and qemu_aio_wait() to aio_poll().

The .bdrv_detach_aio_context() and .bdrv_attach_aio_context() interfaces
are not needed since no fd handlers, timers, or BHs stay registered when
requests have been drained.

Cc: Josh Durgin <josh.durgin@inktank.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2014-06-04 09:56:12 +02:00
Stefan Hajnoczi 85ebd381fd block/raw-win32: implement .bdrv_detach/attach_aio_context()
Drop the assumption that we're using the main AioContext for raw-win32.
Convert the aio-win32 code to support detach/attach and replace
qemu_aio_wait() with aio_poll().

The .bdrv_detach/attach_aio_context() interfaces move the aio-win32
event notifier from the old to the new AioContext.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:12 +02:00
Stefan Hajnoczi 99cc598924 block/raw-win32: create one QEMUWin32AIOState per BDRVRawState
Each QEMUWin32AIOState event notifier is associated with an AioContext.
Since BlockDriverState instances can use different AioContexts we cannot
continue to use a global QEMUWin32AIOState.

Let each BDRVRawState have its own QEMUWin32AIOState and free it when
BDRVRawState is closed.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:12 +02:00
Stefan Hajnoczi abd269b7cf block/linux-aio: fix memory and fd leak
Hot unplugging -drive aio=native,file=test.img,format=raw images leaves
the Linux AIO event notifier and struct qemu_laio_state allocated.
Luckily nothing will use the event notifier after the BlockDriverState
has been closed so the handler function is never called.

It's still worth fixing this resource leak.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:11 +02:00
Stefan Hajnoczi c2f3426c9b block/raw-posix: implement .bdrv_detach/attach_aio_context()
Drop the assumption that we're using the main AioContext for Linux AIO.
Convert the Linux AIO event notifier to use aio_set_event_notifier().

The .bdrv_detach/attach_aio_context() interfaces also need to be
implemented to move the event notifier handler from the old to the new
AioContext.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:11 +02:00
Stefan Hajnoczi e3625d3d6a quorum: implement .bdrv_detach/attach_aio_context()
Implement .bdrv_detach/attach_aio_context() interfaces to propagate
detach/attach to BDRVQuorumState->bs[] children.  The block layer takes
care of ->file and ->backing_hd but doesn't know about our ->bs[]
BlockDriverStates, which is also part of the graph.

Reviewed-by: Benoît Canet <benoit.canet@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:11 +02:00
Stefan Hajnoczi a8c868c386 qed: use BlockDriverState's AioContext
Drop the assumption that we're using the main AioContext.  Convert
qemu_bh_new() to aio_bh_new() and qemu_aio_wait() to aio_poll() so we're
using the BlockDriverState's AioContext.

Implement .bdrv_detach/attach_aio_context() interfaces to move the
QED_F_NEED_CHECK timer from the old AioContext to the new one.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:11 +02:00
Stefan Hajnoczi 471799d135 nfs: implement .bdrv_detach/attach_aio_context()
Drop the assumption that we're using the main AioContext.  The following
functions need to be converted:
 * qemu_bh_new() -> aio_bh_new()
 * qemu_aio_set_fd_handler() -> aio_set_fd_handler()
 * qemu_aio_wait() -> aio_poll()

The .bdrv_detach/attach_aio_context() interfaces also need to be
implemented to move the fd handler from the old to the new AioContext.

Cc: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
2014-06-04 09:56:11 +02:00
Stefan Hajnoczi 69447cd8f3 nbd: implement .bdrv_detach/attach_aio_context()
Drop the assumption that we're using the main AioContext.  Convert
qemu_aio_set_fd_handler() calls to aio_set_fd_handler().

The .bdrv_detach/attach_aio_context() interfaces also need to be
implemented to move the socket fd handler from the old to the new
AioContext.

Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:11 +02:00
Stefan Hajnoczi 80cf625766 iscsi: implement .bdrv_detach/attach_aio_context()
Drop the assumption that we're using the main AioContext for Linux
AIO.  Convert qemu_aio_set_fd_handler() to aio_set_fd_handler() and
timer_new_ms() to aio_timer_new().

The .bdrv_detach/attach_aio_context() interfaces also need to be
implemented to move the fd and timer from the old to the new AioContext.

Cc: Peter Lieven <pl@kamp.de>
Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
2014-06-04 09:56:11 +02:00
Stefan Hajnoczi 6ee50af24b gluster: use BlockDriverState's AioContext
Drop the assumption that we're using the main AioContext.  Use
aio_bh_new() instead of qemu_bh_new().

The .bdrv_detach_aio_context() and .bdrv_attach_aio_context() interfaces
are not needed since no fd handlers, timers, or BHs stay registered when
requests have been drained.

Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:11 +02:00
Stefan Hajnoczi 63f0f45f2e curl: implement .bdrv_detach/attach_aio_context()
The curl block driver uses fd handlers, timers, and BHs.  The fd
handlers and timers are managed on behalf of libcurl, which controls
them using callback functions that the block driver implements.

The simplest way to implement .bdrv_detach/attach_aio_context() is to
clean up libcurl in the old event loop and initialize it again in the
new event loop.  We do not need to keep track of anything since there
are no pending requests when the AioContext is changed.

Also make sure to use aio_set_fd_handler() instead of
qemu_aio_set_fd_handler() and aio_bh_new() instead of qemu_bh_new() so
the current AioContext is passed in.

Cc: Alexander Graf <agraf@suse.de>
Cc: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2014-06-04 09:56:11 +02:00
Stefan Hajnoczi 9d94020bc6 blkverify: implement .bdrv_detach/attach_aio_context()
Drop the assumption that we're using the main AioContext.  Convert
qemu_bh_new() to aio_bh_new() and qemu_aio_wait() to aio_poll() so we
use the BlockDriverState's AioContext.

Implement .bdrv_detach/attach_aio_context() interfaces to propagate
detach/attach to BDRVBlkverifyState->test_file.  The block layer takes
care of ->file and ->backing_hd but doesn't know about our ->test_file
BlockDriverState, which is also part of the graph.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:11 +02:00
Stefan Hajnoczi 7e1efdf0a3 blkdebug: use BlockDriverState's AioContext
Drop the assumption that we're using the main AioContext.  Convert
qemu_bh_new() to aio_bh_new() so we use the BlockDriverState's
AioContext.

The .bdrv_detach_aio_context() and .bdrv_attach_aio_context() interfaces
are not needed since no fd handlers, timers, or BHs stay registered when
requests have been drained.

Cc: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:11 +02:00
Fam Zheng c13959c745 vmdk: Fix local_err in vmdk_create
In vmdk_create and vmdk_create_extent, initialize local_err before using
it, and don't leak it on error.

Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-05-30 14:26:54 +02:00
Peter Maydell 675036e4da block/raw-posix.c: Avoid nonstandard LONG_LONG_MAX
In the MacOSX specific code in raw-posix.c we use the define
LONG_LONG_MAX. This is actually a non-standard pre-C99 define;
switch to using the standard LLONG_MAX instead.

This apparently fixes a compilation failure with certain
compiler/OS versions (though it is unclear which).

Reported-by: Peter Bartoli <peter@bartoli.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-05-30 14:26:54 +02:00
Markus Armbruster 2df5fee2db block/sheepdog: Plug memory leak in sd_snapshot_create()
Has always been leaky.  Spotted by Coverity.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-05-30 14:26:54 +02:00
Markus Armbruster b122c3b6d0 block/vvfat: Plug memory leak in read_directory()
Has always been leaky.  Spotted by Coverity.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-05-30 14:26:54 +02:00
Markus Armbruster 6262bbd363 block/vvfat: Plug memory leak in check_directory_consistency()
On error path.  Introduced in commit a046433a.  Spotted by Coverity.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-05-30 14:26:54 +02:00
Markus Armbruster f25391c2a6 block/qapi: Plug memory leak in dump_qobject() case QTYPE_QERROR
Introduced in commit a8d8ecb.  Spotted by Coverity.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-05-30 14:26:54 +02:00
Markus Armbruster a1904e48c4 qcow2: Plug memory leak on qcow2_invalidate_cache() error paths
Introduced in commit 5a8a30d.  Spotted by Coverity.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-05-30 14:26:54 +02:00
Markus Armbruster 75e347d66a block/vvfat: Plug memory leak in enable_write_target()
I figure the leak originated in bdrv_create2(), and was duplicated
into callers when commit 91a073a dropped that function.  Looks like
the other places have since been fixed.

Spotted by Coverity.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-05-30 14:26:54 +02:00
Markus Armbruster fbab9ccbdb block/sheepdog: Don't use qerror_report()
qerror_report() is a transitional interface to help with converting
existing HMP commands to QMP.  It should not be used elsewhere.
Replace by error_report().

Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-05-28 14:28:47 +02:00
Markus Armbruster efde4b6252 block/sheepdog: Fix silent sd_open(), sd_create() failures
Open and create methods must set an error when they fail.

Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-05-28 14:28:47 +02:00
Markus Armbruster e67c399363 block/sheepdog: Propagate errors to open and create methods
Completes the conversion to Error started in commit 015a103^..d5124c0,
except for a few bugs fixed in the next commit.

Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-05-28 14:28:47 +02:00
Markus Armbruster dc83cd427b block/sheepdog: Propagate errors through find_vdi_name()
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-05-28 14:28:47 +02:00
Markus Armbruster 7d2d3e74e5 block/sheepdog: Propagate errors through do_sd_create()
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-05-28 14:28:47 +02:00
Markus Armbruster 318df29e10 block/sheepdog: Propagate errors through sd_prealloc()
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-05-28 14:28:47 +02:00
Markus Armbruster 356b4ca2bb block/sheepdog: Propagate errors through get_sheep_fd()
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-05-28 14:28:47 +02:00
Markus Armbruster dfb12bf86e block/sheepdog: Propagate errors through connect_to_sdog()
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-05-28 14:28:47 +02:00
Markus Armbruster d11c8917b2 block/vvfat: Propagate errors through init_directories()
Completes the conversion of the open method to Error started in commit
015a103.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-05-28 14:28:47 +02:00
Markus Armbruster 68c70af16d block/vvfat: Propagate errors through enable_write_target()
Continues the conversion of the open method to Error started in commit
015a103.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-05-28 14:28:47 +02:00
Markus Armbruster 5496fb1aeb block/ssh: Propagate errors to open and create methods
Completes the conversion to Error started in commit 015a103^..d5124c0.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-05-28 14:28:47 +02:00
Markus Armbruster 5f0c39e598 block/ssh: Propagate errors through connect_to_ssh()
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-05-28 14:28:47 +02:00
Markus Armbruster 4618e658e6 block/ssh: Propagate errors through authenticate()
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-05-28 14:28:47 +02:00
Markus Armbruster 01c2b265fc block/ssh: Propagate errors through check_host_key()
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-05-28 14:28:47 +02:00
Markus Armbruster 04bc7c0e38 block/ssh: Drop superfluous libssh2_session_last_errno() calls
libssh2_session_last_error() already returns the error code.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-05-28 14:28:47 +02:00
Markus Armbruster d61563b235 block/rbd: Propagate errors to open and create methods
Completes the conversion to Error started in commit 015a103^..d5124c0.

Cc: Josh Durgin <josh.durgin@inktank.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-05-28 14:28:47 +02:00
Fam Zheng 826b6ca0b0 block: Add backing_blocker in BlockDriverState
This makes use of op_blocker and blocks all the operations except for
commit target, on each BlockDriverState->backing_hd.

The asserts for op_blocker in bdrv_swap are removed because with this
change, the target of block commit has at least the backing blocker of
its child, so the assertion is not true. Callers should do their check.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-05-28 14:28:46 +02:00
Fam Zheng 920beae103 block: Use bdrv_set_backing_hd everywhere
We need to handle the coming backing_blocker properly, so don't open
code the assignment, instead, call bdrv_set_backing_hd to change
backing_hd.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-05-28 14:28:46 +02:00
Kevin Wolf bd60436936 qcow2: Fix memory leak in COW error path
This triggers if bs->drv becomes NULL in a concurrent request. This is
currently only the case when corruption prevention kicks in (i.e. at
most once per image, and after that it produces I/O errors).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-05-28 14:28:46 +02:00
Peter Maydell 65903a8b08 Merge remote-tracking branch 'remotes/bonzini/scsi-next' into staging
* remotes/bonzini/scsi-next:
  megasas: remove buildtime strings
  block: iscsi build fix if LIBISCSI_FEATURE_IOVECTOR is not defined
  virtio-scsi: Plug memory leak on virtio_scsi_push_event() error path
  scsi: Document intentional fall through in scsi_req_length()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-05-22 15:27:46 +01:00
Peter Maydell ca8c0fab95 Block patches
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJTehNaAAoJEH8JsnLIjy/Ws+QP/0C7GTkY+CAgSSoQmIKr96lf
 k0iRIypxMw68S6NEXtiqFn1yLVv41+puamrznmOSBqxZwwWLpu+J28M8i3M3lr2Q
 /bs7l22+HfakfJ6AJkYXaLPqAlRdCnM3HOLWpDuVFLeaLlHV14w4oWCIAs/lA6Tg
 n4S4nKpWUzm97NnNvQjf953kSFZ/xfH72PcICE5vyaQBnrDMMUnRyffdnVBEGMjd
 hhx/demJNXt+01XxC4VrnvpibGvMthEbQoRwemFi2snD6YXhk9XcT+jiD2VMrgCr
 fC316vdAFAiVNvI+JjCRE/1gaMRI+m0tNpymzGWnbnEc8P86KUaitASRc150NDSO
 UgpDg7oneMXC66OdZXG0XqojiAQ8sqHrvMpV+YiirJUbwIcD5ITDKt9omuIjOWjj
 ENXHOk2U87xoFfqBRRbsuO+U2QtfPDFA4jRjh5ppUy/0xuW/YL3SBCSdUHR8jalM
 H8mYcC9zKsL7D71Nh6spU4btNK2xjZT+vPoiurHNyiBSVniHagsKPGtzQCqhJEa+
 y9xCBCyqZvHBvQ2w1pE4DBOIvt3L0kKd7pRxRch9letCA6Fo/ktb7rvkDkcPVh50
 I0kphrnWqGLgC+8oMvh/gjwtzvWkTCfc8jhvzAcBGaInQr+spSaduCAnrGTpfBh1
 vfvc1o3NUVhvqipMgdzq
 =LnQj
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block patches

# gpg: Signature made Mon 19 May 2014 15:21:14 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"

* remotes/kevin/tags/for-upstream: (22 commits)
  block: optimize zero writes with bdrv_write_zeroes
  blockdev: add a function to parse enum ids from strings
  util: add qemu_iovec_is_zero
  qcow1: Stricter backing file length check
  qcow1: Validate image size (CVE-2014-0223)
  qcow1: Validate L2 table size (CVE-2014-0222)
  qcow1: Check maximum cluster size
  qcow1: Make padding in the header explicit
  curl: Add usage documentation
  curl: Add sslverify option
  curl: Remove broken parsing of options from url
  curl: Fix build when curl_multi_socket_action isn't available
  qemu-iotests: Fix blkdebug in VM drive in 030
  qemu-iotests: Fix core dump suppression in test 039
  iotests: Add test for the JSON protocol
  block: Allow JSON filenames
  check-qdict: Add test for qdict_join()
  qdict: Add qdict_join()
  block: add test for vhdx image created by Disk2VHD
  block: vhdx - account for identical header sections
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-05-20 11:57:52 +01:00
Jeff Cody f2564d88fe block: iscsi build fix if LIBISCSI_FEATURE_IOVECTOR is not defined
Commit b03c380 introduced the function
iscsi_allocationmap_is_allocated(), however it is only used within a
code block that is conditionally compiled.  This produces a warning
(error with -werror) of "defined but not used" for the the function, if
LIBISCSI_FEATURE_IOVECTOR is not defined.

This wraps iscsi_allocationmap_is_allocated() in the same conditional.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-20 10:09:45 +02:00
Peter Lieven 465bee1da8 block: optimize zero writes with bdrv_write_zeroes
this patch tries to optimize zero write requests
by automatically using bdrv_write_zeroes if it is
supported by the format.

This significantly speeds up file system initialization and
should speed zero write test used to test backend storage
performance.

I ran the following 2 tests on my internal SSD with a
50G QCOW2 container and on an attached iSCSI storage.

a) mkfs.ext4 -E lazy_itable_init=0,lazy_journal_init=0 /dev/vdX

QCOW2         [off]     [on]     [unmap]
-----
runtime:       14secs    1.1secs  1.1secs
filesize:      937M      18M      18M

iSCSI         [off]     [on]     [unmap]
----
runtime:       9.3s      0.9s     0.9s

b) dd if=/dev/zero of=/dev/vdX bs=1M oflag=direct

QCOW2         [off]     [on]     [unmap]
-----
runtime:       246secs   18secs   18secs
filesize:      51G       192K     192K
throughput:    203M/s    2.3G/s   2.3G/s

iSCSI*        [off]     [on]     [unmap]
----
runtime:       8mins     45secs   33secs
throughput:    106M/s    1.2G/s   1.6G/s
allocated:     100%      100%     0%

* The storage was connected via an 1Gbit interface.
  It seems to internally handle writing zeroes
  via WRITESAME16 very fast.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-05-19 13:42:27 +02:00
Peter Maydell 6a23082b4e Merge remote-tracking branch 'remotes/bonzini/scsi-next' into staging
* remotes/bonzini/scsi-next:
  [PATCH] block/iscsi: bump year in copyright notice
  block/iscsi: allow cluster_size of 4K and greater
  block/iscsi: clarify the meaning of ISCSI_CHECKALLOC_THRES
  block/iscsi: speed up read for unallocated sectors
  block/iscsi: allow fall back to WRITE SAME without UNMAP
  MAINTAINERS: mark megasas as maintained
  megasas: Add MSI support
  megasas: Enable MSI-X support
  megasas: Implement LD_LIST_QUERY
  scsi: Improve error messages more
  scsi-disk: Improve error messager if can't get version number

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-05-19 12:30:06 +01:00
Kevin Wolf d66e5cee00 qcow1: Stricter backing file length check
Like qcow2 since commit 6d33e8e7, error out on invalid lengths instead
of silently truncating them to 1023.

Also don't rely on bdrv_pread() catching integer overflows that make len
negative, but use unsigned variables in the first place.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-05-19 11:36:49 +02:00
Kevin Wolf 46485de0cb qcow1: Validate image size (CVE-2014-0223)
A huge image size could cause s->l1_size to overflow. Make sure that
images never require a L1 table larger than what fits in s->l1_size.

This cannot only cause unbounded allocations, but also the allocation of
a too small L1 table, resulting in out-of-bounds array accesses (both
reads and writes).

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-05-19 11:36:49 +02:00
Kevin Wolf 42eb58179b qcow1: Validate L2 table size (CVE-2014-0222)
Too large L2 table sizes cause unbounded allocations. Images actually
created by qemu-img only have 512 byte or 4k L2 tables.

To keep things consistent with cluster sizes, allow ranges between 512
bytes and 64k (in fact, down to 1 entry = 8 bytes is technically
working, but L2 table sizes smaller than a cluster don't make a lot of
sense).

This also means that the number of bytes on the virtual disk that are
described by the same L2 table is limited to at most 8k * 64k or 2^29,
preventively avoiding any integer overflows.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-05-19 11:36:49 +02:00
Kevin Wolf 7159a45b2b qcow1: Check maximum cluster size
Huge values for header.cluster_bits cause unbounded allocations (e.g.
for s->cluster_cache) and crash qemu this way. Less huge values may
survive those allocations, but can cause integer overflows later on.

The only cluster sizes that qemu can create are 4k (for standalone
images) and 512 (for images with backing files), so we can limit it
to 64k.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-05-19 11:36:49 +02:00
Kevin Wolf ea54feff58 qcow1: Make padding in the header explicit
We were relying on all compilers inserting the same padding in the
header struct that is used for the on-disk format. Let's not do that.
Mark the struct as packed and insert an explicit padding field for
compatibility.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-05-19 11:36:49 +02:00
Matthew Booth 97a3ea5719 curl: Add sslverify option
This allows qemu to use images over https with a self-signed certificate. It
defaults to verifying the certificate.

Signed-off-by: Matthew Booth <mbooth@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-05-19 11:36:49 +02:00
Matthew Booth e3542c67af curl: Remove broken parsing of options from url
The block layer now supports a generic json syntax for passing option parameters
explicitly, making parsing of options from the url redundant.

Signed-off-by: Matthew Booth <mbooth@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-05-19 11:36:49 +02:00
Matthew Booth 9aedd5a5d6 curl: Fix build when curl_multi_socket_action isn't available
Signed-off-by: Matthew Booth <mbooth@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-05-19 11:36:49 +02:00
Jeff Cody 6906046169 block: vhdx - account for identical header sections
The VHDX spec v1.00 declares that "a header is current if it is the only
valid header or if it is valid and its SequenceNumber field is greater
than the other header’s SequenceNumber field. The parser must only use
data from the current header. If there is no current header, then the
VHDX file is corrupt."

However, the Disk2VHD tool from Microsoft creates a VHDX image file that
has 2 identical headers, including matching checksums and matching
sequence numbers.  Likely, as a shortcut the tool is just writing the
header twice, for the active and inactive headers, during the image
creation.  Technically, this should be considered a corrupt VHDX file
(at least per the 1.00 spec, and that is how we currently treat it).

But in order to accomodate images created with Disk2VHD, we can safely
create an exception for this case.  If we find identical sequence
numbers, then we check the VHDXHeader-sized chunks of each 64KB header
sections (we won't rely just on the crc32c to indicate the headers are
the same).  If they are identical, then we go ahead and use the first
one.

Reported-by: Nerijus Baliūnas <nerijus@users.sourceforge.net>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-05-19 11:36:48 +02:00
Max Reitz 4f11aa8a40 block/raw-posix: Try both FIEMAP and SEEK_HOLE
The current version of raw-posix always uses ioctl(FS_IOC_FIEMAP) if
FIEMAP is available; lseek with SEEK_HOLE/SEEK_DATA are not even
compiled in in this case. However, there may be implementations which
support the latter but not the former (e.g., NFSv4.2) as well as vice
versa.

To cover both cases, try FIEMAP first (as this will return -ENOTSUP if
not supported instead of returning a failsafe value (everything
allocated as a single extent)) and if that does not work, fall back to
SEEK_HOLE/SEEK_DATA.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-05-09 20:57:32 +02:00
Peter Krempa 4557117d9e gluster: Correctly propagate errors when volume isn't accessible
The docs for glfs_init suggest that the function sets errno on every
failure. In fact it doesn't. As other functions such as
qemu_gluster_open() in the gluster block code report their errors based
on this fact we need to make sure that errno is set on each failure.

This fixes a crash of qemu-img/qemu when a gluster brick isn't
accessible from given host while the server serving the volume
description is.

Thread 1 (Thread 0x7ffff7fba740 (LWP 203880)):
 #0  0x00007ffff77673f8 in glfs_lseek () from /usr/lib64/libgfapi.so.0
 #1  0x0000555555574a68 in qemu_gluster_getlength ()
 #2  0x0000555555565742 in refresh_total_sectors ()
 #3  0x000055555556914f in bdrv_open_common ()
 #4  0x000055555556e8e8 in bdrv_open ()
 #5  0x000055555556f02f in bdrv_open_image ()
 #6  0x000055555556e5f6 in bdrv_open ()
 #7  0x00005555555c5775 in bdrv_new_open ()
 #8  0x00005555555c5b91 in img_info ()
 #9  0x00007ffff62c9c05 in __libc_start_main () from /lib64/libc.so.6
 #10 0x00005555555648ad in _start ()

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-05-09 20:57:32 +02:00
Fam Zheng 74fe188cd1 vmdk: Implement .bdrv_get_info()
This will return cluster_size and needs_compressed_writes to caller, if all the
extents have the same value (or there's only one extent). Otherwise return
-ENOTSUP.

cluster_size is only reported for sparse formats.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-05-09 13:32:16 +02:00
Fam Zheng ba0ad89e2c vmdk: Implement .bdrv_write_compressed
Add a wrapper function to support "compressed" path in qemu-img convert.
Only support streamOptimized subformat case for now (num_extents == 1
and extent compression is true).

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-05-09 13:32:16 +02:00
Peter Lieven ec209aca83 block/iscsi: bump year in copyright notice
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-05-09 13:32:16 +02:00
Max Reitz 5f4d5e1aa6 block/nfs: Check for NULL server part
After the URL has been parsed make sure the server part is valid in
order to avoid a segmentation fault when calling nfs_mount().

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-05-09 13:32:16 +02:00
Max Reitz 65f33bc002 qcow2: Fix alloc_clusters_noref() overflow detection
If the very first allocation has a length of 0, the free_cluster_index
is still 0 after the for loop, which means that subtracting one from it
will underflow and signal an invalid range of clusters by returning
-EFBIG. However, there is no such range, as its length is 0.

Fix this by preventing underflows on free_cluster_index during the
check.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-05-09 13:32:16 +02:00
Peter Lieven 6a86dec619 [PATCH] block/iscsi: bump year in copyright notice
Signed-off-by: Peter Lieven <pl@kamp.de>
2014-05-05 10:04:30 +02:00
Matthew Booth b7079df410 curl: Fix hang reading from slow connections
When receiving a new aio read request, we first look for an existing
transaction whose range will cover the read request by the time it
completes. However, we weren't checking that the existing transaction
was still active. If it had timed out, we were adding the request to a
transaction which would never complete and had already been cancelled,
resulting in a hang.

Signed-off-by: Matthew Booth <mbooth@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-30 16:34:21 +02:00
Matthew Booth 1f2cead324 curl: Ensure all informationals are checked for completion
According to the documentation, the correct way to ensure all
informationals have been returned by curl_multi_info_read is to loop
until it returns NULL.

Signed-off-by: Matthew Booth <mbooth@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-30 16:34:18 +02:00
Matthew Booth 838ef60249 curl: Eliminate unnecessary use of curl_multi_socket_all
curl_multi_socket_all is a deprecated catch-all which checks for
activities on all open curl sockets. We have enough information from
the event loop to check only the sockets with activity. This change
removes use of curl_multi_socket_all in favour of
curl_multi_socket_action called with the relevant handle.

At the same time, it also ensures that the driver only checks for
completion of read operations after reading from a socket, rather than
both reading and writing.

Signed-off-by: Matthew Booth <mbooth@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-30 16:34:16 +02:00
Matthew Booth b69cdef876 curl: Remove unnecessary explicit calls to internal event handler
Remove calls to curl_multi_do where the relevant handles are already
registered to the event loop.

Ensure that we kick off socket handling with CURL_SOCKET_TIMEOUT after
adding a new handle.

Signed-off-by: Matthew Booth <mbooth@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-30 16:34:14 +02:00
Matthew Booth e466183718 curl: Remove erroneous sleep waiting for curl completion
The driver will not start more than a fixed number of curl sessions.
If it needs more, it must wait for the completion of an existing one.
The driver was sleeping, which will prevent the main loop from
running, and therefore the event it's waiting on. It was also directly
calling its internal handler rather than waiting on existing
registered handlers to be called from the main loop.

This change causes it simply to wait for a period of time whilst
allowing the main loop to execute.

Signed-off-by: Matthew Booth <mbooth@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-30 16:34:11 +02:00
Matthew Booth 38bbc0a580 curl: Fix return from curl_read_cb with invalid state
A curl write callback is supposed to return the number of bytes it
handled.  curl_read_cb would have erroneously reported it had handled
all bytes in the event that the internal curl state was invalid.

Signed-off-by: Matthew Booth <mbooth@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-30 16:34:08 +02:00
Matthew Booth 9e550b3260 curl: Remove unnecessary use of goto
This isn't any of the usually acceptable uses of goto.

Signed-off-by: Matthew Booth <mbooth@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-30 16:34:06 +02:00
Matthew Booth f6246509be curl: Fix long line
Signed-off-by: Matthew Booth <mbooth@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-30 16:33:39 +02:00
Max Reitz 0549ea8b6d block/vdi: Error out immediately in vdi_create()
Currently, if an error occurs during the part of vdi_create() which
actually writes the image, the function stores -errno, but continues
anyway.

Instead of trying to write data which (if it can be written at all) does
not make any sense without the operations before succeeding (e.g.,
writing the image header), just error out immediately.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-30 14:46:17 +02:00
Max Reitz e1b42f456f block/bochs: Fix error handling for seek_to_sector()
Currently, seek_to_sector() returns -1 both for errors and unallocated
sectors, resulting in silent errors. As 0 is an invalid offset of data
clusters (bitmap_offset is greater than 0 because s->data_offset is
greater than 0), just return 0 for unallocated sectors and -errno in
case of error. This should then be propagated by bochs_read(), the sole
user of seek_to_sector().

That function also has a case of "return -1 in case of error", which is
fixed by this patch as well.

bochs_read() is called by bochs_co_read() which passes the return value
through, therefore it is indeed correct for bochs_read() to return
-errno.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-30 14:46:17 +02:00
Max Reitz b93f995081 qcow2: Check min_size in qcow2_grow_l1_table()
First, new_l1_size is an int64_t, whereas min_size is a uint64_t.
Therefore, during the loop which adjusts new_l1_size until it equals or
exceeds min_size, new_l1_size might overflow and become negative. The
comparison in the loop condition however will take it as an unsigned
value (because min_size is unsigned) and therefore recognize it as
exceeding min_size. Therefore, the loop is left with a negative
new_l1_size, which is not correct. This could be fixed by making
new_l1_size uint64_t.

On the other hand, however, by doing this, the while loop may take
forever. If min_size is e.g. UINT64_MAX, it will take new_l1_size
probably multiple overflows to reach the exact same value (if it reaches
it at all). Then, right after the loop, new_l1_size will be recognized
as being too big anyway.

Both problems require a ridiculously high min_size value, which is very
unlikely to occur; but both problems are also simply avoided by checking
whether min_size is sane before calculating new_l1_size (which should
still be checked separately, though).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-30 14:46:17 +02:00
Max Reitz a49139af77 qcow2: Catch bdrv_getlength() error
The call to bdrv_getlength() from qcow2_check_refcounts() may result in
an error. Check this and abort if necessary.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-30 14:46:17 +02:00
Max Reitz 521b2b5df0 block: Use correct width in format strings
Instead of blindly relying on a normal integer having a width of 32 bits
(which is a pretty good assumption, but we should not rely on it if
there is no need), use the correct format string macros.

This does not touch DEBUG output.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-30 14:46:17 +02:00
Max Reitz 91f827dcff qcow2: Avoid overflow in alloc_clusters_noref()
alloc_clusters_noref() stores the cluster index in a uint64_t. However,
offsets are often represented as int64_t (as for example the return
value of alloc_clusters_noref() itself demonstrates). Therefore, we
should make sure all offsets in the allocated range of clusters are
representable using int64_t without overflows.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-30 14:46:13 +02:00
Max Reitz 35d0d40a03 block: Use error_abort in bdrv_image_info_specific_dump()
Currently, bdrv_image_info_specific_dump() uses an error variable for
visit_type_ImageInfoSpecific, but ignores the result. As this function
is used here with an output visitor to transform the ImageInfoSpecific
object to a generic QDict, an error should actually be impossible. It is
however better to assert that this is indeed the case. This is done by
this patch using error_abort instead of an unused local Error variable.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-30 12:43:30 +02:00
Kevin Wolf 8bfea15dda block: Unlink temporary files in raw-posix/win32
Instead of having unlink() calls in the generic block layer, where we
aren't even guarateed to have a file name, move them to those block
drivers that are actually used and that always have a filename. Gets us
rid of some #ifdefs as well.

The patch also converts bs->is_temporary to a new BDRV_O_TEMPORARY open
flag so that it is inherited in the protocol layer and the raw-posix and
raw-win32 drivers can unlink the file.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-04-30 11:05:00 +02:00
Max Reitz c883db0df9 qcow2: Fix discard
discard_single_l2() should not implement its own version of
qcow2_get_cluster_type(), but rather rely on this already existing
function. By doing so, it will work for compressed clusters as well
(which it did not so far).

Also, rename "old_offset" to "old_l2_entry", as both are quite different
(and the value is indeed of the latter kind).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-29 16:39:51 +02:00
Fam Zheng c3cc95bd15 mirror: Check for bdrv_get_info result
bdrv_get_info could fail. Add check before using the returned value.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-29 13:43:08 +02:00
Fam Zheng 373df5b135 mirror: Fix resource leak when bdrv_getlength fails
The direct return will skip releasing of all the resouces at
immediate_exit, don't miss that.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-29 13:43:00 +02:00
Peter Lieven 3d2acaa308 block/iscsi: allow cluster_size of 4K and greater
depending on the target the opt_unmap_gran might be as low
as 4K. As we know use this also as a knob to activate the allocationmap
feature lower the barrier. The limit 4K (and not 512) is choosen
to avoid a potentially too big allocationmap.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-04-29 11:15:01 +02:00
Peter Lieven 5917af812e block/iscsi: clarify the meaning of ISCSI_CHECKALLOC_THRES
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-04-29 11:14:41 +02:00
Peter Lieven b03c38057b block/iscsi: speed up read for unallocated sectors
this patch implements a cache that tracks if a page on the
iscsi target is allocated or not. The cache is implemented in
a way that it allows for false positives
(e.g. pretending a page is allocated, but it isn't), but
no false negatives.

The cached allocation info is then used to speed up the
read process for unallocated sectors by issueing a GET_LBA_STATUS
request for all sectors that are not yet known to be allocated.
If the read request is confirmed to fall into an unallocated
range we directly return zeroes and do not transfer the
data over the wire.

Tests have shown that a relatively small amount of GET_LBA_STATUS
requests happens a vServer boot time to fill the allocation cache
(all those blocks are not queried again).

Not to transfer all the data of unallocated sectors saves a lot
of time, bandwidth and storage I/O load during block jobs or storage
migration and it saves a lot of bandwidth as well for any big sequential
read of the whole disk (e.g. block copy or speed tests) if a significant
number of blocks is unallocated.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-04-29 11:14:25 +02:00
Fam Zheng f0e9736012 mirror: Use DIV_ROUND_UP
Although bdrv_getlength() was just called above this, and checked for
error, it is better to just use the value we already get, and use
DIV_ROUND_UP.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-28 17:36:30 +02:00
Peter Lieven dbe5c58f2a block/iscsi: allow fall back to WRITE SAME without UNMAP
if the iscsi driver receives a write zeroes request with
the BDRV_REQ_MAY_UNMAP flag set it fails with -ENOTSUP
if the iscsi target does not support WRITE SAME with
UNMAP. However, the BDRV_REQ_MAY_UNMAP is only a hint
and writing zeroes with WRITE SAME will still be
better than falling back to writing zeroes with WRITE16.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-04-28 15:04:17 +02:00
Peter Maydell 13de54eedd Merge remote-tracking branch 'remotes/qmp-unstable/queue/qmp' into staging
* remotes/qmp-unstable/queue/qmp:
  monitor: fix qmp_getfd() fd leak in error case
  HMP: support specifying dump format for dump-guest-memory
  HMP: fix doc of dump-guest-memory
  qmp: object-add: Validate class before creating object
  monitor: Add device_add and device_del completion.
  monitor: Add command_completion callback to mon_cmd_t.
  monitor: Fix drive_del id argument type completion.
  error: Remove some unused headers
  qerror.h: Replace QERR_NOT_SUPPORTED with QERR_UNSUPPORTED
  qerror.h: Remove QERR defines that are only used once
  qerror.h: Remove unused error classes
  error: Print error_report() to stderr if using qmp
  monitor: Remove unused monitor_print_filename
  error: Privatize error_print_loc
  vnc: Remove default_mon usage
  slirp: Remove default_mon usage

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-04-28 12:56:34 +01:00
Markus Armbruster 172fc4dd33 iscsi: Don't use error_is_set() to suppress additional errors
Using error_is_set(errp) that way can sweep programming errors under
the carpet when we get called incorrectly with an error set.

Commit 24d3bd6 added a broken error path to iscsi_do_inquiry(): it
first calls error_setg(), then jumps to the preexisting error label,
where error_setg() gets called again, triggering an assertion failure.

Commit cbee81f fixed this by guarding the second error_setg() with an
error_is_set().

Replace this fix by a simpler and safer one: jump right behind the
second error_setg().

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-25 18:05:06 +02:00
Markus Armbruster 92de901290 nbd: Use return values instead of error_is_set(errp)
Using error_is_set(errp) to check whether a function call failed is
fragile: it breaks when errp is null.  Check perfectly suitable return
values instead when possible.  errp can't be null there now, but this
is more robust and more obviously correct

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-25 18:05:06 +02:00
Markus Armbruster 0fb6395c0c Use error_is_set() only when necessary (again)
error_is_set(&var) is the same as var != NULL, but it takes
whole-program analysis to figure that out.  Unnecessarily hard for
optimizers, static checkers, and human readers.  Commit 84d18f0 dumbed
it down to obvious, but a few more have crept in since, and
documentation was overlooked.  Dumb these down, too.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-25 18:05:06 +02:00
Cole Robinson f231b88db1 qerror.h: Remove QERR defines that are only used once
Just hardcode them in the callers

Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-04-25 09:19:59 -04:00
Stefan Hajnoczi 370e681629 block/cloop: use PRIu32 format specifier for uint32_t
PRIu32 is the format string specifier for uint32_t, let's use it.
Variables ->block_size, ->n_blocks, and i are all uint32_t.

Suggested-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-23 11:34:10 +02:00
Fam Zheng 9b17031ac4 vmdk: Fix "%x" to PRIx32 in format strings for cid
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-22 14:14:30 +02:00
Kevin Wolf 98522f63f4 block: Add errp to bdrv_new()
This patch adds an errp parameter to bdrv_new() and updates all its
callers. The next patches will make use of this in order to check for
duplicate IDs. Most of the callers know that their ID is fine, so they
can simply assert that there is no error.

Behaviour doesn't change with this patch yet as bdrv_new() doesn't
actually assign errors to errp.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-04-22 12:00:20 +02:00
Aakriti Gupta 5ff679b47d convert fprintf() calls to error_setg() in block/qed.c:bdrv_qed_create()
This patch converts fprintf() calls to error_setg() in block/qed.c:bdrv_qed_create()
(error_setg() is part of error reporting API in include/qapi/error.h)

Signed-off-by: Aakriti Gupta <aakritty@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-22 11:57:02 +02:00
Maria Kustova acd7fdc6d8 curl: Replaced old error handling with error reporting API.
Signed-off-by: Maria Kustova <maria.k@catit.be>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-22 11:57:02 +02:00
Fam Zheng b8afb520e4 block: Handle error of bdrv_getlength in bdrv_create_dirty_bitmap
bdrv_getlength could fail, check the return value before using it.
Return NULL and set errno if it fails. Callers are updated to handle
the error case.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-22 11:57:02 +02:00
Fam Zheng 4ab9dab5b9 vmdk: Fix %d and %lld to PRI* in format strings
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-22 11:57:02 +02:00
Fam Zheng cd82b6fb4d iscsi: Remember to set ret for iscsi_open in error case
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-11 13:59:49 +02:00
Kevin Wolf 715c3f60ef bochs: Fix catalog size check
The old check was off by a factor of 512 and didn't consider cases where
we don't get an exact division. This could lead to an out-of-bounds
array access in seek_to_sector().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2014-04-11 13:59:49 +02:00
Kevin Wolf 28ec11bc88 bochs: Fix memory leak in bochs_open() error path
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2014-04-11 13:59:49 +02:00
Kevin Wolf 8885eadedd qcow2: Put cache reference in error case
When qcow2_get_cluster_offset() sees a zero cluster in a version 2
image, it (rightfully) returns an error. But in doing so it shouldn't
leak an L2 table cache reference.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-04-04 17:10:08 +02:00
Kevin Wolf 4c2e5f8f46 qcow2: Flush metadata during read-only reopen
If lazy refcounts are enabled for a backing file, committing to this
backing file may leave it in a dirty state even if the commit succeeds.
The reason is that the bdrv_flush() call in bdrv_commit() doesn't flush
refcount updates with lazy refcounts enabled, and qcow2_reopen_prepare()
doesn't take care to flush metadata.

In order to fix this, this patch also fixes qcow2_mark_clean(), which
contains another ineffective bdrv_flush() call beause lazy refcounts are
disabled only afterwards. All existing callers of qcow2_mark_clean()
either don't modify refcounts or already flush manually, so that this
fixes only a latent, but not yet actually triggerable bug.

Another instance of the same problem is live snapshots. Again, a real
corruption is prevented by an explicit flush for non-read-only images in
external_snapshot_prepare(), but images using lazy refcounts stay dirty.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-04 14:12:26 +02:00
Fam Zheng cbee81f6de iscsi: Don't set error if already set in iscsi_do_inquiry
This eliminates the possible assertion failure in error_setg().

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-04 14:11:34 +02:00
Peter Maydell 784a5592c9 Merge remote-tracking branch 'remotes/bonzini/scsi-next' into staging
* remotes/bonzini/scsi-next:
  iscsi: always query max WRITE SAME length
  iscsi: ignore flushes on scsi-generic devices
  iscsi: recognize "invalid field" ASCQ from WRITE SAME command
  scsi-bus: remove bogus assertion

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-04-03 12:24:35 +01:00
Paolo Bonzini c97ca29db0 iscsi: always query max WRITE SAME length
Max WRITE SAME length is also used when the UNMAP bit is zero, so it
should be queried even if LBPWS=0.  Same for the optimal transfer
length.

However, the write_zeroes_alignment only matters for UNMAP=1 so we
still restrict it to LBPWS=1.

Reviewed-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-04-03 13:10:53 +02:00
Paolo Bonzini b2f9c08a4f iscsi: ignore flushes on scsi-generic devices
Non-block SCSI devices do not support flushing, but we may still send
them requests via bdrv_flush_all.  Just ignore them.

Reviewed-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-04-03 13:10:45 +02:00
Paolo Bonzini 27898a5daa iscsi: recognize "invalid field" ASCQ from WRITE SAME command
Some targets may return "invalid field" as the ASCQ from WRITE SAME
if they support the command only without the UNMAP field.  Recognize
that, and return ENOTSUP just like for "invalid operation code".

Reviewed-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-04-03 13:10:32 +02:00
Stefan Hajnoczi c792707f54 qcow2: link all L2 meta updates in preallocate()
preallocate() only links the first QCowL2Meta's data clusters into the
L2 table and ignores any chained QCowL2Metas in the linked list.

Chains of QCowL2Meta structs are built up when contiguous clusters span
L2 tables.  Each QCowL2Meta describes one L2 table update.  This is a
rare case in preallocate() but can happen.

This patch fixes preallocate() by iterating over the whole list of
QCowL2Metas.  Compare with the qcow2_co_writev() function's
implementation, which is similar but also also handles request
dependencies.  preallocate() only performs one allocation at a time so
there can be no dependencies.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Kevin Wolf 9302e863aa parallels: Sanity check for s->tracks (CVE-2014-0142)
This avoids a possible division by zero.

Convert s->tracks to unsigned as well because it feels better than
surviving just because the results of calculations with s->tracks are
converted to unsigned anyway.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Kevin Wolf afbcc40bee parallels: Fix catalog size integer overflow (CVE-2014-0143)
The first test case would cause a huge memory allocation, leading to a
qemu abort; the second one to a too small malloc() for the catalog
(smaller than s->catalog_size), which causes a read-only out-of-bounds
array access and on big endian hosts an endianess conversion for an
undefined memory area.

The sample image used here is not an original Parallels image. It was
created using an hexeditor on the basis of the struct that qemu uses.
Good enough for trying to crash the driver, but not for ensuring
compatibility.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Kevin Wolf 5dae6e30c5 qcow2: Limit snapshot table size
Even with a limit of 64k snapshots, each snapshot could have a filename
and an ID with up to 64k, which would still lead to pretty large
allocations, which could potentially lead to qemu aborting. Limit the
total size of the snapshot table to an average of 1k per entry when
the limit of 64k snapshots is fully used. This should be plenty for any
reasonable user.

This also fixes potential integer overflows of s->snapshot_size.

Suggested-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Kevin Wolf 6a83f8b5be qcow2: Check maximum L1 size in qcow2_snapshot_load_tmp() (CVE-2014-0143)
This avoids an unbounded allocation.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Kevin Wolf c05e4667be qcow2: Fix L1 allocation size in qcow2_snapshot_load_tmp() (CVE-2014-0145)
For the L1 table to loaded for an internal snapshot, the code allocated
only enough memory to hold the currently active L1 table. If the
snapshot's L1 table is actually larger than the current one, this leads
to a buffer overflow.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Kevin Wolf 11b128f406 qcow2: Fix NULL dereference in qcow2_open() error path (CVE-2014-0146)
The qcow2 code assumes that s->snapshots is non-NULL if s->nb_snapshots
!= 0. By having the initialisation of both fields separated in
qcow2_open(), any error occuring in between would cause the error path
to dereference NULL in qcow2_free_snapshots() if the image had any
snapshots.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Kevin Wolf 6b7d4c5558 qcow2: Fix copy_sectors() with VM state
bs->total_sectors is not the highest possible sector number that could
be involved in a copy on write operation: VM state is after the end of
the virtual disk. This resulted in wrong values for the number of
sectors to be copied (n).

The code that checks for the end of the image isn't required any more
because the code hasn't been calling the block layer's bdrv_read() for a
long time; instead, it directly calls qcow2_readv(), which doesn't error
out on VM state sector numbers.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Stefan Hajnoczi f0dce23475 dmg: prevent chunk buffer overflow (CVE-2014-0145)
Both compressed and uncompressed I/O is buffered.  dmg_open() calculates
the maximum buffer size needed from the metadata in the image file.

There is currently a buffer overflow since ->lengths[] is accounted
against the maximum compressed buffer size but actually uses the
uncompressed buffer:

  switch (s->types[chunk]) {
  case 1: /* copy */
      ret = bdrv_pread(bs->file, s->offsets[chunk],
                       s->uncompressed_chunk, s->lengths[chunk]);

We must account against the maximum uncompressed buffer size for type=1
chunks.

This patch fixes the maximum buffer size calculation to take into
account the chunk type.  It is critical that we update the correct
maximum since there are two buffers ->compressed_chunk and
->uncompressed_chunk.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Stefan Hajnoczi 686d7148ec dmg: use uint64_t consistently for sectors and lengths
The DMG metadata is stored as uint64_t, so use the same type for
sector_num.  int was a particularly poor choice since it is only 32-bit
and would truncate large values.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Stefan Hajnoczi c165f77580 dmg: sanitize chunk length and sectorcount (CVE-2014-0145)
Chunk length and sectorcount are used for decompression buffers as well
as the bdrv_pread() count argument.  Ensure that they have reasonable
values so neither memory allocation nor conversion from uint64_t to int
will cause problems.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Stefan Hajnoczi eb71803b04 dmg: use appropriate types when reading chunks
Use the right types instead of signed int:

  size_t new_size;

  This is a byte count for g_realloc() that is calculated from uint32_t
  and size_t values.

  uint32_t chunk_count;

  Use the same type as s->n_chunks, which is used together with
  chunk_count.

This patch is a cleanup and does not fix bugs.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Stefan Hajnoczi b404bf8542 dmg: drop broken bdrv_pread() loop
It is not necessary to check errno for EINTR and the block layer does
not produce short reads.  Therefore we can drop the loop that attempts
to read a compressed chunk.

The loop is buggy because it incorrectly adds the transferred bytes
twice:

  do {
      ret = bdrv_pread(...);
      i += ret;
  } while (ret >= 0 && ret + i < s->lengths[chunk]);

Luckily we can drop the loop completely and perform a single
bdrv_pread().

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Stefan Hajnoczi 73ed27ec28 dmg: prevent out-of-bounds array access on terminator
When a terminator is reached the base for offsets and sectors is stored.
The following records that are processed will use this base value.

If the first record we encounter is a terminator, then calculating the
base values would result in out-of-bounds array accesses.  Don't do
that.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Stefan Hajnoczi 2c1885adcf dmg: coding style and indentation cleanup
Clean up the mix of tabs and spaces, as well as the coding style
violations in block/dmg.c.  There are no semantic changes since this
patch simply reformats the code.

This patch is necessary before we can make meaningful changes to this
file, due to the inconsistent formatting and confusing indentation.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Kevin Wolf cab60de930 qcow2: Fix new L1 table size check (CVE-2014-0143)
The size in bytes is assigned to an int later, so check that instead of
the number of entries.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Kevin Wolf 0abe740f1d qcow2: Protect against some integer overflows in bdrv_check
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Kevin Wolf bb572aefbd qcow2: Fix types in qcow2_alloc_clusters and alloc_clusters_noref
In order to avoid integer overflows.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:34 +02:00
Kevin Wolf 2b5d5953ee qcow2: Check new refcount table size on growth
If the size becomes larger than what qcow2_open() would accept, fail the
growing operation.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:34 +02:00
Kevin Wolf db8a31d11d qcow2: Avoid integer overflow in get_refcount (CVE-2014-0143)
This ensures that the checks catch all invalid cluster indexes
instead of returning the refcount of a wrong cluster.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:34 +02:00
Kevin Wolf b106ad9185 qcow2: Don't rely on free_cluster_index in alloc_refcount_block() (CVE-2014-0147)
free_cluster_index is only correct if update_refcount() was called from
an allocation function, and even there it's brittle because it's used to
protect unfinished allocations which still have a refcount of 0 - if it
moves in the wrong place, the unfinished allocation can be corrupted.

So not using it any more seems to be a good idea. Instead, use the
first requested cluster to do the calculations. Return -EAGAIN if
unfinished allocations could become invalid and let the caller restart
its search for some free clusters.

The context of creating a snapsnot is one situation where
update_refcount() is called outside of a cluster allocation. For this
case, the change fixes a buffer overflow if a cluster is referenced in
an L2 table that cannot be represented by an existing refcount block.
(new_table[refcount_table_index] was out of bounds)

[Bump the qemu-iotests 026 refblock_alloc.write leak count from 10 to
11.
--Stefan]

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:21:03 +02:00
Kevin Wolf 6d33e8e7dc qcow2: Fix backing file name length check
len could become negative and would pass the check then. Nothing bad
happened because bdrv_pread() happens to return an error for negative
length values, but make variables for sizes unsigned anyway.

This patch also changes the behaviour to error out on invalid lengths
instead of silently truncating it to 1023.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 14:19:09 +02:00
Kevin Wolf 2d51c32c4b qcow2: Validate active L1 table offset and size (CVE-2014-0144)
This avoids an unbounded allocation.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 14:19:09 +02:00
Kevin Wolf ce48f2f441 qcow2: Validate snapshot table offset/size (CVE-2014-0144)
This avoid unbounded memory allocation and fixes a potential buffer
overflow on 32 bit hosts.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 14:19:09 +02:00
Kevin Wolf 8c7de28305 qcow2: Validate refcount table offset
The end of the refcount table must not exceed INT64_MAX so that integer
overflows are avoided.

Also check for misaligned refcount table. Such images are invalid and
probably the result of data corruption. Error out to avoid further
corruption.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 14:19:09 +02:00
Kevin Wolf 5dab2faddc qcow2: Check refcount table size (CVE-2014-0144)
Limit the in-memory reference count table size to 8 MB, it's enough in
practice. This fixes an unbounded allocation as well as a buffer
overflow in qcow2_refcount_init().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 14:19:09 +02:00
Kevin Wolf a1b3955c94 qcow2: Check backing_file_offset (CVE-2014-0144)
Header, header extension and the backing file name must all be stored in
the first cluster. Setting the backing file to a much higher value
allowed header extensions to become much bigger than we want them to be
(unbounded allocation).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 14:19:09 +02:00
Kevin Wolf 24342f2cae qcow2: Check header_length (CVE-2014-0144)
This fixes an unbounded allocation for s->unknown_header_fields.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 14:19:09 +02:00
Fam Zheng 6d4b9e55fc curl: check data size before memcpy to local buffer. (CVE-2014-0144)
curl_read_cb is callback function for libcurl when data arrives. The
data size passed in here is not guaranteed to be within the range of
request we submitted, so we may overflow the guest IO buffer. Check the
real size we have before memcpy to buffer to avoid overflow.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 14:19:09 +02:00
Jeff Cody 1d7678dec4 vhdx: Bounds checking for block_size and logical_sector_size (CVE-2014-0148)
Other variables (e.g. sectors_per_block) are calculated using these
variables, and if not range-checked illegal values could be obtained
causing infinite loops and other potential issues when calculating
BAT entries.

The 1.00 VHDX spec requires BlockSize to be min 1MB, max 256MB.
LogicalSectorSize is required to be either 512 or 4096 bytes.

Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 14:19:09 +02:00
Jeff Cody 63fa06dc97 vdi: add bounds checks for blocks_in_image and disk_size header fields (CVE-2014-0144)
The maximum blocks_in_image is 0xffffffff / 4, which also limits the
maximum disk_size for a VDI image to 1024TB.  Note that this is the maximum
size that QEMU will currently support with this driver, not necessarily the
maximum size allowed by the image format.

This also fixes an incorrect error message, a bug introduced by commit
5b7aa9b56d (Reported by Stefan Weil)

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 14:06:31 +02:00
Kevin Wolf 5e71dfad76 vpc: Validate block size (CVE-2014-0142)
This fixes some cases of division by zero crashes.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 13:59:47 +02:00
Jeff Cody 97f1c45c6f vpc/vhd: add bounds check for max_table_entries and block_size (CVE-2014-0144)
This adds checks to make sure that max_table_entries and block_size
are in sane ranges.  Memory is allocated based on max_table_entries,
and block_size is used to calculate indices into that allocated
memory, so if these values are incorrect that can lead to potential
unbounded memory allocation, or invalid memory accesses.

Also, the allocation of the pagetable is changed from g_malloc0()
to qemu_blockalign().

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 13:59:47 +02:00
Kevin Wolf a9ba36a45d bochs: Fix bitmap offset calculation
32 bit truncation could let us access the wrong offset in the image.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 13:59:47 +02:00
Kevin Wolf 8e53abbc20 bochs: Check extent_size header field (CVE-2014-0142)
This fixes two possible division by zero crashes: In bochs_open() and in
seek_to_sector().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 13:59:47 +02:00
Kevin Wolf e3737b820b bochs: Check catalog_size header field (CVE-2014-0143)
It should neither become negative nor allow unbounded memory
allocations. This fixes aborts in g_malloc() and an s->catalog_bitmap
buffer overflow on big endian hosts.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 13:59:47 +02:00
Kevin Wolf 246f65838d bochs: Use unsigned variables for offsets and sizes (CVE-2014-0147)
Gets us rid of integer overflows resulting in negative sizes which
aren't correctly checked.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 13:59:47 +02:00
Kevin Wolf 3dd8a6763b bochs: Unify header structs and make them QEMU_PACKED
This is an on-disk structure, so offsets must be accurate.

Before this patch, sizeof(bochs) != sizeof(header_v1), which makes the
memcpy() between both invalid. We're lucky enough that the destination
buffer happened to be the larger one, and the memcpy size to be taken
from the smaller one, so we didn't get a buffer overflow in practice.

This patch unifies the both structures, eliminating the need to do a
memcpy in the first place. The common fields are extracted to the top
level of the struct and the actually differing part gets a union of the
two versions.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 13:59:47 +02:00
Stefan Hajnoczi 42d43d35d9 block/cloop: fix offsets[] size off-by-one
cloop stores the number of compressed blocks in the n_blocks header
field.  The file actually contains n_blocks + 1 offsets, where the extra
offset is the end-of-file offset.

The following line in cloop_read_block() results in an out-of-bounds
offsets[] access:

    uint32_t bytes = s->offsets[block_num + 1] - s->offsets[block_num];

This patch allocates and loads the extra offset so that
cloop_read_block() works correctly when the last block is accessed.

Notice that we must free s->offsets[] unconditionally now since there is
always an end-of-file offset.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 13:59:47 +02:00
Stefan Hajnoczi f56b9bc3ae block/cloop: refuse images with bogus offsets (CVE-2014-0144)
The offsets[] array allows efficient seeking and tells us the maximum
compressed data size.  If the offsets are bogus the maximum compressed
data size will be unrealistic.

This could cause g_malloc() to abort and bogus offsets mean the image is
broken anyway.  Therefore we should refuse such images.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 13:59:47 +02:00
Stefan Hajnoczi 7b103b36d6 block/cloop: refuse images with huge offsets arrays (CVE-2014-0144)
Limit offsets_size to 512 MB so that:

1. g_malloc() does not abort due to an unreasonable size argument.

2. offsets_size does not overflow the bdrv_pread() int size argument.

This limit imposes a maximum image size of 16 TB at 256 KB block size.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 13:59:47 +02:00
Stefan Hajnoczi 509a41bab5 block/cloop: prevent offsets_size integer overflow (CVE-2014-0143)
The following integer overflow in offsets_size can lead to out-of-bounds
memory stores when n_blocks has a huge value:

    uint32_t n_blocks, offsets_size;
    [...]
    ret = bdrv_pread(bs->file, 128 + 4, &s->n_blocks, 4);
    [...]
    s->n_blocks = be32_to_cpu(s->n_blocks);

    /* read offsets */
    offsets_size = s->n_blocks * sizeof(uint64_t);
    s->offsets = g_malloc(offsets_size);

    [...]

    for(i=0;i<s->n_blocks;i++) {
        s->offsets[i] = be64_to_cpu(s->offsets[i]);

offsets_size can be smaller than n_blocks due to integer overflow.
Therefore s->offsets[] is too small when the for loop byteswaps offsets.

This patch refuses to open files if offsets_size would overflow.

Note that changing the type of offsets_size is not a fix since 32-bit
hosts still only have 32-bit size_t.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 13:59:47 +02:00
Stefan Hajnoczi d65f97a82c block/cloop: validate block_size header field (CVE-2014-0144)
Avoid unbounded s->uncompressed_block memory allocation by checking that
the block_size header field has a reasonable value.  Also enforce the
assumption that the value is a non-zero multiple of 512.

These constraints conform to cloop 2.639's code so we accept existing
image files.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 13:59:47 +02:00
Prasad Joshi c5a33ee9ee qcow2: fix two memory leaks in qcow2_open error code path
Signed-off-by: Prasad Joshi <prasadjoshi.linux@gmail.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 13:49:53 +02:00
Markus Armbruster 4c7096607d vvfat: Fix :floppy: option to suppress partition table
Regressed in commit 7ad9be6, v1.5.0.

Reported-by: Kiyokazu SUTO <suto@ks-and-ks.ne.jp>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 13:49:53 +02:00
Stefan Hajnoczi 7b770c720b mirror: fix early wake from sleep due to aio
The mirror blockjob coroutine rate-limits itself by sleeping.  The
coroutine also performs I/O asynchronously so it's important that the
aio callback doesn't wake the coroutine early as that breaks
rate-limiting.

Reported-by: Joaquim Barrera <jbarrera@ac.upc.edu>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-03-25 14:09:50 +01:00
Paolo Bonzini cc8c9d6c6f mirror: fix throttling delay calculation
The throttling delay calculation was using an inaccurate sector count to
calculate the time to sleep.  This broke rate-limiting for the block
mirror job.

Move the delay calculation into mirror_iteration() where we know how
many sectors were transferred.  This lets us calculate an accurate delay
time.

Reported-by: Joaquim Barrera <jbarrera@ac.upc.edu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-03-25 14:09:50 +01:00
Deepak Kathayat dc6fb73d21 Fixed various typos
Signed-off-by: Deepak Kathayat <deepak.mk17@gmail.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-03-25 14:09:50 +01:00
Peter Lieven 20fccb187c block/nfs: report errors from libnfs
if an NFS operation fails we should report what libnfs knows
about the failure. It is likely more than just an error code.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-03-19 09:39:41 +01:00
Max Reitz a134d90f50 qcow2: Fix fail path in realloc_refcount_block()
If qcow2_alloc_clusters() fails, new_offset and ret will both be
negative after the fail label, thus passing the first if condition and
subsequently resulting in a call of qcow2_free_clusters() with an
invalid (negative) offset parameter. Fix this by introducing a new label
"fail_free_cluster" which is only invoked if new_offset is indeed
pointing to a newly allocated cluster that should be cleaned up by
freeing it.

While we're at it, clean up the whole fail path. qcow2_cache_put()
should (and actually can) never fail, hence the return value can safely
be ignored (aside from asserting that it indeed did not fail).

Furthermore, there is no reason to give QCOW2_DISCARD_ALWAYS to
qcow2_free_clusters(), a mere QCOW2_DISCARD_OTHER will suffice.

Ultimately, rename the "fail" label to "done", as it is invoked both on
failure and success.

Suggested-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-03-19 09:39:41 +01:00
Max Reitz 8a15b813e6 qcow2: Correct comment for realloc_refcount_block()
Contrary to the comment describing this function's behavior, it does not
return 0 on success, but rather the offset of the newly allocated
cluster. This patch adjusts the comment accordingly to reflect the
actual behavior.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-03-19 09:39:41 +01:00
Kevin Wolf 5a8a30db47 block: Add error handling to bdrv_invalidate_cache()
If it returns an error, the migrated VM will not be started, but qemu
exits with an error message.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-03-19 09:39:41 +01:00
Stefan Hajnoczi 4a41a2d68a nbd: close socket if connection breaks
nbd_receive_reply() is called by the event loop whenever data is
available or the socket has been closed by the remote side.

This patch closes the socket when an error occurs to prevent the
nbd_receive_reply() handler from being called indefinitely after the
connection has failed.

Note that we were already correctly returning EIO for pending requests
but leaving the nbd_receive_reply() handler registered resulted in high
CPU consumption and a flood of error messages.

Reuse nbd_teardown_connection() to close the socket.

Reported-by: Zhifeng Cai <bluewindow@h3c.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-03-14 16:28:28 +01:00