Commit Graph

2712 Commits (2f114315dcf239bc513f18ae0b04b5df81cae059)

Author SHA1 Message Date
Fam Zheng 27685a8dd0 dmg: Move libbz2 code to dmg-bz2.so
dmg.o was moved to block-obj-m in 5505e8b76 to become a separate module,
so that its reference to libbz2, since 6b383c08c, doesn't add an extra
library to the main executable.

Until recently, commit 06e60f70a (blockdev: Add dynamic module loading
for block drivers) moved it back to block-obj-y to simplify the design
of dynamic loading of block modules. But we don't want to lose the
feature of less library dependency on the main executable.

The solution here is to move only the bz2 related code to a separate
DSO file, and load it when dmg_open is called.

dmg_probe doesn't depend on bz2 support to work, and is the only code in
this file which can run before dmg_open.

While we are at it, fix the unhelpful cast of last argument passed to
dmg_uncompress_bz2.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1473043845-13197-4-git-send-email-famz@redhat.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-10-07 14:14:06 +02:00
Kevin Wolf 2d76e724cf block: Add qdev ID to DEVICE_TRAY_MOVED
The event currently only contains the BlockBackend name. However, with
anonymous BlockBackends, this is always the empty string. Add the qdev
ID (or if none was given, the QOM path) so that the user can still see
which device caused the event.

Event generation has to be moved from bdrv_eject() to the BlockBackend
because the BDS doesn't know the attached device, but that's easy
because blk_eject() is the only user of it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-10-07 13:34:22 +02:00
Kevin Wolf bbc8ea98bc block-backend: Remember if attached device is non-qdev
Almost all block devices are qdevified by now. This allows us to go back
from the BlockBackend to the DeviceState. xen_disk is the last device
that is missing. We'll remember in the BlockBackend if a xen_disk is
attached and can then disable any features that require going from a BB
to the DeviceState.

While at it, clearly mark the function used by xen_disk as legacy even
in its name, not just in TODO comments.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-10-07 13:34:22 +02:00
Kevin Wolf 2bf7e10f78 block: Add node name to BLOCK_IO_ERROR event
The event currently only contains the BlockBackend name. However, with
anonymous BlockBackends, this is always the empty string. Add the node
name so that the user can still see which block device caused the event.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-10-07 13:34:22 +02:00
Paolo Bonzini fffb6e1223 block: use aio_bh_schedule_oneshot
This simplifies bottom half handlers by removing calls to qemu_bh_delete and
thus removing the need to stash the bottom half pointer in the opaque
datum.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-10-07 13:34:07 +02:00
Paolo Bonzini 818bbc86c9 block: use bdrv_add_before_write_notifier
Register the notifier using the specific API for block devices.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-10-07 13:34:07 +02:00
Kevin Wolf b85114f8cf block: Use 'detect-zeroes' option for 'blockdev-change-medium'
Instead of modifying the new BDS after it has been opened, use the newly
supported 'detect-zeroes' option in bdrv_open_common() so that all
requirements are checked (detect-zeroes=unmap requires discard=unmap).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-09-29 14:13:39 +02:00
Kevin Wolf 0a4279d97c block/qapi: Move 'aio' option to file driver
The option whether or not to use a native AIO interface really isn't a
generic option for all drivers, but only applies to the native file
protocols. This patch moves the option in blockdev-add to the
appropriate places (raw-posix and raw-win32).

We still have to keep the flag BDRV_O_NATIVE_AIO for compatibility
because so far the AIO option was usually specified on the wrong layer
(the top-level format driver, which didn't even look at it) and then
inherited by the protocol driver (where it was actually used). We can't
forbid this use except in new interfaces.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-09-29 14:13:39 +02:00
John Snow 49137bf684 block-backend: remove blk_flush_all
We can teach Xen to drain and flush each device as it needs to, instead
of trying to flush ALL devices. This removes the last user of
blk_flush_all.

The function is therefore removed under the premise that any new uses
of blk_flush_all would be the wrong paradigm: either flush the single
device that requires flushing, or use an appropriate flush_all mechanism
from outside of the BlkBackend layer.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-09-29 14:13:38 +02:00
John Snow 4085f5c7a2 block: reintroduce bdrv_flush_all
Commit fe1a9cbc moved the flush_all routine from the bdrv layer to the
block-backend layer. In doing so, however, the semantics of the routine
changed slightly such that flush_all now used blk_flush instead of
bdrv_flush.

blk_flush can fail if the attached device model reports that it is not
"available," (i.e. the tray is open.) This changed the semantics of
flush_all such that it can now fail for e.g. open CDROM drives.

Reintroduce bdrv_flush_all to regain the old semantics without having to
alter the behavior of blk_flush or blk_flush_all, which are already
'doing the right thing.'

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-09-29 14:13:13 +02:00
Peter Maydell c640f2849e * thread-safe tb_flush (Fred, Alex, Sergey, me, Richard, Emilio,... :-)
* license clarification for compiler.h (Felipe)
 * glib cflags improvement (Marc-André)
 * checkpatch silencing (Paolo)
 * SMRAM migration fix (Paolo)
 * Replay improvements (Pavel)
 * IOMMU notifier improvements (Peter)
 * IOAPIC now defaults to version 0x20 (Peter)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQExBAABCAAbBQJX6kKUFBxwYm9uemluaUByZWRoYXQuY29tAAoJEL/70l94x66D
 M1UIAKCQ7XfWDoClYd1TyGZ+Qj3K3TrjwLDIl/Z258euyeZ9p7PpqYQ64OCRsREJ
 fsGQOqkFYDe7gi4epJiJOuu4oAW7Xu8G6lB2RfBd7KWVMhsl3Che9AEom7amzyzh
 yoN+g9gwKfAmYwpKyjYWnlWOSjUvif6o0DaTCQCMTaAoEM3b4HKdgHfr6A2dA/E/
 47rtIVp/jNExmrZkaOjnCDS1DJ8XYT3aVeoTkuzRFQ3DBzrAiPABn6B4ExP8IBcJ
 YLFX/W8xG7F3qyXbKQOV/uYM25A55WS5B0G94ZfSlDtUGa/avzS7df9DFD/IWQT+
 RpfiyDdeJueByiTw9R0ZYxFjhd8=
 =g7xm
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* thread-safe tb_flush (Fred, Alex, Sergey, me, Richard, Emilio,... :-)
* license clarification for compiler.h (Felipe)
* glib cflags improvement (Marc-André)
* checkpatch silencing (Paolo)
* SMRAM migration fix (Paolo)
* Replay improvements (Pavel)
* IOMMU notifier improvements (Peter)
* IOAPIC now defaults to version 0x20 (Peter)

# gpg: Signature made Tue 27 Sep 2016 10:57:40 BST
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (28 commits)
  replay: allow replay stopping and restarting
  replay: vmstate for replay module
  replay: move internal data to the structure
  cpus-common: lock-free fast path for cpu_exec_start/end
  tcg: Make tb_flush() thread safe
  cpus-common: Introduce async_safe_run_on_cpu()
  cpus-common: simplify locking for start_exclusive/end_exclusive
  cpus-common: remove redundant call to exclusive_idle()
  cpus-common: always defer async_run_on_cpu work items
  docs: include formal model for TCG exclusive sections
  cpus-common: move exclusive work infrastructure from linux-user
  cpus-common: fix uninitialized variable use in run_on_cpu
  cpus-common: move CPU work item management to common code
  cpus-common: move CPU list management to common code
  linux-user: Add qemu_cpu_is_self() and qemu_cpu_kick()
  linux-user: Use QemuMutex and QemuCond
  cpus: Rename flush_queued_work()
  cpus: Move common code out of {async_, }run_on_cpu()
  cpus: pass CPUState to run_on_cpu helpers
  build-sys: put glib_cflags in QEMU_CFLAGS
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-09-28 23:02:56 +01:00
Stefan Hajnoczi fe121b9d3c linux-aio: fix re-entrant completion processing
Commit 0ed93d84ed ("linux-aio: process
completions from ioq_submit()") added an optimization that processes
completions each time ioq_submit() returns with requests in flight.
This commit introduces a "Co-routine re-entered recursively" error which
can be triggered with -drive format=qcow2,aio=native.

Fam Zheng <famz@redhat.com>, Kevin Wolf <kwolf@redhat.com>, and I
debugged the following backtrace:

  (gdb) bt
  #0  0x00007ffff0a046f5 in raise () at /lib64/libc.so.6
  #1  0x00007ffff0a062fa in abort () at /lib64/libc.so.6
  #2  0x0000555555ac0013 in qemu_coroutine_enter (co=0x5555583464d0) at util/qemu-coroutine.c:113
  #3  0x0000555555a4b663 in qemu_laio_process_completions (s=s@entry=0x555557e2f7f0) at block/linux-aio.c:218
  #4  0x0000555555a4b874 in ioq_submit (s=s@entry=0x555557e2f7f0) at block/linux-aio.c:331
  #5  0x0000555555a4ba12 in laio_do_submit (fd=fd@entry=13, laiocb=laiocb@entry=0x555559d38ae0, offset=offset@entry=2932727808, type=type@entry=1) at block/linux-aio.c:383
  #6  0x0000555555a4bbd3 in laio_co_submit (bs=<optimized out>, s=0x555557e2f7f0, fd=13, offset=2932727808, qiov=0x555559d38e20, type=1) at block/linux-aio.c:402
  #7  0x0000555555a4fd23 in bdrv_driver_preadv (bs=bs@entry=0x55555663bcb0, offset=offset@entry=2932727808, bytes=bytes@entry=8192, qiov=qiov@entry=0x555559d38e20, flags=0) at block/io.c:804
  #8  0x0000555555a52b34 in bdrv_aligned_preadv (bs=bs@entry=0x55555663bcb0, req=req@entry=0x555559d38d20, offset=offset@entry=2932727808, bytes=bytes@entry=8192, align=align@entry=512, qiov=qiov@entry=0x555559d38e20, flags=0) at block/io.c:1041
  #9  0x0000555555a52db8 in bdrv_co_preadv (child=<optimized out>, offset=2932727808, bytes=8192, qiov=qiov@entry=0x555559d38e20, flags=flags@entry=0) at block/io.c:1133
  #10 0x0000555555a29629 in qcow2_co_preadv (bs=0x555556635890, offset=6178725888, bytes=8192, qiov=0x555557527840, flags=<optimized out>) at block/qcow2.c:1509
  #11 0x0000555555a4fd23 in bdrv_driver_preadv (bs=bs@entry=0x555556635890, offset=offset@entry=6178725888, bytes=bytes@entry=8192, qiov=qiov@entry=0x555557527840, flags=0) at block/io.c:804
  #12 0x0000555555a52b34 in bdrv_aligned_preadv (bs=bs@entry=0x555556635890, req=req@entry=0x555559d39000, offset=offset@entry=6178725888, bytes=bytes@entry=8192, align=align@entry=1, qiov=qiov@entry=0x555557527840, flags=0) at block/io.c:1041
  #13 0x0000555555a52db8 in bdrv_co_preadv (child=<optimized out>, offset=offset@entry=6178725888, bytes=bytes@entry=8192, qiov=qiov@entry=0x555557527840, flags=flags@entry=0) at block/io.c:1133
  #14 0x0000555555a4515a in blk_co_preadv (blk=0x5555566356d0, offset=6178725888, bytes=8192, qiov=0x555557527840, flags=0) at block/block-backend.c:783
  #15 0x0000555555a45266 in blk_aio_read_entry (opaque=0x5555577025e0) at block/block-backend.c:991
  #16 0x0000555555ac0cfa in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:78

It turned out that re-entrant ioq_submit() and completion processing
between three requests caused this error.  The following check is not
sufficient to prevent recursively entering coroutines:

  if (laiocb->co != qemu_coroutine_self()) {
      qemu_coroutine_enter(laiocb->co);
  }

As the following coroutine backtrace shows, not just the current
coroutine (self) can be entered.  There might also be other coroutines
that are currently entered and transferred control due to the qcow2 lock
(CoMutex):

  (gdb) qemu coroutine 0x5555583464d0
  #0  0x0000555555ac0c90 in qemu_coroutine_switch (from_=from_@entry=0x5555583464d0, to_=to_@entry=0x5555572f9890, action=action@entry=COROUTINE_ENTER) at util/coroutine-ucontext.c:175
  #1  0x0000555555abfe54 in qemu_coroutine_enter (co=0x5555572f9890) at util/qemu-coroutine.c:117
  #2  0x0000555555ac031c in qemu_co_queue_run_restart (co=co@entry=0x5555583462c0) at util/qemu-coroutine-lock.c:60
  #3  0x0000555555abfe5e in qemu_coroutine_enter (co=0x5555583462c0) at util/qemu-coroutine.c:119
  #4  0x0000555555a4b663 in qemu_laio_process_completions (s=s@entry=0x555557e2f7f0) at block/linux-aio.c:218
  #5  0x0000555555a4b874 in ioq_submit (s=s@entry=0x555557e2f7f0) at block/linux-aio.c:331
  #6  0x0000555555a4ba12 in laio_do_submit (fd=fd@entry=13, laiocb=laiocb@entry=0x55555a338b40, offset=offset@entry=2911477760, type=type@entry=1) at block/linux-aio.c:383
  #7  0x0000555555a4bbd3 in laio_co_submit (bs=<optimized out>, s=0x555557e2f7f0, fd=13, offset=2911477760, qiov=0x55555a338e80, type=1) at block/linux-aio.c:402
  #8  0x0000555555a4fd23 in bdrv_driver_preadv (bs=bs@entry=0x55555663bcb0, offset=offset@entry=2911477760, bytes=bytes@entry=8192, qiov=qiov@entry=0x55555a338e80, flags=0) at block/io.c:804
  #9  0x0000555555a52b34 in bdrv_aligned_preadv (bs=bs@entry=0x55555663bcb0, req=req@entry=0x55555a338d80, offset=offset@entry=2911477760, bytes=bytes@entry=8192, align=align@entry=512, qiov=qiov@entry=0x55555a338e80, flags=0) at block/io.c:1041
  #10 0x0000555555a52db8 in bdrv_co_preadv (child=<optimized out>, offset=2911477760, bytes=8192, qiov=qiov@entry=0x55555a338e80, flags=flags@entry=0) at block/io.c:1133
  #11 0x0000555555a29629 in qcow2_co_preadv (bs=0x555556635890, offset=6157475840, bytes=8192, qiov=0x5555575df720, flags=<optimized out>) at block/qcow2.c:1509
  #12 0x0000555555a4fd23 in bdrv_driver_preadv (bs=bs@entry=0x555556635890, offset=offset@entry=6157475840, bytes=bytes@entry=8192, qiov=qiov@entry=0x5555575df720, flags=0) at block/io.c:804
  #13 0x0000555555a52b34 in bdrv_aligned_preadv (bs=bs@entry=0x555556635890, req=req@entry=0x55555a339060, offset=offset@entry=6157475840, bytes=bytes@entry=8192, align=align@entry=1, qiov=qiov@entry=0x5555575df720, flags=0) at block/io.c:1041
  #14 0x0000555555a52db8 in bdrv_co_preadv (child=<optimized out>, offset=offset@entry=6157475840, bytes=bytes@entry=8192, qiov=qiov@entry=0x5555575df720, flags=flags@entry=0) at block/io.c:1133
  #15 0x0000555555a4515a in blk_co_preadv (blk=0x5555566356d0, offset=6157475840, bytes=8192, qiov=0x5555575df720, flags=0) at block/block-backend.c:783
  #16 0x0000555555a45266 in blk_aio_read_entry (opaque=0x555557231aa0) at block/block-backend.c:991
  #17 0x0000555555ac0cfa in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:78

Use the new qemu_coroutine_entered() function instead of comparing
against qemu_coroutine_self().  This is correct because:

1. If a coroutine is not entered then it must have yielded to wait for
   I/O completion.  It is therefore safe to enter.

2. If a coroutine is entered then it must be in
   ioq_submit()/qemu_laio_process_completions() because otherwise it
   would be yielded while waiting for I/O completion.  Therefore it will
   check laio->ret and return from ioq_submit() instead of yielding,
   i.e. it's guaranteed not to hang.

Reported-by: Fam Zheng <famz@redhat.com>
Tested-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1474989516-18255-4-git-send-email-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-09-28 17:11:23 +01:00
Pavel Dovgalyuk 6d0ceb80ff replay: allow replay stopping and restarting
This patch fixes bug with stopping and restarting replay
through monitor.

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20160926080815.6992.71818.stgit@PASHA-ISP>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-27 11:57:30 +02:00
Peter Maydell 3b71ec8516 Block layer patches
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJX5RkyAAoJEH8JsnLIjy/WMnYP/i03Ii6nt8VY2PMX10NK3z+H
 5QskSnGzBgQg0OVFPQkvV1nW8bXPN07n/L2Q0H9FzQXTUzCbAkQC4MkzBZTMgaUp
 63XVRG151+AjqctvIncWQktUgMg6ywKBCug8G6gwMQ1BVsLerUJmE2tM0JGZF6WL
 q+F6uTtVMLNKR6miaTsJtFJA+ts9gGEecGNITRByEWbtD7ESF8TOXhCfpV/Ni45s
 5pMVOQT0o9jwgjekrUbJ9PrlxGSLfe/bi3ypqy6boe8EXQS6DImAiDKWkTQjGj2P
 cGlZ0oxWHq0g/15sgZOzqDtOXqE02+RK9C1UcASgObfdUos6gFpi6G7cAffzP9aq
 wzhi8u8Beth2WZP8tLLAJtoQjJhcjutWSw60HPhEcHZNkkaZF3KMXGDk8j1xp5EN
 8VVkWVZr+1ZOP+HsHCNrpvag+IP49DJ5j50W+oVqSKSzG37AC6SxNo3PEsg1urAi
 m5APFJbCgxMfDSnb8yw6ClBIe1IH624VMoELVzY1C9AqOnTxfJ08o4MbE8t1DEoa
 dQV/R3Mrqt3W+YflWNPXzVhRxD/mZW+RXPlBARYvFzHW12fh1PcwFWrB+ivdqu+4
 x0VvmZebJcEP+CxmRQOat80Zhos1Hs+ZyxUHyA5a0+Nt3YhK7k3sr4CQXxr5MIHA
 1c/D8znmWf9VksmW33U5
 =JG9f
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches

# gpg: Signature made Fri 23 Sep 2016 12:59:46 BST
# gpg:                using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream: (33 commits)
  block: Remove BB interface from blockdev-add/del
  qemu-iotests/141: Avoid blockdev-add with id
  block: Avoid printing NULL string in error messages
  qemu-iotests/139: Avoid blockdev-add with id
  qemu-iotests/124: Avoid blockdev-add with id
  qemu-iotests/118: Avoid blockdev-add with id
  qemu-iotests/117: Avoid blockdev-add with id
  qemu-iotests/087: Avoid blockdev-add with id
  qemu-iotests/081: Avoid blockdev-add with id
  qemu-iotests/071: Avoid blockdev-add with id
  qemu-iotests/067: Avoid blockdev-add with id
  qemu-iotests/041: Avoid blockdev-add with id
  qemu-iotests/118: Test media change with qdev name
  block: Accept device model name for block_set_io_throttle
  block: Accept device model name for blockdev-change-medium
  block: Accept device model name for eject
  block: Accept device model name for x-blockdev-remove-medium
  block: Accept device model name for x-blockdev-insert-medium
  block: Accept device model name for blockdev-open/close-tray
  qdev-monitor: Add blk_by_qdev_id()
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-09-23 16:15:33 +01:00
Peter Maydell 4c892756fd -----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
 
 iQEcBAABCAAGBQJX5LZ0AAoJEMo1YkxqkXHGmgAIAKeDTKx0sA76ewIvH9fKdmUu
 NNatJw59XnVX8lpfOU5yISkJ4BD6oBdN7tbWaOW8yzcAeYu1Ff5iUu4LBEUFb7eW
 g6zqUCV58XjaCTLmTiAfa19Exfnh6pXZlZMRP4Hr3vUVSCHFmC0EyTEllfHxU/jW
 aPHtAEge/p6EDAHygHJBTSQzsaXRdyJNyt/AKPreDtblNRT8VgapCDzZQPcCVGH1
 F9grWVu0B/VVDS0mfgSRhT0UeF/vtiikuRW92sC4woVVB+brJyG4VwGT8oeUN8RU
 30/tGo5p9fpqef3iP669uUrloLfmWcKcIJuPfQ4ZUlZh8kIV+lWK9kZuTVgocGw=
 =xLJw
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/famz/tags/various-pull-request' into staging

# gpg: Signature made Fri 23 Sep 2016 05:58:28 BST
# gpg:                using RSA key 0xCA35624C6A9171C6
# gpg: Good signature from "Fam Zheng <famz@redhat.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 5003 7CB7 9706 0F76 F021  AD56 CA35 624C 6A91 71C6

* remotes/famz/tags/various-pull-request: (23 commits)
  docker: exec $CMD
  docker: Terminate instances at SIGTERM and SIGHUP
  docker: Support showing environment information
  docker: Print used options before doing configure
  docker: Flatten default target list in test-quick
  docker: Update fedora image to latest
  docker: Generate /packages.txt in ubuntu image
  docker: Generate /packages.txt in fedora image
  docker: Generate /packages.txt in centos6 image
  tests: Ignore test-uuid
  Add UUID files to MAINTAINERS
  tests: Add uuid tests
  uuid: Tighten uuid parse
  vl: Switch qemu_uuid to QemuUUID
  configure: Remove detection code for UUID
  tests: No longer dependent on CONFIG_UUID
  crypto: Switch to QEMU UUID API
  vpc: Use QEMU UUID API
  vdi: Use QEMU UUID API
  vhdx: Use QEMU UUID API
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

# Conflicts:
#	tests/Makefile.include
2016-09-23 13:10:43 +01:00
Kevin Wolf 1c89e1fa2f block: Add blk_by_dev()
This finds a BlockBackend given the device model that is attached to it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-09-23 13:36:10 +02:00
Alberto Garcia 0fe282bb4b commit: Add 'base' to the reopen queue before 'overlay_bs'
Now that we're checking for duplicates in the reopen queue, there's no
need to force a specific order in which the queue is constructed so we
can revert 3db2bd5508.

Since both ways of constructing the queue are now valid, this patch
doesn't have any effect on the behavior of QEMU and is not strictly
necessary. However it can help us check that the fix for the reopen
queue is robust: if it stops working properly at some point, iotest
040 will break.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-09-23 13:36:10 +02:00
Alberto Garcia f87a0e29a9 block: Add "read-only" to the options QDict
This adds the "read-only" option to the QDict. One important effect of
this change is that when a child inherits options from its parent, the
existing "read-only" mode can be preserved if it was explicitly set
previously.

This addresses scenarios like this:

   [E] <- [D] <- [C] <- [B] <- [A]

In this case, if we reopen [D] with read-only=off, and later reopen
[B], then [D] will not inherit read-only=on from its parent during the
bdrv_reopen_queue_child() stage.

The BDRV_O_RDWR flag is not removed yet, but its keep in sync with the
value of the "read-only" option.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-09-23 13:36:10 +02:00
Daniel P. Berrange bb9f8dd0e1 qcow2: fix encryption during cow of sectors
Broken in previous commit:

  commit aaa4d20b49
  Author: Kevin Wolf <kwolf@redhat.com>
  Date:   Wed Jun 1 15:21:05 2016 +0200

      qcow2: Make copy_sectors() byte based

The copy_sectors() code was originally using the 'sector'
parameter for encryption, which was passed in by the caller
from the QCowL2Meta.offset field (aka the guest logical
offset).

After the change, the code is using 'cluster_offset' which
was passed in from QCow2L2Meta.alloc_offset field (aka the
host physical offset).

This would cause the data to be encrypted using an incorrect
initialization vector which will in turn cause later reads
to return garbage.

Although current qcow2 built-in encryption is blocked from
usage in the emulator, one could still hit this if writing
to the file via qemu-{img,io,nbd} commands.

Cc: qemu-stable@nongnu.org
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-09-23 13:36:09 +02:00
Peter Maydell 6de68ffd7c * More KVM LAPIC fixes
* fix divide-by-zero regression on libiscsi SG devices
 * fix qemu-char segfault
 * add scripts/show-fixed-bugs.sh
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQExBAABCAAbBQJX5CEJFBxwYm9uemluaUByZWRoYXQuY29tAAoJEL/70l94x66D
 PVcH/A6kHS5FHBrqLGl0BRUzZU0HSJJVOofAB55/60qknNajMSTAWM2i/mkNxs6P
 6MJD88Pb+aJFRlP6qD7CLGIlIYR38hH8VFTViuW9/Q+BuMMFgkqUauVCr3maRs82
 zH6o8wj/6PEi6yjBz3yVQCKI9+dHrXiv3BdIml88CP3jAMRjit7Hzcgad7NBmMjg
 Try4ipTxLq0qTkMPNn5IewtiARtJH6UInrE3pzbHKekORSvammJ8Lej0mqXYNACl
 fatQSXTiMc1DX4Mb3Bph6rm1FzmzRGKxydaidVZdswhj2+voMgt3Xvt0mfcaD8Gk
 aNbuHlj2tZ4zPIP08V21xYfd1xc=
 =EsZw
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* More KVM LAPIC fixes
* fix divide-by-zero regression on libiscsi SG devices
* fix qemu-char segfault
* add scripts/show-fixed-bugs.sh

# gpg: Signature made Thu 22 Sep 2016 19:20:57 BST
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream:
  kvm: fix events.flags (KVM_VCPUEVENT_VALID_SMM) overwritten by 0
  scripts: Add a script to check for bug URLs in the git log
  msmouse: Fix segfault caused by free the chr before chardev cleanup.
  iscsi: Fix divide-by-zero regression on raw SG devices
  kvm: apic: set APIC base as part of kvm_apic_put
  target-i386: introduce kvm_put_one_msr

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-09-23 12:12:55 +01:00
Fam Zheng 38440a21fa vpc: Use QEMU UUID API
Previously we conditionally generated footer->uuid, when libuuid was
available. Now that we have a built-in implementation, we can switch to
it.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-Id: <1474432046-325-6-git-send-email-famz@redhat.com>
2016-09-23 11:42:52 +08:00
Fam Zheng 7c6f55b697 vdi: Use QEMU UUID API
The UUID operations we need from libuuid are fully supported by QEMU UUID
implementation. Use it, and remove the unused code.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-Id: <1474432046-325-5-git-send-email-famz@redhat.com>
2016-09-23 11:42:52 +08:00
Fam Zheng cb6414dfec vhdx: Use QEMU UUID API
This removes our dependency to libuuid, so that the driver can always be
built.

Similar to how we handled data plane configure options, --enable-vhdx
and --disable-vhdx are also changed to a nop with a message saying it's
obsolete.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-Id: <1474432046-325-4-git-send-email-famz@redhat.com>
2016-09-23 11:42:52 +08:00
Fam Zheng cea25275a3 util: Add UUID API
A number of different places across the code base use CONFIG_UUID. Some
of them are soft dependency, some are not built if libuuid is not
available, some come with dummy fallback, some throws runtime error.

It is hard to maintain, and hard to reason for users.

Since UUID is a simple standard with only a small number of operations,
it is cleaner to have a central support in libqemuutil. This patch adds
qemu_uuid_* functions that all uuid users in the code base can
rely on. Except for qemu_uuid_generate which is new code, all other
functions are just copy from existing fallbacks from other files.

Note that qemu_uuid_parse is moved without updating the function
signature to use QemuUUID, to keep this patch simple.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-Id: <1474432046-325-2-git-send-email-famz@redhat.com>
2016-09-23 11:42:52 +08:00
Eric Blake 95eaa78537 iscsi: Fix divide-by-zero regression on raw SG devices
When qemu uses iscsi devices in sg mode, iscsilun->block_size
is left at 0.  Prior to commits cf081fca and similar, when
block limits were tracked in sectors, this did not matter:
various block limits were just left at 0.  But when we started
scaling by block size, this caused SIGFPE.

Then, in a later patch, commit a5b8dd2c added an assertion to
bdrv_open_common() that request_alignment is always non-zero;
which was not true for SG mode.  Rather than relax that assertion,
we can just provide a sane value (we don't know of any SG device
with a block size smaller than qemu's default sizing of 512 bytes).

One possible solution for SG mode is to just blindly skip ALL
of iscsi_refresh_limits(), since we already short circuit so
many other things in sg mode.  But this patch takes a slightly
more conservative approach, and merely guarantees that scaling
will succeed, while still using multiples of the original size
where possible.  Resulting limits may still be zero in SG mode
(that is, we mostly only fix block_size used as a denominator
or which affect assertions, not all uses).

Reported-by: Holger Schranz <holger@fam-schranz.de>
Signed-off-by: Eric Blake <eblake@redhat.com>
CC: qemu-stable@nongnu.org

Message-Id: <1473283640-15756-1-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-22 20:20:51 +02:00
Alberto Garcia 4d6f8cbba7 commit: get the overlay node before manipulating the backing chain
The 'block-commit' command has a 'top' parameter to specify the
topmost node from which the data is going to be copied.

   [E] <- [D] <- [C] <- [B] <- [A]

In this case if [C] is the top node then this is the result:

   [E] <- [B] <- [A]

[B] must be modified so its backing image string points to [E] instead
of [C]. commit_start() takes care of reopening [B] in read-write
mode, and commit_complete() puts it back in read-only mode once the
operation has finished.

In order to find [B] (the overlay node) we look for the node that has
[C] (the top node) as its backing image. However in commit_complete()
we're doing it after [C] has been removed from the chain, so [B] is
never found and remains in read-write mode.

This patch gets the overlay node before the backing chain is
manipulated.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 1471836963-28548-1-git-send-email-berto@igalia.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-09-20 22:12:57 +02:00
Colin Lord 4be4879ff8 blockdev: Modularize nfs block driver
Modularizes the nfs block driver so that it gets dynamically loaded.

Signed-off-by: Colin Lord <clord@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1471008424-16465-5-git-send-email-clord@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-09-20 22:12:57 +02:00
Marc Mari 88d88798b7 blockdev: Add dynamic module loading for block drivers
Extend the current module interface to allow for block drivers to be
loaded dynamically on request. The only block drivers that can be
converted into modules are the drivers that don't perform any init
operation except for registering themselves.

In addition, only the protocol drivers are being modularized, as they
are the only ones which see significant performance benefits. The format
drivers do not generally link to external libraries, so modularizing
them is of no benefit from a performance perspective.

All the necessary module information is located in a new structure found
in module_block.h

This spoils the purpose of 5505e8b76f (block/dmg: make it modular).

Before this patch, if module build is enabled, block-dmg.so is linked to
libbz2, whereas the main binary is not. In downstream, theoretically, it
means only the qemu-block-extra package depends on libbz2, while the
main QEMU package needn't to. With this patch, we (temporarily) change
the case so that the main QEMU depends on libbz2 again.

Signed-off-by: Marc Marí <markmb@redhat.com>
Signed-off-by: Colin Lord <clord@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1471008424-16465-4-git-send-email-clord@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
[mreitz: Do a signed comparison against the length of
 block_driver_modules[], so it will not cause a compile error when
 empty]
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-09-20 22:12:03 +02:00
Colin Lord f57b4b5fb1 blockdev: prepare iSCSI block driver for dynamic loading
This commit moves the initialization of the QemuOptsList qemu_iscsi_opts
struct out of block/iscsi.c in order to allow the iscsi module to be
dynamically loaded.

Signed-off-by: Colin Lord <clord@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1471008424-16465-2-git-send-email-clord@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-09-20 22:10:57 +02:00
Daniel P. Berrange 3bd18890ca crypto: make PBKDF iterations configurable for LUKS format
As protection against bruteforcing passphrases, the PBKDF
algorithm is tuned by counting the number of iterations
needed to produce 1 second of running time. If the machine
that the image will be used on is much faster than the
machine where the image is created, it can be desirable
to raise the number of iterations. This change adds a new
'iter-time' property that allows the user to choose the
iteration wallclock time.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-09-19 16:30:45 +01:00
Laurent Vivier 11d816a5bc sheepdog: remove useless casts
This patch is the result of coccinelle script
scripts/coccinelle/typecast.cocci

CC: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
CC: qemu-block@nongnu.org
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-09-15 15:32:22 +03:00
Tomáš Golembiovský a41c457881 curl: Operate on zero-length file
Another attempt to fix the bug 1596870.

When creating new disk backed by remote file accessed via HTTPS and the
backing file has zero length, qemu-img terminates with uniformative
error message:

    qemu-img: disk.qcow2: CURL: Error opening file:

While it may not make much sense to operate on empty file, other block
backends (e.g. raw backend for regular files) seem to allow it. This
patch fixes it for the curl backend and improves the reported error.

Signed-off-by: Tomáš Golembiovský <tgolembi@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-09-15 15:32:22 +03:00
Ladi Prosek d4b84d564e Remove unused function declarations
Unused function declarations were found using a simple gcc plugin and
manually verified by grepping the sources.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-09-15 15:32:22 +03:00
Peter Maydell 4dfbe3767a Pull request
v2:
  * Fixed qcow2 sanitizer warnings [Peter]
  * Renamed get_error test cases to get_error_all to avoid tripping "error:"
    grep scripts [Peter]
  * Added Fam's iothread stop patch
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJX1862AAoJEJykq7OBq3PIK/YH/jLBxEu/H4hu+NzR7V7ur8lv
 F5E633tVzfxC7iM5CGUV+6qKM1UsIXvXn98JDxuGPh8aZaPdhsaalrS5iAJby6LN
 nPxam3ps1RlPeFA3hS3OqMizlO0qhp2gQZ5T6KPEktNXgSKP3kyiumXDo26XQoj6
 dtW1yap9lcvk3z9XBdTTpAfPLmAurvcxUptZsxQgo6O6/7jCuLyxEDuFenL13tOU
 vUvdrWvBn44YlzDjJc75nw2EBnCFLphw6SCj6seMj8RXDn2JUdVM0DQIZ2HQtBcT
 YlVjYl8hNyBwRZGi1QjSzdIhPYZQen0gg3EVJUBdwg/zgkrS9w91JImSwJGMxyg=
 =AHxB
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging

Pull request

v2:
 * Fixed qcow2 sanitizer warnings [Peter]
 * Renamed get_error test cases to get_error_all to avoid tripping "error:"
   grep scripts [Peter]
 * Added Fam's iothread stop patch

# gpg: Signature made Tue 13 Sep 2016 11:02:30 BST
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/block-pull-request:
  iothread: Stop threads before main() quits
  tests: fix qvirtqueue_kick
  MAINTAINERS: add maintainer for replication
  support replication driver in blockdev-add
  tests: add unit test case for replication
  replication: Implement new driver for block replication
  replication: Introduce new APIs to do replication operation
  configure: support replication
  mirror: auto complete active commit
  docs: block replication's description
  block: Link backup into block core
  Backup: export interfaces for extra serialization
  Backup: clear all bitmap when doing block checkpoint
  block: unblock backup operations in backing file
  virtio-blk: rename virtio_device_info to virtio_blk_info
  linux-aio: process completions from ioq_submit()
  linux-aio: split processing events function
  linux-aio: consume events in userspace instead of calling io_getevents
  qcow2: avoid memcpy(dst, NULL, len)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-09-13 14:31:18 +01:00
Wen Congyang 29ff789060 replication: Implement new driver for block replication
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Signed-off-by: Wang WeiWei <wangww.fnst@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Message-id: 1469602913-20979-10-git-send-email-xiecl.fnst@cn.fujitsu.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-09-13 11:00:56 +01:00
Wen Congyang b49f7ead8d mirror: auto complete active commit
Auto complete mirror job in background to prevent from
blocking synchronously

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Signed-off-by: Wang WeiWei <wangww.fnst@cn.fujitsu.com>
Message-id: 1469602913-20979-7-git-send-email-xiecl.fnst@cn.fujitsu.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-09-13 11:00:56 +01:00
Wen Congyang 258854ad1d block: Link backup into block core
Some programs that add a dependency on it will use
the block layer directly.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Signed-off-by: Wang WeiWei <wangww.fnst@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1469602913-20979-5-git-send-email-xiecl.fnst@cn.fujitsu.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-09-13 11:00:56 +01:00
Changlong Xie a8bbee0edf Backup: export interfaces for extra serialization
Normal backup(sync='none') workflow:
step 1. NBD peformance I/O write from client to server
   qcow2_co_writev
    bdrv_co_writev
     ...
       bdrv_aligned_pwritev
        notifier_with_return_list_notify -> backup_do_cow
         bdrv_driver_pwritev // write new contents

step 2. drive-backup sync=none
   backup_do_cow
   {
    wait_for_overlapping_requests
    cow_request_begin
    for(; start < end; start++) {
            bdrv_co_readv_no_serialising //read old contents from Secondary disk
            bdrv_co_writev // write old contents to hidden-disk
    }
    cow_request_end
   }

step 3. Then roll back to "step 1" to write new contents to Secondary disk.

And for replication, we must make sure that we only read the old contents from
Secondary disk in order to keep contents consistent.

1) Replication workflow of Secondary
                                                         virtio-blk
                                                              ^
------->  1 NBD                                               |
   ||     server                                       3 replication
   ||        ^                                                ^
   ||        |           backing                 backing      |
   ||  Secondary disk 6<-------- hidden-disk 5 <-------- active-disk 4
   ||        |                         ^
   ||        '-------------------------'
   ||           drive-backup sync=none 2

Hence, we need these interfaces to implement coarse-grained serialization between
COW of Secondary disk and the read operation of replication.

Example codes about how to use them:

*#include "block/block_backup.h"

static coroutine_fn int xxx_co_readv()
{
        CowRequest req;
        BlockJob *job = secondary_disk->bs->job;

        if (job) {
              backup_wait_for_overlapping_requests(job, start, end);
              backup_cow_request_begin(&req, job, start, end);
              ret = bdrv_co_readv();
              backup_cow_request_end(&req);
              goto out;
        }
        ret = bdrv_co_readv();
out:
        return ret;
}

Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Wang WeiWei <wangww.fnst@cn.fujitsu.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1469602913-20979-4-git-send-email-xiecl.fnst@cn.fujitsu.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-09-13 11:00:56 +01:00
Wen Congyang 49d3e828f8 Backup: clear all bitmap when doing block checkpoint
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Signed-off-by: Wang WeiWei <wangww.fnst@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1469602913-20979-3-git-send-email-xiecl.fnst@cn.fujitsu.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-09-13 11:00:56 +01:00
Roman Pen 0ed93d84ed linux-aio: process completions from ioq_submit()
In order to reduce completion latency it makes sense to harvest completed
requests ASAP.  Very fast backend device can complete requests just after
submission, so it is worth trying to check ring buffer in order to peek
completed requests directly after io_submit() has been called.

Indeed, this patch reduces the completions latencies and increases the
overall throughput, e.g. the following is the percentiles of number of
completed requests at once:

        1th 10th  20th  30th  40th  50th  60th  70th  80th  90th  99.99th
Before    2    4    42   112   128   128   128   128   128   128    128
 After    1    1     4    14    33    45    47    48    50    51    108

That means, that before the current patch is applied the ring buffer is
observed as full (128 requests were consumed at once) in 60% of calls.

After patch is applied the distribution of number of completed requests
is "smoother" and the queue (requests in-flight) is almost never full.

The fio read results are the following (write results are almost the
same and are not showed here):

  Before
  ------
job: (groupid=0, jobs=8): err= 0: pid=2227: Tue Jul 19 11:29:50 2016
  Description  : [Emulation of Storage Server Access Pattern]
  read : io=54681MB, bw=1822.7MB/s, iops=179779, runt= 30001msec
    slat (usec): min=172, max=16883, avg=338.35, stdev=109.66
    clat (usec): min=1, max=21977, avg=1051.45, stdev=299.29
     lat (usec): min=317, max=22521, avg=1389.83, stdev=300.73
    clat percentiles (usec):
     |  1.00th=[  346],  5.00th=[  596], 10.00th=[  708], 20.00th=[  852],
     | 30.00th=[  932], 40.00th=[  996], 50.00th=[ 1048], 60.00th=[ 1112],
     | 70.00th=[ 1176], 80.00th=[ 1256], 90.00th=[ 1384], 95.00th=[ 1496],
     | 99.00th=[ 1800], 99.50th=[ 1928], 99.90th=[ 2320], 99.95th=[ 2672],
     | 99.99th=[ 4704]
    bw (KB  /s): min=205229, max=553181, per=12.50%, avg=233278.26, stdev=18383.51

  After
  ------
job: (groupid=0, jobs=8): err= 0: pid=2220: Tue Jul 19 11:31:51 2016
  Description  : [Emulation of Storage Server Access Pattern]
  read : io=57637MB, bw=1921.2MB/s, iops=189529, runt= 30002msec
    slat (usec): min=169, max=20636, avg=329.61, stdev=124.18
    clat (usec): min=2, max=19592, avg=988.78, stdev=251.04
     lat (usec): min=381, max=21067, avg=1318.42, stdev=243.58
    clat percentiles (usec):
     |  1.00th=[  310],  5.00th=[  580], 10.00th=[  748], 20.00th=[  876],
     | 30.00th=[  908], 40.00th=[  948], 50.00th=[ 1012], 60.00th=[ 1064],
     | 70.00th=[ 1080], 80.00th=[ 1128], 90.00th=[ 1224], 95.00th=[ 1288],
     | 99.00th=[ 1496], 99.50th=[ 1608], 99.90th=[ 1960], 99.95th=[ 2256],
     | 99.99th=[ 5408]
    bw (KB  /s): min=212149, max=390160, per=12.49%, avg=245746.04, stdev=11606.75

Throughput increased from 1822MB/s to 1921MB/s, average completion latencies
decreased from 1051us to 988us.

Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com>
Message-id: 1468931263-32667-4-git-send-email-roman.penyaev@profitbricks.com
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-devel@nongnu.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-09-13 11:00:56 +01:00
Roman Pen 3407de572b linux-aio: split processing events function
Prepare processing events function to be called from ioq_submit(),
thus split function on two parts: the first harvests completed IO
requests, the second submits pending requests.

Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com>
Message-id: 1468931263-32667-3-git-send-email-roman.penyaev@profitbricks.com
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-devel@nongnu.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-09-13 11:00:55 +01:00
Roman Pen 9e909a5829 linux-aio: consume events in userspace instead of calling io_getevents
AIO context in userspace is represented as a simple ring buffer, which
can be consumed directly without entering the kernel, which obviously
can bring some performance gain.  QEMU does not use timeout value for
waiting for events completions, so we can consume all events from
userspace.

Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com>
Message-id: 1468931263-32667-2-git-send-email-roman.penyaev@profitbricks.com
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-devel@nongnu.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-09-13 11:00:55 +01:00
Stefan Hajnoczi 0647d47cc1 qcow2: avoid memcpy(dst, NULL, len)
Section "7.1.4 Use of library functions" in the C99 standard says:

  If an argument to a function has an invalid value (such as [...]
  a null pointer [...]) [...] the behavior is undefined.

Additionally the "searching and sorting" functions are specified as
requiring valid pointer values as described in 7.1.4.

This patch fixes the following sanitizer errors:

  block/qcow2.c:1807:41: runtime error: null pointer passed as argument 2, which is declared to never be null
  block/qcow2-cluster.c:86:26: runtime error: null pointer passed as argument 2, which is declared to never be null

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1473758138-19260-1-git-send-email-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-09-13 11:00:55 +01:00
Prasanna Kumar Kalever e9db8ff38e block/gluster: add support to choose libgfapi logfile
currently all the libgfapi logs defaults to '/dev/stderr' as it was hardcoded
in a call to glfs logging api. When the debug level is chosen to DEBUG/TRACE,
gfapi logs will be huge and fill/overflow the console view.

This patch provides a commandline option to mention log file path which helps
in logging to the specified file and also help in persisting the gfapi logs.

Usage:
-----
 *URI Style:
  ---------
  -drive file=gluster://hostname/volname/image.qcow2,file.debug=9,\
                      file.logfile=/var/log/qemu/qemu-gfapi.log

 *JSON Style:
  ----------
  'json:{
           "driver":"qcow2",
           "file":{
              "driver":"gluster",
              "volume":"volname",
              "path":"image.qcow2",
              "debug":"9",
              "logfile":"/var/log/qemu/qemu-gfapi.log",
              "server":[
                 {
                    "type":"tcp",
                    "host":"1.2.3.4",
                    "port":24007
                 },
                 {
                    "type":"unix",
                    "socket":"/var/run/glusterd.socket"
                 }
              ]
           }
        }'

Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-09-13 01:34:47 -04:00
Pavel Butsykin 8b2bd09338 qcow2: fix iovec size at qcow2_co_pwritev_compressed
Use bytes as the size would be more exact than s->cluster_size.  Although
qemu_iovec_to_buf() will not allow to go beyond the qiov.

Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-09-05 19:06:48 +02:00
Pavel Butsykin 13b9414b57 drive-backup: added support for data compression
The idea is simple - backup is "written-once" data. It is written block
by block and it is large enough. It would be nice to save storage
space and compress it.

The patch adds a flag to the qmp/hmp drive-backup command which enables
block compression. Compression should be implemented in the format driver
to enable this feature.

There are some limitations of the format driver to allow compressed writes.
We can write data only once. Though for backup this is perfectly fine.
These limitations are maintained by the driver and the error will be
reported if we are doing something wrong.

Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Jeff Cody <jcody@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
CC: Eric Blake <eblake@redhat.com>
CC: John Snow <jsnow@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-09-05 19:06:48 +02:00
Pavel Butsykin 3ea1a09111 block/io: turn on dirty_bitmaps for the compressed writes
Previously was added the assert:

  commit 1755da16e3
  Author: Paolo Bonzini <pbonzini@redhat.com>
  Date:   Thu Oct 18 16:49:18 2012 +0200
  block: introduce new dirty bitmap functionality

Now the compressed write is always in coroutine and setting the bits is
done after the write, so that we can return the dirty_bitmaps for the
compressed writes.

Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Jeff Cody <jcody@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
CC: Eric Blake <eblake@redhat.com>
CC: John Snow <jsnow@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-09-05 19:06:48 +02:00
Pavel Butsykin 35fadca80e block: remove BlockDriver.bdrv_write_compressed
There are no block drivers left that implement the old
.bdrv_write_compressed interface, so it can be removed. Also now we have
no need to use the bdrv_pwrite_compressed function and we can remove it
entirely.

Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Jeff Cody <jcody@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
CC: Eric Blake <eblake@redhat.com>
CC: John Snow <jsnow@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-09-05 19:06:48 +02:00
Pavel Butsykin 655923df4b qcow: cleanup qcow_co_pwritev_compressed to avoid the recursion
Now that the function uses a vector instead of a buffer, there is no
need to use recursive code.

Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Jeff Cody <jcody@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
CC: Eric Blake <eblake@redhat.com>
CC: John Snow <jsnow@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-09-05 19:06:48 +02:00
Pavel Butsykin f2b95a1231 qcow: add qcow_co_pwritev_compressed
Added implementation of the qcow_co_pwritev_compressed function that
will allow us to safely use compressed writes for the qcow from running
VMs.

Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Jeff Cody <jcody@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
CC: Eric Blake <eblake@redhat.com>
CC: John Snow <jsnow@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-09-05 19:06:48 +02:00