summaryrefslogtreecommitdiffstats
path: root/include
Commit message (Collapse)AuthorAgeFilesLines
* virtio-input: fix eventq batchingLadi Prosek2017-03-271-1/+4
| | | | | | | | | | | | | | | | | | | | virtio_input_send buffers input events until it sees a SYNC. Then it either sends or drops the entire batch, depending on whether eventq has enough space available. The case to avoid here is partial sends where only part of the batch would get to the guest. Using virtqueue_get_avail_bytes to check the state of eventq was not correct. The queue may have a smaller number of larger buffers available so bytes may be enough but the batch would still not be possible to send, leading to the "Huh? No vq elem available" error. Instead of checking available bytes, this patch optimistically pops buffers from the queue and puts them back in case it runs out of space and the batch needs to be dropped. Signed-off-by: Ladi Prosek <lprosek@redhat.com> Message-id: 1490365490-4854-3-git-send-email-lprosek@redhat.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
* Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.9-20170323' ↵Peter Maydell2017-03-231-0/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | into staging ppc patch queue for 2017-03-23 Just a single bugfix in this batch. It's not strictly in ppc code, though it's for the pseries machine's benefit. Eduardo suggested it go through my tree however. # gpg: Signature made Thu 23 Mar 2017 10:09:17 GMT # gpg: using RSA key 0x6C38CACA20D9B392 # gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" # gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" # gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" # gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" # Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392 * remotes/dgibson/tags/ppc-for-2.9-20170323: numa,spapr: align default numa node memory size to 256MB Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * numa,spapr: align default numa node memory size to 256MBLaurent Vivier2017-03-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit 224245b ("spapr: Add LMB DR connectors"), NUMA node memory size must be aligned to 256MB (SPAPR_MEMORY_BLOCK_SIZE). But when "-numa" option is provided without "mem" parameter, the memory is equally divided between nodes, but 8MB aligned. This can be not valid for pseries. In that case we can have: $ ./ppc64-softmmu/qemu-system-ppc64 -m 4G -numa node -numa node -numa node qemu-system-ppc64: Node 0 memory size 0x55000000 is not aligned to 256 MiB With this patch, we have: (qemu) info numa 3 nodes node 0 cpus: 0 node 0 size: 1280 MB node 1 cpus: node 1 size: 1280 MB node 2 cpus: node 2 size: 1536 MB Signed-off-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* | Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into stagingPeter Maydell2017-03-231-0/+8
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | # gpg: Signature made Wed 22 Mar 2017 17:28:56 GMT # gpg: using RSA key 0xBDBE7B27C0DE3057 # gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>" # gpg: aka "Jeffrey Cody <jeff@codyprime.org>" # gpg: aka "Jeffrey Cody <codyprime@gmail.com>" # Primary key fingerprint: 9957 4B4D 3474 90E7 9D98 D624 BDBE 7B27 C0DE 3057 * remotes/cody/tags/block-pull-request: blockjob: add devops to blockjob backends block-backend: add drained_begin / drained_end ops blockjob: add block_job_start_shim blockjob: avoid recursive AioContext locking Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | block-backend: add drained_begin / drained_end opsJohn Snow2017-03-221-0/+8
| |/ | | | | | | | | | | | | | | | | | | | | | | Allow block backends to forward drain requests to their devices/users. The initial intended purpose for this patch is to allow BBs to forward requests along to BlockJobs, which will want to pause if their associated BB has entered a drained region. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Message-id: 20170316212351.13797-3-jsnow@redhat.com Signed-off-by: Jeff Cody <jcody@redhat.com>
* | hw/acpi/vmgenid: prevent more than one vmgenid deviceLaszlo Ersek2017-03-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | A system with multiple VMGENID devices is undefined in the VMGENID spec by omission. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Ben Warren <ben@skyportsystems.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
* | hw/acpi/vmgenid: prevent device realization on pre-2.5 machine typesLaszlo Ersek2017-03-222-0/+5
|/ | | | | | | | | | | | | | | | | | | | The WRITE_POINTER linker/loader command that underlies VMGENID depends on commit baf2d5bfbac0 ("fw-cfg: support writeable blobs", 2017-01-12), which in turn depends on fw_cfg DMA. DMA for fw_cfg is enabled in 2.5+ machine types only (see commit e6915b5f3a87, "fw_cfg: unbreak migration compatibility for 2.4 and earlier machines", 2016-02-18). Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Ben Warren <ben@skyportsystems.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Ben Warren <ben@skyportsystems.com <mailto:ben@skyportsystems.com>> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
* qemu-ga: obey LISTEN_PID when using systemd socket activationPaolo Bonzini2017-03-191-0/+26
| | | | | | | | | | | | | | | | | | | | | qemu-ga's socket activation support was not obeying the LISTEN_PID environment variable, which avoids that a process uses a socket-activation file descriptor meant for its parent. Mess can for example ensue if a process forks a children before consuming the socket-activation file descriptor and therefore setting O_CLOEXEC on it. Luckily, qemu-nbd also got socket activation code, and its copy does support LISTEN_PID. Some extra fixups are needed to ensure that the code can be used for both, but that's what this patch does. The main change is to replace get_listen_fds's "consume" argument with the FIRST_SOCKET_ACTIVATION_FD macro from the qemu-nbd code. Cc: "Richard W.M. Jones" <rjones@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* block: Always call bdrv_child_check_perm firstFam Zheng2017-03-171-4/+0Star
| | | | | | | | | | | bdrv_child_set_perm alone is not very usable because the caller must call bdrv_child_check_perm first. This is already encapsulated conveniently in bdrv_child_try_set_perm, so remove the other prototypes from the header and fix the one wrong caller, block/mirror.c. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* Merge remote-tracking branch 'remotes/kraxel/tags/pull-cirrus-20170316-1' ↵Peter Maydell2017-03-162-7/+8
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | into staging cirrus: blitter fixes. # gpg: Signature made Thu 16 Mar 2017 09:05:22 GMT # gpg: using RSA key 0x4CB6D8EED3E87138 # gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>" # gpg: aka "Gerd Hoffmann <gerd@kraxel.org>" # gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>" # Primary key fingerprint: A032 8CFF B93A 17A7 9901 FE7D 4CB6 D8EE D3E8 7138 * remotes/kraxel/tags/pull-cirrus-20170316-1: cirrus: stop passing around src pointers in the blitter cirrus: stop passing around dst pointers in the blitter cirrus: fix cirrus_invalidate_region cirrus: add option to disable blitter cirrus: switch to 4 MB video memory by default cirrus/vnc: zap bitblit support from console code. fix :cirrus_vga fix OOB read case qemu Segmentation fault # Conflicts: # include/hw/compat.h Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * cirrus: switch to 4 MB video memory by defaultGerd Hoffmann2017-03-161-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Quoting cirrus source code: Follow real hardware, cirrus card emulated has 4 MB video memory. Also accept 8 MB/16 MB for backward compatibility. So just use 4MB by default. We decided to leave that at 8MB by default a while ago, for live migration compatibility reasons. But we have compat properties to handle that, so that isn't a compeling reason. This also removes some sanity check inconsistencies in the cirrus code. Some places check against the allocated video memory, some places check against the 4MB physical hardware has. Guest code can trigger asserts because of that. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-id: 1489494514-15606-1-git-send-email-kraxel@redhat.com
| * cirrus/vnc: zap bitblit support from console code.Gerd Hoffmann2017-03-161-7/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a special code path (dpy_gfx_copy) to allow graphic emulation notify user interface code about bitblit operations carryed out by guests. It is supported by cirrus and vnc server. The intended purpose is to optimize display scrolls and just send over the scroll op instead of a full display update. This is rarely used these days though because modern guests simply don't use the cirrus blitter any more. Any linux guest using the cirrus drm driver doesn't. Any windows guest newer than winxp doesn't ship with a cirrus driver any more and thus uses the cirrus as simple framebuffer. So this code tends to bitrot and bugs can go unnoticed for a long time. See for example commit "3e10c3e vnc: fix qemu crash because of SIGSEGV" which fixes a bug lingering in the code for almost a year, added by commit "c7628bf vnc: only alloc server surface with clients connected". Also the vnc server will throttle the frame rate in case it figures the network can't keep up (send buffers are full). This doesn't work with dpy_gfx_copy, for any copy operation sent to the vnc client we have to send all outstanding updates beforehand, otherwise the vnc client might run the client side blit on outdated data and thereby corrupt the display. So this dpy_gfx_copy "optimization" might even make things worse on slow network links. Lets kill it once for all. Oh, and one more reason: Turns out (after writing the patch) we have a security bug in that code path ... Fixes: CVE-2016-9603 Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-id: 1489494419-14340-1-git-send-email-kraxel@redhat.com
* | Merge remote-tracking branch 'remotes/juanquintela/tags/migration/20170316' ↵Peter Maydell2017-03-162-1/+5
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | into staging migration/next for 20170316 # gpg: Signature made Thu 16 Mar 2017 08:21:51 GMT # gpg: using RSA key 0xF487EF185872D723 # gpg: Good signature from "Juan Quintela <quintela@redhat.com>" # gpg: aka "Juan Quintela <quintela@trasno.org>" # Primary key fingerprint: 1899 FF8E DEBF 58CC EE03 4B82 F487 EF18 5872 D723 * remotes/juanquintela/tags/migration/20170316: postcopy: Check for shared memory RAMBlocks: qemu_ram_is_shared vmstate: fix failed iotests case 68 and 91 migration/block: Avoid invoking blk_drain too frequently migration: use "" as the default for tls-creds/hostname Change the method to calculate dirty-pages-rate Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | RAMBlocks: qemu_ram_is_sharedDr. David Alan Gilbert2017-03-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Provide a helper to say whether a RAMBlock was created as a shared mapping. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
| * | Change the method to calculate dirty-pages-rateChao Fan2017-03-161-1/+4
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In function cpu_physical_memory_sync_dirty_bitmap, file include/exec/ram_addr.h: if (src[idx][offset]) { unsigned long bits = atomic_xchg(&src[idx][offset], 0); unsigned long new_dirty; new_dirty = ~dest[k]; dest[k] |= bits; new_dirty &= bits; num_dirty += ctpopl(new_dirty); } After these codes executed, only the pages not dirtied in bitmap(dest), but dirtied in dirty_memory[DIRTY_MEMORY_MIGRATION] will be calculated. For example: When ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] = 0b00001111, and atomic_rcu_read(&migration_bitmap_rcu)->bmap = 0b00000011, the new_dirty will be 0b00001100, and this function will return 2 but not 4 which is expected. the dirty pages in dirty_memory[DIRTY_MEMORY_MIGRATION] are all new, so these should be calculated also. Signed-off-by: Chao Fan <fanc.fnst@cn.fujitsu.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
* | Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into stagingPeter Maydell2017-03-163-0/+23
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | virtio, pci: fixes More fixes missed in the previous pull request. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Thu 16 Mar 2017 02:29:49 GMT # gpg: using RSA key 0x281F0DB8D28D5469 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * remotes/mst/tags/for_upstream: virtio-serial-bus: Delete timer from list before free it hw/virtio: fix Power Management Control Register for PCI Express virtio devices hw/virtio: fix Link Control Register for PCI Express virtio devices hw/virtio: fix error enabling flags in Device Control register hw/pcie: fix Extended Configuration Space for devices with no Extended Capabilities Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | hw/virtio: fix Power Management Control Register for PCI Express virtio devicesMarcel Apfelbaum2017-03-162-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | Make Power Management State flag writable to conform with the PCI Express spec. Signed-off-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * | hw/virtio: fix Link Control Register for PCI Express virtio devicesMarcel Apfelbaum2017-03-162-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | Make several Link Control Register flags writable to conform with the PCI Express spec. Signed-off-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * | hw/virtio: fix error enabling flags in Device Control registerMarcel Apfelbaum2017-03-161-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | When the virtio devices are PCI Express, make error-enabling flags writable to respect the PCIe spec. Signed-off-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * | hw/pcie: fix Extended Configuration Space for devices with no Extended ↵Marcel Apfelbaum2017-03-162-0/+6
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Capabilities Absence of any Extended Capabilities is required to be indicated by an Extended Capability header with a Capability ID of 0000h, a Capability Version of 0h, and a Next Capability Offset of 000h. Instead of inserting a 'NULL' capability is simpler to mark the start of the Extended Configuration Space as read-only to achieve the same behaviour. Signed-off-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* / ide: core: add cleanup functionLi Qiang2017-03-161-0/+1
|/ | | | | | | | | | As the pci ahci can be hotplug and unplug, in the ahci unrealize function it should free all the resource once allocated in the realized function. This patch add ide_exit to free the resource. Signed-off-by: Li Qiang <liqiang6-s@360.cn> Message-id: 1488449293-80280-3-git-send-email-liqiang6-s@360.cn Signed-off-by: John Snow <jsnow@redhat.com>
* pci: introduce a bus master containerJason Wang2017-03-151-0/+1
| | | | | | | | | | | | | | | | | 96a8821d2141 ("virtio: unbreak virtio-pci with IOMMU after caching ring translations") tries to make IOMMU works with virtio memory region cache, but it requires IOMMU to be created before any virtio devices. This is sub optimal, fixing this by introduce a bus master container to make sure address space can be initialized during device registering, and then we can safely set alias and make bus_master_enable_region as its subregion during bus master initialization. Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* icount: process QEMU_CLOCK_VIRTUAL timers in vCPU threadPaolo Bonzini2017-03-141-0/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | icount has become much slower after tcg_cpu_exec has stopped using the BQL. There is also a latent bug that is masked by the slowness. The slowness happens because every occurrence of a QEMU_CLOCK_VIRTUAL timer now has to wake up the I/O thread and wait for it. The rendez-vous is mediated by the BQL QemuMutex: - handle_icount_deadline wakes up the I/O thread with BQL taken - the I/O thread wakes up and waits on the BQL - the VCPU thread releases the BQL a little later - the I/O thread raises an interrupt, which calls qemu_cpu_kick - the VCPU thread notices the interrupt, takes the BQL to process it and waits on it All this back and forth is extremely expensive, causing a 6 to 8-fold slowdown when icount is turned on. One may think that the issue is that the VCPU thread is too dependent on the BQL, but then the latent bug comes in. I first tried removing the BQL completely from the x86 cpu_exec, only to see everything break. The only way to fix it (and make everything slow again) was to add a dummy BQL lock/unlock pair. This is because in -icount mode you really have to process the events before the CPU restarts executing the next instruction. Therefore, this series moves the processing of QEMU_CLOCK_VIRTUAL timers straight in the vCPU thread when running in icount mode. The required changes include: - make the timer notification callback wake up TCG's single vCPU thread when run from another thread. By using async_run_on_cpu, the callback can override all_cpu_threads_idle() when the CPU is halted. - move handle_icount_deadline after qemu_tcg_wait_io_event, so that the timer notification callback is invoked after the dummy work item wakes up the vCPU thread - make handle_icount_deadline run the timers instead of just waking the I/O thread. - stop processing the timers in the main loop Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* cpus: define QEMUTimerListNotifyCB for QEMU system emulationPaolo Bonzini2017-03-142-2/+3
| | | | | | | | There is no change for now, because the callback just invokes qemu_notify_event. Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* qemu-timer: do not include sysemu/cpus.h from util/qemu-timer.hPaolo Bonzini2017-03-142-1/+2
| | | | | | | | | This dependency is the wrong way, and we will need util/qemu-timer.h from sysemu/cpus.h in the next patch. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* mem-prealloc: reduce large guest start-up and migration time.Jitendra Kolhe2017-03-141-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using "-mem-prealloc" option for a large guest leads to higher guest start-up and migration time. This is because with "-mem-prealloc" option qemu tries to map every guest page (create address translations), and make sure the pages are available during runtime. virsh/libvirt by default, seems to use "-mem-prealloc" option in case the guest is configured to use huge pages. The patch tries to map all guest pages simultaneously by spawning multiple threads. Currently limiting the change to QEMU library functions on POSIX compliant host only, as we are not sure if the problem exists on win32. Below are some stats with "-mem-prealloc" option for guest configured to use huge pages. ------------------------------------------------------------------------ Idle Guest | Start-up time | Migration time ------------------------------------------------------------------------ Guest stats with 2M HugePage usage - single threaded (existing code) ------------------------------------------------------------------------ 64 Core - 4TB | 54m11.796s | 75m43.843s 64 Core - 1TB | 8m56.576s | 14m29.049s 64 Core - 256GB | 2m11.245s | 3m26.598s ------------------------------------------------------------------------ Guest stats with 2M HugePage usage - map guest pages using 8 threads ------------------------------------------------------------------------ 64 Core - 4TB | 5m1.027s | 34m10.565s 64 Core - 1TB | 1m10.366s | 8m28.188s 64 Core - 256GB | 0m19.040s | 2m10.148s ----------------------------------------------------------------------- Guest stats with 2M HugePage usage - map guest pages using 16 threads ----------------------------------------------------------------------- 64 Core - 4TB | 1m58.970s | 31m43.400s 64 Core - 1TB | 0m39.885s | 7m55.289s 64 Core - 256GB | 0m11.960s | 2m0.135s ----------------------------------------------------------------------- Changed in v2: - modify number of memset threads spawned to min(smp_cpus, 16). - removed 64GB memory restriction for spawning memset threads. Changed in v3: - limit number of threads spawned based on min(sysconf(_SC_NPROCESSORS_ONLN), 16, smp_cpus) - implement memset thread specific siglongjmp in SIGBUS signal_handler. Changed in v4 - remove sigsetjmp/siglongjmp and SIGBUS unblock/block for main thread as main thread no longer touches any pages. - simplify code my returning memset_thread_failed status from touch_all_pages. Signed-off-by: Jitendra Kolhe <jitendra.kolhe@hpe.com> Message-Id: <1487907103-32350-1-git-send-email-jitendra.kolhe@hpe.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* memory_region: Fix name commentsDr. David Alan Gilbert2017-03-141-6/+12
| | | | | | | | | | | The 'name' parameter to memory_region_init_* had been marked as debug only, however vmstate_region_ram uses it as a parameter to qemu_ram_set_idstr to set RAMBlock names and these form part of the migration stream. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20170309152708.30635-1-dgilbert@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.9-20170314' ↵Peter Maydell2017-03-141-0/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | into staging ppc patch queue for 2017-03-14 This set has a handful og bugfixes to go into qemu-2.9. This includes an update to the dtc/libfdt submodule which will fix the build errors seen on some distributions. # gpg: Signature made Tue 14 Mar 2017 04:00:41 GMT # gpg: using RSA key 0x6C38CACA20D9B392 # gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" # gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" # gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" # gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" # Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392 * remotes/dgibson/tags/ppc-for-2.9-20170314: dtc: Update submodule to avoid build errors pseries: Don't expose PCIe extended config space on older machine types target/ppc: fix cpu_ov setting for 32-bit target/ppc: Fix wrong number of UAMR register Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * pseries: Don't expose PCIe extended config space on older machine typesDavid Gibson2017-03-141-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | bb9986452 "spapr_pci: Advertise access to PCIe extended config space" allowed guests to access the extended config space of PCI Express devices via the PAPR interfaces, even though the paravirtualized bus mostly acts like plain PCI. However, that patch enabled access unconditionally, including for existing machine types, which is an unwise change in behaviour. This patch limits the change to pseries-2.9 (and later) machine types. Suggested-by: Andrea Bolognani <abologna@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* | build: include sys/sysmacros.h for major() and minor()Christopher Covington2017-03-141-0/+4
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The definition of the major() and minor() macros are moving within glibc to <sys/sysmacros.h>. Include this header when it is available to avoid the following sorts of build-stopping messages: qga/commands-posix.c: In function ‘dev_major_minor’: qga/commands-posix.c:656:13: error: In the GNU C Library, "major" is defined by <sys/sysmacros.h>. For historical compatibility, it is currently defined by <sys/types.h> as well, but we plan to remove this soon. To use "major", include <sys/sysmacros.h> directly. If you did not intend to use a system-defined macro "major", you should undefine it after including <sys/types.h>. [-Werror] *devmajor = major(st.st_rdev); ^~~~~~~~~~~~~~~~~~~~~~~~~~ qga/commands-posix.c:657:13: error: In the GNU C Library, "minor" is defined by <sys/sysmacros.h>. For historical compatibility, it is currently defined by <sys/types.h> as well, but we plan to remove this soon. To use "minor", include <sys/sysmacros.h> directly. If you did not intend to use a system-defined macro "minor", you should undefine it after including <sys/types.h>. [-Werror] *devminor = minor(st.st_rdev); ^~~~~~~~~~~~~~~~~~~~~~~~~~ The additional include allows the build to complete on Fedora 26 (Rawhide) with glibc version 2.24.90. Signed-off-by: Christopher Covington <cov@codeaurora.org> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* i386: Change stepping of Haswell to non-blacklisted valueEduardo Habkost2017-03-101-0/+5
| | | | | | | | | | | | | | glibc blacklists TSX on Haswell CPUs with model==60 and stepping < 4. To make the Haswell CPU model more useful, make those guests actually use TSX by changing CPU stepping to 4. References: * glibc commit 2702856bf45c82cf8e69f2064f5aa15c0ceb6359 https://sourceware.org/git/?p=glibc.git;a=commit;h=2702856bf45c82cf8e69f2064f5aa15c0ceb6359 Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20170309181212.18864-4-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
* Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into stagingPeter Maydell2017-03-082-5/+5
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Block layer fixes for 2.9.0-rc0 # gpg: Signature made Tue 07 Mar 2017 14:59:18 GMT # gpg: using RSA key 0x7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * remotes/kevin/tags/for-upstream: (27 commits) commit: Don't use error_abort in commit_start block: Don't use error_abort in blk_new_open sheepdog: Support blockdev-add qapi-schema: Rename SocketAddressFlat's variant tcp to inet qapi-schema: Rename GlusterServer to SocketAddressFlat gluster: Plug memory leaks in qemu_gluster_parse_json() gluster: Don't duplicate qapi-util.c's qapi_enum_parse() gluster: Drop assumptions on SocketTransport names sheepdog: Implement bdrv_parse_filename() sheepdog: Use SocketAddress and socket_connect() sheepdog: Report errors in pseudo-filename more usefully sheepdog: Don't truncate long VDI name in _open(), _create() sheepdog: Fix snapshot ID parsing in _open(), _create, _goto() sheepdog: Mark sd_snapshot_delete() lossage FIXME sheepdog: Fix error handling sd_create() sheepdog: Fix error handling in sd_snapshot_delete() sheepdog: Defuse time bomb in sd_open() error handling block: Fix error handling in bdrv_replace_in_backing_chain() block: Handle permission errors in change_parent_backing_link() block: Ignore multiple children in bdrv_check_update_perm() ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * block: Fix error handling in bdrv_replace_in_backing_chain()Kevin Wolf2017-03-072-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When adding an Error parameter, bdrv_replace_in_backing_chain() would become nothing more than a wrapper around change_parent_backing_link(). So make the latter public, renamed as bdrv_replace_node(), and remove bdrv_replace_in_backing_chain(). Most of the callers just remove a node from the graph that they just inserted, so they can use &error_abort, but completion of a mirror job with 'replaces' set can actually fail. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
| * block: Ignore multiple children in bdrv_check_update_perm()Kevin Wolf2017-03-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | change_parent_backing_link() will need to update multiple BdrvChild objects at once. Checking permissions reference by reference doesn't work because permissions need to be consistent only with all parents moved to the new child. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
* | qapi: New qobject_input_visitor_new_str() for convenienceMarkus Armbruster2017-03-071-0/+12
| | | | | | | | | | Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1488317230-26248-21-git-send-email-armbru@redhat.com>
* | qapi: New parse_qapi_name()Markus Armbruster2017-03-071-0/+2
| | | | | | | | | | | | Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1488317230-26248-19-git-send-email-armbru@redhat.com>
* | qobject: Propagate parse errors through qobject_from_json()Markus Armbruster2017-03-071-1/+1
| | | | | | | | | | | | | | | | | | The next few commits will put the errors to use where appropriate. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1488317230-26248-13-git-send-email-armbru@redhat.com>
* | qobject: Propagate parse errors through qobject_from_jsonv()Markus Armbruster2017-03-071-1/+2
| | | | | | | | | | | | | | | | The next few commits will put the errors to use where appropriate. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <1488317230-26248-9-git-send-email-armbru@redhat.com>
* | qapi: qobject input visitor variant for use with keyval_parse()Daniel P. Berrange2017-03-071-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the QObjectInputVisitor assumes that all scalar values are directly represented as the final types declared by the thing being visited. i.e. it assumes an 'int' is using QInt, and a 'bool' is using QBool, etc. This is good when QObjectInputVisitor is fed a QObject that came from a JSON document on the QMP monitor, as it will strictly validate correctness. To allow QObjectInputVisitor to be reused for visiting a QObject originating from keyval_parse(), an alternative mode is needed where all the scalars types are represented as QString and converted on the fly to the final desired type. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-Id: <1475246744-29302-8-git-send-email-berrange@redhat.com> Rebased, conflicts resolved, commit message updated to refer to keyval_parse(). autocast replaced by keyval in identifiers, noautocast replaced by fail in tests. Fix qobject_input_type_uint64_keyval() not to reject '-', for QemuOpts compatibility: replace parse_uint_full() by open-coded parse_option_number(). The next commit will add suitable tests. Leave out the fancy ERANGE error reporting for now, but add a TODO comment. Add it qobject_input_type_int64_keyval() and qobject_input_type_number_keyval(), too. Open code parse_option_bool() and parse_option_size() so we have to call qobject_input_get_name() only when actually needed. Again, leave out ERANGE error reporting for now. QAPI/QMP downstream extension prefixes __RFQDN_ don't work, because keyval_parse() splits them at '.'. This will be addressed later in the series. qobject_input_type_int64_keyval(), qobject_input_type_uint64_keyval(), qobject_input_type_number_keyval() tweaked for style. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <1488317230-26248-5-git-send-email-armbru@redhat.com>
* | keyval: New keyval_parse()Markus Armbruster2017-03-071-0/+3
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | keyval_parse() parses KEY=VALUE,... into a QDict. Works like qemu_opts_parse(), except: * Returns a QDict instead of a QemuOpts (d'oh). * Supports nesting, unlike QemuOpts: a KEY is split into key fragments at '.' (dotted key convention; the block layer does something similar on top of QemuOpts). The key fragments are QDict keys, and the last one's value is updated to VALUE. * Each key fragment may be up to 127 bytes long. qemu_opts_parse() limits the entire key to 127 bytes. * Overlong key fragments are rejected. qemu_opts_parse() silently truncates them. * Empty key fragments are rejected. qemu_opts_parse() happily accepts empty keys. * It does not store the returned value. qemu_opts_parse() stores it in the QemuOptsList. * It does not treat parameter "id" specially. qemu_opts_parse() ignores all but the first "id", and fails when its value isn't id_wellformed(), or duplicate (a QemuOpts with the same ID is already stored). It also screws up when a value contains ",id=". * Implied value is not supported. qemu_opts_parse() desugars "foo" to "foo=on", and "nofoo" to "foo=off". * An implied key's value can't be empty, and can't contain ','. I intend to grow this into a saner replacement for QemuOpts. It'll take time, though. Note: keyval_parse() provides no way to do lists, and its key syntax is incompatible with the __RFQDN_ prefix convention for downstream extensions, because it blindly splits at '.', even in __RFQDN_. Both issues will be addressed later in the series. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1488317230-26248-4-git-send-email-armbru@redhat.com>
* Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into ↵Peter Maydell2017-03-061-2/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | staging # gpg: Signature made Mon 06 Mar 2017 04:15:17 GMT # gpg: using RSA key 0xEF04965B398D6211 # gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211 * remotes/jasowang/tags/net-pull-request: net/filter-mirror: Follow CODING_STYLE COLO-compare: Fix icmp and udp compare different packet always dump bug COLO-compare: Optimize compare_common and compare_tcp COLO-compare: Rename compare function and remove duplicate codes filter-rewriter: skip net_checksum_calculate() while offset = 0 net/colo: fix memory double free error vmxnet3: VMStatify rx/tx q_descr and int_state vmxnet3: Convert ring values to uint32_t's net/colo-compare: Fix memory free error colo-compare: Fix removing fds been watched incorrectly in finalization char: remove the right fd been watched in qemu_chr_fe_set_handlers() colo-compare: kick compare thread to exit after some cleanup in finalization colo-compare: use g_timeout_source_new() to process the stale packets NetRxPkt: Remove code duplication in net_rx_pkt_pull_data() NetRxPkt: Account buffer with ETH header in IOV length NetRxPkt: Do not try to pull more data than present NetRxPkt: Fix memory corruption on VLAN header stripping eth: Extend vlan stripping functions net: Remove useless local var pkt Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * eth: Extend vlan stripping functionsDmitry Fleytman2017-03-061-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | Make VLAN stripping functions return number of bytes copied to given Ethernet header buffer. This information should be used to re-compose packet IOV after VLAN stripping. Cc: qemu-stable@nongnu.org Signed-off-by: Dmitry Fleytman <dmitry@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
* | qapi: Improve qobject visitor documentationMarkus Armbruster2017-03-052-5/+67
| | | | | | | | | | | | Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1488544368-30622-29-git-send-email-armbru@redhat.com>
* | qapi: Make input visitors detect unvisited list tailsMarkus Armbruster2017-03-052-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the design flaw demonstrated in the previous commit: new method check_list() lets input visitors report that unvisited input remains for a list, exactly like check_struct() lets them report that unvisited input remains for a struct or union. Implement the method for the qobject input visitor (straightforward), and the string input visitor (less so, due to the magic list syntax there). The opts visitor's list magic is even more impenetrable, and all I can do there today is a stub with a FIXME comment. No worse than before. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1488544368-30622-26-git-send-email-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
* | qapi: Drop unused non-strict qobject input visitorMarkus Armbruster2017-03-051-4/+1Star
| | | | | | | | | | | | | | | | | | | | The split between tests/test-qobject-input-visitor.c and tests/test-qobject-input-strict.c now makes less sense than ever. The next commit will take care of that. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1488544368-30622-20-git-send-email-armbru@redhat.com>
* | qapi: Drop string input visitor method optional()Markus Armbruster2017-03-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | visit_optional() is to be called only between visit_start_struct() and visit_end_struct(). Visitors that don't support struct visits, i.e. don't implement start_struct(), end_struct(), have no use for it. Clarify documentation. The string input visitor doesn't support struct visits. Its parse_optional() is therefore useless. Drop it. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1488544368-30622-16-git-send-email-armbru@redhat.com>
* | qapi: Improve qobject input visitor error reportingMarkus Armbruster2017-03-051-6/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Error messages refer to nodes of the QObject being visited by name. Trouble is the names are sometimes less than helpful: * The name of the root QObject is whatever @name argument got passed to the visitor, except NULL gets mapped to "null". We commonly pass NULL. Not good. Avoiding errors "at the root" mitigates. For instance, visit_start_struct() can only fail when the visited object is not a dictionary, and we commonly ensure it is beforehand. * The name of a QDict's member is the member key. Good enough only when this happens to be unique. * The name of a QList's member is "null". Not good. Improve error messages by referring to nodes by path instead, as follows: * The path of the root QObject is whatever @name argument got passed to the visitor, except NULL gets mapped to "<anonymous>". * The path of a root QDict's member is the member key. * The path of a root QList's member is "[%u]", where %u is the list index, starting at zero. * The path of a non-root QDict's member is the path of the QDict concatenated with "." and the member key. * The path of a non-root QList's member is the path of the QList concatenated with "[%u]", where %u is the list index. For example, the incorrect QMP command { "execute": "blockdev-add", "arguments": { "node-name": "foo", "driver": "raw", "file": {"driver": "file" } } } now fails with {"error": {"class": "GenericError", "desc": "Parameter 'file.filename' is missing"}} instead of {"error": {"class": "GenericError", "desc": "Parameter 'filename' is missing"}} and { "execute": "input-send-event", "arguments": { "device": "bar", "events": [ [] ] } } now fails with {"error": {"class": "GenericError", "desc": "Invalid parameter type for 'events[0]', expected: object"}} instead of {"error": {"class": "GenericError", "desc": "Invalid parameter type for 'null', expected: QDict"}} Aside: calling the thing "parameter" is suboptimal for QMP, because the root object is "arguments" there. The qobject output visitor doesn't have this problem because it should not fail. Same for dealloc and clone visitors. The string visitors don't have this problem because they visit just one value, whose name needs to be passed to the visitor as @name. The string output visitor shouldn't fail anyway. The options visitor uses QemuOpts names. Their name space is flat, so the use of QDict member keys as names is fine. NULL names used with roots and lists could conceivably result in bad error messages. Left for another day. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1488544368-30622-15-git-send-email-armbru@redhat.com>
* | qmp: Eliminate silly QERR_QMP_* macrosMarkus Armbruster2017-03-051-9/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The QERR_ macros are leftovers from the days of "rich" error objects. QERR_QMP_BAD_INPUT_OBJECT, QERR_QMP_BAD_INPUT_OBJECT_MEMBER, QERR_QMP_EXTRA_MEMBER are used in just one place now, except for one use that has crept into qobject-input-visitor.c. Drop these macros, to make the (bad) error messages more visible. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1488544368-30622-10-git-send-email-armbru@redhat.com>
* | qapi: Support multiple command registries per programMarkus Armbruster2017-03-051-8/+14
| | | | | | | | | | | | | | | | | | | | | | The command registry encapsulates a single command list. Give the functions using it a parameter instead. Define suitable command lists in monitor, guest agent and test-qmp-commands. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1488544368-30622-6-git-send-email-armbru@redhat.com> [Debugging turds buried] Reviewed-by: Eric Blake <eblake@redhat.com>
* | qmp: Dumb down how we run QMP command registrationMarkus Armbruster2017-03-052-2/+1Star
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The way we get QMP commands registered is high tech: * qapi-commands.py generates qmp_init_marshal() that does the actual work * it also generates the magic to register it as a MODULE_INIT_QAPI function, so it runs when someone calls module_call_init(MODULE_INIT_QAPI) * main() calls module_call_init() QEMU needs to register a few non-qapified commands. Same high tech works: monitor.c has its own qmp_init_marshal() along with the magic to make it run in module_call_init(MODULE_INIT_QAPI). QEMU also needs to unregister commands that are not wanted in this build's configuration (commit 5032a16). Simple enough: qmp_unregister_commands_hack(). The difficulty is to make it run after the generated qmp_init_marshal(). We can't simply run it in monitor.c's qmp_init_marshal(), because the order in which the registered functions run is indeterminate. So qmp_init_marshal() registers qmp_unregister_commands_hack() separately. Since registering *appends* to the list of registered functions, this will make it run after all the functions that have been registered already. I suspect it takes a long and expensive computer science education to not find this silly. Dumb it down as follows: * Drop MODULE_INIT_QAPI entirely * Give the generated qmp_init_marshal() external linkage. * Call it instead of module_call_init(MODULE_INIT_QAPI) * Except in QEMU proper, call new monitor_init_qmp_commands() that in turn calls the generated qmp_init_marshal(), registers the additional commands and unregisters the unwanted ones. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1488544368-30622-5-git-send-email-armbru@redhat.com>