summaryrefslogtreecommitdiffstats
path: root/memory.c
Commit message (Collapse)AuthorAgeFilesLines
* include/qemu/osdep.h: Don't include qapi/error.hMarkus Armbruster2016-03-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 57cb38b included qapi/error.h into qemu/osdep.h to get the Error typedef. Since then, we've moved to include qemu/osdep.h everywhere. Its file comment explains: "To avoid getting into possible circular include dependencies, this file should not include any other QEMU headers, with the exceptions of config-host.h, compiler.h, os-posix.h and os-win32.h, all of which are doing a similar job to this file and are under similar constraints." qapi/error.h doesn't do a similar job, and it doesn't adhere to similar constraints: it includes qapi-types.h. That's in excess of 100KiB of crap most .c files don't actually need. Add the typedef to qemu/typedefs.h, and include that instead of qapi/error.h. Include qapi/error.h in .c files that need it and don't get it now. Include qapi-types.h in qom/object.h for uint16List. Update scripts/clean-includes accordingly. Update it further to match reality: replace config.h by config-target.h, add sysemu/os-posix.h, sysemu/os-win32.h. Update the list of includes in the qemu/osdep.h comment quoted above similarly. This reduces the number of objects depending on qapi/error.h from "all of them" to less than a third. Unfortunately, the number depending on qapi-types.h shrinks only a little. More work is needed for that one. Signed-off-by: Markus Armbruster <armbru@redhat.com> [Fix compilation without the spice devel packages. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* trace: separate MMIO tracepoints from TB-access tracepointsHollis Blanchard2016-03-141-0/+30
| | | | | | | | | | | Memory accesses to code which has previously been translated into a TB show up in the MMIO path, so that they may invalidate the TB. It's extremely confusing to mix those in with device MMIOs, so split them into their own tracepoint. Signed-off-by: Hollis Blanchard <hollis_blanchard@mentor.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1456949575-1633-2-git-send-email-hollis_blanchard@mentor.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* trace: include CPU index in trace_memory_region_*()Hollis Blanchard2016-03-141-12/+20
| | | | | | | | | | | | | | Knowing which CPU performed an action is essential for understanding SMP guest behavior. However, cpu_physical_memory_rw() may be executed by a machine init function, before any VCPUs are running, when there is no CPU running ('current_cpu' is NULL). In this case, store -1 in the trace record as the CPU index. Trace analysis tools may need to be aware of this special case. Signed-off-by: Hollis Blanchard <hollis_blanchard@mentor.com> Message-id: 1456949575-1633-1-git-send-email-hollis_blanchard@mentor.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* exec: Pass RAMBlock pointer to qemu_ram_freeFam Zheng2016-03-071-2/+2
| | | | | | | | | | The only caller now knows exactly which RAMBlock to free, so it's not necessary to do the lookup. Reviewed-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <1456813104-25902-6-git-send-email-famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* memory: Drop MemoryRegion.ram_addrFam Zheng2016-03-071-44/+27Star
| | | | | | | | | | | | All references to mr->ram_addr are replaced by memory_region_get_ram_addr(mr) (except for a few assertions that are replaced with mr->ram_block). Reviewed-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <1456813104-25902-5-git-send-email-famz@redhat.com> Acked-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* memory: Implement memory_region_get_ram_addr with mr->ram_blockFam Zheng2016-03-071-0/+5
| | | | | | Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <1456813104-25902-4-git-send-email-famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* memory: Move assignment to ram_block to memory_region_init_*Fam Zheng2016-03-071-0/+5
| | | | | | | | | | | | | We don't force "const" qualifiers with pointers in QEMU, but it's still good to keep a clean function interface. Assigning to mr->ram_block is in this sense ugly - one initializer mutating its owning object's state. Move it to memory_region_init_*, where mr->ram_addr is assigned. Reviewed-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <1456813104-25902-3-git-send-email-famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* exec: Return RAMBlock pointer from allocating functionsFam Zheng2016-03-071-5/+20
| | | | | | | | | | | | Previously we return RAMBlock.offset; now return the pointer to the whole structure. ram_block_add returns void now, error is completely passed with errp. Reviewed-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <1456813104-25902-2-git-send-email-famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* trace: use addresses instead of offsets in memory tracepointsHollis Blanchard2016-03-011-12/+32
| | | | | | | | | | | | | When memory_region_ops tracepoints are enabled, calculate and record the absolute address being accessed. Otherwise, we only get offsets into the memory region instead of addresses. [Fixed "offset" -> "addr" in trace event format strings. --Stefan] Signed-off-by: Hollis Blanchard <hollis_blanchard@mentor.com> Message-id: 1454976185-30095-3-git-send-email-hollis_blanchard@mentor.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* trace: split subpage MMIOs into their own trace events.Hollis Blanchard2016-03-011-6/+30
| | | | | | | | | | | | Previously, a single MMIO could trigger the memory_region_ops tracepoint twice: once on its way into subpage ops, then later on its way into the model's ops. Also, the fields previously called "addr" are actually offsets into the memory region. Rename them to "offset" while we're editing the tracepoint definitions. Signed-off-by: Hollis Blanchard <hollis_blanchard@mentor.com> Message-id: 1454976185-30095-2-git-send-email-hollis_blanchard@mentor.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* memory: optimize qemu_get_ram_ptr and qemu_ram_ptr_lengthGonglei2016-02-251-1/+1
| | | | | | | | | | | | | | these two functions consume too much cpu overhead to find the RAMBlock by ram address. After this patch, we can pass the RAMBlock pointer to them so that they don't need to find the RAMBlock anymore most of the time. We can get better performance in address translation processing. Signed-off-by: Gonglei <arei.gonglei@huawei.com> Message-Id: <1455935721-8804-3-git-send-email-arei.gonglei@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* exec: store RAMBlock pointer into memory regionGonglei2016-02-251-0/+1
| | | | | | | | | | | | | | | | | Each RAM memory region has a unique corresponding RAMBlock. In the current realization, the memory region only stored the ram_addr which means the offset of RAM address space, We need to qurey the global ram.list to find the ram block by ram_addr if we want to get the ram block, which is very expensive. Now, we store the RAMBlock pointer into memory region structure. So, if we know the mr, we can easily get the RAMBlock. Signed-off-by: Gonglei <arei.gonglei@huawei.com> Message-Id: <1456130097-4208-2-git-send-email-arei.gonglei@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* qom: Swap 'name' next to visitor in ObjectPropertyAccessorEric Blake2016-02-081-8/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Similar to the previous patch, it's nice to have all functions in the tree that involve a visitor and a name for conversion to or from QAPI to consistently stick the 'name' parameter next to the Visitor parameter. Done by manually changing include/qom/object.h and qom/object.c, then running this Coccinelle script and touching up the fallout (Coccinelle insisted on adding some trailing whitespace). @ rule1 @ identifier fn; typedef Object, Visitor, Error; identifier obj, v, opaque, name, errp; @@ void fn - (Object *obj, Visitor *v, void *opaque, const char *name, + (Object *obj, Visitor *v, const char *name, void *opaque, Error **errp) { ... } @@ identifier rule1.fn; expression obj, v, opaque, name, errp; @@ fn(obj, v, - opaque, name, + name, opaque, errp) Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1454075341-13658-20-git-send-email-eblake@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
* qapi: Swap visit_* arguments for consistent 'name' placementEric Blake2016-02-081-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | JSON uses "name":value, but many of our visitor interfaces were called with visit_type_FOO(v, &value, name, errp). This can be a bit confusing to have to mentally swap the parameter order to match JSON order. It's particularly bad for visit_start_struct(), where the 'name' parameter is smack in the middle of the otherwise-related group of 'obj, kind, size' parameters! It's time to do a global swap of the parameter ordering, so that the 'name' parameter is always immediately after the Visitor argument. Additional reason in favor of the swap: the existing include/qjson.h prefers listing 'name' first in json_prop_*(), and I have plans to unify that file with the qapi visitors; listing 'name' first in qapi will minimize churn to the (admittedly few) qjson.h clients. Later patches will then fix docs, object.h, visitor-impl.h, and those clients to match. Done by first patching scripts/qapi*.py by hand to make generated files do what I want, then by running the following Coccinelle script to affect the rest of the code base: $ spatch --sp-file script `git grep -l '\bvisit_' -- '**/*.[ch]'` I then had to apply some touchups (Coccinelle insisted on TAB indentation in visitor.h, and botched the signature of visit_type_enum() by rewriting 'const char *const strings[]' to the syntactically invalid 'const char*const[] strings'). The movement of parameters is sufficient to provoke compiler errors if any callers were missed. // Part 1: Swap declaration order @@ type TV, TErr, TObj, T1, T2; identifier OBJ, ARG1, ARG2; @@ void visit_start_struct -(TV v, TObj OBJ, T1 ARG1, const char *name, T2 ARG2, TErr errp) +(TV v, const char *name, TObj OBJ, T1 ARG1, T2 ARG2, TErr errp) { ... } @@ type bool, TV, T1; identifier ARG1; @@ bool visit_optional -(TV v, T1 ARG1, const char *name) +(TV v, const char *name, T1 ARG1) { ... } @@ type TV, TErr, TObj, T1; identifier OBJ, ARG1; @@ void visit_get_next_type -(TV v, TObj OBJ, T1 ARG1, const char *name, TErr errp) +(TV v, const char *name, TObj OBJ, T1 ARG1, TErr errp) { ... } @@ type TV, TErr, TObj, T1, T2; identifier OBJ, ARG1, ARG2; @@ void visit_type_enum -(TV v, TObj OBJ, T1 ARG1, T2 ARG2, const char *name, TErr errp) +(TV v, const char *name, TObj OBJ, T1 ARG1, T2 ARG2, TErr errp) { ... } @@ type TV, TErr, TObj; identifier OBJ; identifier VISIT_TYPE =~ "^visit_type_"; @@ void VISIT_TYPE -(TV v, TObj OBJ, const char *name, TErr errp) +(TV v, const char *name, TObj OBJ, TErr errp) { ... } // Part 2: swap caller order @@ expression V, NAME, OBJ, ARG1, ARG2, ERR; identifier VISIT_TYPE =~ "^visit_type_"; @@ ( -visit_start_struct(V, OBJ, ARG1, NAME, ARG2, ERR) +visit_start_struct(V, NAME, OBJ, ARG1, ARG2, ERR) | -visit_optional(V, ARG1, NAME) +visit_optional(V, NAME, ARG1) | -visit_get_next_type(V, OBJ, ARG1, NAME, ERR) +visit_get_next_type(V, NAME, OBJ, ARG1, ERR) | -visit_type_enum(V, OBJ, ARG1, ARG2, NAME, ERR) +visit_type_enum(V, NAME, OBJ, ARG1, ARG2, ERR) | -VISIT_TYPE(V, OBJ, NAME, ERR) +VISIT_TYPE(V, NAME, OBJ, ERR) ) Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1454075341-13658-19-git-send-email-eblake@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
* all: Clean up includesPeter Maydell2016-02-041-1/+1
| | | | | | | | | | Clean up includes so that osdep.h is included first and headers which it implies are not included manually. This commit was created with scripts/clean-includes. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 1454089805-5470-16-git-send-email-peter.maydell@linaro.org
* memory: Add address_space_init_shareable()Peter Crosthwaite2016-01-211-0/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | This will either create a new AS or return a pointer to an already existing equivalent one, if we have already created an AS for the specified root memory region. The motivation is to reuse address spaces as much as possible. It's going to be quite common that bus masters out in device land have pointers to the same memory region for their mastering yet each will need to create its own address space. Let the memory API implement sharing for them. Aside from the perf optimisations, this should reduce the amount of redundant output on info mtree as well. Thee returned value will be malloced, but the malloc will be automatically freed when the AS runs out of refs. Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com> [PMM: dropped check for NULL root as unused; added doc-comment; squashed Peter C's reference-counting patch into this one; don't compare name string when deciding if we can share ASes; read as->malloced before the unref of as->root to avoid possible read-after-free if as->root was the owner of as] Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Acked-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
* memory: inline a few small accessorsPaolo Bonzini2015-12-171-20/+0Star
| | | | | | These are used in the address_space_* fast paths. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* memory: avoid unnecessary object_ref/unrefPaolo Bonzini2015-12-171-16/+12Star
| | | | | | | | | For the common case of DMA into non-hotplugged RAM, it is unnecessary but expensive to do object_ref/unref. Add back an owner field to MemoryRegion, so that these memory regions can skip the reference counting. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* exec: always call qemu_get_ram_ptr within rcu_read_lockPaolo Bonzini2015-12-171-4/+10
| | | | | | | Simplify the code and document the assumption. The only caller that is not within rcu_read_lock is memory_region_get_ram_ptr. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* memory: emulate ioeventfdPavel Fedin2015-12-171-0/+42
| | | | | | | | | | | | | | | | | | | | | | The ioeventfd mechanism is used by vhost, dataplane, and virtio-pci to turn guest MMIO/PIO writes into eventfd file descriptor events. This allows arbitrary threads to be notified when the guest writes to a specific MMIO/PIO address. qtest and TCG do not support ioeventfd because memory writes are not checked against registered ioeventfds in QEMU. This patch implements this in memory_region_dispatch_write() so qtest can use ioeventfd. Also this patch fixes vhost aborting on some misconfigured old kernels like 3.18.0 on ARM. It is possible to explicitly enable CONFIG_EVENTFD in expert settings, while MMIO binding support in KVM will still be missing. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Pavel Fedin <p.fedin@samsung.com> Message-Id: <006e01d12377$0b9c2d40$22d487c0$@samsung.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* memory: Eliminate memory_region_destructor_ram_from_ptr()Eduardo Habkost2015-12-171-6/+1Star
| | | | | | | | | The function is equivalent to memory_region_destructor_ram(), so it's not needed anymore. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <1446844805-14492-3-git-send-email-ehabkost@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* exec: Eliminate qemu_ram_free_from_ptr()Eduardo Habkost2015-12-171-1/+1
| | | | | | | | | | | | | | | | | Replace qemu_ram_free_from_ptr() with qemu_ram_free(). The only difference between qemu_ram_free_from_ptr() and qemu_ram_free() is that g_free_rcu() is used instead of call_rcu(reclaim_ramblock). We can safely replace it because: * RAM blocks allocated by qemu_ram_alloc_from_ptr() always have RAM_PREALLOC set; * reclaim_ramblock(block) will do nothing except g_free(block) if RAM_PREALLOC is set at block->flags. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <1446844805-14492-2-git-send-email-ehabkost@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* memory: don't try to adjust endianness for zero length eventfdJason Wang2015-11-121-2/+6
| | | | | | | | | | | | | | | There's no need to adjust endianness for zero length eventfd since the data wrote was actually ignored by kernel. So skip the adjust in this case to fix a possible crash when trying to use wildcard mmio eventfd in ppc. Cc: Greg Kurz <gkurz@linux.vnet.ibm.com> Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* memory: call begin, log_start and commit when registering a new listenerPaolo Bonzini2015-11-041-0/+9
| | | | | | | | | | | | | | | | This ensures that cpu_reload_memory_map() is called as soon as tcg_cpu_address_space_init() is called, and before cpu->memory_dispatch is used. qemu-system-s390x never changes the address spaces after tcg_cpu_address_space_init() is called, and thus tcg_commit() is never called. This causes a SIGSEGV. Because memory_map_init() will now call mem_commit(), we have to initialize io_mem_* before address_space_memory and friends. Reported-by: Philipp Kern <pkern@debian.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Fixes: 0a1c71cec63e95f9b8d0dc96d049d2daa00c5210 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* memory: allow destroying a non-empty MemoryRegionPaolo Bonzini2015-10-091-1/+16
| | | | | | | | | | | | | | This is legal; the MemoryRegion will simply unreference all the existing subregions and possibly bring them down with it as well. However, it requires a bit of care to avoid an infinite loop. Finalizing a memory region cannot trigger an address space update, but memory_region_del_subregion errs on the side of caution and might trigger a spurious update: avoid that by resetting mr->enabled first. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1443689999-12182-2-git-send-email-armbru@redhat.com>
* memory: Allow replay of IOMMU mapping notificationsDavid Gibson2015-10-051-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we have guest visible IOMMUs, we allow notifiers to be registered which will be informed of all changes to IOMMU mappings. This is used by vfio to keep the host IOMMU mappings in sync with guest IOMMU mappings. However, unlike with a memory region listener, an iommu notifier won't be told about any mappings which already exist in the (guest) IOMMU at the time it is registered. This can cause problems if hotplugging a VFIO device onto a guest bus which had existing guest IOMMU mappings, but didn't previously have an VFIO devices (and hence no host IOMMU mappings). This adds a memory_region_iommu_replay() function to handle this case. It replays any existing mappings in an IOMMU memory region to a specified notifier. Because the IOMMU memory region doesn't internally remember the granularity of the guest IOMMU it has a small hack where the caller must specify a granularity at which to replay mappings. If there are finer mappings in the guest IOMMU these will be reported in the iotlb structures passed to the notifier which it must handle (probably causing it to flag an error). This isn't new - the VFIO iommu notifier must already handle notifications about guest IOMMU mappings too short for it to represent in the host IOMMU. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
* memory: Fix bad error handling in memory_region_init_ram_ptr()Markus Armbruster2015-09-181-1/+1
| | | | | | | | | | | | | | | | | Commit ef701d7 screwed up handling of out-of-memory conditions. Before the commit, we report the error and exit(1), in one place. The commit lifts the error handling up the call chain some, to three places. Fine. Except it uses &error_abort in these places, changing the behavior from exit(1) to abort(), and thus undoing the work of commit 3922825 "exec: Don't abort when we can't allocate guest memory". The previous two commits fixed one of the three places, another one was fixed in commit 33e0eb5. This commit fixes the third one. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1441983105-26376-5-git-send-email-armbru@redhat.com> Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
* Merge memory_region_init_reservation() into memory_region_init_io()Pavel Fedin2015-08-131-9/+1Star
| | | | | | | | | | | Just specifying ops = NULL in some cases can be more convenient than having two functions. Signed-off-by: Pavel Fedin <p.fedin@samsung.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 78a379ab1b6b30ab497db7971ad336dad1dbee76.1438758065.git.p.fedin@samsung.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* memory: do not add a reference to the owner of aliased regionsPaolo Bonzini2015-07-271-7/+0Star
| | | | | | | | | | | | | | | | | | | | Very often the owner of the aliased region is the same as the owner of the alias region itself. When this happens, the reference count can never go back to 0 and the owner is leaked. This is for example breaking hot-unplug of virtio-pci devices (the device cannot be plugged back again with the same id). Another common use for alias is to transform the system I/O address space into an MMIO regions; in this case the aliased region never dies, so there is no problem. Otherwise the owner is always the same for aliasing and aliased region. I checked all calls to memory_region_init_alias introduced after commit dfde4e6 (memory: add ref/unref calls, 2013-05-06) and they do not need the reference in order to keep the owner of the aliased region alive. Reported-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* memory: count number of active VGA logging clientsPaolo Bonzini2015-07-241-0/+7
| | | | | | | | | | For a board that has multiple framebuffer devices, both of them might want to use DIRTY_MEMORY_VGA on the same memory region. The lack of reference counting in memory_region_set_log makes this very awkward to implement. Suggested-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* memory: fix refcount leak in memory_region_presentPaolo Bonzini2015-07-161-16/+28
| | | | | | | | | | | | | | | memory_region_present() leaks a reference to a MemoryRegion in the case "mr == container". While fixing it, avoid reference counting altogether for memory_region_present(), by using RCU only. The return value could in principle be already invalid immediately after memory_region_present returns, but presumably the caller knows that and it's using memory_region_present to probe for devices that are unpluggable, or something like that. The RCU critical section is needed anyway, because it protects as->current_map. Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* exec: pull qemu_flush_coalesced_mmio_buffer() into address_space_rw/ld*/st*Paolo Bonzini2015-07-011-12/+0Star
| | | | | | | | | As memory_region_read/write_accessor will now be run also without BQL held, we need to move coalesced MMIO flushing earlier in the dispatch process. Cc: Frederic Konrad <fred.konrad@greensocs.com> Message-Id: <1434646046-27150-5-git-send-email-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* memory: Add global-locking property to memory regionsJan Kiszka2015-07-011-0/+11
| | | | | | | | | | | | This introduces the memory region property "global_locking". It is true by default. By setting it to false, a device model can request BQL-free dispatching of region accesses to its r/w handlers. The actual BQL break-up will be provided in a separate patch. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Cc: Frederic Konrad <fred.konrad@greensocs.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1434646046-27150-4-git-send-email-pbonzini@redhat.com>
* memory: use mr->ram_addr in "is this RAM?" assertionsPaolo Bonzini2015-06-051-8/+10
| | | | | | | | | | | | | | | | | | mr->terminates alone doesn't guarantee that we are looking at a RAM region. mr->ram_addr also has to be checked, in order to distinguish RAM and I/O regions. So, do the following: 1) add a new define RAM_ADDR_INVALID, and test it in the assertions instead of mr->terminates 2) IOMMU regions were not setting mr->ram_addr to a bogus value, initialize it in the instance_init function so that the new assertions would fire for IOMMU regions as well. Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* memory: replace cpu_physical_memory_reset_dirty() with test-and-clearStefan Hajnoczi2015-06-051-7/+4Star
| | | | | | | | | | | | | | | The cpu_physical_memory_reset_dirty() function is sometimes used together with cpu_physical_memory_get_dirty(). This is not atomic since two separate accesses to the dirty memory bitmap are made. Turn cpu_physical_memory_reset_dirty() and cpu_physical_memory_clear_dirty_range_type() into the atomic cpu_physical_memory_test_and_clear_dirty(). Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <1417519399-3166-6-git-send-email-stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* exec: pass client mask to cpu_physical_memory_set_dirty_rangePaolo Bonzini2015-06-051-1/+2
| | | | | | | | This cuts in half the cost of bitmap operations (which will become more expensive when made atomic) during migration on non-VRAM regions. Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* memory: include DIRTY_MEMORY_MIGRATION in the dirty log maskPaolo Bonzini2015-06-051-2/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | The separate handling of DIRTY_MEMORY_MIGRATION, which does not call log_start/log_stop callbacks when it changes in a region's dirty logging mask, has caused several bugs. One recent example is commit 4cc856f (kvm-all: Sync dirty-bitmap from kvm before kvm destroy the corresponding dirty_bitmap, 2015-04-02). Another performance problem is that KVM keeps tracking dirty pages after a failed live migration, which causes bad performance due to disallowing huge page mapping. This patch removes the root cause of the problem by reporting DIRTY_MEMORY_MIGRATION changes via log_start and log_stop. Note that we now have to rebuild the FlatView when global dirty logging is enabled or disabled; this ensures that log_start and log_stop callbacks are invoked. This will also be used to make the setting of bitmaps conditional. In general, this patch lets users of the memory API ignore the global state of dirty logging if they handle dirty logging generically per region. Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* memory: track DIRTY_MEMORY_CODE in mr->dirty_log_maskPaolo Bonzini2015-06-051-0/+4
| | | | | | | | | DIRTY_MEMORY_CODE is only needed for TCG. By adding it directly to mr->dirty_log_mask, we avoid testing for TCG everywhere a region is checked for the enabled/disabled state of dirty logging. Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* memory: prepare for multiple bits in the dirty log maskPaolo Bonzini2015-06-051-6/+11
| | | | | | | | | | | | | | | | | | When the dirty log mask will also cover other bits than DIRTY_MEMORY_VGA, some listeners may be interested in the overall zero/non-zero value of the dirty log mask; others may be interested in the value of single bits. For this reason, always call log_start/log_stop if bits have respectively appeared or disappeared, and pass the old and new values of the dirty log mask so that listeners can distinguish the kinds of change. For example, KVM checks if dirty logging used to be completely disabled (in log_start) or is now completely disabled (in log_stop). On the other hand, Xen has to check manually if DIRTY_MEMORY_VGA changed, since that is the only bit it cares about. Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* memory: differentiate memory_region_is_logging and ↵Paolo Bonzini2015-06-051-1/+6
| | | | | | | | | | | | | | | | | | memory_region_get_dirty_log_mask For now memory regions only track DIRTY_MEMORY_VGA individually, but this will change soon. To support this, split memory_region_is_logging in two functions: one that returns a given bit from dirty_log_mask, and one that returns the entire mask. memory_region_is_logging gets an extra parameter so that the compiler flags misuse. While VGA-specific users (including the Xen listener!) will want to keep checking that bit, KVM and vhost check for "any bit except migration" (because migration is handled via the global start/stop listener callbacks). Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* memory: the only dirty memory flag for users is DIRTY_MEMORY_VGAPaolo Bonzini2015-06-051-0/+1
| | | | | | | | | | | DIRTY_MEMORY_MIGRATION is triggered by memory_global_dirty_log_start and memory_global_dirty_log_stop, so it cannot be used with memory_region_set_log. Specify this in the documentation and assert it. Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* mtree: also print disabled regionsGerd Hoffmann2015-04-301-5/+7
| | | | | Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* mtree: tag & indent a bit betterGerd Hoffmann2015-04-301-5/+6
| | | | | Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into stagingPeter Maydell2015-04-301-0/+7
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - miscellaneous cleanups for TCG (Emilio) and NBD (Bogdan) - next part in the thread-safe address_space_* saga: atomic access to the bounce buffer and the map_clients list, from Fam - optional support for linking with tcmalloc, also from Fam - reapplying Peter Crosthwaite's "Respect as_translate_internal length clamp" after fixing the SPARC fallout. - build system fix from Wei Liu - small acpi-build and ioport cleanup by myself # gpg: Signature made Wed Apr 29 09:34:00 2015 BST using RSA key ID 78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: (22 commits) nbd/trivial: fix type cast for ioctl translate-all: use bitmap helpers for PageDesc's bitmap target-i386: disable LINT0 after reset Makefile.target: prepend $libs_softmmu to $LIBS milkymist: do not modify libs-softmmu configure: Add support for tcmalloc exec: Respect as_translate_internal length clamp ioport: reserve the whole range of an I/O port in the AddressSpace ioport: loosen assertions on emulation of 16-bit ports ioport: remove wrong comment ide: there is only one data port gus: clean up MemoryRegionPortio sb16: remove useless mixer_write_indexw sun4m: fix slavio sysctrl and led register sizes acpi-build: remove dependency from ram_addr.h memory: add memory_region_ram_resize dma-helpers: Fix race condition of continue_after_map_failure and dma_aio_cancel exec: Notify cpu_register_map_client caller if the bounce buffer is available exec: Protect map_client_list with mutex linux-user, bsd-user: Remove two calls to cpu_exec_init_all ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * memory: add memory_region_ram_resizePaolo Bonzini2015-04-271-0/+7
| | | | | | | | | | | | This is a simple MemoryRegion wrapper for qemu_ram_resize. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | memory: Replace io_mem_read/write with memory_region_dispatch_read/writePeter Maydell2015-04-261-23/+10Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than retaining io_mem_read/write as simple wrappers around the memory_region_dispatch_read/write functions, make the latter public and change all the callers to use them, since we need to touch all the callsites anyway to add MemTxAttrs and MemTxResult support. Delete io_mem_read and io_mem_write entirely. (All the callers currently pass MEMTXATTRS_UNSPECIFIED and convert the return value back to bool or ignore it.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
* | memory: Define API for MemoryRegionOps to take attrs and return statusPeter Maydell2015-04-261-67/+140
|/ | | | | | | | | | | | | | | | | | | | | | | Define an API so that devices can register MemoryRegionOps whose read and write callback functions are passed an arbitrary pointer to some transaction attributes and can return a success-or-failure status code. This will allow us to model devices which: * behave differently for ARM Secure/NonSecure memory accesses * behave differently for privileged/unprivileged accesses * may return a transaction failure (causing a guest exception) for erroneous accesses This patch defines the new API and plumbs the attributes parameter through to the memory.c public level functions io_mem_read() and io_mem_write(), where it is currently dummied out. The success/failure response indication is also propagated out to io_mem_read() and io_mem_write(), which retain the old-style boolean true-for-error return. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
* memory: Move owner-less MemoryRegions to /machine/unattachedAndreas Färber2015-03-171-1/+1
| | | | | | | | | | | | This cleans up the official /machine namespace. In particular /machine/system[0] and /machine/io[0], as well as entries with non-sanitized node names such as "/machine/qemu extended regs[0]". The actual MemoryRegion names remain unchanged. Acked-by: Paolo Bonzini <pbonzini@redhat.com> Tested-by: Alistair Francis <alistair.francis@xilinx.com> Signed-off-by: Andreas Färber <afaerber@suse.de>
* memory: keep the owner of the AddressSpace alive until do_address_space_destroyPaolo Bonzini2015-02-111-0/+5
| | | | | | | | | This fixes a use-after-free if do_address_space_destroy is executed too late. Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com> Tested-by: Michael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* memory: unregister AddressSpace MemoryListener within BQLPaolo Bonzini2015-02-101-0/+1
| | | | | | | | | | | | | address_space_destroy_dispatch is called from an RCU callback and hence outside the iothread mutex (BQL). However, after address_space_destroy no new accesses can hit the destroyed AddressSpace so it is not necessary to observe changes to the memory map. Move the memory_listener_unregister call earlier, to make it thread-safe again. Reported-by: Alex Williamson <alex.williamson@redhat.com> Fixes: 374f2981d1f10bc4307f250f24b2a7ddb9b14be0 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>