summaryrefslogtreecommitdiffstats
path: root/include/exec/ram_addr.h
Commit message (Collapse)AuthorAgeFilesLines
* memory: alloc RAM from file at offsetJagannathan Raman2021-02-091-2/+2
| | | | | | | | | | | | | Allow RAM MemoryRegion to be created from an offset in a file, instead of allocating at offset of 0 by default. This is needed to synchronize RAM between QEMU & remote process. Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Signed-off-by: John G Johnson <john.g.johnson@oracle.com> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 609996697ad8617e3b01df38accc5c208c24d74e.1611938319.git.jag.raman@oracle.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* memory: add readonly support to memory_region_init_ram_from_file()Stefan Hajnoczi2021-02-011-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is currently no way to open(O_RDONLY) and mmap(PROT_READ) when creating a memory region from a file. This functionality is needed since the underlying host file may not allow writing. Add a bool readonly argument to memory_region_init_ram_from_file() and the APIs it calls. Extend memory_region_init_ram_from_file() rather than introducing a memory_region_init_rom_from_file() API so that callers can easily make a choice between read/write and read-only at runtime without calling different APIs. No new RAMBlock flag is introduced for read-only because it's unclear whether RAMBlocks need to know that they are read-only. Pass a bool readonly argument instead. Both of these design decisions can be changed in the future. It just seemed like the simplest approach to me. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20210104171320.575838-2-stefanha@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
* qemu/atomic.h: rename atomic_ to qatomic_Stefan Hajnoczi2020-09-231-12/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | clang's C11 atomic_fetch_*() functions only take a C11 atomic type pointer argument. QEMU uses direct types (int, etc) and this causes a compiler error when a QEMU code calls these functions in a source file that also included <stdatomic.h> via a system header file: $ CC=clang CXX=clang++ ./configure ... && make ../util/async.c:79:17: error: address argument to atomic operation must be a pointer to _Atomic type ('unsigned int *' invalid) Avoid using atomic_*() names in QEMU's atomic.h since that namespace is used by <stdatomic.h>. Prefix QEMU's APIs with 'q' so that atomic.h and <stdatomic.h> can co-exist. I checked /usr/include on my machine and searched GitHub for existing "qatomic_" users but there seem to be none. This patch was generated using: $ git grep -h -o '\<atomic\(64\)\?_[a-z0-9_]\+' include/qemu/atomic.h | \ sort -u >/tmp/changed_identifiers $ for identifier in $(</tmp/changed_identifiers); do sed -i "s%\<$identifier\>%q$identifier%g" \ $(git grep -I -l "\<$identifier\>") done I manually fixed line-wrap issues and misaligned rST tables. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200923105646.47864-1-stefanha@redhat.com>
* migration: Count new_dirty instead of real_dirtyKeqian Zhu2020-07-031-4/+1Star
| | | | | | | | | | | | | | real_dirty_pages becomes equal to total ram size after dirty log sync in ram_init_bitmaps, the reason is that the bitmap of ramblock is initialized to be all set, so old path counts them as "real dirty" at beginning. This causes wrong dirty rate and false positive throttling. Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> Message-Id: <20200622032037.31112-1-zhukeqian1@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
* accel: Move Xen accelerator code under accel/xen/Philippe Mathieu-Daudé2020-06-101-1/+1
| | | | | | | | | | | This code is not related to hardware emulation. Move it under accel/ with the other hypervisors. Reviewed-by: Paul Durrant <paul@xen.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200508100222.7112-1-philmd@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* exec: Rename qemu_ram_writeback() as qemu_ram_msync()Philippe Mathieu-Daudé2020-06-051-2/+2
| | | | | | | | | | | | | Rename qemu_ram_writeback() as qemu_ram_msync() to better match what it does. Suggested-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 20200508062456.23344-5-philmd@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* ram_addr: Split RAMBlock definitionJuan Quintela2020-01-291-39/+1Star
| | | | | | | We need some of the fields without having to poison everything else. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
* Memory: Enable writeback for given memory regionBeata Michalska2019-12-161-0/+8
| | | | | | | | | | | Add an option to trigger memory writeback to sync given memory region with the corresponding backing store, case one is available. This extends the support for persistent memory, allowing syncing on-demand. Signed-off-by: Beata Michalska <beata.michalska@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20191121000843.24844-3-beata.michalska@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* core: replace getpagesize() with qemu_real_host_page_sizeWei Yang2019-10-261-1/+1
| | | | | | | | | | | | | | | | | | | | | There are three page size in qemu: real host page size host page size target page size All of them have dedicate variable to represent. For the last two, we use the same form in the whole qemu project, while for the first one we use two forms: qemu_real_host_page_size and getpagesize(). qemu_real_host_page_size is defined to be a replacement of getpagesize(), so let it serve the role. [Note] Not fully tested for some arch or device. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191013021145.16011-3-richardw.yang@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* rcu: Use automatic rc_read unlock in core memory/exec codeDr. David Alan Gilbert2019-10-111-73/+65Star
| | | | | | | Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20191007143642.301445-6-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
* migration/postcopy: unsentmap is not necessary for postcopyWei Yang2019-09-251-6/+0Star
| | | | | | | | | | | | | | | | | Commit f3f491fcd6dd594ba695 ('Postcopy: Maintain unsentmap') introduced unsentmap to track not yet sent pages. This is not necessary since: * unsentmap is a sub-set of bmap before postcopy start * unsentmap is the summation of bmap and unsentmap after canonicalizing This patch just removes it. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190819061843.28642-3-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
* include: Make headers more self-containedMarkus Armbruster2019-08-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Back in 2016, we discussed[1] rules for headers, and these were generally liked: 1. Have a carefully curated header that's included everywhere first. We got that already thanks to Peter: osdep.h. 2. Headers should normally include everything they need beyond osdep.h. If exceptions are needed for some reason, they must be documented in the header. If all that's needed from a header is typedefs, put those into qemu/typedefs.h instead of including the header. 3. Cyclic inclusion is forbidden. This patch gets include/ closer to obeying 2. It's actually extracted from my "[RFC] Baby steps towards saner headers" series[2], which demonstrates a possible path towards checking 2 automatically. It passes the RFC test there. [1] Message-ID: <87h9g8j57d.fsf@blackfin.pond.sub.org> https://lists.nongnu.org/archive/html/qemu-devel/2016-03/msg03345.html [2] Message-Id: <20190711122827.18970-1-armbru@redhat.com> https://lists.nongnu.org/archive/html/qemu-devel/2019-07/msg02715.html Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20190812052359.30071-2-armbru@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
* migration: Split log_clear() into smaller chunksPeter Xu2019-07-151-2/+74
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we are doing log_clear() right after log_sync() which mostly keeps the old behavior when log_clear() was still part of log_sync(). This patch tries to further optimize the migration log_clear() code path to split huge log_clear()s into smaller chunks. We do this by spliting the whole guest memory region into memory chunks, whose size is decided by MigrationState.clear_bitmap_shift (an example will be given below). With that, we don't do the dirty bitmap clear operation on the remote node (e.g., KVM) when we fetch the dirty bitmap, instead we explicitly clear the dirty bitmap for the memory chunk for each of the first time we send a page in that chunk. Here comes an example. Assuming the guest has 64G memory, then before this patch the KVM ioctl KVM_CLEAR_DIRTY_LOG will be a single one covering 64G memory. If after the patch, let's assume when the clear bitmap shift is 18, then the memory chunk size on x86_64 will be 1UL<<18 * 4K = 1GB. Then instead of sending a big 64G ioctl, we'll send 64 small ioctls, each of the ioctl will cover 1G of the guest memory. For each of the 64 small ioctls, we'll only send if any of the page in that small chunk was going to be sent right away. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190603065056.25211-12-peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
* memory: Introduce memory listener hook log_clear()Peter Xu2019-07-151-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a new memory region listener hook log_clear() to allow the listeners to hook onto the points where the dirty bitmap is cleared by the bitmap users. Previously log_sync() contains two operations: - dirty bitmap collection, and, - dirty bitmap clear on remote site. Let's take KVM as example - log_sync() for KVM will first copy the kernel dirty bitmap to userspace, and at the same time we'll clear the dirty bitmap there along with re-protecting all the guest pages again. We add this new log_clear() interface only to split the old log_sync() into two separated procedures: - use log_sync() to collect the collection only, and, - use log_clear() to clear the remote dirty bitmap. With the new interface, the memory listener users will still be able to decide how to implement the log synchronization procedure, e.g., they can still only provide log_sync() method only and put all the two procedures within log_sync() (that's how the old KVM works before KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 is introduced). However with this new interface the memory listener users will start to have a chance to postpone the log clear operation explicitly if the module supports. That can really benefit users like KVM at least for host kernels that support KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2. There are three places that can clear dirty bits in any one of the dirty bitmap in the ram_list.dirty_memory[3] array: cpu_physical_memory_snapshot_and_clear_dirty cpu_physical_memory_test_and_clear_dirty cpu_physical_memory_sync_dirty_bitmap Currently we hook directly into each of the functions to notify about the log_clear(). Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20190603065056.25211-7-peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
* memory: Pass mr into snapshot_and_clear_dirtyPeter Xu2019-07-151-1/+1
| | | | | | | | | | Also we change the 2nd parameter of it to be the relative offset within the memory region. This is to be used in follow up patches. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20190603065056.25211-6-peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
* memory: Don't set migration bitmap when without migrationPeter Xu2019-07-151-1/+11
| | | | | | | | | | | | Similar to 9460dee4b2 ("memory: do not touch code dirty bitmap unless TCG is enabled", 2015-06-05) but for the migration bitmap - we can skip the MIGRATION bitmap update if migration not enabled. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20190603065056.25211-4-peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
* migration: No need to take rcu during sync_dirty_bitmapPeter Xu2019-07-151-4/+1Star
| | | | | | | | | | | | | cpu_physical_memory_sync_dirty_bitmap() has one RAMBlock* as parameter, which means that it must be with RCU read lock held already. Taking it again inside seems redundant. Removing it. Instead comment on the functions about the RCU read lock. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20190603065056.25211-2-peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
* qemu-common: Move tcg_enabled() etc. to sysemu/tcg.hMarkus Armbruster2019-06-111-0/+1
| | | | | | | | | | | | | | | | | | | Other accelerators have their own headers: sysemu/hax.h, sysemu/hvf.h, sysemu/kvm.h, sysemu/whpx.h. Only tcg_enabled() & friends sit in qemu-common.h. This necessitates inclusion of qemu-common.h into headers, which is against the rules spelled out in qemu-common.h's file comment. Move tcg_enabled() & friends into their own header sysemu/tcg.h, and adjust #include directives. Cc: Richard Henderson <rth@twiddle.net> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-2-armbru@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> [Rebased with conflicts resolved automatically, except for accel/tcg/tcg-all.c]
* exec: Introduce qemu_maxrampagesize() and rename qemu_getrampagesize()David Hildenbrand2019-04-251-1/+2
| | | | | | | | | | | | | | | | | | Rename qemu_getrampagesize() to qemu_minrampagesize(). While at it, properly rename find_max_supported_pagesize() to find_min_backend_pagesize(). s390x is actually interested into the maximum ram pagesize, so introduce and use qemu_maxrampagesize(). Add a TODO, indicating that looking at any mapped memory backends is not 100% correct in some cases. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20190417113143.5551-3-david@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
* COLO: Load dirty pages into SVM's RAM cache firstlyZhang Chen2018-10-191-0/+1
| | | | | | | | | | | | | | | | | | | We should not load PVM's state directly into SVM, because there maybe some errors happen when SVM is receving data, which will break SVM. We need to ensure receving all data before load the state into SVM. We use an extra memory to cache these data (PVM's ram). The ram cache in secondary side is initially the same as SVM/PVM's memory. And in the process of checkpoint, we cache the dirty pages of PVM into this ram cache firstly, so this ram cache always the same as PVM's memory at every checkpoint, then we flush this cached ram to SVM after we receive all PVM's state. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
* hostmem-file: add the 'pmem' optionJunyan He2018-08-101-0/+3
| | | | | | | | | | | | | | | | | | | | | When QEMU emulates vNVDIMM labels and migrates vNVDIMM devices, it needs to know whether the backend storage is a real persistent memory, in order to decide whether special operations should be performed to ensure the data persistence. This boolean option 'pmem' allows users to specify whether the backend storage of memory-backend-file is a real persistent memory. If 'pmem=on', QEMU will set the flag RAM_PMEM in the RAM block of the corresponding memory region. If 'pmem' is set while lack of libpmem support, a error is generated. Signed-off-by: Junyan He <junyan.he@intel.com> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* memory, exec: switch file ram allocation functions to 'flags' parametersJunyan He2018-08-101-2/+23
| | | | | | | | | | | | | | | | | | | | | As more flag parameters besides the existing 'share' are going to be added to following functions memory_region_init_ram_from_file qemu_ram_alloc_from_fd qemu_ram_alloc_from_file let's switch them to use the 'flags' parameters so as to ease future flag additions. The existing 'share' flag is converted to the RAM_SHARED bit in ram_flags, and other flag bits are ignored by above functions right now. Signed-off-by: Junyan He <junyan.he@intel.com> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
* move public invalidate APIs out of translate-all.{c,h}, clean upPaolo Bonzini2018-06-281-0/+2
| | | | | | | | | Place them in exec.c, exec-all.h and ram_addr.h. This removes knowledge of translate-all.h (which is an internal header) from several files outside accel/tcg and removes knowledge of AddressSpace from translate-all.c (as it only operates on ram_addr_t). Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* postcopy: drop ram_pages parameter from postcopy_ram_incoming_init()David Hildenbrand2018-06-271-1/+0Star
| | | | | | | | | | Not needed. Don't expose last_ram_page(). Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20180620202736.21399-1-david@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
* mem: add share parameter to memory-backend-ramMarcel Apfelbaum2018-02-191-1/+2
| | | | | | | | | | | | | | | | | | | | | Currently only file backed memory backend can be created with a "share" flag in order to allow sharing guest RAM with other processes in the host. Add the "share" flag also to RAM Memory Backend in order to allow remapping parts of the guest RAM to different host virtual addresses. This is needed by the RDMA devices in order to remap non-contiguous QEMU virtual addresses to a contiguous virtual address range. Moved the "share" flag to the Host Memory base class, modified phys_mem_alloc to include the new parameter and a new interface memory_region_init_ram_shared_nomigrate. There are no functional changes if the new flag is not used. Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
* cpu_physical_memory_sync_dirty_bitmap: Another alignment fixDr. David Alan Gilbert2018-01-161-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | This code has an optimised, word aligned version, and a boring unaligned version. My commit f70d345 fixed one alignment issue, but there's another. The optimised version operates on 'longs' dealing with (typically) 64 pages at a time, replacing the whole long by a 0 and counting the bits. If the Ramblock is less than 64bits in length that long can contain bits representing two different RAMBlocks, but the code will update the bmap belinging to the 1st RAMBlock only while having updated the total dirty page count for both. This probably didn't matter prior to 6b6712ef which split the dirty bitmap by RAMBlock, but now they're separate RAMBlocks we end up with a count that doesn't match the state in the bitmaps. Symptom: Migration showing a few dirty pages left to be sent constantly Seen on aarch64 and x86 with x86+ovmf Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reported-by: Wei Huang <wei@redhat.com> Fixes: 6b6712efccd383b48a909bee0b29e079a57601ec Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* migration: add bitmap for received pageAlexey Perevalov2017-10-231-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | This patch adds ability to track down already received pages, it's necessary for calculation vCPU block time in postcopy migration feature, and for recovery after postcopy migration failure. Also it's necessary to solve shared memory issue in postcopy livemigration. Information about received pages will be transferred to the software virtual bridge (e.g. OVS-VSWITCHD), to avoid fallocate (unmap) for already received pages. fallocate syscall is required for remmaped shared memory, due to remmaping itself blocks ioctl(UFFDIO_COPY, ioctl in this case will end with EEXIT error (struct page is exists after remmap). Bitmap is placed into RAMBlock as another postcopy/precopy related bitmaps. Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
* cpu_physical_memory_sync_dirty_bitmap: Fix alignment checkDr. David Alan Gilbert2017-08-011-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | This code has an optimised, word aligned version, and a boring unaligned version. Recently 084140bd498909 fixed a missing offset addition from the core of both versions. However, the offset isn't necessarily aligned and thus the choice between the two versions needs fixing up to also include the offset. Symptom: A few stuck unsent pages during migration; not normally noticed unless under very low bandwidth in which case the migration may get stuck never ending and never performing a 2nd sync; noticed by a hanging postcopy-test on a very heavily loaded system. Fixes: 084140bd498909 Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reported-by: Alex Benneé <alex.benee@linaro.org> Tested-by: Alex Benneé <alex.benee@linaro.org> -- v2 Move 'page' inside the if (Comment from Paolo) Message-Id: <20170724165125.29887-1-dgilbert@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* exec: fix access to ram_list.dirty_memory when sync dirty bitmapHaozhong Zhang2017-06-281-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | In cpu_physical_memory_sync_dirty_bitmap(rb, start, ...), the 2nd argument 'start' is relative to the start of the ramblock 'rb'. When it's used to access the dirty memory bitmap of ram_list (i.e. ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION]->blocks[]), an offset to the start of all RAM (i.e. rb->offset) should be added to it, which has however been missed since c/s 6b6712efcc. For a ramblock of host memory backend whose offset is not zero, cpu_physical_memory_sync_dirty_bitmap() synchronizes the incorrect part of the dirty memory bitmap of ram_list to the per ramblock dirty bitmap. As a result, a guest with host memory backend may crash after migration. Fix it by adding the offset of ramblock when accessing the dirty memory bitmap of ram_list in cpu_physical_memory_sync_dirty_bitmap(). Reported-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Message-Id: <20170628083704.24997-1-haozhong.zhang@intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Tested-by: Juan Quintela <quintela@redhat.com> Tested-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
* exec: split qemu_ram_alloc_from_file()Marc-André Lureau2017-06-151-0/+3
| | | | | | | | | Add qemu_ram_alloc_from_fd(), which can be use to allocate ramblock from fd only. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20170602141229.15326-4-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* ram: Split dirty bitmap by RAMBlockJuan Quintela2017-05-041-3/+10
| | | | | | | | | | | | Both the ram bitmap and the unsent bitmap are split by RAMBlock. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Reviewed-by: Peter Xu <peterx@redhat.com> -- Fix compilation when DEBUG_POSTCOPY is enabled (thanks Hailiang)
* Merge remote-tracking branch 'remotes/sstabellini/tags/xen-20170421-v2-tag' ↵Peter Maydell2017-04-261-2/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | into staging Xen 2017/04/21 + fix # gpg: Signature made Tue 25 Apr 2017 19:10:37 BST # gpg: using RSA key 0x894F8F4870E1AE90 # gpg: Good signature from "Stefano Stabellini <stefano.stabellini@eu.citrix.com>" # gpg: aka "Stefano Stabellini <sstabellini@kernel.org>" # Primary key fingerprint: D04E 33AB A51F 67BA 07D3 0AEA 894F 8F48 70E1 AE90 * remotes/sstabellini/tags/xen-20170421-v2-tag: (21 commits) move xen-mapcache.c to hw/i386/xen/ move xen-hvm.c to hw/i386/xen/ move xen-common.c to hw/xen/ add xen-9p-backend to MAINTAINERS under Xen xen/9pfs: build and register Xen 9pfs backend xen/9pfs: send responses back to the frontend xen/9pfs: implement in/out_iov_from_pdu and vmarshal/vunmarshal xen/9pfs: receive requests from the frontend xen/9pfs: connect to the frontend xen/9pfs: introduce Xen 9pfs backend 9p: introduce a type for the 9p header xen: import ring.h from xen configure: use pkg-config for obtaining xen version xen: additionally restrict xenforeignmemory operations xen: use libxendevice model to restrict operations xen: use 5 digit xen versions xen: use libxendevicemodel when available configure: detect presence of libxendevicemodel xen: create wrappers for all other uses of xc_hvm_XXX() functions xen: rename xen_modified_memory() to xen_hvm_modified_memory() ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * xen: rename xen_modified_memory() to xen_hvm_modified_memory()Paul Durrant2017-03-221-2/+2
| | | | | | | | | | | | | | | | | | This patch is a purely cosmetic change that avoids a name collision in a subsequent patch. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony Perard <anthony.perard@citrix.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
* | memory: add support getting and using a dirty bitmap copy.Gerd Hoffmann2017-04-241-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for getting and using a local copy of the dirty bitmap. memory_region_snapshot_and_clear_dirty() will create a snapshot of the dirty bitmap for the specified range, clear the dirty bitmap and return the copy. The returned bitmap can be a bit larger than requested, the range is expanded so the code can copy unsigned longs from the bitmap and avoid atomic bit update operations. memory_region_snapshot_get_dirty() will return the dirty status of pages, pretty much like memory_region_get_dirty(), but using the copy returned by memory_region_copy_and_clear_dirty(). Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-id: 20170421091632.30900-3-kraxel@redhat.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
* | ram: Remove migration_bitmap_extend()Juan Quintela2017-04-211-2/+0Star
| | | | | | | | | | | | | | | | | | We have disabled memory hotplug, so we don't need to handle migration_bitamp there. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
* | ram: rename last_ram_offset() last_ram_pages()Juan Quintela2017-04-211-1/+1
| | | | | | | | | | | | | | We always use it as pages anyways. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
* | ram: Pass RAMBlock to bitmap_syncJuan Quintela2017-04-211-0/+2
| | | | | | | | | | | | | | | | We change the meaning of start to be the offset from the beggining of the block. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
* | ram: Change num_dirty_pages_period type to uint64_tJuan Quintela2017-04-211-1/+1
|/ | | | | | Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>
* Change the method to calculate dirty-pages-rateChao Fan2017-03-161-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | In function cpu_physical_memory_sync_dirty_bitmap, file include/exec/ram_addr.h: if (src[idx][offset]) { unsigned long bits = atomic_xchg(&src[idx][offset], 0); unsigned long new_dirty; new_dirty = ~dest[k]; dest[k] |= bits; new_dirty &= bits; num_dirty += ctpopl(new_dirty); } After these codes executed, only the pages not dirtied in bitmap(dest), but dirtied in dirty_memory[DIRTY_MEMORY_MIGRATION] will be calculated. For example: When ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] = 0b00001111, and atomic_rcu_read(&migration_bitmap_rcu)->bmap = 0b00000011, the new_dirty will be 0b00001100, and this function will return 2 but not 4 which is expected. the dirty pages in dirty_memory[DIRTY_MEMORY_MIGRATION] are all new, so these should be calculated also. Signed-off-by: Chao Fan <fanc.fnst@cn.fujitsu.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
* exec, kvm, target-ppc: Move getrampagesize() to common codeAlexey Kardashevskiy2017-03-031-0/+1
| | | | | | | | | | | | | | | | getrampagesize() returns the largest supported page size and mainly used to know if huge pages are enabled. However is implemented in target-ppc/kvm.c and not available in TCG or other architectures. This renames and moves gethugepagesize() to mmap-alloc.c where fd-based analog of it is already implemented. This renames and moves getrampagesize() to exec.c as it seems to be the common place for helpers like this. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* ramblock-notifier: newPaolo Bonzini2017-01-161-44/+2Star
| | | | | | This adds a notify interface of ram block additions and removals. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* RAMBlocks: Store page sizeDr. David Alan Gilbert2016-10-131-0/+1
| | | | | | | | Store the page size in each RAMBlock, we need it later. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
* memory: remove qemu_get_ram_fd, qemu_set_ram_fd, qemu_ram_block_host_ptrPaolo Bonzini2016-05-291-3/+0Star
| | | | | | | | Remove direct uses of ram_addr_t and optimize memory_region_{get,set}_fd now that a MemoryRegion knows its RAMBlock directly. Reviewed-by: Marc-André Lureau <marcandre.lureau@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* memory: drop find_ram_block()Gonglei2016-05-231-1/+1
| | | | | | | | | | | On the one hand, we have already qemu_get_ram_block() whose function is similar. On the other hand, we can directly use mr->ram_block but searching RAMblock by ram_addr which is a kind of waste. Signed-off-by: Gonglei <arei.gonglei@huawei.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-Id: <1462845901-89716-2-git-send-email-arei.gonglei@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* exec: Pass RAMBlock pointer to qemu_ram_freeFam Zheng2016-03-071-1/+1
| | | | | | | | | | The only caller now knows exactly which RAMBlock to free, so it's not necessary to do the lookup. Reviewed-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <1456813104-25902-6-git-send-email-famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* exec: Return RAMBlock pointer from allocating functionsFam Zheng2016-03-071-11/+11
| | | | | | | | | | | | Previously we return RAMBlock.offset; now return the pointer to the whole structure. ram_block_add returns void now, error is completely passed with errp. Reviewed-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <1456813104-25902-2-git-send-email-famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* memory: fix usage of find_next_bit and find_next_zero_bitPaolo Bonzini2016-02-101-19/+36
| | | | | | | | | | | | | | | | | | The last two arguments to these functions are the last and first bit to check relative to the base. The code was using incorrectly the first bit and the number of bits. Fix this in cpu_physical_memory_get_dirty and cpu_physical_memory_all_dirty. This requires a few changes in the iteration; change the code in cpu_physical_memory_set_dirty_range to match. Fixes: 5b82b70 Cc: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Leon Alrae <leon.alrae@imgtec.com> Tested-by: Thomas Huth <thuth@redhat.com> Message-id: 1455113505-11237-1-git-send-email-pbonzini@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* memory: RCU ram_list.dirty_memory[] for safe RAM hotplugStefan Hajnoczi2016-02-091-24/+165
| | | | | | | | | | | | | | | | | | | Although accesses to ram_list.dirty_memory[] use atomics so multiple threads can safely dirty the bitmap, the data structure is not fully thread-safe yet. This patch handles the RAM hotplug case where ram_list.dirty_memory[] is grown. ram_list.dirty_memory[] is change from a regular bitmap to an RCU array of pointers to fixed-size bitmap blocks. Threads can continue accessing bitmap blocks while the array is being extended. See the comments in the code for an in-depth explanation of struct DirtyMemoryBlocks. I have tested that live migration with virtio-blk dataplane works. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <1453728801-5398-2-git-send-email-stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* memory: add early bail out from cpu_physical_memory_set_dirty_rangePaolo Bonzini2016-02-091-0/+4
| | | | | | | | | | This condition is true in the common case, so we can cut out the body of the function. In addition, this makes it easier for the compiler to do at least partial inlining, even if it decides that fully inlining the function is unreasonable. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* ram: Split host_from_stream_offset() into two helper functionszhanghailiang2016-02-051-2/+6
| | | | | | | | | | | | | Split host_from_stream_offset() into two parts: One is to get ram block, which the block idstr may be get from migration stream, the other is to get hva (host) address from block and the offset. Besides, we will do the check working in a new helper offset_in_ramblock(). Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Message-Id: <1452829066-9764-2-git-send-email-zhang.zhanghailiang@huawei.com> Signed-off-by: Amit Shah <amit.shah@redhat.com>