summaryrefslogtreecommitdiffstats
path: root/util/oslib-posix.c
Commit message (Collapse)AuthorAgeFilesLines
* util: Fix broken build on HaikuThomas Huth2022-07-181-4/+0Star
| | | | | | | | | | | | A recent commit moved some Haiku-specific code parts from oslib-posix.c to cutils.c, but failed to move the corresponding header #include statement, too, so "make vm-build-haiku.x86_64" is currently broken. Fix it by moving the header #include, too. Fixes: 06680b15b4 ("include: move qemu_*_exec_dir() to cutils") Message-Id: <20220718172026.139004-1-thuth@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
* include: move qemu_*_exec_dir() to cutilsMarc-André Lureau2022-05-281-84/+2Star
| | | | | | | | | The function is required by get_relocated_path() (already in cutils), and used by qemu-ga and may be generally useful. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20220525144140.591926-2-marcandre.lureau@redhat.com>
* util: rename qemu_*block() socket functionsMarc-André Lureau2022-05-031-4/+4
| | | | | | | | | | | | | | | | | | The qemu_*block() functions are meant to be be used with sockets (the win32 implementation expects SOCKET) Over time, those functions where used with Win32 SOCKET or file-descriptors interchangeably. But for portability, they must only be used with socket-like file-descriptors. FDs can use g_unix_set_fd_nonblocking() instead. Rename the functions with "socket" in the name to prevent bad usages. This is effectively reverting commit f9e8cacc5557e43 ("oslib-posix: rename socket_set_nonblock() to qemu_set_nonblock()"). Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
* Replace fcntl(O_NONBLOCK) with g_unix_set_fd_nonblocking()Marc-André Lureau2022-05-031-14/+2Star
| | | | | | Suggested-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
* Replace qemu_pipe() with g_unix_open_pipe()Marc-André Lureau2022-05-031-22/+0Star
| | | | | | | | GLib g_unix_open_pipe() is essentially like qemu_pipe(), available since 2.30. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
* block: move fcntl_setfl()Marc-André Lureau2022-05-031-15/+0Star
| | | | | | | It is only used by block/file-posix.c, move it there. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
* util: replace qemu_get_local_state_pathname()Marc-André Lureau2022-04-211-5/+2Star
| | | | | | | | | Simplify the function to only return the directory path. Callers are adjusted to use the GLib function to build paths, g_build_filename(). Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20220420132624.2439741-39-marcandre.lureau@redhat.com>
* util: use qemu_create() in qemu_write_pidfile()Marc-André Lureau2022-04-211-2/+1Star
| | | | | | | | | qemu_open_old(O_CREATE) should be replaced with qemu_create() which handles Error reporting. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20220420132624.2439741-38-marcandre.lureau@redhat.com>
* util: use qemu_write_full() in qemu_write_pidfile()Marc-André Lureau2022-04-211-1/+1
| | | | | | | | Mostly for correctness. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20220420132624.2439741-37-marcandre.lureau@redhat.com>
* qga: move qga_get_host_name()Marc-André Lureau2022-04-211-35/+0Star
| | | | | | | | | The function is specific to qemu-ga, no need to share it in QEMU. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com> Message-Id: <20220420132624.2439741-32-marcandre.lureau@redhat.com>
* include: move qemu_msync() to osdepMarc-André Lureau2022-04-211-0/+18
| | | | | | | | | The implementation depends on the OS. (and longer-term goal is to move cutils to a common subproject) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20220420132624.2439741-21-marcandre.lureau@redhat.com>
* Remove qemu-common.h include from most unitsMarc-André Lureau2022-04-061-1/+0Star
| | | | | | Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20220323155743.1585078-33-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* Move fcntl_setfl() to oslib-posixMarc-André Lureau2022-04-061-0/+15
| | | | | | | | | | It is only implemented for POSIX anyway. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220323155743.1585078-30-marcandre.lureau@redhat.com> [Add braces around if statements. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* Replace qemu_real_host_page variables with inlined functionsMarc-André Lureau2022-04-061-4/+4
| | | | | | | | | | | | Replace the global variables with inlined helper functions. getpagesize() is very likely annotated with a "const" function attribute (at least with glibc), and thus optimization should apply even better. This avoids the need for a constructor initialization too. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20220323155743.1585078-12-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* util: Put qemu_vfree() in memalign.cPeter Maydell2022-03-071-6/+0Star
| | | | | | | | | | qemu_vfree() is the companion free function to qemu_memalign(); put it in memalign.c so the allocation and free functions are together. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20220226180723.1706285-9-peter.maydell@linaro.org Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
* util: Share qemu_try_memalign() implementation between POSIX and WindowsPeter Maydell2022-03-071-29/+0Star
| | | | | | | | | | | | | | | The qemu_try_memalign() functions for POSIX and Windows used to be significantly different, but these days they are identical except for the actual allocation function called, and the POSIX version already has to have ifdeffery for different allocation functions. Move to a single implementation in memalign.c, which uses the Windows _aligned_malloc if we detect that function in meson. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20220226180723.1706285-7-peter.maydell@linaro.org Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
* util: Return valid allocation for qemu_try_memalign() with zero sizePeter Maydell2022-03-071-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | Currently qemu_try_memalign()'s behaviour if asked to allocate 0 bytes is rather variable: * on Windows, we will assert * on POSIX platforms, we get the underlying behaviour of the posix_memalign() or equivalent function, which may be either "return a valid non-NULL pointer" or "return NULL" Explictly check for 0 byte allocations, so we get consistent behaviour across platforms. We handle them by incrementing the size so that we return a valid non-NULL pointer that can later be passed to qemu_vfree(). This is permitted behaviour for the posix_memalign() API and is the most usual way that underlying malloc() etc implementations handle a zero-sized allocation request, because it won't trip up calling code that assumes NULL means an error. (This includes our own qemu_memalign(), which will abort on NULL.) This change is a preparation for sharing the qemu_try_memalign() code between Windows and POSIX. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
* util: Unify implementations of qemu_memalign()Peter Maydell2022-03-071-14/+0Star
| | | | | | | | | | | | | | | | | | | | | | | We implement qemu_memalign() in both oslib-posix.c and oslib-win32.c, but the two versions are essentially the same: they call qemu_try_memalign(), and abort() after printing an error message if it fails. The only difference is that the win32 version prints the GetLastError() value whereas the POSIX version prints strerror(errno). However, this is a bug in the win32 version: in commit dfbd0b873a85021 in 2020 we changed the implementation of qemu_try_memalign() from using VirtualAlloc() (which sets the GetLastError() value) to using _aligned_malloc() (which sets errno), but didn't update the error message to match. Replace the two separate functions with a single version in a new memalign.c file, which drops the unnecessary extra qemu_oom_check() function and instead prints a more useful message including the requested size and alignment as well as the errno string. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20220226180723.1706285-4-peter.maydell@linaro.org Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
* util: Make qemu_oom_check() a static functionPeter Maydell2022-03-071-1/+1
| | | | | | | | | | | The qemu_oom_check() function, which we define in both oslib-posix.c and oslib-win32.c, is now used only locally in that file; make it static. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20220226180723.1706285-3-peter.maydell@linaro.org
* include: Move qemu_madvise() and related #defines to new qemu/madvise.hPeter Maydell2022-02-211-0/+1
| | | | | | | | | | | The function qemu_madvise() and the QEMU_MADV_* constants associated with it are used in only 10 files. Move them out of osdep.h to a new qemu/madvise.h header that is included where it is needed. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20220208200856.3558249-2-peter.maydell@linaro.org
* util/oslib-posix: Fix missing unlock in the error path of os_mem_prealloc()David Hildenbrand2022-02-061-0/+1
| | | | | | | | | | | | | | | | | | We're missing an unlock in case installing the signal handler failed. Fortunately, we barely see this error in real life. Fixes: a960d6642d39 ("util/oslib-posix: Support concurrent os_mem_prealloc() invocation") Fixes: CID 1468941 Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Pankaj Gupta <pankaj.gupta@ionos.com> Cc: Daniel P. Berrangé <berrange@redhat.com> Cc: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20220111120830.119912-1-david@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* util/oslib-posix: Forward SIGBUS to MCE handler under LinuxDavid Hildenbrand2022-01-081-3/+34
| | | | | | | | | | | | | | | | | Temporarily modifying the SIGBUS handler is really nasty, as we might be unlucky and receive an MCE SIGBUS while having our handler registered. Unfortunately, there is no way around messing with SIGBUS when MADV_POPULATE_WRITE is not applicable or not around. Let's forward SIGBUS that don't belong to us to the already registered handler and document the situation. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-8-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* util/oslib-posix: Support concurrent os_mem_prealloc() invocationDavid Hildenbrand2022-01-071-0/+9
| | | | | | | | | | | | | | | | | | | | | | Add a mutex to protect the SIGBUS case, as we cannot mess concurrently with the sigbus handler and we have to manage the global variable sigbus_memset_context. The MADV_POPULATE_WRITE path can run concurrently. Note that page_mutex and page_cond are shared between concurrent invocations, which shouldn't be a problem. This is a preparation for future virtio-mem prealloc code, which will call os_mem_prealloc() asynchronously from an iothread when handling guest requests. Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-7-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* util/oslib-posix: Avoid creating a single thread with MADV_POPULATE_WRITEDavid Hildenbrand2022-01-071-0/+8
| | | | | | | | | | | | | Let's simplify the case when we only want a single thread and don't have to mess with signal handlers. Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-6-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* util/oslib-posix: Don't create too many threads with small memory or little ↵David Hildenbrand2022-01-071-2/+10
| | | | | | | | | | | | | | | | | pages Let's limit the number of threads to something sane, especially that - We don't have more threads than the number of pages we have - We don't have threads that initialize small (< 64 MiB) memory Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-5-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* util/oslib-posix: Introduce and use MemsetContext for touch_all_pages()David Hildenbrand2022-01-071-26/+47
| | | | | | | | | | | | | | | | | Let's minimize the number of global variables to prepare for os_mem_prealloc() getting called concurrently and make the code a bit easier to read. The only consumer that really needs a global variable is the sigbus handler, which will require protection via a mutex in the future either way as we cannot concurrently mess with the SIGBUS handler. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-4-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* util/oslib-posix: Support MADV_POPULATE_WRITE for os_mem_prealloc()David Hildenbrand2022-01-071-21/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Let's sense support and use it for preallocation. MADV_POPULATE_WRITE does not require a SIGBUS handler, doesn't actually touch page content, and avoids context switches; it is, therefore, faster and easier to handle than our current approach. While MADV_POPULATE_WRITE is, in general, faster than manual prefaulting, and especially faster with 4k pages, there is still value in prefaulting using multiple threads to speed up preallocation. More details on MADV_POPULATE_WRITE can be found in the Linux commits 4ca9b3859dac ("mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault page tables") and eb2faa513c24 ("mm/madvise: report SIGBUS as -EFAULT for MADV_POPULATE_(READ|WRITE)"), and in the man page proposal [1]. This resolves the TODO in do_touch_pages(). In the future, we might want to look into using fallocate(), eventually combined with MADV_POPULATE_READ, when dealing with shared file/fd mappings and not caring about memory bindings. [1] https://lkml.kernel.org/r/20210816081922.5155-1-david@redhat.com Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-3-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* util/oslib-posix: Let touch_all_pages() return an errorDavid Hildenbrand2022-01-071-12/+16
| | | | | | | | | | | | | | | | | Let's prepare touch_all_pages() for returning differing errors. Return an error from the thread and report the last processed error. Translate SIGBUS to -EFAULT, as a SIGBUS can mean all different kind of things (memory error, read error, out of memory). When allocating memory fails via the current SIGBUS-based mechanism, we'll get: os_mem_prealloc: preallocating memory failed: Bad address Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-2-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* memory: Introduce RAM_NORESERVE and wire it up in qemu_ram_mmap()David Hildenbrand2021-06-151-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Let's introduce RAM_NORESERVE, allowing mmap'ing with MAP_NORESERVE. The new flag has the following semantics: " RAM is mmap-ed with MAP_NORESERVE. When set, reserving swap space (or huge pages if applicable) is skipped: will bail out if not supported. When not set, the OS will do the reservation, if supported for the memory type. " Allow passing it into: - memory_region_init_ram_nomigrate() - memory_region_init_resizeable_ram() - memory_region_init_ram_from_file() ... and teach qemu_ram_mmap() and qemu_anon_ram_alloc() about the flag. Bail out if the flag is not supported, which is the case right now for both, POSIX and win32. We will add Linux support next and allow specifying RAM_NORESERVE via memory backends. The target use case is virtio-mem, which dynamically exposes memory inside a large, sparse memory area to the VM. Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20210510114328.21835-9-david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* util/mmap-alloc: Pass flags instead of separate bools to qemu_ram_mmap()David Hildenbrand2021-06-151-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | Let's pass flags instead of bools to prepare for passing other flags and update the documentation of qemu_ram_mmap(). Introduce new QEMU_MAP_ flags that abstract the mmap() PROT_ and MAP_ flag handling and simplify it. We expose only flags that are currently supported by qemu_ram_mmap(). Maybe, we'll see qemu_mmap() in the future as well that can implement these flags. Note: We don't use MAP_ flags as some flags (e.g., MAP_SYNC) are only defined for some systems and we want to always be able to identify these flags reliably inside qemu_ram_mmap() -- for example, to properly warn when some future flags are not available or effective on a system. Also, this way we can simplify PROT_ handling as well. Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20210510114328.21835-8-david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* oslib-posix: Remove OpenBSD workaround for fcntl("/dev/null", F_SETFL, ↵Brad Smith2021-06-041-11/+0Star
| | | | | | | | | | | | | | | O_NONBLOCK) failure OpenBSD prior to 6.3 required a workaround to utilize fcntl(F_SETFL) on memory devices. Since modern verions of OpenBSD that are only officialy supported and buildable on do not have this issue I am garbage collecting this workaround. Signed-off-by: Brad Smith <brad@comstyle.com> Message-Id: <YGYECGXQhdamEJgC@humpty.home.comstyle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* memory: alloc RAM from file at offsetJagannathan Raman2021-02-091-1/+1
| | | | | | | | | | | | | Allow RAM MemoryRegion to be created from an offset in a file, instead of allocating at offset of 0 by default. This is needed to synchronize RAM between QEMU & remote process. Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Signed-off-by: John G Johnson <john.g.johnson@oracle.com> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 609996697ad8617e3b01df38accc5c208c24d74e.1611938319.git.jag.raman@oracle.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* memory: add readonly support to memory_region_init_ram_from_file()Stefan Hajnoczi2021-02-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is currently no way to open(O_RDONLY) and mmap(PROT_READ) when creating a memory region from a file. This functionality is needed since the underlying host file may not allow writing. Add a bool readonly argument to memory_region_init_ram_from_file() and the APIs it calls. Extend memory_region_init_ram_from_file() rather than introducing a memory_region_init_rom_from_file() API so that callers can easily make a choice between read/write and read-only at runtime without calling different APIs. No new RAMBlock flag is introduced for read-only because it's unclear whether RAMBlocks need to know that they are read-only. Pass a bool readonly argument instead. Both of these design decisions can be changed in the future. It just seemed like the simplest approach to me. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20210104171320.575838-2-stefanha@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
* util/oslib: Assert qemu_try_memalign() alignment is a power of 2Philippe Mathieu-Daudé2021-01-071-0/+2
| | | | | | | | | | | | | | | | | qemu_try_memalign() expects a power of 2 alignment: - posix_memalign(3): The address of the allocated memory will be a multiple of alignment, which must be a power of two and a multiple of sizeof(void *). - _aligned_malloc() The alignment value, which must be an integer power of 2. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20201021173803.2619054-3-philmd@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* cfi: Initial support for cfi-icall in QEMUDaniele Buono2021-01-021-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | LLVM/Clang, supports runtime checks for forward-edge Control-Flow Integrity (CFI). CFI on indirect function calls (cfi-icall) ensures that, in indirect function calls, the function called is of the right signature for the pointer type defined at compile time. For this check to work, the code must always respect the function signature when using function pointer, the function must be defined at compile time, and be compiled with link-time optimization. This rules out, for example, shared libraries that are dynamically loaded (given that functions are not known at compile time), and code that is dynamically generated at run-time. This patch: 1) Introduces the CONFIG_CFI flag to support cfi in QEMU 2) Introduces a decorator to allow the definition of "sensitive" functions, where a non-instrumented function may be called at runtime through a pointer. The decorator will take care of disabling cfi-icall checks on such functions, when cfi is enabled. 3) Marks functions currently in QEMU that exhibit such behavior, in particular: - The function in TCG that calls pre-compiled TBs - The function in TCI that interprets instructions - Functions in the plugin infrastructures that jump to callbacks - Functions in util that directly call a signal handler Signed-off-by: Daniele Buono <dbuono@linux.vnet.ibm.com> Acked-by: Alex Bennée <alex.bennee@linaro.org Message-Id: <20201204230615.2392-3-dbuono@linux.vnet.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* oslib-posix: relocate path to /varPaolo Bonzini2020-09-301-2/+4
| | | | | Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* oslib-posix: default exec_dir to bindirPaolo Bonzini2020-09-301-15/+8Star
| | | | | | | | | If the exec_dir cannot be retrieved, just assume it's the installation directory that was specified at configure time. This makes it simpler to reason about what the callers will do if they get back an empty path. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* oslib: do not call g_strdup from qemu_get_exec_dirPaolo Bonzini2020-09-301-3/+5
| | | | | | | | Just return the directory without requiring the caller to free it. This also removes a bogus check for NULL in os_find_datadir and module_load_one; g_strdup of a static variable cannot return NULL. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* util: rename qemu_open() to qemu_open_old()Daniel P. Berrangé2020-09-161-1/+1
| | | | | | | | | | | We want to introduce a new version of qemu_open() that uses an Error object for reporting problems and make this it the preferred interface. Rename the existing method to release the namespace for the new impl. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
* util: add qemu_get_host_physmem utility functionAlex Bennée2020-07-271-0/+15
| | | | | | | | | | | | | | | | This will be used in a future patch. For POSIX systems _SC_PHYS_PAGES isn't standardised but at least appears in the man pages for Open/FreeBSD. The result is advisory so any users of it shouldn't just fail if we can't work it out. The win32 stub currently returns 0 until someone with a Windows system can develop and test a patch. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Cc: BALATON Zoltan <balaton@eik.bme.hu> Cc: Christian Ehrhardt <christian.ehrhardt@canonical.com> Message-Id: <20200724064509.331-5-alex.bennee@linaro.org>
* util: Implement qemu_get_thread_id() for OpenBSDDavid CARLIER2020-07-201-0/+2
| | | | | | | | | | | | Implement qemu_get_thread_id() for OpenBSD hosts, using getthrid(). Signed-off-by: David Carlier <devnexen@gmail.com> Reviewed-by: Brad Smith <brad@comstyle.com> Message-id: CA+XhMqxD6gQDBaj8tX0CMEj3si7qYKsM8u1km47e_-U7MC37Pg@mail.gmail.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> [PMM: tidied up commit message] Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* net: check if the file descriptor is valid before using itLaurent Vivier2020-07-151-8/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | qemu_set_nonblock() checks that the file descriptor can be used and, if not, crashes QEMU. An assert() is used for that. The use of assert() is used to detect programming error and the coredump will allow to debug the problem. But in the case of the tap device, this assert() can be triggered by a misconfiguration by the user. At startup, it's not a real problem, but it can also happen during the hot-plug of a new device, and here it's a problem because we can crash a perfectly healthy system. For instance: # ip link add link virbr0 name macvtap0 type macvtap mode bridge # ip link set macvtap0 up # TAP=/dev/tap$(ip -o link show macvtap0 | cut -d: -f1) # qemu-system-x86_64 -machine q35 -device pcie-root-port,id=pcie-root-port-0 -monitor stdio 9<> $TAP (qemu) netdev_add type=tap,id=hostnet0,vhost=on,fd=9 (qemu) device_add driver=virtio-net-pci,netdev=hostnet0,id=net0,bus=pcie-root-port-0 (qemu) device_del net0 (qemu) netdev_del hostnet0 (qemu) netdev_add type=tap,id=hostnet1,vhost=on,fd=9 qemu-system-x86_64: .../util/oslib-posix.c:247: qemu_set_nonblock: Assertion `f != -1' failed. Aborted (core dumped) To avoid that, add a function, qemu_try_set_nonblock(), that allows to report the problem without crashing. In the same way, we also update the function for vhostfd in net_init_tap_one() and for fd in net_init_socket() (both descriptors are provided by the user and can be wrong). Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
* util: Introduce qemu_get_host_name()Michal Privoznik2020-07-141-0/+35
| | | | | | | | | | | This function offers operating system agnostic way to fetch host name. It is implemented for both POSIX-like and Windows systems. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Cc: qemu-stable@nongnu.org Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
* util/oslib-posix.c: Implement qemu_init_exec_dir() for HaikuDavid CARLIER2020-07-131-0/+19
| | | | | | | | | | | | | The qemu_init_exec_dir() function is inherently non-portable; provide an implementation for Haiku hosts. Signed-off-by: David Carlier <devnexen@gmail.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20200703145614.16684-9-peter.maydell@linaro.org [PMM: Expanded commit message] Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* osdep.h: Always include <sys/signal.h> if it existsDavid CARLIER2020-07-131-1/+0Star
| | | | | | | | | | | | | | | | | | | | Regularize our handling of <sys/signal.h>: currently we include it in osdep.h, but only for OpenBSD, and we include it without an ifdef guard in a couple of C files. This causes problems for Haiku, which doesn't have that header. Instead, check in configure whether sys/signal.h exists, and if it does then always include it from osdep.h. Signed-off-by: David Carlier <devnexen@gmail.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20200703145614.16684-5-peter.maydell@linaro.org [PMM: Expanded commit message; rename to HAVE_SYS_SIGNAL_H] Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* util/oslib-posix : qemu_init_exec_dir implementation for MacDavid CARLIER2020-06-231-0/+15
| | | | | | | | | | | | | | From 3025a0ce3fdf7d3559fc35a52c659f635f5c750c Mon Sep 17 00:00:00 2001 From: David Carlier <devnexen@gmail.com> Date: Tue, 26 May 2020 21:35:27 +0100 Subject: [PATCH] util/oslib-posix : qemu_init_exec_dir implementation for Mac Using dyld API to get the full path of the current process. Signed-off-by: David Carlier <devnexen@gmail.com> Message-id: CA+XhMqxwC10XHVs4Z-JfE0-WLAU3ztDuU9QKVi31mjr59HWCxg@mail.gmail.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* util/oslib: Returns the real thread identifier on FreeBSD and NetBSDDavid Carlier2020-06-101-0/+9
| | | | | | | | | | getpid is good enough in a mono thread context, however thr_self/_lwp_self reflects the real current thread identifier from a given process. Signed-off-by: David Carlier <devnexen@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: David Carlier <devnexen@gmail.com>
* oslib-posix: take lock before qemu_cond_broadcastBauerchen2020-04-111-0/+3
| | | | | | | | | | In touch_all_pages, if the mutex is not taken around qemu_cond_broadcast, qemu_cond_broadcast may be called before all touch page threads enter qemu_cond_wait. In this case, the touch page threads wait forever for the main thread to wake them up, causing a deadlock. Signed-off-by: Bauerchen <bauerchen@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* oslib-posix: initialize mutex and condition variablePaolo Bonzini2020-03-161-0/+7
| | | | | | | | | | The mutex and condition variable were never initialized, causing -mem-prealloc to abort with an assertion failure. Fixes: 037fb5eb3941c80a2b7c36a843e47207ddb004d4 Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com> Cc: bauerchen <bauerchen@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* mem-prealloc: optimize large guest startupbauerchen2020-02-251-8/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [desc]: Large memory VM starts slowly when using -mem-prealloc, and there are some areas to optimize in current method; 1、mmap will be used to alloc threads stack during create page clearing threads, and it will attempt mm->mmap_sem for write lock, but clearing threads have hold read lock, this competition will cause threads createion very slow; 2、methods of calcuating pages for per threads is not well;if we use 64 threads to split 160 hugepage,63 threads clear 2page,1 thread clear 34 page,so the entire speed is very slow; to solve the first problem,we add a mutex in thread function,and start all threads when all threads finished createion; and the second problem, we spread remainder to other threads,in situation that 160 hugepage and 64 threads, there are 32 threads clear 3 pages,and 32 threads clear 2 pages. [test]: 320G 84c VM start time can be reduced to 10s 680G 84c VM start time can be reduced to 18s Signed-off-by: bauerchen <bauerchen@tencent.com> Reviewed-by: Pan Rui <ruippan@tencent.com> Reviewed-by: Ivan Ren <ivanren@tencent.com> [Simplify computation of the number of pages per thread. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>