summaryrefslogtreecommitdiffstats
path: root/block/io_uring.c
Commit message (Collapse)AuthorAgeFilesLines
* aio-posix: split poll check from ready handlerStefan Hajnoczi2022-01-121-8/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adaptive polling measures the execution time of the polling check plus handlers called when a polled event becomes ready. Handlers can take a significant amount of time, making it look like polling was running for a long time when in fact the event handler was running for a long time. For example, on Linux the io_submit(2) syscall invoked when a virtio-blk device's virtqueue becomes ready can take 10s of microseconds. This can exceed the default polling interval (32 microseconds) and cause adaptive polling to stop polling. By excluding the handler's execution time from the polling check we make the adaptive polling calculation more accurate. As a result, the event loop now stays in polling mode where previously it would have fallen back to file descriptor monitoring. The following data was collected with virtio-blk num-queues=2 event_idx=off using an IOThread. Before: 168k IOPS, IOThread syscalls: 9837.115 ( 0.020 ms): IO iothread1/620155 io_submit(ctx_id: 140512552468480, nr: 16, iocbpp: 0x7fcb9f937db0) = 16 9837.158 ( 0.002 ms): IO iothread1/620155 write(fd: 103, buf: 0x556a2ef71b88, count: 8) = 8 9837.161 ( 0.001 ms): IO iothread1/620155 write(fd: 104, buf: 0x556a2ef71b88, count: 8) = 8 9837.163 ( 0.001 ms): IO iothread1/620155 ppoll(ufds: 0x7fcb90002800, nfds: 4, tsp: 0x7fcb9f1342d0, sigsetsize: 8) = 3 9837.164 ( 0.001 ms): IO iothread1/620155 read(fd: 107, buf: 0x7fcb9f939cc0, count: 512) = 8 9837.174 ( 0.001 ms): IO iothread1/620155 read(fd: 105, buf: 0x7fcb9f939cc0, count: 512) = 8 9837.176 ( 0.001 ms): IO iothread1/620155 read(fd: 106, buf: 0x7fcb9f939cc0, count: 512) = 8 9837.209 ( 0.035 ms): IO iothread1/620155 io_submit(ctx_id: 140512552468480, nr: 32, iocbpp: 0x7fca7d0cebe0) = 32 174k IOPS (+3.6%), IOThread syscalls: 9809.566 ( 0.036 ms): IO iothread1/623061 io_submit(ctx_id: 140539805028352, nr: 32, iocbpp: 0x7fd0cdd62be0) = 32 9809.625 ( 0.001 ms): IO iothread1/623061 write(fd: 103, buf: 0x5647cfba5f58, count: 8) = 8 9809.627 ( 0.002 ms): IO iothread1/623061 write(fd: 104, buf: 0x5647cfba5f58, count: 8) = 8 9809.663 ( 0.036 ms): IO iothread1/623061 io_submit(ctx_id: 140539805028352, nr: 32, iocbpp: 0x7fd0d0388b50) = 32 Notice that ppoll(2) and eventfd read(2) syscalls are eliminated because the IOThread stays in polling mode instead of falling back to file descriptor monitoring. As usual, polling is not implemented on Windows so this patch ignores the new io_poll_read() callback in aio-win32.c. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20211207132336.36627-2-stefanha@redhat.com [Fixed up aio_set_event_notifier() calls in tests/unit/test-fdmon-epoll.c added after this series was queued. --Stefan] Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* block/io_uring: resubmit when result is -EAGAINFabian Ebner2021-07-291-1/+15
| | | | | | | | | | | | | | | Linux SCSI can throw spurious -EAGAIN in some corner cases in its completion path, which will end up being the result in the completed io_uring request. Resubmitting such requests should allow block jobs to complete, even if such spurious errors are encountered. Co-authored-by: Stefan Hajnoczi <stefanha@gmail.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Fabian Ebner <f.ebner@proxmox.com> Message-id: 20210729091029.65369-1-f.ebner@proxmox.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* io_uring: do not use pointer after freePaolo Bonzini2020-11-171-1/+1
| | | | | | | | | | | Even though only the pointer value is only printed, it is untidy and Coverity complains. Cc: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20201113154102.1460459-1-pbonzini@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* io_uring: use io_uring_cq_ready() to check for ready cqesStefano Garzarella2020-06-051-6/+3Star
| | | | | | | | | | In qemu_luring_poll_cb() we are not using the cqe peeked from the CQ ring. We are using io_uring_peek_cqe() only to see if there are cqes ready, so we can replace it with io_uring_cq_ready(). Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20200519134942.118178-1-sgarzare@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* io_uring: retry io_uring_submit() if it fails with errno=EINTRStefano Garzarella2020-06-051-1/+1
| | | | | | | | | | | | | | | | | | As recently documented [1], io_uring_enter(2) syscall can return an error (errno=EINTR) if the operation was interrupted by a delivery of a signal before it could complete. This should happen when IORING_ENTER_GETEVENTS flag is used, for example during io_uring_submit_and_wait() or during io_uring_submit() when IORING_SETUP_IOPOLL is enabled. We shouldn't have this problem for now, but it's better to prevent it. [1] https://github.com/axboe/liburing/commit/344355ec6619de8f4e64584c9736530b5346e4f4 Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20200519133041.112138-1-sgarzare@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* block/io_uring: Remove superfluous semicolonPhilippe Mathieu-Daudé2020-02-181-1/+1
| | | | | | | Fixes: 6663a0a3376 Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200218094402.26625-5-philmd@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block/io_uring: adds userspace completion pollingAarushi Mehta2020-01-301-1/+16
| | | | | | | | | Signed-off-by: Aarushi Mehta <mehta.aaru20@gmail.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20200120141858.587874-11-stefanha@redhat.com Message-Id: <20200120141858.587874-11-stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* block: add trace events for io_uringAarushi Mehta2020-01-301-3/+20
| | | | | | | | | Signed-off-by: Aarushi Mehta <mehta.aaru20@gmail.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20200120141858.587874-10-stefanha@redhat.com Message-Id: <20200120141858.587874-10-stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* block/io_uring: implements interfaces for io_uringAarushi Mehta2020-01-301-0/+401
Aborts when sqe fails to be set as sqes cannot be returned to the ring. Adds slow path for short reads for older kernels Signed-off-by: Aarushi Mehta <mehta.aaru20@gmail.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20200120141858.587874-5-stefanha@redhat.com Message-Id: <20200120141858.587874-5-stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>