summaryrefslogtreecommitdiffstats
path: root/job.c
Commit message (Collapse)AuthorAgeFilesLines
* block: remove bdrv_try_set_aio_context and replace it with ↵Emanuele Giuseppe Esposito2022-10-271-1/+1
| | | | | | | | | | | bdrv_try_change_aio_context No functional change intended. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20221025084952.2139888-11-eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* job: remove unused functionsEmanuele Giuseppe Esposito2022-10-071-101/+6Star
| | | | | | | | | | | | | | | | | These public functions are not used anywhere, thus can be dropped. Also, since this is the final job API that doesn't use AioContext lock and replaces it with job_lock, adjust all remaining function documentation to clearly specify if the job lock is taken or not. Also document the locking requirements for a few functions where the second version is not removed. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20220926093214.506243-22-eesposit@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* job.c: enable job lock/unlock and remove Aiocontext locksEmanuele Giuseppe Esposito2022-10-071-79/+32Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change the job_{lock/unlock} and macros to use job_mutex. Now that they are not nop anymore, remove the aiocontext to avoid deadlocks. Therefore: - when possible, remove completely the aiocontext lock/unlock pair - if it is used by some other function too, reduce the locking section as much as possible, leaving the job API outside. - change AIO_WAIT_WHILE in AIO_WAIT_WHILE_UNLOCKED, since we are not using the aiocontext lock anymore The only functions that still need the aiocontext lock are: - the JobDriver callbacks, already documented in job.h - job_cancel_sync() in replication.c is called with aio_context_lock taken, but now job is using AIO_WAIT_WHILE_UNLOCKED so we need to release the lock. Reduce the locking section to only cover the callback invocation and document the functions that take the AioContext lock, to avoid taking it twice. Also remove real_job_{lock/unlock}, as they are replaced by the public functions. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220926093214.506243-19-eesposit@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* jobs: protect job.aio_context with BQL and job_mutexEmanuele Giuseppe Esposito2022-10-071-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | In order to make it thread safe, implement a "fake rwlock", where we allow reads under BQL *or* job_mutex held, but writes only under BQL *and* job_mutex. The only write we have is in child_job_set_aio_ctx, which always happens under drain (so the job is paused). For this reason, introduce job_set_aio_context and make sure that the context is set under BQL, job_mutex and drain. Also make sure all other places where the aiocontext is read are protected. The reads in commit.c and mirror.c are actually safe, because always done under BQL. Note: at this stage, job_{lock/unlock} and job lock guard macros are *nop*. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220926093214.506243-14-eesposit@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* job: detect change of aiocontext within job coroutinePaolo Bonzini2022-10-071-2/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We want to make sure access of job->aio_context is always done under either BQL or job_mutex. The problem is that using aio_co_enter(job->aiocontext, job->co) in job_start and job_enter_cond makes the coroutine immediately resume, so we can't hold the job lock. And caching it is not safe either, as it might change. job_start is under BQL, so it can freely read job->aiocontext, but job_enter_cond is not. We want to avoid reading job->aio_context in job_enter_cond, therefore: 1) use aio_co_wake(), since it doesn't want an aiocontext as argument but uses job->co->ctx 2) detect possible discrepancy between job->co->ctx and job->aio_context by checking right after the coroutine resumes back from yielding if job->aio_context has changed. If so, reschedule the coroutine to the new context. Calling bdrv_try_set_aio_context() will issue the following calls (simplified): * in terms of bdrv callbacks: .drained_begin -> .set_aio_context -> .drained_end * in terms of child_job functions: child_job_drained_begin -> child_job_set_aio_context -> child_job_drained_end * in terms of job functions: job_pause_locked -> job_set_aio_context -> job_resume_locked We can see that after setting the new aio_context, job_resume_locked calls again job_enter_cond, which then invokes aio_co_wake(). But while job->aiocontext has been set in job_set_aio_context, job->co->ctx has not changed, so the coroutine would be entering in the wrong aiocontext. Using aio_co_schedule in job_resume_locked() might seem as a valid alternative, but the problem is that the bh resuming the coroutine is not scheduled immediately, and if in the meanwhile another bdrv_try_set_aio_context() is run (see test_propagate_mirror() in test-block-iothread.c), we would have the first schedule in the wrong aiocontext, and the second set of drains won't even manage to schedule the coroutine, as job->busy would still be true from the previous job_resume_locked(). The solution is to stick with aio_co_wake() and detect every time the coroutine resumes back from yielding if job->aio_context has changed. If so, we can reschedule it to the new context. Check for the aiocontext change in job_do_yield_locked because: 1) aio_co_reschedule_self requires to be in the running coroutine 2) since child_job_set_aio_context allows changing the aiocontext only while the job is paused, this is the exact place where the coroutine resumes, before running JobDriver's code. Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220926093214.506243-13-eesposit@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* job: move and update comments from blockjob.cEmanuele Giuseppe Esposito2022-10-071-0/+16
| | | | | | | | | | | | | | | | | This comment applies more on job, it was left in blockjob as in the past the whole job logic was implemented there. Note: at this stage, job_{lock/unlock} and job lock guard macros are *nop*. No functional change intended. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20220926093214.506243-7-eesposit@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* job.c: add job_lock/unlock while keeping job.h intactEmanuele Giuseppe Esposito2022-10-071-180/+430
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With "intact" we mean that all job.h functions implicitly take the lock. Therefore API callers are unmodified. This means that: - many static functions that will be always called with job lock held become _locked, and call _locked functions - all public functions take the lock internally if needed, and call _locked functions - all public functions called internally by other functions in job.c will have a _locked counterpart (sometimes public), to avoid deadlocks (job lock already taken). These functions are not used for now. - some public functions called only from exernal files (not job.c) do not have _locked() counterpart and take the lock inside. Others won't need the lock at all because use fields only set at initialization and never modified. job_{lock/unlock} is independent from real_job_{lock/unlock}. Note: at this stage, job_{lock/unlock} and job lock guard macros are *nop* Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-Id: <20220926093214.506243-6-eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* job.c: API functions not used outside should be staticEmanuele Giuseppe Esposito2022-10-071-3/+19
| | | | | | | | | | | | | | job_event_* functions can all be static, as they are not used outside job.c. Same applies for job_txn_add_job(). Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20220926093214.506243-4-eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* job.c: make job_mutex and job_lock/unlock() publicEmanuele Giuseppe Esposito2022-10-071-12/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | job mutex will be used to protect the job struct elements and list, replacing AioContext locks. Right now use a shared lock for all jobs, in order to keep things simple. Once the AioContext lock is gone, we can introduce per-job locks. To simplify the switch from aiocontext to job lock, introduce *nop* lock/unlock functions and macros. We want to always call job_lock/unlock outside the AioContext locks, and not vice-versa, otherwise we might get a deadlock. This is not straightforward to do, and that's why we start with nop functions. Once everything is protected by job_lock/unlock, we can change the nop into an actual mutex and remove the aiocontext lock. Since job_mutex is already being used, add static real_job_{lock/unlock} for the existing usage. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-Id: <20220926093214.506243-2-eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* job: add missing coroutine_fn annotationsPaolo Bonzini2022-10-071-1/+1
| | | | | | | | | | | | Callers of coroutine_fn must be coroutine_fn themselves, or the call must be within "if (qemu_in_coroutine())". Apply coroutine_fn to functions where this holds. Reviewed-by: Alberto Faria <afaria@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220922084924.201610-22-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* job.h: assertions in the callers of JobDriver function pointersEmanuele Giuseppe Esposito2022-03-041-0/+10
| | | | | | Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220303151616.325444-32-eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* job.c: add missing notifier initializationEmanuele Giuseppe Esposito2021-12-281-0/+1
| | | | | | | | | | | | It seems that on_idle list is not properly initialized like the other notifiers. Fixes: 34dc97b9a0e ("blockjob: Wake up BDS when job becomes idle") Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
* mirror: Do not clear .cancelledHanna Reitz2021-10-071-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Clearing .cancelled before leaving the main loop when the job has been soft-cancelled is no longer necessary since job_is_cancelled() only returns true for jobs that have been force-cancelled. Therefore, this only makes a differences in places that call job_cancel_requested(). In block/mirror.c, this is done only before .cancelled was cleared. In job.c, there are two callers: - job_completed_txn_abort() asserts that .cancelled is true, so keeping it true will not affect this place. - job_complete() refuses to let a job complete that has .cancelled set. It is correct to refuse to let the user invoke job-complete on mirror jobs that have already been soft-cancelled. With this change, there are no places that reset .cancelled to false and so we can be sure that .force_cancel can only be true if .cancelled is true as well. Assert this in job_is_cancelled(). Signed-off-by: Hanna Reitz <hreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20211006151940.214590-13-hreitz@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
* job: Add job_cancel_requested()Hanna Reitz2021-10-071-2/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Most callers of job_is_cancelled() actually want to know whether the job is on its way to immediate termination. For example, we refuse to pause jobs that are cancelled; but this only makes sense for jobs that are really actually cancelled. A mirror job that is cancelled during READY with force=false should absolutely be allowed to pause. This "cancellation" (which is actually a kind of completion) may take an indefinite amount of time, and so should behave like any job during normal operation. For example, with on-target-error=stop, the job should stop on write errors. (In contrast, force-cancelled jobs should not get write errors, as they should just terminate and not do further I/O.) Therefore, redefine job_is_cancelled() to only return true for jobs that are force-cancelled (which as of HEAD^ means any job that interprets the cancellation request as a request for immediate termination), and add job_cancel_requested() as the general variant, which returns true for any jobs which have been requested to be cancelled, whether it be immediately or after an arbitrarily long completion phase. Finally, here is a justification for how different job_is_cancelled() invocations are treated by this patch: - block/mirror.c (mirror_run()): - The first invocation is a while loop that should loop until the job has been cancelled or scheduled for completion. What kind of cancel does not matter, only the fact that the job is supposed to end. - The second invocation wants to know whether the job has been soft-cancelled. Calling job_cancel_requested() is a bit too broad, but if the job were force-cancelled, we should leave the main loop as soon as possible anyway, so this should not matter here. - The last two invocations already check force_cancel, so they should continue to use job_is_cancelled(). - block/backup.c, block/commit.c, block/stream.c, anything in tests/: These jobs know only force-cancel, so there is no difference between job_is_cancelled() and job_cancel_requested(). We can continue using job_is_cancelled(). - job.c: - job_pause_point(), job_yield(), job_sleep_ns(): Only force-cancelled jobs should be prevented from being paused. Continue using job_is_cancelled(). - job_update_rc(), job_finalize_single(), job_finish_sync(): These functions are all called after the job has left its main loop. The mirror job (the only job that can be soft-cancelled) will clear .cancelled before leaving the main loop if it has been soft-cancelled. Therefore, these functions will observe .cancelled to be true only if the job has been force-cancelled. We can continue to use job_is_cancelled(). (Furthermore, conceptually, a soft-cancelled mirror job should not report to have been cancelled. It should report completion (see also the block-job-cancel QAPI documentation). Therefore, it makes sense for these functions not to distinguish between a soft-cancelled mirror job and a job that has completed as normal.) - job_completed_txn_abort(): All jobs other than @job have been force-cancelled. job_is_cancelled() must be true for them. Regarding @job itself: job_completed_txn_abort() is mostly called when the job's return value is not 0. A soft-cancelled mirror has a return value of 0, and so will not end up here then. However, job_cancel() invokes job_completed_txn_abort() if the job has been deferred to the main loop, which is mostly the case for completed jobs (which skip the assertion), but not for sure. To be safe, use job_cancel_requested() in this assertion. - job_complete(): This is function eventually invoked by the user (through qmp_block_job_complete() or qmp_job_complete(), or job_complete_sync(), which comes from qemu-img). The intention here is to prevent a user from invoking job-complete after the job has been cancelled. This should also apply to soft cancelling: After a mirror job has been soft-cancelled, the user should not be able to decide otherwise and have it complete as normal (i.e. pivoting to the target). - job_cancel(): Both functions are equivalent (see comment there), but we want to use job_is_cancelled(), because this shows that we call job_completed_txn_abort() only for force-cancelled jobs. (As explained for job_update_rc(), soft-cancelled jobs should be treated as if they have completed as normal.) Buglink: https://gitlab.com/qemu-project/qemu/-/issues/462 Signed-off-by: Hanna Reitz <hreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20211006151940.214590-9-hreitz@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
* job: Do not soft-cancel after a job is doneHanna Reitz2021-10-071-4/+21
| | | | | | | | | | | | | | | | | | | The only job that supports a soft cancel mode is the mirror job, and in such a case it resets its .cancelled field before it leaves its .run() function, so it does not really count as cancelled. However, it is possible to cancel the job after .run() returns and before job_exit() (which is run in the main loop) is executed. Then, .cancelled would still be true and the job would count as cancelled. This does not seem to be in the interest of the mirror job, so adjust job_cancel_async() to not set .cancelled in such a case, and job_cancel() to not invoke job_completed_txn_abort(). Signed-off-by: Hanna Reitz <hreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20211006151940.214590-8-hreitz@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
* jobs: Give Job.force_cancel more meaningHanna Reitz2021-10-071-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We largely have two cancel modes for jobs: First, there is actual cancelling. The job is terminated as soon as possible, without trying to reach a consistent result. Second, we have mirror in the READY state. Technically, the job is not really cancelled, but it just is a different completion mode. The job can still run for an indefinite amount of time while it tries to reach a consistent result. We want to be able to clearly distinguish which cancel mode a job is in (when it has been cancelled). We can use Job.force_cancel for this, but right now it only reflects cancel requests from the user with force=true, but clearly, jobs that do not even distinguish between force=false and force=true are effectively always force-cancelled. So this patch has Job.force_cancel signify whether the job will terminate as soon as possible (force_cancel=true) or whether it will effectively remain running despite being "cancelled" (force_cancel=false). To this end, we let jobs that provide JobDriver.cancel() tell the generic job code whether they will terminate as soon as possible or not, and for jobs that do not provide that method we assume they will. Signed-off-by: Hanna Reitz <hreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20211006151940.214590-7-hreitz@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
* job: @force parameter for job_cancel_sync()Hanna Reitz2021-10-071-3/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Callers should be able to specify whether they want job_cancel_sync() to force-cancel the job or not. In fact, almost all invocations do not care about consistency of the result and just want the job to terminate as soon as possible, so they should pass force=true. The replication block driver is the exception, specifically the active commit job it runs. As for job_cancel_sync_all(), all callers want it to force-cancel all jobs, because that is the point of it: To cancel all remaining jobs as quickly as possible (generally on process termination). So make it invoke job_cancel_sync() with force=true. This changes some iotest outputs, because quitting qemu while a mirror job is active will now lead to it being cancelled instead of completed, which is what we want. (Cancelling a READY mirror job with force=false may take an indefinite amount of time, which we do not want when quitting. If users want consistent results, they must have all jobs be done before they quit qemu.) Buglink: https://gitlab.com/qemu-project/qemu/-/issues/462 Signed-off-by: Hanna Reitz <hreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20211006151940.214590-6-hreitz@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
* job: Force-cancel jobs in a failed transactionHanna Reitz2021-10-071-1/+6
| | | | | | | | | | | When a transaction is aborted, no result matters, and so all jobs within should be force-cancelled. Signed-off-by: Hanna Reitz <hreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20211006151940.214590-5-hreitz@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
* job: Context changes in job_completed_txn_abort()Hanna Reitz2021-10-071-5/+17
| | | | | | | | | | | | | | | | | | | | | | | Finalizing the job may cause its AioContext to change. This is noted by job_exit(), which points at job_txn_apply() to take this fact into account. However, job_completed() does not necessarily invoke job_txn_apply() (through job_completed_txn_success()), but potentially also job_completed_txn_abort(). The latter stores the context in a local variable, and so always acquires the same context at its end that it has released in the beginning -- which may be a different context from the one that job_exit() releases at its end. If it is different, qemu aborts ("qemu_mutex_unlock_impl: Operation not permitted"). Drop the local @outer_ctx variable from job_completed_txn_abort(), and instead re-acquire the actual job's context at the end of the function, so job_exit() will release the same. Signed-off-by: Hanna Reitz <hreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20211006151940.214590-2-hreitz@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
* progressmeter: protect with a mutexEmanuele Giuseppe Esposito2021-06-251-0/+3
| | | | | | | | | | | | | | | | | | | | Progressmeter is protected by the AioContext mutex, which is taken by the block jobs and their caller (like blockdev). We would like to remove the dependency of block layer code on the AioContext mutex, since most drivers and the core I/O code are already not relying on it. Create a new C file to implement the ProgressMeter API, but keep the struct as public, to avoid forcing allocation on the heap. Also add a mutex to be able to provide an accurate snapshot of the progress values to the caller. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20210614081130.22134-5-eesposit@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
* mirror: stop cancelling in-flight requests on non-force cancel in READYVladimir Sementsov-Ogievskiy2021-05-141-1/+1
| | | | | | | | | | | | | | | | | | | If mirror is READY than cancel operation is not discarding the whole result of the operation, but instead it's a documented way get a point-in-time snapshot of source disk. So, we should not cancel any requests if mirror is READ and force=false. Let's fix that case. Note, that bug that we have before this commit is not critical, as the only .bdrv_cancel_in_flight implementation is nbd_cancel_in_flight() and it cancels only requests waiting for reconnection, so it should be rare case. Fixes: 521ff8b779b11c394dbdc43f02e158dd99df308a Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210421075858.40197-1-vsementsov@virtuozzo.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
* job: Allow complete for jobs on standbyMax Reitz2021-04-091-2/+2
| | | | | | | | | | The only job that implements .complete is the mirror job, and it can handle completion requests just fine while the job is paused. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1945635 Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20210409120422.144040-4-mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* job: add .cancel handler for the driverVladimir Sementsov-Ogievskiy2021-02-121-0/+3
| | | | | | | | | | To be used in mirror in the following commit to cancel in-flight io on target to not waste the time. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210205163720.887197-5-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
* job: call job_enter from job_pauseVladimir Sementsov-Ogievskiy2021-01-261-0/+3
| | | | | | | | | | | | | | | | | | | | If main job coroutine called job_yield (while some background process is in progress), we should give it a chance to call job_pause_point(). It will be used in backup, when moved on async block-copy. Note, that job_user_pause is not enough: we want to handle child_job_drained_begin() as well, which call job_pause(). Still, if job is already in job_do_yield() in job_pause_point() we should not enter it. iotest 109 output is modified: on stop we do bdrv_drain_all() which now triggers job pause immediately (and pause after ready is standby). Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210116214705.822267-10-vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
* trace: switch position of headers to what Meson requiresPaolo Bonzini2020-08-211-1/+1
| | | | | | | | | | | | | | | | | Meson doesn't enjoy the same flexibility we have with Make in choosing the include path. In particular the tracing headers are using $(build_root)/$(<D). In order to keep the include directives unchanged, the simplest solution is to generate headers with patterns like "trace/trace-audio.h" and place forwarding headers in the source tree such that for example "audio/trace.h" includes "trace/trace-audio.h". This patch is too ugly to be applied to the Makefiles now. It's only a way to separate the changes to the tracing header files from the Meson rewrite of the tracing logic. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* job: take each job's lock individually in job_txn_applyStefan Reiter2020-04-071-10/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | All callers of job_txn_apply hold a single job's lock, but different jobs within a transaction can have different contexts, thus we need to lock each one individually before applying the callback function. Similar to job_completed_txn_abort this also requires releasing the caller's context before and reacquiring it after to avoid recursive locks which might break AIO_WAIT_WHILE in the callback. This is safe, since existing code would already have to take this into account, lest job_completed_txn_abort might have broken. This also brings to light a different issue: When a callback function in job_txn_apply moves it's job to a different AIO context, callers will try to release the wrong lock (now that we re-acquire the lock correctly, previously it would just continue with the old lock, leaving the job unlocked for the rest of the return path). Fix this by not caching the job's context. This is only necessary for qmp_block_job_finalize, qmp_job_finalize and job_exit, since everyone else calls through job_exit. One test needed adapting, since it calls job_finalize directly, so it manually needs to acquire the correct context. Signed-off-by: Stefan Reiter <s.reiter@proxmox.com> Message-Id: <20200407115651.69472-2-s.reiter@proxmox.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* job: refactor progress to separate objectVladimir Sementsov-Ogievskiy2020-03-111-3/+3
| | | | | | | | | | | | We need it in separate to pass to the block-copy object in the next commit. Cc: qemu-stable@nongnu.org Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200311103004.7649-2-vsementsov@virtuozzo.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
* job: drop job_drainVladimir Sementsov-Ogievskiy2019-09-101-11/+1Star
| | | | | | | | | | | | In job_finish_sync job_enter should be enough for a job to make some progress and draining is a wrong tool for it. So use job_enter directly here and drop job_drain with all related staff not used more. Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Tested-by: John Snow <jsnow@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* Include qemu-common.h exactly where neededMarkus Armbruster2019-06-121-1/+0Star
| | | | | | | | | | | | | | | | No header includes qemu-common.h after this commit, as prescribed by qemu-common.h's file comment. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-5-armbru@redhat.com> [Rebased with conflicts resolved automatically, except for include/hw/arm/xlnx-zynqmp.h hw/arm/nrf51_soc.c hw/arm/msf2-soc.c block/qcow2-refcount.c block/qcow2-cluster.c block/qcow2-cache.c target/arm/cpu.h target/lm32/cpu.h target/m68k/cpu.h target/mips/cpu.h target/moxie/cpu.h target/nios2/cpu.h target/openrisc/cpu.h target/riscv/cpu.h target/tilegx/cpu.h target/tricore/cpu.h target/unicore32/cpu.h target/xtensa/cpu.h; bsd-user/main.c and net/tap-bsd.c fixed up]
* blockjob: Fix coroutine thread after AioContext changeKevin Wolf2019-05-101-1/+1
| | | | | | | | | | | | Commit 463e0be10 ('blockjob: add AioContext attached callback') tried to make block jobs robust against AioContext changes of their main node, but it never made sure that the job coroutine actually runs in the new thread. Instead of waking up the job coroutine in whatever thread it ran before, let's always pass the AioContext where it should be running now. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* job: Fix off-by-one assert checks for JobSTT and JobVerbTableLiam Merwick2018-11-121-2/+2
| | | | | | | | | | | | | | | | | | | | | In the assert checking the array dereference of JobVerbTable[verb] in job_apply_verb() the check of the index, verb, allows an overrun because an index equal to the array size is permitted. Similarly, in the assert check of JobSTT[s0][s1] with index s1 in job_state_transition(), an off-by-one overrun is not flagged either. This is not a run-time issue as there are no callers actually passing in the max value. Signed-off-by: Liam Merwick <Liam.Merwick@oracle.com> Reviewed-by: Darren Kenny <Darren.Kenny@oracle.com> Reviewed-by: Mark Kanda <Mark.Kanda@oracle.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 1541453919-25973-2-git-send-email-Liam.Merwick@oracle.com Signed-off-by: Max Reitz <mreitz@redhat.com>
* block: Use a single global AioWaitKevin Wolf2018-09-251-2/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | | | When draining a block node, we recurse to its parent and for subtree drains also to its children. A single AIO_WAIT_WHILE() is then used to wait for bdrv_drain_poll() to become true, which depends on all of the nodes we recursed to. However, if the respective child or parent becomes quiescent and calls bdrv_wakeup(), only the AioWait of the child/parent is checked, while AIO_WAIT_WHILE() depends on the AioWait of the original node. Fix this by using a single AioWait for all callers of AIO_WAIT_WHILE(). This may mean that the draining thread gets a few more unnecessary wakeups because an unrelated operation got completed, but we already wake it up when something _could_ have changed rather than only if it has certainly changed. Apart from that, drain is a slow path anyway. In theory it would be possible to use wakeups more selectively and still correctly, but the gains are likely not worth the additional complexity. In fact, this patch is a nice simplification for some places in the code. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
* job: Avoid deadlocks in job_completed_txn_abort()Kevin Wolf2018-09-251-5/+11
| | | | | | | | | | | Amongst others, job_finalize_single() calls the .prepare/.commit/.abort callbacks of the individual job driver. Recently, their use was adapted for all block jobs so that they involve code calling AIO_WAIT_WHILE() now. Such code must be called under the AioContext lock for the respective job, but without holding any other AioContext lock. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
* blockjob: Lie better in child_job_drained_poll()Kevin Wolf2018-09-251-1/+10
| | | | | | | | | | | | | | | | | | | | | | | Block jobs claim in .drained_poll() that they are in a quiescent state as soon as job->deferred_to_main_loop is true. This is obviously wrong, they still have a completion BH to run. We only get away with this because commit 91af091f923 added an unconditional aio_poll(false) to the drain functions, but this is bypassing the regular drain mechanisms. However, just removing this and telling that the job is still active doesn't work either: The completion callbacks themselves call drain functions (directly, or indirectly with bdrv_reopen), so they would deadlock then. As a better lie, tell that the job is active as long as the BH is pending, but falsely call it quiescent from the point in the BH when the completion callback is called. At this point, nested drain calls won't deadlock because they ignore the job, and outer drains will wait for the job to really reach a quiescent state because the callback is already running. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
* job: Use AIO_WAIT_WHILE() in job_finish_sync()Kevin Wolf2018-09-251-8/+6Star
| | | | | | | | | | | | | | | | | job_finish_sync() needs to release the AioContext lock of the job before calling aio_poll(). Otherwise, callbacks called by aio_poll() would possibly take the lock a second time and run into a deadlock with a nested AIO_WAIT_WHILE() call. Also, job_drain() without aio_poll() isn't necessarily enough to make progress on a job, it could depend on bottom halves to be executed. Combine both open-coded while loops into a single AIO_WAIT_WHILE() call that solves both of these problems. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
* blockjob: Wake up BDS when job becomes idleKevin Wolf2018-09-251-0/+7
| | | | | | | | | | | | | | | | | In the context of draining a BDS, the .drained_poll callback of block jobs is called. If this returns true (i.e. there is still some activity pending), the drain operation may call aio_poll() with blocking=true to wait for completion. As soon as the pending activity is completed and the job finally arrives in a quiescent state (i.e. its coroutine either yields with busy=false or terminates), the block job must notify the aio_poll() loop to wake up, otherwise we get a deadlock if both are running in different threads. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
* job: Fix missing locking due to mismergeKevin Wolf2018-09-251-0/+4
| | | | | | | | | | | | | | | | | | job_completed() had a problem with double locking that was recently fixed independently by two different commits: "job: Fix nested aio_poll() hanging in job_txn_apply" "jobs: add exit shim" One fix removed the first aio_context_acquire(), the other fix removed the other one. Now we have a bug again and the code is run without any locking. Add it back in one of the places. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com>
* job: Fix nested aio_poll() hanging in job_txn_applyFam Zheng2018-09-251-13/+5Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | All callers have acquired ctx already. Doing that again results in aio_poll() hang. This fixes the problem that a BDRV_POLL_WHILE() in the callback cannot make progress because ctx is recursively locked, for example, when drive-backup finishes. There are two callers of job_finalize(): fam@lemon:~/work/qemu [master]$ git grep -w -A1 '^\s*job_finalize' blockdev.c: job_finalize(&job->job, errp); blockdev.c- aio_context_release(aio_context); -- job-qmp.c: job_finalize(job, errp); job-qmp.c- aio_context_release(aio_context); -- tests/test-blockjob.c: job_finalize(&job->job, &error_abort); tests/test-blockjob.c- assert(job->job.status == JOB_STATUS_CONCLUDED); Ignoring the test, it's easy to see both callers to job_finalize (and job_do_finalize) have acquired the context. Cc: qemu-stable@nongnu.org Reported-by: Gu Nini <ngu@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* jobs: remove .exit callbackJohn Snow2018-09-251-43/+34Star
| | | | | | | | | | | | | | | | | Now that all of the jobs use the component finalization callbacks, there's no use for the heavy-hammer .exit callback anymore. job_exit becomes a glorified type shim so that we can call job_completed from aio_bh_schedule_oneshot. Move these three functions down into job.c to eliminate a forward reference. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20180906130225.5118-12-jsnow@redhat.com Reviewed-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
* Merge remote-tracking branch 'remotes/xanclic/tags/pull-block-2018-08-31-v2' ↵Peter Maydell2018-09-241-48/+25Star
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | into staging Block patches: - (Block) job exit refactoring, part 1 (removing job_defer_to_main_loop()) - test-bdrv-drain leak fix # gpg: Signature made Fri 31 Aug 2018 15:30:33 BST # gpg: using RSA key F407DB0061D5CF40 # gpg: Good signature from "Max Reitz <mreitz@redhat.com>" # Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40 * remotes/xanclic/tags/pull-block-2018-08-31-v2: jobs: remove job_defer_to_main_loop jobs: remove ret argument to job_completed; privatize it block/backup: make function variables consistently named jobs: utilize job_exit shim block/mirror: utilize job_exit shim block/commit: utilize job_exit shim jobs: add exit shim jobs: canonize Error object jobs: change start callback to run callback tests: fix bdrv-drain leak Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * jobs: remove job_defer_to_main_loopJohn Snow2018-08-311-38/+2Star
| | | | | | | | | | | | | | | | | | | | | | Now that the job infrastructure is handling the job_completed call for all implemented jobs, we can remove the interface that allowed jobs to schedule their own completion. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20180830015734.19765-10-jsnow@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
| * jobs: remove ret argument to job_completed; privatize itJohn Snow2018-08-311-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jobs are now expected to return their retcode on the stack, from the .run callback, so we can remove that argument. job_cancel does not need to set -ECANCELED because job_completed will update the return code itself if the job was canceled. While we're here, make job_completed static to job.c and remove it from job.h; move the documentation of return code to the .run() callback and to the job->ret property, accordingly. Signed-off-by: John Snow <jsnow@redhat.com> Message-id: 20180830015734.19765-9-jsnow@redhat.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
| * jobs: add exit shimJohn Snow2018-08-311-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All jobs do the same thing when they leave their running loop: - Store the return code in a structure - wait to receive this structure in the main thread - signal job completion via job_completed Few jobs do anything beyond exactly this. Consolidate this exit logic for a net reduction in SLOC. More seriously, when we utilize job_defer_to_main_loop_bh to call a function that calls job_completed, job_finalize_single will run in a context where it has recursively taken the aio_context lock, which can cause hangs if it puts down a reference that causes a flush. You can observe this in practice by looking at mirror_exit's careful placement of job_completed and bdrv_unref calls. If we centralize job exiting, we can signal job completion from outside of the aio_context, which should allow for job cleanup code to run with only one lock, which makes cleanup callbacks less tricky to write. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20180830015734.19765-4-jsnow@redhat.com Reviewed-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
| * jobs: canonize Error objectJohn Snow2018-08-311-12/+6Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jobs presently use both an Error object in the case of the create job, and char strings in the case of generic errors elsewhere. Unify the two paths as just j->err, and remove the extra argument from job_completed. The integer error code for job_completed is kept for now, to be removed shortly in a separate patch. Signed-off-by: John Snow <jsnow@redhat.com> Message-id: 20180830015734.19765-3-jsnow@redhat.com [mreitz: Dropped a superfluous g_strdup()] Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
| * jobs: change start callback to run callbackJohn Snow2018-08-311-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Presently we codify the entry point for a job as the "start" callback, but a more apt name would be "run" to clarify the idea that when this function returns we consider the job to have "finished," except for any cleanup which occurs in separate callbacks later. As part of this clarification, change the signature to include an error object and a return code. The error ptr is not yet used, and the return code while captured, will be overwritten by actions in the job_completed function. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20180830015734.19765-2-jsnow@redhat.com Reviewed-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
* | qapi: Drop qapi_event_send_FOO()'s Error ** argumentPeter Xu2018-08-281-1/+1
|/ | | | | | | | | | | | | | | | | | | The generated qapi_event_send_FOO() take an Error ** argument. They can't actually fail, because all they do with the argument is passing it to functions that can't fail: the QObject output visitor, and the @qmp_emit callback, which is either monitor_qapi_event_queue() or event_test_emit(). Drop the argument, and pass &error_abort to the QObject output visitor and @qmp_emit instead. Suggested-by: Eric Blake <eblake@redhat.com> Suggested-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180815133747.25032-4-peterx@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [Commit message rewritten, update to qapi-code-gen.txt corrected] Signed-off-by: Markus Armbruster <armbru@redhat.com>
* block: for jobs, do not clear user_paused until after the resumeJeff Cody2018-08-211-1/+1
| | | | | | | | | | | | | | | | | | | | The function job_cancel_async() will always cause an assert for blockjob user resume. We set job->user_paused to false, and then call job->driver->user_resume(). In the case of blockjobs, this is the block_job_user_resume() function. In that function, we assert that job.user_paused is set to true. Unfortunately, right before calling this function, it has explicitly been set to false. The fix is pretty simple: set job->user_paused to false only after the job user_resume() function has been called. Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com> Message-id: bb183b77d8f2dd6bd67b8da559a90ac1e74b2052.1534868459.git.jcody@redhat.com Signed-off-by: Jeff Cody <jcody@redhat.com>
* job: Add job_progress_increase_remaining()Max Reitz2018-06-181-0/+5
| | | | | | | Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20180613181823.13618-12-mreitz@redhat.com Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
* job: Add error message for failing jobsKevin Wolf2018-05-301-2/+14
| | | | | | | | | | | | | | | | | So far we relied on job->ret and strerror() to produce an error message for failed jobs. Not surprisingly, this tends to result in completely useless messages. This adds a Job.error field that can contain an error string for a failing job, and a parameter to job_completed() that sets the field. As a default, if NULL is passed, we continue to use strerror(job->ret). All existing callers are changed to pass NULL. They can be improved in separate patches. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com>
* job: Add query-jobs QMP commandKevin Wolf2018-05-231-1/+1
| | | | | | | | This adds a minimal query-jobs implementation that shouldn't pose many design questions. It can later be extended to expose more information, and especially job-specific information. Signed-off-by: Kevin Wolf <kwolf@redhat.com>