diff options
Diffstat (limited to 'docs/devel')
-rw-r--r-- | docs/devel/ci-jobs.rst.inc | 116 | ||||
-rw-r--r-- | docs/devel/ci.rst | 11 | ||||
-rw-r--r-- | docs/devel/index-tcg.rst | 1 | ||||
-rw-r--r-- | docs/devel/replay.rst | 306 | ||||
-rw-r--r-- | docs/devel/replay.txt | 46 | ||||
-rw-r--r-- | docs/devel/submitting-a-patch.rst | 36 | ||||
-rw-r--r-- | docs/devel/testing.rst | 2 |
7 files changed, 449 insertions, 69 deletions
diff --git a/docs/devel/ci-jobs.rst.inc b/docs/devel/ci-jobs.rst.inc index 92e25872aa..1f28fec0d0 100644 --- a/docs/devel/ci-jobs.rst.inc +++ b/docs/devel/ci-jobs.rst.inc @@ -1,3 +1,5 @@ +.. _ci_var: + Custom CI/CD variables ====================== @@ -28,7 +30,113 @@ For further information about how to set these variables, please refer to:: https://docs.gitlab.com/ee/user/project/push_options.html#push-options-for-gitlab-cicd -Here is a list of the most used variables: +Setting aliases in your git config +---------------------------------- + +You can use aliases to make it easier to push branches with different +CI configurations. For example define an alias for triggering CI: + +.. code:: + + git config --local alias.push-ci "push -o ci.variable=QEMU_CI=1" + git config --local alias.push-ci-now "push -o ci.variable=QEMU_CI=2" + +Which lets you run: + +.. code:: + + git push-ci + +to create the pipeline, or: + +.. code:: + + git push-ci-now + +to create and run the pipeline + + +Variable naming and grouping +---------------------------- + +The variables used by QEMU's CI configuration are grouped together +in a handful of namespaces + + * QEMU_JOB_nnnn - variables to be defined in individual jobs + or templates, to influence the shared rules defined in the + .base_job_template. + + * QEMU_CI_nnn - variables to be set by contributors in their + repository CI settings, or as git push variables, to influence + which jobs get run in a pipeline + + * nnn - other misc variables not falling into the above + categories, or using different names for historical reasons + and not yet converted. + +Maintainer controlled job variables +----------------------------------- + +The following variables may be set when defining a job in the +CI configuration file. + +QEMU_JOB_CIRRUS +~~~~~~~~~~~~~~~ + +The job makes use of Cirrus CI infrastructure, requiring the +configuration setup for cirrus-run to be present in the repository + +QEMU_JOB_OPTIONAL +~~~~~~~~~~~~~~~~~ + +The job is expected to be successful in general, but is not run +by default due to need to conserve limited CI resources. It is +available to be started manually by the contributor in the CI +pipelines UI. + +QEMU_JOB_ONLY_FORKS +~~~~~~~~~~~~~~~~~~~ + +The job results are only of interest to contributors prior to +submitting code. They are not required as part of the gating +CI pipeline. + +QEMU_JOB_SKIPPED +~~~~~~~~~~~~~~~~ + +The job is not reliably successsful in general, so is not +currently suitable to be run by default. Ideally this should +be a temporary marker until the problems can be addressed, or +the job permanently removed. + +QEMU_JOB_PUBLISH +~~~~~~~~~~~~~~~~ + +The job is for publishing content after a branch has been +merged into the upstream default branch. + +QEMU_JOB_AVOCADO +~~~~~~~~~~~~~~~~ + +The job runs the Avocado integration test suite + +Contributor controlled runtime variables +---------------------------------------- + +The following variables may be set by contributors to control +job execution + +QEMU_CI +~~~~~~~ + +By default, no pipelines will be created on contributor forks +in order to preserve CI credits + +Set this variable to 1 to create the pipelines, but leave all +the jobs to be manually started from the UI + +Set this variable to 2 to create the pipelines and run all +the jobs immediately, as was historicaly behaviour QEMU_CI_AVOCADO_TESTING ~~~~~~~~~~~~~~~~~~~~~~~ @@ -38,6 +146,12 @@ these artifacts are not already cached, downloading them make the jobs reach the timeout limit). Set this variable to have the tests using the Avocado framework run automatically. +Other misc variables +-------------------- + +These variables are primarily to control execution of jobs on +private runners + AARCH64_RUNNER_AVAILABLE ~~~~~~~~~~~~~~~~~~~~~~~~ If you've got access to an aarch64 host that can be used as a gitlab-CI diff --git a/docs/devel/ci.rst b/docs/devel/ci.rst index d106610096..ed88a2010b 100644 --- a/docs/devel/ci.rst +++ b/docs/devel/ci.rst @@ -1,12 +1,13 @@ +.. _ci: + == CI == -QEMU has configurations enabled for a number of different CI services. -The most up to date information about them and their status can be -found at:: - - https://wiki.qemu.org/Testing/CI +Most of QEMU's CI is run on GitLab's infrastructure although a number +of other CI services are used for specialised purposes. The most up to +date information about them and their status can be found on the +`project wiki testing page <https://wiki.qemu.org/Testing/CI>`_. .. include:: ci-definitions.rst.inc .. include:: ci-jobs.rst.inc diff --git a/docs/devel/index-tcg.rst b/docs/devel/index-tcg.rst index 0b0ad12c22..7b9760b26f 100644 --- a/docs/devel/index-tcg.rst +++ b/docs/devel/index-tcg.rst @@ -13,3 +13,4 @@ are only implementing things for HW accelerated hypervisors. multi-thread-tcg tcg-icount tcg-plugins + replay diff --git a/docs/devel/replay.rst b/docs/devel/replay.rst new file mode 100644 index 0000000000..0244be8b9c --- /dev/null +++ b/docs/devel/replay.rst @@ -0,0 +1,306 @@ +.. + Copyright (c) 2022, ISP RAS + Written by Pavel Dovgalyuk and Alex Bennée + +======================= +Execution Record/Replay +======================= + +Core concepts +============= + +Record/replay functions are used for the deterministic replay of qemu +execution. Execution recording writes a non-deterministic events log, which +can be later used for replaying the execution anywhere and for unlimited +number of times. Execution replaying reads the log and replays all +non-deterministic events including external input, hardware clocks, +and interrupts. + +Several parts of QEMU include function calls to make event log recording +and replaying. +Devices' models that have non-deterministic input from external devices were +changed to write every external event into the execution log immediately. +E.g. network packets are written into the log when they arrive into the virtual +network adapter. + +All non-deterministic events are coming from these devices. But to +replay them we need to know at which moments they occur. We specify +these moments by counting the number of instructions executed between +every pair of consecutive events. + +Academic papers with description of deterministic replay implementation: + +* `Deterministic Replay of System's Execution with Multi-target QEMU Simulator for Dynamic Analysis and Reverse Debugging <https://www.computer.org/csdl/proceedings/csmr/2012/4666/00/4666a553-abs.html>`_ +* `Don't panic: reverse debugging of kernel drivers <https://dl.acm.org/citation.cfm?id=2786805.2803179>`_ + +Modifications of qemu include: + + * wrappers for clock and time functions to save their return values in the log + * saving different asynchronous events (e.g. system shutdown) into the log + * synchronization of the bottom halves execution + * synchronization of the threads from thread pool + * recording/replaying user input (mouse, keyboard, and microphone) + * adding internal checkpoints for cpu and io synchronization + * network filter for recording and replaying the packets + * block driver for making block layer deterministic + * serial port input record and replay + * recording of random numbers obtained from the external sources + +Instruction counting +-------------------- + +QEMU should work in icount mode to use record/replay feature. icount was +designed to allow deterministic execution in absence of external inputs +of the virtual machine. We also use icount to control the occurrence of the +non-deterministic events. The number of instructions elapsed from the last event +is written to the log while recording the execution. In replay mode we +can predict when to inject that event using the instruction counter. + +Locking and thread synchronisation +---------------------------------- + +Previously the synchronisation of the main thread and the vCPU thread +was ensured by the holding of the BQL. However the trend has been to +reduce the time the BQL was held across the system including under TCG +system emulation. As it is important that batches of events are kept +in sequence (e.g. expiring timers and checkpoints in the main thread +while instruction checkpoints are written by the vCPU thread) we need +another lock to keep things in lock-step. This role is now handled by +the replay_mutex_lock. It used to be held only for each event being +written but now it is held for a whole execution period. This results +in a deterministic ping-pong between the two main threads. + +As the BQL is now a finer grained lock than the replay_lock it is almost +certainly a bug, and a source of deadlocks, to take the +replay_mutex_lock while the BQL is held. This is enforced by an assert. +While the unlocks are usually in the reverse order, this is not +necessary; you can drop the replay_lock while holding the BQL, without +doing a more complicated unlock_iothread/replay_unlock/lock_iothread +sequence. + +Checkpoints +----------- + +Replaying the execution of virtual machine is bound by sources of +non-determinism. These are inputs from clock and peripheral devices, +and QEMU thread scheduling. Thread scheduling affect on processing events +from timers, asynchronous input-output, and bottom halves. + +Invocations of timers are coupled with clock reads and changing the state +of the virtual machine. Reads produce non-deterministic data taken from +host clock. And VM state changes should preserve their order. Their relative +order in replay mode must replicate the order of callbacks in record mode. +To preserve this order we use checkpoints. When a specific clock is processed +in record mode we save to the log special "checkpoint" event. +Checkpoints here do not refer to virtual machine snapshots. They are just +record/replay events used for synchronization. + +QEMU in replay mode will try to invoke timers processing in random moment +of time. That's why we do not process a group of timers until the checkpoint +event will be read from the log. Such an event allows synchronizing CPU +execution and timer events. + +Two other checkpoints govern the "warping" of the virtual clock. +While the virtual machine is idle, the virtual clock increments at +1 ns per *real time* nanosecond. This is done by setting up a timer +(called the warp timer) on the virtual real time clock, so that the +timer fires at the next deadline of the virtual clock; the virtual clock +is then incremented (which is called "warping" the virtual clock) as +soon as the timer fires or the CPUs need to go out of the idle state. +Two functions are used for this purpose; because these actions change +virtual machine state and must be deterministic, each of them creates a +checkpoint. ``icount_start_warp_timer`` checks if the CPUs are idle and if so +starts accounting real time to virtual clock. ``icount_account_warp_timer`` +is called when the CPUs get an interrupt or when the warp timer fires, +and it warps the virtual clock by the amount of real time that has passed +since ``icount_start_warp_timer``. + +Virtual devices +=============== + +Record/replay mechanism, that could be enabled through icount mode, expects +the virtual devices to satisfy the following requirement: +everything that affects +the guest state during execution in icount mode should be deterministic. + +Timers +------ + +Timers are used to execute callbacks from different subsystems of QEMU +at the specified moments of time. There are several kinds of timers: + + * Real time clock. Based on host time and used only for callbacks that + do not change the virtual machine state. For this reason real time + clock and timers does not affect deterministic replay at all. + * Virtual clock. These timers run only during the emulation. In icount + mode virtual clock value is calculated using executed instructions counter. + That is why it is completely deterministic and does not have to be recorded. + * Host clock. This clock is used by device models that simulate real time + sources (e.g. real time clock chip). Host clock is the one of the sources + of non-determinism. Host clock read operations should be logged to + make the execution deterministic. + * Virtual real time clock. This clock is similar to real time clock but + it is used only for increasing virtual clock while virtual machine is + sleeping. Due to its nature it is also non-deterministic as the host clock + and has to be logged too. + +All virtual devices should use virtual clock for timers that change the guest +state. Virtual clock is deterministic, therefore such timers are deterministic +too. + +Virtual devices can also use realtime clock for the events that do not change +the guest state directly. When the clock ticking should depend on VM execution +speed, use virtual clock with EXTERNAL attribute. It is not deterministic, +but its speed depends on the guest execution. This clock is used by +the virtual devices (e.g., slirp routing device) that lie outside the +replayed guest. + +Block devices +------------- + +Block devices record/replay module (``blkreplay``) intercepts calls of +bdrv coroutine functions at the top of block drivers stack. + +All block completion operations are added to the queue in the coroutines. +When the queue is flushed the information about processed requests +is recorded to the log. In replay phase the queue is matched with +events read from the log. Therefore block devices requests are processed +deterministically. + +Bottom halves +------------- + +Bottom half callbacks, that affect the guest state, should be invoked through +``replay_bh_schedule_event`` or ``replay_bh_schedule_oneshot_event`` functions. +Their invocations are saved in record mode and synchronized with the existing +log in replay mode. + +Disk I/O events are completely deterministic in our model, because +in both record and replay modes we start virtual machine from the same +disk state. But callbacks that virtual disk controller uses for reading and +writing the disk may occur at different moments of time in record and replay +modes. + +Reading and writing requests are created by CPU thread of QEMU. Later these +requests proceed to block layer which creates "bottom halves". Bottom +halves consist of callback and its parameters. They are processed when +main loop locks the global mutex. These locks are not synchronized with +replaying process because main loop also processes the events that do not +affect the virtual machine state (like user interaction with monitor). + +That is why we had to implement saving and replaying bottom halves callbacks +synchronously to the CPU execution. When the callback is about to execute +it is added to the queue in the replay module. This queue is written to the +log when its callbacks are executed. In replay mode callbacks are not processed +until the corresponding event is read from the events log file. + +Sometimes the block layer uses asynchronous callbacks for its internal purposes +(like reading or writing VM snapshots or disk image cluster tables). In this +case bottom halves are not marked as "replayable" and do not saved +into the log. + +Saving/restoring the VM state +----------------------------- + +All fields in the device state structure (including virtual timers) +should be restored by loadvm to the same values they had before savevm. + +Avoid accessing other devices' state, because the order of saving/restoring +is not defined. It means that you should not call functions like +``update_irq`` in ``post_load`` callback. Save everything explicitly to avoid +the dependencies that may make restoring the VM state non-deterministic. + +Stopping the VM +--------------- + +Stopping the guest should not interfere with its state (with the exception +of the network connections, that could be broken by the remote timeouts). +VM can be stopped at any moment of replay by the user. Restarting the VM +after that stop should not break the replay by the unneeded guest state change. + +Replay log format +================= + +Record/replay log consists of the header and the sequence of execution +events. The header includes 4-byte replay version id and 8-byte reserved +field. Version is updated every time replay log format changes to prevent +using replay log created by another build of qemu. + +The sequence of the events describes virtual machine state changes. +It includes all non-deterministic inputs of VM, synchronization marks and +instruction counts used to correctly inject inputs at replay. + +Synchronization marks (checkpoints) are used for synchronizing qemu threads +that perform operations with virtual hardware. These operations may change +system's state (e.g., change some register or generate interrupt) and +therefore should execute synchronously with CPU thread. + +Every event in the log includes 1-byte event id and optional arguments. +When argument is an array, it is stored as 4-byte array length +and corresponding number of bytes with data. +Here is the list of events that are written into the log: + + - EVENT_INSTRUCTION. Instructions executed since last event. Followed by: + + - 4-byte number of executed instructions. + + - EVENT_INTERRUPT. Used to synchronize interrupt processing. + - EVENT_EXCEPTION. Used to synchronize exception handling. + - EVENT_ASYNC. This is a group of events. When such an event is generated, + it is stored in the queue and processed in icount_account_warp_timer(). + Every such event has it's own id from the following list: + + - REPLAY_ASYNC_EVENT_BH. Bottom-half callback. This event synchronizes + callbacks that affect virtual machine state, but normally called + asynchronously. Followed by: + + - 8-byte operation id. + + - REPLAY_ASYNC_EVENT_INPUT. Input device event. Contains + parameters of keyboard and mouse input operations + (key press/release, mouse pointer movement). Followed by: + + - 9-16 bytes depending of input event. + + - REPLAY_ASYNC_EVENT_INPUT_SYNC. Internal input synchronization event. + - REPLAY_ASYNC_EVENT_CHAR_READ. Character (e.g., serial port) device input + initiated by the sender. Followed by: + + - 1-byte character device id. + - Array with bytes were read. + + - REPLAY_ASYNC_EVENT_BLOCK. Block device operation. Used to synchronize + operations with disk and flash drives with CPU. Followed by: + + - 8-byte operation id. + + - REPLAY_ASYNC_EVENT_NET. Incoming network packet. Followed by: + + - 1-byte network adapter id. + - 4-byte packet flags. + - Array with packet bytes. + + - EVENT_SHUTDOWN. Occurs when user sends shutdown event to qemu, + e.g., by closing the window. + - EVENT_CHAR_WRITE. Used to synchronize character output operations. Followed by: + + - 4-byte output function return value. + - 4-byte offset in the output array. + + - EVENT_CHAR_READ_ALL. Used to synchronize character input operations, + initiated by qemu. Followed by: + + - Array with bytes that were read. + + - EVENT_CHAR_READ_ALL_ERROR. Unsuccessful character input operation, + initiated by qemu. Followed by: + + - 4-byte error code. + + - EVENT_CLOCK + clock_id. Group of events for host clock read operations. Followed by: + + - 8-byte clock value. + + - EVENT_CHECKPOINT + checkpoint_id. Checkpoint for synchronization of + CPU, internal threads, and asynchronous input events. + - EVENT_END. Last event in the log. diff --git a/docs/devel/replay.txt b/docs/devel/replay.txt deleted file mode 100644 index e641c35add..0000000000 --- a/docs/devel/replay.txt +++ /dev/null @@ -1,46 +0,0 @@ -Record/replay mechanism, that could be enabled through icount mode, expects -the virtual devices to satisfy the following requirements. - -The main idea behind this document is that everything that affects -the guest state during execution in icount mode should be deterministic. - -Timers -====== - -All virtual devices should use virtual clock for timers that change the guest -state. Virtual clock is deterministic, therefore such timers are deterministic -too. - -Virtual devices can also use realtime clock for the events that do not change -the guest state directly. When the clock ticking should depend on VM execution -speed, use virtual clock with EXTERNAL attribute. It is not deterministic, -but its speed depends on the guest execution. This clock is used by -the virtual devices (e.g., slirp routing device) that lie outside the -replayed guest. - -Bottom halves -============= - -Bottom half callbacks, that affect the guest state, should be invoked through -replay_bh_schedule_event or replay_bh_schedule_oneshot_event functions. -Their invocations are saved in record mode and synchronized with the existing -log in replay mode. - -Saving/restoring the VM state -============================= - -All fields in the device state structure (including virtual timers) -should be restored by loadvm to the same values they had before savevm. - -Avoid accessing other devices' state, because the order of saving/restoring -is not defined. It means that you should not call functions like -'update_irq' in post_load callback. Save everything explicitly to avoid -the dependencies that may make restoring the VM state non-deterministic. - -Stopping the VM -=============== - -Stopping the guest should not interfere with its state (with the exception -of the network connections, that could be broken by the remote timeouts). -VM can be stopped at any moment of replay by the user. Restarting the VM -after that stop should not break the replay by the unneeded guest state change. diff --git a/docs/devel/submitting-a-patch.rst b/docs/devel/submitting-a-patch.rst index e51259eb9c..d3876ec1b7 100644 --- a/docs/devel/submitting-a-patch.rst +++ b/docs/devel/submitting-a-patch.rst @@ -204,23 +204,25 @@ log`` for these keywords for example usage. Test your patches ~~~~~~~~~~~~~~~~~ -Although QEMU has `continuous integration -services <Testing#Continuous_Integration>`__ that attempt to test -patches submitted to the list, it still saves everyone time if you have -already tested that your patch compiles and works. Because QEMU is such -a large project, it's okay to use configure arguments to limit what is -built for faster turnaround during your development time; but it is -still wise to also check that your patches work with a full build before -submitting a series, especially if your changes might have an unintended -effect on other areas of the code you don't normally experiment with. -See `Testing <Testing>`__ for more details on what tests are available. -Also, it is a wise idea to include a testsuite addition as part of your -patches - either to ensure that future changes won't regress your new -feature, or to add a test which exposes the bug that the rest of your -series fixes. Keeping separate commits for the test and the fix allows -reviewers to rebase the test to occur first to prove it catches the -problem, then again to place it last in the series so that bisection -doesn't land on a known-broken state. +Although QEMU uses various :ref:`ci` services that attempt to test +patches submitted to the list, it still saves everyone time if you +have already tested that your patch compiles and works. Because QEMU +is such a large project the default configuration won't create a +testing pipeline on GitLab when a branch is pushed. See the :ref:`CI +variable documentation<ci_var>` for details on how to control the +running of tests; but it is still wise to also check that your patches +work with a full build before submitting a series, especially if your +changes might have an unintended effect on other areas of the code you +don't normally experiment with. See :ref:`testing` for more details on +what tests are available. + +Also, it is a wise idea to include a testsuite addition as part of +your patches - either to ensure that future changes won't regress your +new feature, or to add a test which exposes the bug that the rest of +your series fixes. Keeping separate commits for the test and the fix +allows reviewers to rebase the test to occur first to prove it catches +the problem, then again to place it last in the series so that +bisection doesn't land on a known-broken state. .. _submitting_your_patches: diff --git a/docs/devel/testing.rst b/docs/devel/testing.rst index 5b60a31807..3f6ebd5073 100644 --- a/docs/devel/testing.rst +++ b/docs/devel/testing.rst @@ -1,3 +1,5 @@ +.. _testing: + Testing in QEMU =============== |