summaryrefslogtreecommitdiffstats
path: root/util/throttle.c
Commit message (Collapse)AuthorAgeFilesLines
* throttle: Assert that bkt->max is valid in throttle_compute_wait()Alberto Garcia2017-09-261-0/+1
| | | | | | | | | | | | | If bkt->max == 0 and bkt->burst_length > 1 then we could have a division by 0 in throttle_do_compute_wait(). That configuration is however not permitted and is already detected by throttle_is_valid(), but let's assert it in throttle_compute_wait() to make it explicit. Found by Coverity (CID: 1381016). Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: convert ThrottleGroup to object with QOMManos Pitsidianakis2017-09-051-0/+151
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ThrottleGroup is converted to an object. This will allow the future throttle block filter drive easy creation and configuration of throttle groups in QMP and cli. A new QAPI struct, ThrottleLimits, is introduced to provide a shared struct for all throttle configuration needs in QMP. ThrottleGroups can be created via CLI as -object throttle-group,id=foo,x-iops-total=100,x-.. where x-* are individual limit properties. Since we can't add non-scalar properties in -object this interface must be used instead. However, setting these properties must be disabled after initialization because certain combinations of limits are forbidden and thus configuration changes should be done in one transaction. The individual properties will go away when support for non-scalar values in CLI is implemented and thus are marked as experimental. ThrottleGroup also has a `limits` property that uses the ThrottleLimits struct. It can be used to create ThrottleGroups or set the configuration in existing groups as follows: { "execute": "object-add", "arguments": { "qom-type": "throttle-group", "id": "foo", "props" : { "limits": { "iops-total": 100 } } } } { "execute" : "qom-set", "arguments" : { "path" : "foo", "property" : "limits", "value" : { "iops-total" : 99 } } } This also means a group's configuration can be fetched with qom-get. Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* throttle: Make burst_length 64bit and add range checksAlberto Garcia2017-08-291-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | LeakyBucket.burst_length is defined as an unsigned integer but the code never checks for overflows and it only makes sure that the value is not 0. In practice this means that the user can set something like throttling.iops-total-max-length=4294967300 despite being larger than UINT_MAX and the final value after casting to unsigned int will be 4. This patch changes the data type to uint64_t. This does not increase the storage size of LeakyBucket, and allows us to assign the value directly from qemu_opt_get_number() or BlockIOThrottle and then do the checks directly in throttle_is_valid(). The value of burst_length does not have a specific upper limit, but since the bucket size is defined by max * burst_length we have to prevent overflows. Instead of going for UINT64_MAX or something similar this patch reuses THROTTLE_VALUE_MAX, which allows I/O bursts of 1 GiB/s for 10 days in a row. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: 1b2e3049803f71cafb2e1fa1be4fb47147a0d398.1503580370.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* throttle: Make LeakyBucket.avg and LeakyBucket.max integer typesAlberto Garcia2017-08-291-4/+3Star
| | | | | | | | | | | | | | | | | | | | | | Both the throttling limits set with the throttling.iops-* and throttling.bps-* options and their QMP equivalents defined in the BlockIOThrottle struct are integer values. Those limits are also reported in the BlockDeviceInfo struct and they are integers there as well. Therefore there's no reason to store them internally as double and do the conversion everytime we're setting or querying them, so this patch uses uint64_t for those types. Let's also use an unsigned type because we don't allow negative values anyway. LeakyBucket.level and LeakyBucket.burst_level do however remain double because their value changes depending on the fraction of time elapsed since the previous I/O operation. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: f29b840422767b5be2c41c2dfdbbbf6c5f8fedf8.1503580370.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* throttle: Remove throttle_fix_bucket() / throttle_unfix_bucket()Alberto Garcia2017-08-291-39/+23Star
| | | | | | | | | | | | | | The throttling code can change internally the value of bkt->max if it hasn't been set by the user. The problem with this is that if we want to retrieve the original value we have to undo this change first. This is ugly and unnecessary: this patch removes the throttle_fix_bucket() and throttle_unfix_bucket() functions completely and moves the logic to throttle_compute_wait(). Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Message-id: 5b0b9e1ac6eb208d709eddc7b09e7669a523bff3.1503580370.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* throttle: Make throttle_is_valid() a bit less verboseAlberto Garcia2017-08-291-8/+7Star
| | | | | | | | | | Use a pointer to the bucket instead of repeating cfg->buckets[i] all the time. This makes the code more concise and will help us expand the checks later and save a few line breaks. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: 763ffc40a26b17d54cf93f5a999e4656049fcf0c.1503580370.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* throttle: Update the throttle_fix_bucket() documentationAlberto Garcia2017-08-291-8/+3Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The way the throttling algorithm works is that requests start being throttled once the bucket level exceeds the burst limit. When we get there the bucket leaks at the level set by the user (bkt->avg), and that leak rate is what prevents guest I/O from exceeding the desired limit. If we don't allow bursts (i.e. bkt->max == 0) then we can start throttling requests immediately. The problem with keeping the threshold at 0 is that it only allows one request at a time, and as soon as there's a bit of I/O from the guest every other request will be throttled and performance will suffer considerably. That can even make the guest unable to reach the throttle limit if that limit is high enough, and that happens regardless of the block scheduler used by the guest. Increasing that threshold gives flexibility to the guest, allowing it to perform short bursts of I/O before being throttled. Increasing the threshold too much does not make a difference in the long run (because it's the leak rate what defines the actual throughput) but it does allow the guest to perform longer initial bursts and exceed the throttle limit for a short while. A burst value of bkt->avg / 10 allows the guest to perform 100ms' worth of I/O at the target rate without being throttled. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: 31aae6645f0d1fbf3860fb2b528b757236f0c0a7.1503580370.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* block: remove timer canceling in throttle_config()Manos Pitsidianakis2017-07-181-14/+0Star
| | | | | | | | | | | | | | throttle_config() cancels the timers of the calling BlockBackend. This doesn't make sense because other BlockBackends in the group remain untouched. There's no need to cancel the timers in the one specific BlockBackend so let's not do that. Throttled requests will run as scheduled and future requests will follow the new configuration. This also allows a throttle group's configuration to be changed even when it has no members. Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: add clock_type field to ThrottleGroupManos Pitsidianakis2017-07-181-1/+3
| | | | | | | | | | | Clock type in throttling is currently inferred by the ThrottleTimer's clock type even though it is a per-ThrottleGroup property; it doesn't make sense to have different clock types in the same group. Moving this to a field in ThrottleGroup can simplify some of the throttle functions. Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* throttle: make throttle_config(throttle_get_config()) symmetricStefan Hajnoczi2017-04-211-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | Throttling has a weird property that throttle_get_config() does not always return the same throttling settings that were given with throttle_config(). In other words, the set and get functions aren't symmetric. If .max is 0 then the throttling code assigns a default value of .avg / 10 in throttle_config(). This is an implementation detail of the throttling algorithm. When throttle_get_config() is called the .max value returned should still be 0. Users are exposed to this quirk via "info block" or "query-block" monitor commands. This has caused confusion because it looks like a bug when an unexpected value is reported. This patch hides the .max value adjustment in throttle_get_config() and updates test-throttle.c appropriately. Reported-by: Nini Gu <ngu@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20170301115026.22621-4-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* throttle: Don't allow burst limits to be lower than the normal limitsAlberto Garcia2016-08-051-0/+5
| | | | | | | | | | | | | | | Setting FOO_max to a value that is lower than FOO does not make sense, and it produces odd results depending on the value of FOO_max_length. Although the user should not set that configuration in the first place it's better to reject it explicitly. https://bugzilla.redhat.com/show_bug.cgi?id=1355665 Signed-off-by: Alberto Garcia <berto@igalia.com> Reported-by: Gu Nini <ngu@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 663d5aca406060e31f80d8113f77b6feee63b919.1469693110.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* throttle: refuse iops-size without iops-total/read/writeStefan Hajnoczi2016-06-071-0/+8
| | | | | | | | | | | | | | | In a similar vein to commit ee2bdc33c913b7d765baa5aa338c29fb30a05c9a ("throttle: refuse bps_max/iops_max without bps/iops") it is likely that the user made a configuration error if iops-size has been set but no iops limit has been set. Print an error message so the user can check their throttling configuration. They should either remove iops-size if they don't want any throttling or specify one of iops-total, iops-read, or iops-write. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 1464828031-25601-1-git-send-email-stefanha@redhat.com
* include/qemu/osdep.h: Don't include qapi/error.hMarkus Armbruster2016-03-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 57cb38b included qapi/error.h into qemu/osdep.h to get the Error typedef. Since then, we've moved to include qemu/osdep.h everywhere. Its file comment explains: "To avoid getting into possible circular include dependencies, this file should not include any other QEMU headers, with the exceptions of config-host.h, compiler.h, os-posix.h and os-win32.h, all of which are doing a similar job to this file and are under similar constraints." qapi/error.h doesn't do a similar job, and it doesn't adhere to similar constraints: it includes qapi-types.h. That's in excess of 100KiB of crap most .c files don't actually need. Add the typedef to qemu/typedefs.h, and include that instead of qapi/error.h. Include qapi/error.h in .c files that need it and don't get it now. Include qapi-types.h in qom/object.h for uint16List. Update scripts/clean-includes accordingly. Update it further to match reality: replace config.h by config-target.h, add sysemu/os-posix.h, sysemu/os-win32.h. Update the list of includes in the qemu/osdep.h comment quoted above similarly. This reduces the number of objects depending on qapi/error.h from "all of them" to less than a third. Unfortunately, the number depending on qapi-types.h shrinks only a little. More work is needed for that one. Signed-off-by: Markus Armbruster <armbru@redhat.com> [Fix compilation without the spice devel packages. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* throttle: Add support for burst periodsAlberto Garcia2016-02-221-13/+60
| | | | | | | | | | This patch adds support for burst periods to the throttling code. With this feature the user can keep performing bursts as defined by the LeakyBucket.max rate for a configurable period of time. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* throttle: Use throttle_config_init() to initialize ThrottleConfigAlberto Garcia2016-02-221-0/+10
| | | | | | | | | | | | | | We can currently initialize ThrottleConfig by zeroing all its fields, but this will change with the new fields to define the length of the burst periods. This patch introduces a new throttle_config_init() function and uses it to replace all memset() calls that initialize ThrottleConfig directly. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* throttle: Merge all functions that check the configuration into oneAlberto Garcia2016-02-221-32/+8Star
| | | | | | | | | | | | | | There's no need to keep throttle_conflicting(), throttle_is_valid() and throttle_max_is_missing_limit() as separate functions, so this patch merges all three into one. As a consequence, check_throttle_config() becomes redundant and can be replaced with throttle_is_valid(). Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* throttle: Make throttle_is_valid() set errpAlberto Garcia2016-02-221-1/+4
| | | | | | | | | | The caller does not need to set it, and this will allow us to refactor this function later. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* throttle: Make throttle_max_is_missing_limit() set errpAlberto Garcia2016-02-221-1/+4
| | | | | | | | | | The caller does not need to set it, and this will allow us to refactor this function later. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* throttle: Make throttle_conflicting() set errpAlberto Garcia2016-02-221-2/+9
| | | | | | | | | | The caller does not need to set it, and this will allow us to refactor this function later. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* throttle: Make throttle_compute_timer() staticAlberto Garcia2016-02-221-4/+4
| | | | | | | | | This function is only used internally in throttle.c Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* util: Clean up includesPeter Maydell2016-02-041-0/+1
| | | | | | | | | | Clean up includes so that osdep.h is included first and headers which it implies are not included manually. This commit was created with scripts/clean-includes. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 1454089805-5470-6-git-send-email-peter.maydell@linaro.org
* blockdev: Error out on negative throttling option valuesFam Zheng2016-01-201-10/+6Star
| | | | | | | | | | | | | | | | | | | | extract_common_blockdev_options() uses qemu_opt_get_number() to parse the bps/iops numbers to uint64_t, then converts to double and stores in ThrottleConfig. The actual parsing is done by strtoull() in parse_option_number(). Negative numbers are wrapped to large positive ones, and stored. We used to reject negative numbers since 7d81c1413c9, but this regressed when the option parsing code was changed later. Now fix this again. This time, define an arbitrary large upper limit (1e15), and check the values so both negative and impractically big numbers are caught and reported. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* throttle: refuse bps_max/iops_max without bps/iopsStefan Hajnoczi2015-08-051-0/+15
| | | | | | | | | | The bps_max/iops_max values are meaningless without corresponding bps/iops values. Reported an error if bps_max/iops_max is given without bps/iops. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 1438683733-21111-2-git-send-email-stefanha@redhat.com
* timer: rename NSEC_PER_SEC due to Mac OS X header clashStefan Hajnoczi2015-07-201-2/+2
| | | | | | | | | | | | | | | Commit e0cf11f31c24cfb17f44ed46c254d84c78e7f6e9 ("timer: Use a single definition of NSEC_PER_SEC for the whole codebase") renamed NANOSECONDS_PER_SECOND to NSEC_PER_SEC. On Mac OS X there is a <dispatch/time.h> system header which also defines NSEC_PER_SEC. This causes compiler warnings. Let's use the old name instead. It's longer but it doesn't clash. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1436364609-7929-1-git-send-email-stefanha@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* timer: Use a single definition of NSEC_PER_SEC for the whole codebaseAlberto Garcia2015-07-021-2/+2
| | | | | | Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: c6e55468856ba0b8f95913c4da111cc0ef266541.1434113783.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* throttle: Update throttle infrastructure copyrightAlberto Garcia2015-06-121-3/+5
| | | | | | | Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 07dcd4ed02f0110b13b3140f477b761b8bb8e270.1433779731.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* throttle: Extract timers from ThrottleState into a separate structureBenoît Canet2015-06-121-30/+43
| | | | | | | | | | | | | | | | | | Group throttling will share ThrottleState between multiple bs. As a consequence the ThrottleState will be accessed by multiple aio context. Timers are tied to their aio context so they must go out of the ThrottleState structure. This commit paves the way for each bs of a common ThrottleState to have its own timer. Signed-off-by: Benoit Canet <benoit.canet@nodalink.com> Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 6cf9ea96d8b32ae2f8769cead38f68a6a0c8c909.1433779731.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* throttle: add throttle_detach/attach_aio_context()Stefan Hajnoczi2014-06-041-4/+23
| | | | | | | | | | | | | | | | | Block I/O throttling uses timers and currently always adds them to the main loop. Throttling will break if bdrv_set_aio_context() is used to move a BlockDriverState to a different AioContext. This patch adds throttle_detach/attach_aio_context() interfaces so the throttling timers and uses them to move timers to the new AioContext. Note that bdrv_set_aio_context() already drains all requests so we're sure no throttled requests are pending. The test cases need to be updated since the throttle_init() interface has changed. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net>
* throttle: Add a new throttling API implementing continuous leaky bucket.Benoît Canet2013-09-061-0/+396
Implement the continuous leaky bucket algorithm devised on IRC as a separate module. Signed-off-by: Benoit Canet <benoit@irqsave.net> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>