From 0770a7a6466cc2dbf4ac91841173ad4488e1fbc7 Mon Sep 17 00:00:00 2001
From: Alberto Garcia
Date: Thu, 24 Aug 2017 16:24:44 +0300
Subject: throttle: Update the throttle_fix_bucket() documentation

The way the throttling algorithm works is that requests start being
throttled once the bucket level exceeds the burst limit. When we get
there the bucket leaks at the level set by the user (bkt->avg), and
that leak rate is what prevents guest I/O from exceeding the desired
limit.

If we don't allow bursts (i.e. bkt->max == 0) then we can start
throttling requests immediately. The problem with keeping the
threshold at 0 is that it only allows one request at a time, and as
soon as there's a bit of I/O from the guest every other request will
be throttled and performance will suffer considerably. That can even
make the guest unable to reach the throttle limit if that limit is
high enough, and that happens regardless of the block scheduler used
by the guest.

Increasing that threshold gives flexibility to the guest, allowing it
to perform short bursts of I/O before being throttled. Increasing the
threshold too much does not make a difference in the long run (because
it's the leak rate what defines the actual throughput) but it does
allow the guest to perform longer initial bursts and exceed the
throttle limit for a short while.

A burst value of bkt->avg / 10 allows the guest to perform 100ms'
worth of I/O at the target rate without being throttled.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 31aae6645f0d1fbf3860fb2b528b757236f0c0a7.1503580370.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 util/throttle.c | 11 +++--------
 1 file changed, 3 insertions(+), 8 deletions(-)

(limited to 'util')

diff --git a/util/throttle.c b/util/throttle.c
index b2a52b8b34..9a6bda813c 100644
--- a/util/throttle.c
+++ b/util/throttle.c
@@ -366,14 +366,9 @@ static void throttle_fix_bucket(LeakyBucket *bkt)
     /* zero bucket level */
     bkt->level = bkt->burst_level = 0;
 
-    /* The following is done to cope with the Linux CFQ block scheduler
-     * which regroup reads and writes by block of 100ms in the guest.
-     * When they are two process one making reads and one making writes cfq
-     * make a pattern looking like the following:
-     * WWWWWWWWWWWRRRRRRRRRRRRRRWWWWWWWWWWWWWwRRRRRRRRRRRRRRRRR
-     * Having a max burst value of 100ms of the average will help smooth the
-     * throttling
-     */
+    /* If bkt->max is 0 we still want to allow short bursts of I/O
+     * from the guest, otherwise every other request will be throttled
+     * and performance will suffer considerably. */
     min = bkt->avg / 10;
     if (bkt->avg && !bkt->max) {
         bkt->max = min;
-- 
cgit v1.2.3-55-g7522


From fa36f1b2ebcd9a7b2a58c8e12dfb1cc8596c23c0 Mon Sep 17 00:00:00 2001
From: Alberto Garcia
Date: Thu, 24 Aug 2017 16:24:45 +0300
Subject: throttle: Make throttle_is_valid() a bit less verbose

Use a pointer to the bucket instead of repeating cfg->buckets[i] all
the time. This makes the code more concise and will help us expand the
checks later and save a few line breaks.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 763ffc40a26b17d54cf93f5a999e4656049fcf0c.1503580370.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 util/throttle.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

(limited to 'util')

diff --git a/util/throttle.c b/util/throttle.c
index 9a6bda813c..bde56fe3de 100644
--- a/util/throttle.c
+++ b/util/throttle.c
@@ -324,32 +324,31 @@ bool throttle_is_valid(ThrottleConfig *cfg, Error **errp)
     }
 
     for (i = 0; i < BUCKETS_COUNT; i++) {
-        if (cfg->buckets[i].avg < 0 ||
-            cfg->buckets[i].max < 0 ||
-            cfg->buckets[i].avg > THROTTLE_VALUE_MAX ||
-            cfg->buckets[i].max > THROTTLE_VALUE_MAX) {
+        LeakyBucket *bkt = &cfg->buckets[i];
+        if (bkt->avg < 0 || bkt->max < 0 ||
+            bkt->avg > THROTTLE_VALUE_MAX || bkt->max > THROTTLE_VALUE_MAX) {
             error_setg(errp, "bps/iops/max values must be within [0, %lld]",
                        THROTTLE_VALUE_MAX);
             return false;
         }
 
-        if (!cfg->buckets[i].burst_length) {
+        if (!bkt->burst_length) {
             error_setg(errp, "the burst length cannot be 0");
             return false;
         }
 
-        if (cfg->buckets[i].burst_length > 1 && !cfg->buckets[i].max) {
+        if (bkt->burst_length > 1 && !bkt->max) {
             error_setg(errp, "burst length set without burst rate");
             return false;
         }
 
-        if (cfg->buckets[i].max && !cfg->buckets[i].avg) {
+        if (bkt->max && !bkt->avg) {
             error_setg(errp, "bps_max/iops_max require corresponding"
                        " bps/iops values");
             return false;
         }
 
-        if (cfg->buckets[i].max && cfg->buckets[i].max < cfg->buckets[i].avg) {
+        if (bkt->max && bkt->max < bkt->avg) {
             error_setg(errp, "bps_max/iops_max cannot be lower than bps/iops");
             return false;
         }
-- 
cgit v1.2.3-55-g7522


From 2a8be39ebad013e506e31b069ddcce8993a957bf Mon Sep 17 00:00:00 2001
From: Alberto Garcia
Date: Thu, 24 Aug 2017 16:24:46 +0300
Subject: throttle: Remove throttle_fix_bucket() / throttle_unfix_bucket()

The throttling code can change internally the value of bkt->max if it
hasn't been set by the user. The problem with this is that if we want
to retrieve the original value we have to undo this change first. This
is ugly and unnecessary: this patch removes the throttle_fix_bucket()
and throttle_unfix_bucket() functions completely and moves the logic
to throttle_compute_wait().

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Message-id: 5b0b9e1ac6eb208d709eddc7b09e7669a523bff3.1503580370.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 util/throttle.c | 62 +++++++++++++++++++++------------------------------------
 1 file changed, 23 insertions(+), 39 deletions(-)

(limited to 'util')

diff --git a/util/throttle.c b/util/throttle.c
index bde56fe3de..4e80a7ea54 100644
--- a/util/throttle.c
+++ b/util/throttle.c
@@ -95,23 +95,36 @@ static int64_t throttle_do_compute_wait(double limit, double extra)
 int64_t throttle_compute_wait(LeakyBucket *bkt)
 {
     double extra; /* the number of extra units blocking the io */
+    double bucket_size;   /* I/O before throttling to bkt->avg */
+    double burst_bucket_size; /* Before throttling to bkt->max */
 
     if (!bkt->avg) {
         return 0;
     }
 
-    /* If the bucket is full then we have to wait */
-    extra = bkt->level - bkt->max * bkt->burst_length;
+    if (!bkt->max) {
+        /* If bkt->max is 0 we still want to allow short bursts of I/O
+         * from the guest, otherwise every other request will be throttled
+         * and performance will suffer considerably. */
+        bucket_size = bkt->avg / 10;
+        burst_bucket_size = 0;
+    } else {
+        /* If we have a burst limit then we have to wait until all I/O
+         * at burst rate has finished before throttling to bkt->avg */
+        bucket_size = bkt->max * bkt->burst_length;
+        burst_bucket_size = bkt->max / 10;
+    }
+
+    /* If the main bucket is full then we have to wait */
+    extra = bkt->level - bucket_size;
     if (extra > 0) {
         return throttle_do_compute_wait(bkt->avg, extra);
     }
 
-    /* If the bucket is not full yet we have to make sure that we
-     * fulfill the goal of bkt->max units per second. */
+    /* If the main bucket is not full yet we still have to check the
+     * burst bucket in order to enforce the burst limit */
     if (bkt->burst_length > 1) {
-        /* We use 1/10 of the max value to smooth the throttling.
-         * See throttle_fix_bucket() for more details. */
-        extra = bkt->burst_level - bkt->max / 10;
+        extra = bkt->burst_level - burst_bucket_size;
         if (extra > 0) {
             return throttle_do_compute_wait(bkt->max, extra);
         }
@@ -357,31 +370,6 @@ bool throttle_is_valid(ThrottleConfig *cfg, Error **errp)
     return true;
 }
 
-/* fix bucket parameters */
-static void throttle_fix_bucket(LeakyBucket *bkt)
-{
-    double min;
-
-    /* zero bucket level */
-    bkt->level = bkt->burst_level = 0;
-
-    /* If bkt->max is 0 we still want to allow short bursts of I/O
-     * from the guest, otherwise every other request will be throttled
-     * and performance will suffer considerably. */
-    min = bkt->avg / 10;
-    if (bkt->avg && !bkt->max) {
-        bkt->max = min;
-    }
-}
-
-/* undo internal bucket parameter changes (see throttle_fix_bucket()) */
-static void throttle_unfix_bucket(LeakyBucket *bkt)
-{
-    if (bkt->max < bkt->avg) {
-        bkt->max = 0;
-    }
-}
-
 /* Used to configure the throttle
  *
  * @ts: the throttle state we are working on
@@ -396,8 +384,10 @@ void throttle_config(ThrottleState *ts,
 
     ts->cfg = *cfg;
 
+    /* Zero bucket level */
     for (i = 0; i < BUCKETS_COUNT; i++) {
-        throttle_fix_bucket(&ts->cfg.buckets[i]);
+        ts->cfg.buckets[i].level = 0;
+        ts->cfg.buckets[i].burst_level = 0;
     }
 
     ts->previous_leak = qemu_clock_get_ns(clock_type);
@@ -410,13 +400,7 @@ void throttle_config(ThrottleState *ts,
  */
 void throttle_get_config(ThrottleState *ts, ThrottleConfig *cfg)
 {
-    int i;
-
     *cfg = ts->cfg;
-
-    for (i = 0; i < BUCKETS_COUNT; i++) {
-        throttle_unfix_bucket(&cfg->buckets[i]);
-    }
 }
 
 
-- 
cgit v1.2.3-55-g7522


From d00e6923b1e2c1bec7840b0a0706764493648527 Mon Sep 17 00:00:00 2001
From: Alberto Garcia
Date: Thu, 24 Aug 2017 16:24:47 +0300
Subject: throttle: Make LeakyBucket.avg and LeakyBucket.max integer types

Both the throttling limits set with the throttling.iops-* and
throttling.bps-* options and their QMP equivalents defined in the
BlockIOThrottle struct are integer values.

Those limits are also reported in the BlockDeviceInfo struct and they
are integers there as well.

Therefore there's no reason to store them internally as double and do
the conversion everytime we're setting or querying them, so this patch
uses uint64_t for those types. Let's also use an unsigned type because
we don't allow negative values anyway.

LeakyBucket.level and LeakyBucket.burst_level do however remain double
because their value changes depending on the fraction of time elapsed
since the previous I/O operation.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: f29b840422767b5be2c41c2dfdbbbf6c5f8fedf8.1503580370.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 include/qemu/throttle.h | 4 ++--
 tests/test-throttle.c   | 3 ++-
 util/throttle.c         | 7 +++----
 3 files changed, 7 insertions(+), 7 deletions(-)

(limited to 'util')

diff --git a/include/qemu/throttle.h b/include/qemu/throttle.h
index 66a8ac10a4..6e31155fd4 100644
--- a/include/qemu/throttle.h
+++ b/include/qemu/throttle.h
@@ -77,8 +77,8 @@ typedef enum {
  */
 
 typedef struct LeakyBucket {
-    double  avg;              /* average goal in units per second */
-    double  max;              /* leaky bucket max burst in units */
+    uint64_t avg;             /* average goal in units per second */
+    uint64_t max;             /* leaky bucket max burst in units */
     double  level;            /* bucket level in units */
     double  burst_level;      /* bucket level in units (for computing bursts) */
     unsigned burst_length;    /* max length of the burst period, in seconds */
diff --git a/tests/test-throttle.c b/tests/test-throttle.c
index 768f11dfed..41c0dd2529 100644
--- a/tests/test-throttle.c
+++ b/tests/test-throttle.c
@@ -284,13 +284,14 @@ static void test_enabled(void)
     for (i = 0; i < BUCKETS_COUNT; i++) {
         throttle_config_init(&cfg);
         set_cfg_value(false, i, 150);
+        g_assert(throttle_is_valid(&cfg, NULL));
         g_assert(throttle_enabled(&cfg));
     }
 
     for (i = 0; i < BUCKETS_COUNT; i++) {
         throttle_config_init(&cfg);
         set_cfg_value(false, i, -150);
-        g_assert(!throttle_enabled(&cfg));
+        g_assert(!throttle_is_valid(&cfg, NULL));
     }
 }
 
diff --git a/util/throttle.c b/util/throttle.c
index 4e80a7ea54..80660ffd2c 100644
--- a/util/throttle.c
+++ b/util/throttle.c
@@ -106,13 +106,13 @@ int64_t throttle_compute_wait(LeakyBucket *bkt)
         /* If bkt->max is 0 we still want to allow short bursts of I/O
          * from the guest, otherwise every other request will be throttled
          * and performance will suffer considerably. */
-        bucket_size = bkt->avg / 10;
+        bucket_size = (double) bkt->avg / 10;
         burst_bucket_size = 0;
     } else {
         /* If we have a burst limit then we have to wait until all I/O
          * at burst rate has finished before throttling to bkt->avg */
         bucket_size = bkt->max * bkt->burst_length;
-        burst_bucket_size = bkt->max / 10;
+        burst_bucket_size = (double) bkt->max / 10;
     }
 
     /* If the main bucket is full then we have to wait */
@@ -338,8 +338,7 @@ bool throttle_is_valid(ThrottleConfig *cfg, Error **errp)
 
     for (i = 0; i < BUCKETS_COUNT; i++) {
         LeakyBucket *bkt = &cfg->buckets[i];
-        if (bkt->avg < 0 || bkt->max < 0 ||
-            bkt->avg > THROTTLE_VALUE_MAX || bkt->max > THROTTLE_VALUE_MAX) {
+        if (bkt->avg > THROTTLE_VALUE_MAX || bkt->max > THROTTLE_VALUE_MAX) {
             error_setg(errp, "bps/iops/max values must be within [0, %lld]",
                        THROTTLE_VALUE_MAX);
             return false;
-- 
cgit v1.2.3-55-g7522


From 67335a4558d3cad2173aac0ce13b6c096b077c41 Mon Sep 17 00:00:00 2001
From: Alberto Garcia
Date: Thu, 24 Aug 2017 16:24:48 +0300
Subject: throttle: Make burst_length 64bit and add range checks

LeakyBucket.burst_length is defined as an unsigned integer but the
code never checks for overflows and it only makes sure that the value
is not 0.

In practice this means that the user can set something like
throttling.iops-total-max-length=4294967300 despite being larger than
UINT_MAX and the final value after casting to unsigned int will be 4.

This patch changes the data type to uint64_t. This does not increase
the storage size of LeakyBucket, and allows us to assign the value
directly from qemu_opt_get_number() or BlockIOThrottle and then do the
checks directly in throttle_is_valid().

The value of burst_length does not have a specific upper limit,
but since the bucket size is defined by max * burst_length we have
to prevent overflows. Instead of going for UINT64_MAX or something
similar this patch reuses THROTTLE_VALUE_MAX, which allows I/O bursts
of 1 GiB/s for 10 days in a row.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 1b2e3049803f71cafb2e1fa1be4fb47147a0d398.1503580370.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 include/qemu/throttle.h | 2 +-
 util/throttle.c         | 5 +++++
 2 files changed, 6 insertions(+), 1 deletion(-)

(limited to 'util')

diff --git a/include/qemu/throttle.h b/include/qemu/throttle.h
index 6e31155fd4..8e01885d29 100644
--- a/include/qemu/throttle.h
+++ b/include/qemu/throttle.h
@@ -81,7 +81,7 @@ typedef struct LeakyBucket {
     uint64_t max;             /* leaky bucket max burst in units */
     double  level;            /* bucket level in units */
     double  burst_level;      /* bucket level in units (for computing bursts) */
-    unsigned burst_length;    /* max length of the burst period, in seconds */
+    uint64_t burst_length;    /* max length of the burst period, in seconds */
 } LeakyBucket;
 
 /* The following structure is used to configure a ThrottleState
diff --git a/util/throttle.c b/util/throttle.c
index 80660ffd2c..b8c524336c 100644
--- a/util/throttle.c
+++ b/util/throttle.c
@@ -354,6 +354,11 @@ bool throttle_is_valid(ThrottleConfig *cfg, Error **errp)
             return false;
         }
 
+        if (bkt->max && bkt->burst_length > THROTTLE_VALUE_MAX / bkt->max) {
+            error_setg(errp, "burst length too high for this burst rate");
+            return false;
+        }
+
         if (bkt->max && !bkt->avg) {
             error_setg(errp, "bps_max/iops_max require corresponding"
                        " bps/iops values");
-- 
cgit v1.2.3-55-g7522


From e916a6e88a4ff6c39cd6f62fb162a561c6b89de8 Mon Sep 17 00:00:00 2001
From: Eduardo Habkost
Date: Tue, 29 Aug 2017 18:20:53 -0300
Subject: oslib-posix: Print errors before aborting on qemu_alloc_stack()

If QEMU is running on a system that's out of memory and mmap()
fails, QEMU aborts with no error message at all, making it hard
to debug the reason for the failure.

Add perror() calls that will print error information before
aborting.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20170829212053.6003-1-ehabkost@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 util/oslib-posix.c | 2 ++
 1 file changed, 2 insertions(+)

(limited to 'util')

diff --git a/util/oslib-posix.c b/util/oslib-posix.c
index cacf0ef5e3..80086c549f 100644
--- a/util/oslib-posix.c
+++ b/util/oslib-posix.c
@@ -530,6 +530,7 @@ void *qemu_alloc_stack(size_t *sz)
     ptr = mmap(NULL, *sz, PROT_READ | PROT_WRITE,
                MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
     if (ptr == MAP_FAILED) {
+        perror("failed to allocate memory for stack");
         abort();
     }
 
@@ -544,6 +545,7 @@ void *qemu_alloc_stack(size_t *sz)
     guardpage = ptr;
 #endif
     if (mprotect(guardpage, pagesz, PROT_NONE) != 0) {
+        perror("failed to set up stack guard page");
         abort();
     }
 
-- 
cgit v1.2.3-55-g7522