summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/i915/intel_pm.c
diff options
context:
space:
mode:
authorChris Wilson2018-01-22 14:55:41 +0100
committerChris Wilson2018-01-22 19:27:04 +0100
commitc1beabcf143c72ecdfe76bf50aa6e385a6192d08 (patch)
tree3eed763f4d93143cee7314f218ade1dd0fbee7b5 /drivers/gpu/drm/i915/intel_pm.c
parentdrm/i915: Per-engine scratch VMA is mandatory (diff)
downloadkernel-qcow2-linux-c1beabcf143c72ecdfe76bf50aa6e385a6192d08.tar.gz
kernel-qcow2-linux-c1beabcf143c72ecdfe76bf50aa6e385a6192d08.tar.xz
kernel-qcow2-linux-c1beabcf143c72ecdfe76bf50aa6e385a6192d08.zip
drm/i915: Increase render/media power gating hysteresis for gen9+
On gen9+, after an idle period the HW will disable the entire power well to conserve power (by preventing current leakage). It takes around a 100 microseconds to bring the power well back online afterwards. With the current hysteresis value of 25us (really 25 * 1280ns), we do not have sufficient time to respond to an interrupt and schedule the next execution before the HW powers itself down. (At present, we prevent this by grabbing the forcewake for prolonged periods of time, but that overkill fixed in the next patch.) The minimum we want to set the power gating hysteresis to is the length of time it takes us to service the GPU, which across a broad spectrum of machines is about 250us. (Note this also brings guc latency into the same ballpark as execlists.) v2: Include some notes on where I plucked the numbers from. Testcase: igt/gem_exec_nop/sequential Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180122135541.32222-1-chris@chris-wilson.co.uk
Diffstat (limited to 'drivers/gpu/drm/i915/intel_pm.c')
-rw-r--r--drivers/gpu/drm/i915/intel_pm.c26
1 files changed, 23 insertions, 3 deletions
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 1db79a860b96..0b92ea1dbd40 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -6626,9 +6626,29 @@ static void gen9_enable_rc6(struct drm_i915_private *dev_priv)
I915_WRITE(GEN6_RC_SLEEP, 0);
- /* 2c: Program Coarse Power Gating Policies. */
- I915_WRITE(GEN9_MEDIA_PG_IDLE_HYSTERESIS, 25);
- I915_WRITE(GEN9_RENDER_PG_IDLE_HYSTERESIS, 25);
+ /*
+ * 2c: Program Coarse Power Gating Policies.
+ *
+ * Bspec's guidance is to use 25us (really 25 * 1280ns) here. What we
+ * use instead is a more conservative estimate for the maximum time
+ * it takes us to service a CS interrupt and submit a new ELSP - that
+ * is the time which the GPU is idle waiting for the CPU to select the
+ * next request to execute. If the idle hysteresis is less than that
+ * interrupt service latency, the hardware will automatically gate
+ * the power well and we will then incur the wake up cost on top of
+ * the service latency. A similar guide from intel_pstate is that we
+ * do not want the enable hysteresis to less than the wakeup latency.
+ *
+ * igt/gem_exec_nop/sequential provides a rough estimate for the
+ * service latency, and puts it around 10us for Broadwell (and other
+ * big core) and around 40us for Broxton (and other low power cores).
+ * [Note that for legacy ringbuffer submission, this is less than 1us!]
+ * However, the wakeup latency on Broxton is closer to 100us. To be
+ * conservative, we have to factor in a context switch on top (due
+ * to ksoftirqd).
+ */
+ I915_WRITE(GEN9_MEDIA_PG_IDLE_HYSTERESIS, 250);
+ I915_WRITE(GEN9_RENDER_PG_IDLE_HYSTERESIS, 250);
/* 3a: Enable RC6 */
I915_WRITE(GEN6_RC6_THRESHOLD, 37500); /* 37.5/125ms per EI */