drm/i915: Only emit one semaphore per request

Ideally we only need one semaphore per ring to accommodate waiting on multiple engines in parallel. However, since we do not know which fences we will finally be waiting on, we emit a semaphore for every fence. It turns out to be quite easy to trick ourselves into exhausting our ringbuffer causing an error, just by feeding in a batch that depends on several thousand contexts. Since we never can be waiting on more than one semaphore in parallel (other than perhaps the desire to busywait on multiple engines), just pick the first fence for our semaphore. If we pick the wrong fence to busywait on, we just miss an opportunity to reduce latency. An adaption might be to use sched.flags as either a semaphore counter, or to track the first busywait on each engine, converting it back to a single use bit prior to closing the request. v2: Track first semaphore used per-engine (this caters for our basic igt that semaphores are working). Reported-by: Mika Kuoppala <mika.kuoppala@intel.com> Testcase: igt/gem_exec_fence/long-history Fixes: e88619646971 ("drm/i915: Use HW semaphores for inter-engine synchronisation on gen8+") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190401162641.10963-3-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
author: Chris Wilson 2019-04-01 18:26:41 +0200
committer: Chris Wilson 2019-04-02 16:52:09 +0200
commit: 7881e6057586b0bdaaffef13d9f88c95a86ba484 (patch)
tree: 5f119a17c621eebeb9458ab61c6d228b241c98f7 /drivers/gpu/drm/i915/i915_request.c
parent: drm/i915: Split out i915_priolist_types into its own header (diff)
download: kernel-qcow2-linux-7881e6057586b0bdaaffef13d9f88c95a86ba484.tar.gz
kernel-qcow2-linux-7881e6057586b0bdaaffef13d9f88c95a86ba484.tar.xz
kernel-qcow2-linux-7881e6057586b0bdaaffef13d9f88c95a86ba484.zip
1 files changed, 9 insertions, 2 deletions
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index e9c2094ab8ea..82094b9f5ba7 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -783,6 +783,12 @@ emit_semaphore_wait(struct i915_request *to,
 	GEM_BUG_ON(!from->timeline->has_initial_breadcrumb);
 	GEM_BUG_ON(INTEL_GEN(to->i915) < 8);
 
+	/* Just emit the first semaphore we see as request space is limited. */
+	if (to->sched.semaphores & from->engine->mask)
+		return i915_sw_fence_await_dma_fence(&to->submit,
+						     &from->fence, 0,
+						     I915_FENCE_GFP);
+
 	/* We need to pin the signaler's HWSP until we are finished reading. */
 	err = i915_timeline_read_hwsp(from, to, &hwsp_offset);
 	if (err)
@@ -814,7 +820,8 @@ emit_semaphore_wait(struct i915_request *to,
 	*cs++ = 0;
 
 	intel_ring_advance(to, cs);
-	to->sched.flags |= I915_SCHED_HAS_SEMAPHORE;
+	to->sched.semaphores |= from->engine->mask;
+	to->sched.flags |= I915_SCHED_HAS_SEMAPHORE_CHAIN;
 	return 0;
 }
 
@@ -1126,7 +1133,7 @@ void i915_request_add(struct i915_request *request)
 		 * far in the distance past over useful work, we keep a history
 		 * of any semaphore use along our dependency chain.
 		 */
-		if (!(request->sched.flags & I915_SCHED_HAS_SEMAPHORE))
+		if (!(request->sched.flags & I915_SCHED_HAS_SEMAPHORE_CHAIN))
 			attr.priority |= I915_PRIORITY_NOSEMAPHORE;
 
 		/*
author	Chris Wilson	2019-04-01 18:26:41 +0200
committer	Chris Wilson	2019-04-02 16:52:09 +0200
commit	7881e6057586b0bdaaffef13d9f88c95a86ba484 (patch)
tree	5f119a17c621eebeb9458ab61c6d228b241c98f7 /drivers/gpu/drm/i915/i915_request.c
parent	drm/i915: Split out i915_priolist_types into its own header (diff)
download	kernel-qcow2-linux-7881e6057586b0bdaaffef13d9f88c95a86ba484.tar.gz kernel-qcow2-linux-7881e6057586b0bdaaffef13d9f88c95a86ba484.tar.xz kernel-qcow2-linux-7881e6057586b0bdaaffef13d9f88c95a86ba484.zip