summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/i915/intel_ringbuffer.c
Commit message (Collapse)AuthorAgeFilesLines
* drm/i915: Fix PIPE_CONTROL DW/QW write through global GTT on IVB+Ville Syrjälä2013-02-201-1/+2
| | | | | | | | | | | | | | | | The bit controlling whether PIPE_CONTROL DW/QW write targets the global GTT or PPGTT moved moved from DW 2 bit 2 to DW 1 bit 24 on IVB. I verified on IVB that the fix is in fact effective. Without the fix none of the scratch writes actually landed in the pipe control page. With the fix the writes show up correctly. v2: move PIPE_CONTROL_GLOBAL_GTT_IVB setup to where other flags are set Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: Print the pipe control page GTT addressVille Syrjälä2013-02-201-0/+3
| | | | | | | | | We already print the HWS addresses during init, so do the same for the pipe control page. Reduces guesswork when looking at hex addresses later. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* Merge branch 'fbcon-locking-fixes' of ↵Dave Airlie2013-02-081-6/+18
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | ssh://people.freedesktop.org/~airlied/linux into drm-next This pulls in most of Linus tree up to -rc6, this fixes the worst lockdep reported issues and re-enables fbcon lockdep. (not the fbcon maintainer) * 'fbcon-locking-fixes' of ssh://people.freedesktop.org/~airlied/linux: (529 commits) Revert "Revert "console: implement lockdep support for console_lock"" fbcon: fix locking harder fb: Yet another band-aid for fixing lockdep mess fb: rework locking to fix lock ordering on takeover
| * drm/i915: GFX_MODE Flush TLB Invalidate Mode must be '1' for scanline waitsChris Wilson2013-01-231-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On SNB, if bit 13 of GFX_MODE, Flush TLB Invalidate Mode, is not set to 1, the hardware can not program the scanline values. Those scanline values then control when the signal is sent from the display engine to the render ring for MI_WAIT_FOR_EVENTs. Note setting this bit means that TLB invalidations must be performed explicitly through the appropriate bits being set in PIPE_CONTROL. References: https://bugzilla.kernel.org/show_bug.cgi?id=52311 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
| * drm/i915: Disable AsyncFlip performance optimisationsChris Wilson2013-01-231-6/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a required workarounds for all products, especially on gen6+ where it causes the command streamer to fail to parse instructions following a WAIT_FOR_EVENT. We use WAIT_FOR_EVENT for synchronising between the GPU and the display engines, and so this bit being unset may cause hangs. References: https://bugzilla.kernel.org/show_bug.cgi?id=52311 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* | drm/i915: use gem_set_seqno() on hardware initMika Kuoppala2013-01-221-2/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When machine was rebooted or module was reloaded, gem_hw_init() set last_seqno to be identical to next_seqno. This lead to situation that waits for first ever request always passed immediately regardless if it was actually executed. Use gem_set_seqno() to be consistent how hw is initialized on init, wrap and on resume. Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* | drm/i915: move wedged to the other gpu error handling stuffDaniel Vetter2013-01-201-2/+4
| | | | | | | | | | | | | | | | | | | | | | And to make Ben Widawsky happier, use the gpu_error instead of the entire device as the argument in some functions. Drop the outdated comment on ->wedged for now, a follow-up patch will change the semantics and add a proper comment again. Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* | drm/i915: extract hangcheck/reset/error_state state into substructDaniel Vetter2013-01-201-1/+1
| | | | | | | | | | | | | | | | | | This has been sprinkled all over the place in dev_priv. I think it'd be good to also move all the code into a separate file like i915_gem_error.c, but that's for another patch. Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* | drm/i915: Remove use on gma_bus_addr on gen6+Ben Widawsky2013-01-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | We have enough info to not use the intel_gtt bridge stuff. v2: Move setup of mappable_base above the legacy init stuff because we still need that on older platforms. (Daniel) v3: Remove the dev_priv hunk which was rebased in by accident Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> (v2) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* | Merge tag 'drm-intel-next-2012-12-21' of ↵Dave Airlie2013-01-171-18/+81
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://people.freedesktop.org/~danvet/drm-intel into drm-next Daniel writes: - seqno wrap fixes and debug infrastructure from Mika Kuoppala and Chris Wilson - some leftover kill-agp on gen6+ patches from Ben - hotplug improvements from Damien - clear fb when allocated from stolen, avoids dirt on the fbcon (Chris) - Stolen mem support from Chris Wilson, one of the many steps to get to real fastboot support. - Some DDI code cleanups from Paulo. - Some refactorings around lvds and dp code. - some random little bits&pieces * tag 'drm-intel-next-2012-12-21' of git://people.freedesktop.org/~danvet/drm-intel: (93 commits) drm/i915: Return the real error code from intel_set_mode() drm/i915: Make GSM void drm/i915: Move GSM mapping into dev_priv drm/i915: Move even more gtt code to i915_gem_gtt drm/i915: Make next_seqno debugs entry to use i915_gem_set_seqno drm/i915: Introduce i915_gem_set_seqno() drm/i915: Always clear semaphore mboxes on seqno wrap drm/i915: Initialize hardware semaphore state on ring init drm/i915: Introduce ring set_seqno drm/i915: Missed conversion to gtt_pte_t drm/i915: Bug on unsupported swizzled platforms drm/i915: BUG() if fences are used on unsupported platform drm/i915: fixup overlay stolen memory leak drm/i915: clean up PIPECONF bpc #defines drm/i915: add intel_dp_set_signal_levels drm/i915: remove leftover display.update_wm assignment drm/i915: check for the PCH when setting pch_transcoder drm/i915: Clear the stolen fb before enabling drm/i915: Access to snooped system memory through the GTT is incoherent drm/i915: Remove stale comment about intel_dp_detect() ... Conflicts: drivers/gpu/drm/i915/intel_display.c
| * drm/i915: Initialize hardware semaphore state on ring initMika Kuoppala2012-12-191-15/+9Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hardware status page needs to have proper seqno set as our initial seqno can be arbitrary. If initial seqno is close to wrap boundary on init and i915_seqno_passed() (31bit space) refers to hw status page which contains zero, errorneous result will be returned. v2: clear mboxes and set hws page directly instead of going through rings. Suggested by Chris Wilson. v3: hws needs to be updated for all gens. Noticed by Chris Wilson. References: https://bugs.freedesktop.org/show_bug.cgi?id=58230 Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
| * drm/i915: Introduce ring set_seqnoMika Kuoppala2012-12-191-0/+20
| | | | | | | | | | | | | | | | | | In preparation for setting per ring initial seqno values add ring::set_seqno(). Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
| * drm/i915: Don't emit semaphore wait if wrap happenedMika Kuoppala2012-12-111-5/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If wrap just happened we need to prevent emitting waits for pre wrap values. Detect this and emit no-ops instead. v2: Use olr > seqno to detect wrap instead of *seqno == 0 as suggested by Chris Wilson. v3: Use last used seqno to detect the wraparound. From Chris Wilson v4: Fixed unnecessary last_seqno assigment References: https://bugs.freedesktop.org/show_bug.cgi?id=57967 Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
| * drm/i915: Add intel_ring_handle_seqno wrapMika Kuoppala2012-12-061-0/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If there are pre-wrap values in semaphore-mbox registers after wrap, syncing against some after-wrap request will complete immediately. Fix this by emitting ring commands to set mbox registers to zero when the wrap happens. v2: Use __intel_ring_begin to emit ring commands, from Chris Wilson. Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Add a small comment to handle_seqno_wrap.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
| * drm/i915: Split intel_ring_beginMika Kuoppala2012-12-061-15/+22
| | | | | | | | | | | | | | | | | | | | In preparation for handling ring seqno wrapping, split intel_ring_begin into helper part which doesn't allocate seqno. Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
| * drm/i915: Allocate ringbuffers from stolen memoryChris Wilson2012-11-301-1/+5
| | | | | | | | | | | | | | Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Acked-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* | drm/i915: Implement workaround for broken CS tlb on i830/845Daniel Vetter2012-12-171-8/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that Chris Wilson demonstrated that the key for stability on early gen 2 is to simple _never_ exchange the physical backing storage of batch buffers I've tried a stab at a kernel solution. Doesn't look too nefarious imho, now that I don't try to be too clever for my own good any more. v2: After discussing the various techniques, we've decided to always blit batches on the suspect devices, but allow userspace to opt out of the kernel workaround assume full responsibility for providing coherent batches. The principal reason is that avoiding the blit does improve performance in a few key microbenchmarks and also in cairo-trace replays. Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: - Drop the hunk which uses HAS_BROKEN_CS_TLB to implement the ring wrap w/a. Suggested by Chris Wilson. - Also add the ACTHD check from Chris Wilson for the error state dumping, so that we still catch batches when userspace opts out of the w/a.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* | drm/i915: Don't allow ring tail to reach the same cacheline as headVille Syrjälä2012-12-031-2/+2
|/ | | | | | | | | | | | | | | | | | | | | | From BSpec: "If the Ring Buffer Head Pointer and the Tail Pointer are on the same cacheline, the Head Pointer must not be greater than the Tail Pointer." The easiest way to enforce this is to reduce the reported ring space. References: Gen2 BSpec "1. Programming Environment" / 1.4.4.6 "Ring Buffer Use" Gen3 BSpec "vol1c Memory Interface Functions" / 2.3.4.5 "Ring Buffer Use" Gen4+ BSpec "vol1c Memory Interface and Command Stream" / 5.3.4.5 "Ring Buffer Use" v2: Include the exact BSpec references in the description v3: s/64/I915_RING_FREE_SPACE, and add the BSpec information to the code Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: Rearrange code to only have a single method for waiting upon the ringChris Wilson2012-11-291-25/+48
| | | | | | | | | | Replace the wait for the ring to be clear with the more common wait for the ring to be idle. The principle advantage is one less exported intel_ring_wait function, and the removal of a hardcoded value. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: Preallocate next seqno before touching the ringChris Wilson2012-11-291-23/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Based on the work by Mika Kuoppala, we realised that we need to handle seqno wraparound prior to committing our changes to the ring. The most obvious point then is to grab the seqno inside intel_ring_begin(), and then to reuse that seqno for all ring operations until the next request. As intel_ring_begin() can fail, the callers must already be prepared to handle such failure and so we can safely add further checks. This patch looks like it should be split up into the interface changes and the tweaks to move seqno wrapping from the execbuffer into the core seqno increment. However, I found no easy way to break it into incremental steps without introducing further broken behaviour. v2: Mika found a silly mistake and a subtle error in the existing code; inside i915_gem_retire_requests() we were resetting the sync_seqno of the target ring based on the seqno from this ring - which are only related by the order of their allocation, not retirement. Hence we were applying the optimisation that the rings were synchronised too early, fortunately the only real casualty there is the handling of seqno wrapping. v3: Do not forget to reset the sync_seqno upon module reinitialisation, ala resume. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=863861 Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [v2] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: Use LRI to update the semaphore registersChris Wilson2012-11-211-5/+2Star
| | | | | | | | | | | | | | | The bspec was recently updated to remove the ability to update the semaphore using the MI_SEMAPHORE_BOX command, the ability to wait upon the semaphore value remained. Instead the advice is to update the register using the MI_LOAD_REGISTER_IMM command. In cursory testing, semaphores continue to function - the question is whether this fixes some of the deadlocks where the semaphore registers contained stale values? Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel J Blueman <daniel@quora.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: Restore physical HWS_PGA after resumeChris Wilson2012-11-161-10/+35
| | | | | | | | | | | | | | | | | | | By always setting up the HWS register for both physical and virtual address variations during render ring we can reduce the number of different special cases that get set up at varying different times during module load. Fixes regression from commit c630119f43471a8ece356b01dabf07f944f453b3 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Wed Oct 17 11:32:57 2012 +0200 drm/i915: don't save/restore HWS_PGA reg for kms Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: drop the double-OP_STOREDW usage in blt_ring_flushDaniel Vetter2012-11-111-1/+1
| | | | | | | | | This has been introduced in "drm/i915: TLB invalidation with MI_FLUSH_DW requires a post-sync op". Reported-by: Fengguang Wu <fengguang.wu@intel.com> Reported-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: PIPE_CONTROL TLB invalidate requires CS stallJesse Barnes2012-11-111-1/+1
| | | | | | | | | | | | "If ENABLED, PIPE_CONTROL command will flush the in flight data written out by render engine to Global Observation point on flush done. Also Requires stall bit ([20] of DW1) set." So set the stall bit to ensure proper invalidation. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Antti Koskipää <antti.koskipaa@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: TLB invalidation with MI_FLUSH_DW requires a post-sync op v3Jesse Barnes2012-11-111-4/+18
| | | | | | | | | | | | | | So store into the scratch space of the HWS to make sure the invalidate occurs. v2: use GTT address space for store, clean up #defines (Chris) v3: use correct #define in blt ring flush (Chris) Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Antti Koskipää <antti.koskipaa@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> References: https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/1063252 Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915/ringbuffer: exclude last 2 cachelines on 845g on all callpathsMika Kuoppala2012-11-111-1/+1
| | | | | | | | | | | | | | Make intel_render_ring_init_dri and intel_init_ring_buffer symmetrical with regards of workaround introduced by: commit 27c1cbd06a7620b354cbb363834f3bb8df4f410d Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Apr 9 13:59:46 2012 +0100 drm/i915/ringbuffer: Exclude last 2 cachlines of ring on 845g Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* Merge tag 'v3.7-rc2' into drm-intel-next-queuedDaniel Vetter2012-10-221-3/+2Star
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Linux 3.7-rc2 Backmerge to solve two ugly conflicts: - uapi. We've already added new ioctl definitions for -next. Do I need to say more? - wc support gtt ptes. We've had to revert this for snb+ for 3.7 and also fix a few other things in the code. Now we know how to make it work on snb+, but to avoid losing the other fixes do the backmerge first before re-enabling wc gtt ptes on snb+. And a few other minor things, among them git getting confused in intel_dp.c and seemingly causing a conflict out of nothing ... Conflicts: drivers/gpu/drm/i915/i915_reg.h drivers/gpu/drm/i915/intel_display.c drivers/gpu/drm/i915/intel_dp.c drivers/gpu/drm/i915/intel_modes.c include/drm/i915_drm.h Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
| * Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds2012-10-041-19/+133
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull drm merge (part 1) from Dave Airlie: "So first of all my tree and uapi stuff has a conflict mess, its my fault as the nouveau stuff didn't hit -next as were trying to rebase regressions out of it before we merged. Highlights: - SH mobile modesetting driver and associated helpers - some DRM core documentation - i915 modesetting rework, haswell hdmi, haswell and vlv fixes, write combined pte writing, ilk rc6 support, - nouveau: major driver rework into a hw core driver, makes features like SLI a lot saner to implement, - psb: add eDP/DP support for Cedarview - radeon: 2 layer page tables, async VM pte updates, better PLL selection for > 2 screens, better ACPI interactions The rest is general grab bag of fixes. So why part 1? well I have the exynos pull req which came in a bit late but was waiting for me to do something they shouldn't have and it looks fairly safe, and David Howells has some more header cleanups he'd like me to pull, that seem like a good idea, but I'd like to get this merge out of the way so -next dosen't get blocked." Tons of conflicts mostly due to silly include line changes, but mostly mindless. A few other small semantic conflicts too, noted from Dave's pre-merged branch. * 'drm-next' of git://people.freedesktop.org/~airlied/linux: (447 commits) drm/nv98/crypt: fix fuc build with latest envyas drm/nouveau/devinit: fixup various issues with subdev ctor/init ordering drm/nv41/vm: fix and enable use of "real" pciegart drm/nv44/vm: fix and enable use of "real" pciegart drm/nv04/dmaobj: fixup vm target handling in preparation for nv4x pcie drm/nouveau: store supported dma mask in vmmgr drm/nvc0/ibus: initial implementation of subdev drm/nouveau/therm: add support for fan-control modes drm/nouveau/hwmon: rename pwm0* to pmw1* to follow hwmon's rules drm/nouveau/therm: calculate the pwm divisor on nv50+ drm/nouveau/fan: rewrite the fan tachometer driver to get more precision, faster drm/nouveau/therm: move thermal-related functions to the therm subdev drm/nouveau/bios: parse the pwm divisor from the perf table drm/nouveau/therm: use the EXTDEV table to detect i2c monitoring devices drm/nouveau/therm: rework thermal table parsing drm/nouveau/gpio: expose the PWM/TOGGLE parameter found in the gpio vbios table drm/nouveau: fix pm initialization order drm/nouveau/bios: check that fixed tvdac gpio data is valid before using it drm/nouveau: log channel debug/error messages from client object rather than drm client drm/nouveau: have drm debugging macros build on top of core macros ...
| * | UAPI: (Scripted) Convert #include "..." to #include <path/...> in drivers/gpu/David Howells2012-10-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert #include "..." to #include <path/...> in drivers/gpu/. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Dave Airlie <airlied@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Dave Jones <davej@redhat.com>
| * | UAPI: (Scripted) Remove redundant DRM UAPI header #inclusions from drivers/gpu/.David Howells2012-10-021-1/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove redundant DRM UAPI header #inclusions from drivers/gpu/. Remove redundant #inclusions of core DRM UAPI headers (drm.h, drm_mode.h and drm_sarea.h). They are now #included via drmP.h and drm_crtc.h via a preceding patch. Without this patch and the patch to make include the UAPI headers from the core headers, after the UAPI split, the DRM C sources cannot find these UAPI headers because the DRM code relies on specific -I flags to make #include "..." work on headers in include/drm/ - but that does not work after the UAPI split without adding more -I flags. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Dave Airlie <airlied@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Dave Jones <davej@redhat.com>
* | | drm/i915: Allow DRM_ROOT_ONLY|DRM_MASTER to submit privileged batchbuffersChris Wilson2012-10-171-9/+39
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the introduction of per-process GTT space, the hardware designers thought it wise to also limit the ability to write to MMIO space to only a "secure" batch buffer. The ability to rewrite registers is the only way to program the hardware to perform certain operations like scanline waits (required for tear-free windowed updates). So we either have a choice of adding an interface to perform those synchronized updates inside the kernel, or we permit certain processes the ability to write to the "safe" registers from within its command stream. This patch exposes the ability to submit a SECURE batch buffer to DRM_ROOT_ONLY|DRM_MASTER processes. v2: Haswell split up bit8 into a ppgtt bit (still bit8) and a security bit (bit 13, accidentally not set). Also add a comment explaining why secure batches need a global gtt binding. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v1) [danvet: added hsw fixup.] Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* | drm/i915: Replace the array of pages with a scatterlistChris Wilson2012-09-201-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than have multiple data structures for describing our page layout in conjunction with the array of pages, we can migrate all users over to a scatterlist. One major advantage, other than unifying the page tracking structures, this offers is that we replace the vmalloc'ed array (which can be up to a megabyte in size) with a chain of individual pages which helps reduce memory pressure. The disadvantage is that we then do not have a simple array to iterate, or to access randomly. The common case for this is in the relocation processing, which will typically fit within a single scatterlist page and so be almost the same cost as the simple array. For iterating over the array, the extra function call could be optimised away, but in reality is an insignificant cost of either binding the pages, or performing the pwrite/pread. v2: Fix drm_clflush_sg() to not invoke wbinvd as well! And fix the trivial compile error from rebasing. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* | drm/i915: add workarounds to gen7_render_ring_flushPaulo Zanoni2012-09-031-5/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From Bspec, Vol 2a, Section 1.9.3.4 "PIPE_CONTROL", intro section detailing the various workarounds: "[DevIVB {W/A}, DevHSW {W/A}]: Pipe_control with CS-stall bit set must be issued before a pipe-control command that has the State Cache Invalidate bit set." Note that public Bspec has different numbering, it's Vol2Part1, Section 1.10.4.1 "PIPE_CONTROL" there. There's also a second workaround for the PIPE_CONTROL command itself: "[DevIVB, DevVLV, DevHSW] {WA}: Every 4th PIPE_CONTROL command, not counting the PIPE_CONTROL with only read-cache-invalidate bit(s) set, must have a CS_STALL bit set" For simplicity we simply set the CS_STALL bit on every pipe_control on gen7+ Note that this massively helps on some hsw machines, together with the following patch to unconditionally set the CS_STALL bit on every pipe_control it prevents a gpu hang every few seconds. This is a regression that has been introduced in the pipe_control cleanup: commit 6c6cf5aa9c583478b19e23149feaa92d01fb8c2d Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Jul 20 18:02:28 2012 +0100 drm/i915: Only apply the SNB pipe control w/a to gen6 It looks like the massive snb pipe_control workaround also papered over any issues on ivb and hsw. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> [danvet: squashed both workarounds together, pimped commit message with Bsepc citations, regression commit citation and changed the comment in the code a bit to clarify that we unconditionally set CS_STALL to avoid being hurt by trying to be clever.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* | drm/i915: add workarounds directly to gen6_render_ring_flushPaulo Zanoni2012-09-031-15/+6Star
| | | | | | | | | | | | | | Since gen 7+ now run the new gen7_render_ring_flush function. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* | drm/i915: add gen7_render_ring_flushPaulo Zanoni2012-09-031-1/+49
| | | | | | | | | | | | | | | | For now, just a copy of gen6_render_ring_flush. Different gens have different workarounds, so we want different functions. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* | drm/i915: Only pwrite through the GTT if there is space in the apertureChris Wilson2012-08-241-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | Avoid stalling and waiting for the GPU by checking to see if there is sufficient inactive space in the aperture for us to bind the buffer prior to writing through the GTT. If there is inadequate space we will have to stall waiting for the GPU, and incur overheads moving objects about. Instead, only incur the clflush overhead on the target object by writing through shmem. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* | Merge tag 'v3.6-rc2' into drm-intel-nextDaniel Vetter2012-08-171-16/+28
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | Backmerge Linux 3.6-rc2 to resolve a few funny conflicts before we put even more madness on top: - drivers/gpu/drm/i915/i915_irq.c: Just a spurious WARN removed in -fixes, that has been changed in a variable-rename in -next, too. - drivers/gpu/drm/i915/intel_ringbuffer.c: -next remove scratch_addr (since all their users have been extracted in another fucntion), -fixes added another user for a hw workaroudn. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
| * drm/i915: Apply post-sync write for pipe control invalidatesChris Wilson2012-08-141-18/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When invalidating the TLBs it is documentated as requiring a post-sync write. Failure to do so seems to result in a GPU hang. Exposure to this hang on IVB seems to be a result of removing the extra stalls required for SNB pipecontrol workarounds: commit 6c6cf5aa9c583478b19e23149feaa92d01fb8c2d Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Jul 20 18:02:28 2012 +0100 drm/i915: Only apply the SNB pipe control w/a to gen6 Note: Manually switch the pipe_control cmd to 4 dwords to avoid a (silent) functional conflict with -next. This way will get a loud (but conflict with next (since the scratch_addr has been deleted there). Reported-and-tested-by: yex.tian@intel.com Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53322 Acked-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: added note about merge conflict with -next.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
| * drm/i915: correctly order the ring init sequenceDaniel Vetter2012-08-081-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | We may only start to set up the new register values after having confirmed that the ring is truely off. Otherwise the hw might lose the newly written register values. This is caught later on in the init sequence, when we check whether the register writes have stuck. Cc: stable@vger.kernel.org Reviewed-by: Jani Nikula <jani.nikula@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50522 Tested-by: Yang Guang <guang.a.yang@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* | drm/i915: Lazily apply the SNB+ seqno w/aChris Wilson2012-08-101-6/+4Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Avoid the forcewake overhead when simply retiring requests, as often the last seen seqno is good enough to satisfy the retirment process and will be promptly re-run in any case. Only ensure that we force the coherent seqno read when we are explicitly waiting upon a completion event to be sure that none go missing, and also for when we are reporting seqno values in case of error or debugging. This greatly reduces the load for userspace using the busy-ioctl to track active buffers, for instance halving the CPU used by X in pushing the pixels from a software render (flash). The effect will be even more magnified with userptr and so providing a zero-copy upload path in that instance, or in similar instances where X is simply compositing DRI buffers. v2: Reverse the polarity of the tachyon stream. Daniel suggested that 'force' was too generic for the parameter name and that 'lazy_coherency' better encapsulated the semantics of it being an optimization and its purpose. Also notice that gen6_get_seqno() is only used by gen6/7 chipsets and so the test for IS_GEN6 || IS_GEN7 is redundant in that function. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* | drm/i915: Only apply the SNB pipe control w/a to gen6Chris Wilson2012-08-081-13/+20
| | | | | | | | | | | | | | | | | | The requirements for the sync flush to be emitted prior to the render cache flush is only true for SandyBridge. On IvyBridge and friends we can just emit the flushes with an inline CS stall. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* | drm/i915: Macro to determine DPF supportBen Widawsky2012-07-251-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Originally I had a macro specifically for DPF support, and Daniel, with good reason asked me to change it to this. It's not the way I would have gone (and indeed I didn't), but for now there is no distinction as all platforms with L3 also have DPF. Note: The good reasons are that dpf is a l3$ feature (at least on currrent hw), hence I don't expect one to go without the other. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: added note] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* | drm/i915: Split i915_gem_flush_ring() into seperate invalidate/flush funcsChris Wilson2012-07-251-0/+38
| | | | | | | | | | | | | | | | | | By moving the function to intel_ringbuffer and currying the appropriate parameter, hopefully we make the callsites easier to read and understand. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* | drm/i915: Remove the per-ring write listChris Wilson2012-07-251-2/+0Star
|/ | | | | | | | This is now handled by a global flag to ensure we emit a flush before the next serialisation point (if we failed to queue one previously). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: missing error case in init status pageBen Widawsky2012-07-201-0/+1
| | | | | Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: Add comments to explain the BSD tail write workaroundChris Wilson2012-07-201-8/+19
| | | | | | | | | | | | | | | | | Having had to dive into the bspec to understand what each stage of the workaround meant, and how that the ring broadcasting IDLE corresponded with the GT powering down the ring (i.e. rc6) add comments to aide the next reader. And since the register "is used to control all aspects of PSMI and power saving functions" that makes it quite interesting to inspect with regards to RC6 hangs, so add it to the error-state. v2: Rediscover the piece of magic, set the RNCID to 0 before waiting for the ring to wake up. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: don't return a spurious -EIO from intel_ring_beginDaniel Vetter2012-07-051-14/+4Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The issue with this check is that it results in userspace receiving an -EIO while the gpu reset hasn't completed, resulting in fallback to sw rendering or worse. Now there's also a stern comment in intel_ring_wait_seqno saying that intel_ring_begin should not return -EAGAIN, ever, because some callers can't handle that. But after an audit of the callsites I don't see any issues. I guess the last problematic spot disappeared with the removal of the pipelined fencing code. So do the right thing and call check_wedge, which should properly decide whether an -EAGAIN or -EIO is appropriate if wedged is set. Note that the early check for a wedged gpu before touching the ring is rather important (and it took me quite some time of acting like the densest doofus to figure that out): If we don't do that and the gpu died for good, not having been resurrect by the reset code, userspace can merrily fill up the entire ring until it notices that something is amiss. Allowing userspace to emit more render, despite that we know that it will fail can't lead to anything good (and by experience can lead to all sorts of havoc, including angering the OOM gods and hard-hanging the hw for good). v2: Fix EAGAIN mispell, noticed by Chris Wilson. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: non-interruptible sleeps can't handle -EAGAINDaniel Vetter2012-07-051-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | So don't return -EAGAIN, even in the case of a gpu hang. Remap it to -EIO instead. Note that this isn't really an issue with interruptability, but more that we have quite a few codepaths (mostly around kms stuff) that simply can't handle any errors and hence not even -EAGAIN. Instead of adding proper failure paths so that we could restart these ioctls we've opted for the cheap way out of sleeping non-interruptibly. Which works everywhere but when the gpu dies, which this patch fixes. So essentially interruptible == false means 'wait for the gpu or die trying'.' This patch is a bit ugly because intel_ring_begin is all non-interruptible and hence only returns -EIO. But as the comment in there says, auditing all the callsites would be a pain. To avoid duplicating code, reuse i915_gem_check_wedge in __wait_seqno and intel_wait_ring_buffer. Also use the opportunity to clarify the different cases in i915_gem_check_wedge a bit with comments. v2: Don't access dev_priv->mm.interruptible from check_wedge - we might not hold dev->struct_mutex, making this racy. Instead pass interruptible in as a parameter. I've noticed this because I've hit a BUG_ON(!mutex_is_locked) at the top of check_wedge. This has been added in commit b4aca0106c466b5a0329318203f65bac2d91b682 Author: Ben Widawsky <ben@bwidawsk.net> Date: Wed Apr 25 20:50:12 2012 -0700 drm/i915: extract some common olr+wedge code although that commit is missing any justification for this. I guess it's just copy&paste, because the same commit add the same BUG_ON check to check_olr, where it indeed makes sense. But in check_wedge everything we access is protected by other means, so this is superflous. And because it now gets in the way (we add a new caller in __wait_seqno, which can be called without dev->struct_mutext) let's just remove it. v3: Group all the i915_gem_check_wedge refactoring into this patch, so that this patch here is all about not returning -EAGAIN to callsites that can't handle syscall restarting. v4: Add clarification what interuptible == fales means in our code, requested by Ben Widawsky. v5: Fix EAGAIN mispell noticed by Chris Wilson. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: "Flush Me Harder" required on gen6+Daniel Vetter2012-06-281-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The prep to remove the flushing list in commit cc889e0f6ce6a63c62db17d702ecfed86d58083f Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Wed Jun 13 20:45:19 2012 +0200 drm/i915: disable flushing_list/gpu_write_list causes quite some decent regressions. We can fix this by setting the CS_STALL bit to ensure that the following seqno write happens only after the cache flush has completed. But only do that when the caller actually wants the flush (and not also when we invalidate caches before starting the next batch). I've looked through all our ancient scrolls about gen6+ pipe control workarounds, and this seems to be indeed a legal combination: We're allowed to set the CS_STALL bit when we flush the render cache (which we do). While yelling at this code, also pass back the return value from intel_emit_post_sync_nonzero_flush properly. v2: Instead of emitting more pipe controls, set the CS_STALL bit on the write flush as suggested by Chris Wilson. It seems to work, too. Cc: Eric Anholt <eric@anholt.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51436 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51429 Tested-by: Lu Hua <huax.lu@intel.com> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* Merge tag 'v3.5-rc4' into drm-intel-next-queuedDaniel Vetter2012-06-251-3/+18
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I want to merge the "no more fake agp on gen6+" patches into drm-intel-next (well, the last pieces). But a patch in 3.5-rc4 also adds a new use of dev->agp. Hence the backmarge to sort this out, for otherwise drm-intel-next merged into Linus' tree would conflict in the relevant code, things would compile but nicely OOPS at driver load :( Conflicts in this merge are just simple cases of "both branches changed/added lines at the same place". The only tricky part is to keep the order correct wrt the unwind code in case of errors in intel_ringbuffer.c (and the MI_DISPLAY_FLIP #defines in i915_reg.h together, obviously). Conflicts: drivers/gpu/drm/i915/i915_reg.h drivers/gpu/drm/i915/intel_ringbuffer.c Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>