openslx/kernel-qcow2-linux.git - In-kernel qcow2 (Kernel part)

	Commit message (Collapse)	Author	Age	Files	Lines
*	drm/msm: Attach the GPU MMU when it is created	Jordan Crouse	2017-08-22	1	-29/+56
\| \| \| \| \| \| \| \| \| \| \| \|	Currently the GPU MMU is attached in the adreno_gpu code but as more and more of the GPU initialization moves to the generic GPU path we have a need to map and use GPU memory earlier and earlier. There isn't any reason to defer attaching the MMU until later so attach it right after the address space is created so it can be used immediately. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm: Separate locking of buffer resources from struct_mutex	Sushmita Susheelendra	2017-06-17	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Buffer object specific resources like pages, domains, sg list need not be protected with struct_mutex. They can be protected with a buffer object level lock. This simplifies locking and makes it easier to avoid potential recursive locking scenarios for SVM involving mmap_sem and struct_mutex. This also removes unnecessary serialization when creating buffer objects, and also between buffer object creation and GPU command submission. Signed-off-by: Sushmita Susheelendra <ssusheel@codeaurora.org> [robclark: squash in handling new locking for shrinker] Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm: remove address-space id	Rob Clark	2017-06-16	1	-2/+0
\| \| \| \| \| \| \| \|	Now that the msm_gem supports an arbitrary number of vma's, we no longer need to assign an id (index) to each address space. So rip out the associated code. Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm: pass address-space to _get_iova() and friends	Rob Clark	2017-06-16	1	-3/+3
\| \| \| \| \| \| \| \|	No functional change, that will come later. But this will make it easier to deal with dynamically created address spaces (ie. per- process pagetables for gpu). Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm: fix locking inconsistency for gpu->hw_init()	Rob Clark	2017-06-16	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Most, but not all, paths where calling the with struct_mutex held. The fast-path in msm_gem_get_iova() (plus some sub-code-paths that only run the first time) was masking this issue. So lets just always hold struct_mutex for hw_init(). And sprinkle some WARN_ON()'s and might_lock() to avoid this sort of problem in the future. Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm: Add a struct to pass configuration to msm_gpu_init()	Jordan Crouse	2017-06-16	1	-7/+6
\| \| \| \| \| \| \| \| \|	The amount of information that we need to pass into msm_gpu_init() is steadily increasing, so add a new struct to stabilize the function call and make it easier to add new configuration down the line. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm/gpu: check legacy clk names in get_clocks()	Rob Clark	2017-05-27	1	-2/+2
\| \| \| \| \| \| \| \| \|	Otherwise if someone was using old bindings with "core_clk" instead of "core" as the clock name, we'd never find it and gpu would be stuck at 27MHz (or whatever it's slowest rate is). Fixes: 98db803 ("msm/drm: gpu: Dynamically locate the clocks from the device tree") Signed-off-by: Rob Clark <robdclark@gmail.com>
*	msm/drm: gpu: Dynamically locate the clocks from the device tree	Jordan Crouse	2017-04-08	1	-23/+55
\| \| \| \| \| \| \| \| \|	Instead of using a fixed list of clock names use the clock-names list in the device tree to discover and get the list of clocks that we need. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm: Hard code the GPU "slow frequency"	Jordan Crouse	2017-04-08	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some A3XX and A4XX GPU targets required that the GPU clock be programmed to a non zero value when it was disabled so 27Mhz was chosen as the "invalid" frequency. Even though newer targets do not have the same clock restrictions we still write 27Mhz on clock disable and expect the clock subsystem to round down to zero. For unknown reasons even though the slow clock speed is always 27Mhz and it isn't actually a functional level the legacy device tree frequency tables always defined it and then did gymnastics to work around it. Instead of playing the same silly games just hard code the "slow" clock speed in the code as 27MHz and save ourselves a bit of infrastructure. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm: Make sure to detach the MMU during GPU cleanup	Jordan Crouse	2017-04-08	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \|	We should be detaching the MMU before destroying the address space. To do this cleanly, the detach has to happen in adreno_gpu_cleanup() because it needs access to structs in adreno_gpu.c. Plus it is better symmetry to have the attach and detach at the same code level. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm/gpu: use pm-runtime	Rob Clark	2017-04-08	1	-72/+27
\| \| \| \| \| \| \|	We need to use pm-runtime properly when IOMMU is using device_link() to control it's own clocks. Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm: drop _clk suffix from clk names	Rob Clark	2017-02-06	1	-4/+3
\| \| \| \| \| \| \| \| \| \|	Suggested by Rob Herring. We still support the old names for compatibility with downstream android dt files. Cc: Rob Herring <robh@kernel.org> Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Rob Herring <robh@kernel.org>
*	drm/msm: gpu: Add A5XX target support	Jordan Crouse	2016-11-28	1	-3/+10
\| \| \| \| \| \| \|	Add support for the A5XX family of Adreno GPUs. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm: Remove 'src_clk' from adreno configuration	Jordan Crouse	2016-11-28	1	-23/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The adreno code inherited a silly workaround from downstream from the bad old days before decent clock control. grp_clk[0] (named 'src_clk') doesn't actually exist - it was used as a proxy for whatever the core clock actually was (usually 'core_clk'). All targets should be able to correctly request 'core_clk' and get the right thing back so zap the anachronism and directly use grp_clk[0] to control the clock rate. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm: convert iova to 64b	Rob Clark	2016-11-28	1	-1/+1
\| \| \| \| \| \| \| \| \|	For a5xx the gpu is 64b so we need to change iova to 64b everywhere. On the display side, iova is still 32b so it can ignore the upper bits. (Although all the armv8 devices have an iommu that can map 64b pa to 32b iova.) Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm: support multiple address spaces	Rob Clark	2016-11-27	1	-7/+12
\| \| \| \| \| \| \| \|	We can have various combinations of 64b and 32b address space, ie. 64b CPU but 32b display and gpu, or 64b CPU and GPU but 32b display. So best to decouple the device iova's from mmap offset. Signed-off-by: Rob Clark <robdclark@gmail.com>
*	dma-buf: Rename struct fence to dma_fence	Chris Wilson	2016-10-25	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I plan to usurp the short name of struct fence for a core kernel struct, and so I need to rename the specialised fence/timeline for DMA operations to make room. A consensus was reached in https://lists.freedesktop.org/archives/dri-devel/2016-July/113083.html that making clear this fence applies to DMA operations was a good thing. Since then the patch has grown a bit as usage increases, so hopefully it remains a good thing! (v2...: rebase, rerun spatch) v3: Compile on msm, spotted a manual fixup that I broke. v4: Try again for msm, sorry Daniel coccinelle script: @@ @@ - struct fence + struct dma_fence @@ @@ - struct fence_ops + struct dma_fence_ops @@ @@ - struct fence_cb + struct dma_fence_cb @@ @@ - struct fence_array + struct dma_fence_array @@ @@ - enum fence_flag_bits + enum dma_fence_flag_bits @@ @@ ( - fence_init + dma_fence_init \| - fence_release + dma_fence_release \| - fence_free + dma_fence_free \| - fence_get + dma_fence_get \| - fence_get_rcu + dma_fence_get_rcu \| - fence_put + dma_fence_put \| - fence_signal + dma_fence_signal \| - fence_signal_locked + dma_fence_signal_locked \| - fence_default_wait + dma_fence_default_wait \| - fence_add_callback + dma_fence_add_callback \| - fence_remove_callback + dma_fence_remove_callback \| - fence_enable_sw_signaling + dma_fence_enable_sw_signaling \| - fence_is_signaled_locked + dma_fence_is_signaled_locked \| - fence_is_signaled + dma_fence_is_signaled \| - fence_is_later + dma_fence_is_later \| - fence_later + dma_fence_later \| - fence_wait_timeout + dma_fence_wait_timeout \| - fence_wait_any_timeout + dma_fence_wait_any_timeout \| - fence_wait + dma_fence_wait \| - fence_context_alloc + dma_fence_context_alloc \| - fence_array_create + dma_fence_array_create \| - to_fence_array + to_dma_fence_array \| - fence_is_array + dma_fence_is_array \| - trace_fence_emit + trace_dma_fence_emit \| - FENCE_TRACE + DMA_FENCE_TRACE \| - FENCE_WARN + DMA_FENCE_WARN \| - FENCE_ERR + DMA_FENCE_ERR ) ( ... ) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk> Acked-by: Sumit Semwal <sumit.semwal@linaro.org> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20161025120045.28839-1-chris@chris-wilson.co.uk
*	drm/msm: move fence allocation out of msm_gpu_submit()	Rob Clark	2016-09-15	1	-11/+2
\| \| \| \| \| \|	Prep work for next patch. Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm: print offender task name on hangcheck recovery	Rob Clark	2016-05-08	1	-4/+19
\| \| \| \| \| \| \|	Track the pid per submit, so we can print the name of the task which submitted the batch that caused the gpu to hang. Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm: fix leak in failed submit path	Rob Clark	2016-05-08	1	-3/+1
\| \| \| \|	Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm: drop return from gpu->submit()	Rob Clark	2016-05-08	1	-2/+2
\| \| \| \| \| \| \| \|	At this point, there is nothing left to fail. And submit already has a fence assigned and is added to the submit_list. Any problems from here on out are asynchronous (ie. hangcheck/recovery). Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm: 'struct fence' conversion	Rob Clark	2016-05-08	1	-10/+17
\| \| \| \|	Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm: introduce msm_fence_context	Rob Clark	2016-05-08	1	-9/+14
\| \| \| \| \| \| \| \|	Better encapsulate the per-timeline stuff into fence-context. For now there is just a single fence-context, but eventually we'll also have one per-CRTC to enable fully explicit fencing. Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm/gpu: simplify tracking in-flight bo's	Rob Clark	2016-05-08	1	-29/+22
\| \| \| \| \| \| \| \|	Since we already track the array of bo's in the submit object, just unconditionally take and drop ref's per submit (rather than only taking ref's if bo is not already active). This simplifies later patches. Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm: move fence code to it's own file	Rob Clark	2016-05-08	1	-0/+1
\| \| \| \|	Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm: Fix IOMMU clean up path in case msm_iommu_new() fails	Stephane Viau	2015-10-22	1	-0/+8
\| \| \| \| \| \| \| \| \|	msm_iommu_new() can fail and this change makes sure that we detect the failure and free the allocated domain before going any further. Signed-off-by: Stephane Viau <sviau@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm: restart queued submits after hang	Rob Clark	2015-06-11	1	-3/+46
\| \| \| \| \| \| \| \| \| \|	Track the list of in-flight submits. If the gpu hangs, retire up to an including the offending submit, and then re-submit the remainder. This way, for concurrently running piglit tests (for example), one failing test doesn't cause unrelated tests to fail simply because it's submit was queued up after one that triggered a hang. Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm: adreno a306 support	Rob Clark	2015-06-11	1	-0/+1
\| \| \| \| \| \| \| \| \|	As found in apq8016 (used in DragonBoard 410c) and msm8916. Note that numerically a306 is actually 307 (since a305c already claimed 306). Nice and confusing. Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm: clarify downstream bus scaling	Rob Clark	2015-06-11	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	A few spots in the driver have support for downstream android CONFIG_MSM_BUS_SCALING. This is mainly to simplify backporting the driver for various devices which do not have sufficient upstream kernel support. But the intentionally dead code seems to cause some confusion. Rename the #define to make this more clear. Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm: fix potential deadlock in gpu init	Rob Clark	2014-08-04	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \|	Somewhere along the way, the firmware loader sprouted another lock dependency, resulting in possible deadlock scenario: &dev->struct_mutex --> &sb->s_type->i_mutex_key#2 --> &mm->mmap_sem which is problematic vs things like gem mmap. So introduce a separate mutex to synchronize gpu init. Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm: use upstream iommu	Rob Clark	2014-08-04	1	-8/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Downstream kernel IOMMU had a non-standard way of dealing with multiple devices and multiple ports/contexts. We don't need that on upstream kernel, so rip out the crazy. Note that we have to move the pinning of the ringbuffer to after the IOMMU is attached. No idea how that managed to work properly on the downstream kernel. For now, I am leaving the IOMMU port name stuff in place, to simplify things for folks trying to backport latest drm/msm to device kernels. Once we no longer have to care about pre-DT kernels, we can drop this and instead backport upstream IOMMU driver. Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm: add perf logging debugfs	Rob Clark	2014-06-02	1	-0/+103
\| \| \| \|	Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm: add rd logging debugfs	Rob Clark	2014-06-02	1	-0/+4
\| \| \| \| \| \| \| \| \| \|	To ease debugging, add debugfs file which can be cat/tail'd to log submits, along with fence #. If GPU hangs, you can look at 'gpu' debugfs file to find last completed fence and current register state, and compare with logged rd file to narrow down the DRAW_INDX which triggered the GPU hang. Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm: crank down gpu when inactive	Rob Clark	2014-03-31	1	-3/+82
\| \| \| \| \| \| \|	Shut down the clks when the gpu has nothing to do. A short inactivity timer is used to provide a low pass filter for power transitions. Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm: bigger synchronization hammer	Rob Clark	2014-02-07	1	-3/+0
\| \| \| \| \| \| \| \| \|	Because we use a list_head in the bo to track it's position in a submit, we need to serialize at a higher layer. Otherwise there are problems when multiple contexts are SUBMIT'ing in parallel cmdstreams referencing a shared bo. Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm: add support for non-IOMMU systems	Rob Clark	2014-01-09	1	-8/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a VRAM carveout that is used for systems which do not have an IOMMU. The VRAM carveout uses CMA. The arch code must setup a CMA pool for the device (preferrably in highmem.. a 256m-512m VRAM pool in lowmem is not cool). The user can configure the VRAM pool size using msm.vram module param. Technically, the abstraction of IOMMU behind msm_mmu is not strictly needed, but it simplifies the GEM code a bit, and will be useful later when I add support for a2xx devices with GPUMMU, so I decided to keep this part. It appears to be possible to configure the GPU to restrict access to addresses within the VRAM pool, but this is not done yet. So for now the GPU will refuse to load if there is no sort of mmu. Once address based limits are supported and tested to confirm that we aren't giving the GPU access to arbitrary memory, this restriction can be lifted Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm: fix bus scaling	Rob Clark	2014-01-09	1	-15/+5
\| \| \| \| \| \| \|	This got a bit broken with original patches when re-arranging things to move dependencies on mach-msm inside #ifndef OF. Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm: rework inactive-work	Rob Clark	2013-11-01	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Re-arrange things a bit so that we can get work requested after a bo fence passes, like pageflip, done before retiring bo's. Without any sort of bo cache in userspace, some games can trigger hundred's of transient bo's, which can cause retire to take a long time (5-10ms). Obviously we want a bo cache.. but this cleanup will make things a bit easier for atomic as well and makes things a bit cleaner. Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: David Brown <davidb@codeaurora.org>
*	drm/msm: fix potential NULL pointer dereference	Wei Yongjun	2013-09-12	1	-1/+2
\| \| \| \| \| \|	The dereference to 'pdata' should be moved below the NULL test. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
*	drm/msm: workaround for missing irq	Rob Clark	2013-09-11	1	-2/+5
\| \| \| \| \| \| \| \|	Occasionally we seem to miss an IRQ from the ME (microengine). I'm not entirely sure the root cause, but for now we can unwedge things by retiring from the hangcheck timer. Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm: hangcheck harder	Rob Clark	2013-09-10	1	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	If gpu locks up with the rptr shortly beyond the wrap-around point in the ringbuffer, because the rptr was not reset (but wptr is, by virtue of resetting rb->cur), we could end up in a scenario where we think there is not enough space in the ringbuffer for the next cmds. And since the CP won't reset rptr until after processing an IB, this leaves things in a sort of deadlock. So reset rptr too. And a bit more spiffing up of hangcheck to make things easier to debug. Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm: handle read vs write fences	Rob Clark	2013-09-10	1	-2/+7
\| \| \| \| \| \| \| \| \|	The userspace API already had everything needed to handle read vs write synchronization. This patch actually bothers to hook it up properly, so that we don't need to (for example) stall on userspace read access to a buffer that gpu is also still reading. Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm: add basic hangcheck/recovery mechanism	Rob Clark	2013-08-24	1	-0/+52
\| \| \| \| \| \| \| \| \| \|	A basic, no-frills recovery mechanism in case the gpu gets wedged. We could try to be a bit more fancy and restart the next submit after the one that got wedged, but for now keep it simple. This is enough to recover things if, for example, the gpu hangs mid way through a piglit run. Signed-off-by: Rob Clark <robdclark@gmail.com>
*	drm/msm: add a3xx gpu support	Rob Clark	2013-08-24	1	-0/+411
	Add initial support for a3xx 3d core. So far, with hardware that I've seen to date, we can have: + zero, one, or two z180 2d cores + a3xx or a2xx 3d core, which share a common CP (the firmware for the CP seems to implement some different PM4 packet types but the basics of cmdstream submission are the same) Which means that the eventual complete "class" hierarchy, once support for all past and present hw is in place, becomes: + msm_gpu + adreno_gpu + a3xx_gpu + a2xx_gpu + z180_gpu This commit splits out the parts that will eventually be common between a2xx/a3xx into adreno_gpu, and the parts that are even common to z180 into msm_gpu. Note that there is no cmdstream validation required. All memory access from the GPU is via IOMMU/MMU. So as long as you don't map silly things to the GPU, there isn't much damage that the GPU can do. Signed-off-by: Rob Clark <robdclark@gmail.com>