summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/msm/msm_gpu.h
Commit message (Collapse)AuthorAgeFilesLines
* drm/msm: add a5xx specific debugfsRob Clark2018-02-201-0/+2
| | | | | | | | Add some debugfs to dump out PFP and ME microcontroller state, as well as some of the queues (MEQ and ROQ). Also add a debugfs file to trigger a GPU reset (and reloading the firmware on next submit). Signed-off-by: Rob Clark <robdclark@gmail.com>
* drm/msm: Add devfreq support for the GPUJordan Crouse2018-01-101-0/+7
| | | | | | | | | | | | Add support for devfreq to dynamically control the GPU frequency. By default try to use the 'simple_ondemand' governor which can adjust the frequency based on GPU load. v2: Fix __aeabi_uldivmod issue from the 0 day bot and use devfreq_recommended_opp() as suggested by Rob. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
* drm/msm/gpu: Remove unused bus scaling codeJordan Crouse2018-01-101-6/+1Star
| | | | | | | | | Remove the downstream bus scaling code. It isn't needed for for compatibility with a downstream or vendor kernel. Get it out of the way to clear space for devfreq support. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
* drm/msm: Make the value of RB_CNTL (almost) genericJordan Crouse2017-10-281-0/+5
| | | | | | | | | | | | | | We use a global ringbuffer size and block size for all targets and at least for 5XX preemption we need to know the value the RB_CNTL in several locations so it makes sense to calculate it once and use it everywhere. The only monkey wrench is that we need to disable the RPTR shadow for A430 targets but that only needs to be done once and doesn't affect A5XX so we can or in the value at init time. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
* drm/msm: Support multiple ringbuffersJordan Crouse2017-10-281-24/+18Star
| | | | | | | | | | | | | | | | | | | | Add the infrastructure to support the idea of multiple ringbuffers. Assign each ringbuffer an id and use that as an index for the various ring specific operations. The biggest delta is to support legacy fences. Each fence gets its own sequence number but the legacy functions expect to use a unique integer. To handle this we return a unique identifier for each submission but map it to a specific ring/sequence under the covers. Newer users use a dma_fence pointer anyway so they don't care about the actual sequence ID or ring. The actual mechanics for multiple ringbuffers are very target specific so this code just allows for the possibility but still only defines one ringbuffer for each target family. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
* drm/msm: Move memptrs to msm_gpuJordan Crouse2017-10-281-2/+15
| | | | | | | | | | | | When we move to multiple ringbuffers we're going to store the data in the memptrs on a per-ring basis. In order to prepare for that move the current memptrs from the adreno namespace into msm_gpu. This is way cleaner and immediately lets us kill off some sub functions so there is much less cost later when we do move to per-ring structs. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
* drm/msm: Add per-instance submit queuesJordan Crouse2017-10-281-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the behavior of a command stream is provided by the user application during submission and the application is expected to internally maintain the settings for each 'context' or 'rendering queue' and specify the correct ones. This works okay for simple cases but as applications become more complex we will want to set context specific flags and do various permission checks to allow certain contexts to enable additional privileges. Add kernel-side submit queues to be analogous to 'contexts' or 'rendering queues' on the application side. Each file descriptor instance will maintain its own list of queues. Queues cannot be shared between file descriptors. For backwards compatibility context id '0' is defined as a default context specifying no priority and no special flags. This is intended to be the usual configuration for 99% of applications so that a garden variety application can function correctly without creating a queue. Only those applications requiring the specific benefit of different queues need create one. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
* drm/msm: remove address-space idRob Clark2017-06-161-1/+0Star
| | | | | | | | Now that the msm_gem supports an arbitrary number of vma's, we no longer need to assign an id (index) to each address space. So rip out the associated code. Signed-off-by: Rob Clark <robdclark@gmail.com>
* drm/msm: Add a struct to pass configuration to msm_gpu_init()Jordan Crouse2017-06-161-1/+10
| | | | | | | | | The amount of information that we need to pass into msm_gpu_init() is steadily increasing, so add a new struct to stabilize the function call and make it easier to add new configuration down the line. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
* drm/msm: Remove idle function hookJordan Crouse2017-06-161-1/+0Star
| | | | | | | There isn't any generic code that uses ->idle so remove it. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
* msm/drm: gpu: Dynamically locate the clocks from the device treeJordan Crouse2017-04-081-1/+3
| | | | | | | | | Instead of using a fixed list of clock names use the clock-names list in the device tree to discover and get the list of clocks that we need. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
* drm/msm: Hard code the GPU "slow frequency"Jordan Crouse2017-04-081-1/+1
| | | | | | | | | | | | | | | | | | | | | Some A3XX and A4XX GPU targets required that the GPU clock be programmed to a non zero value when it was disabled so 27Mhz was chosen as the "invalid" frequency. Even though newer targets do not have the same clock restrictions we still write 27Mhz on clock disable and expect the clock subsystem to round down to zero. For unknown reasons even though the slow clock speed is always 27Mhz and it isn't actually a functional level the legacy device tree frequency tables always defined it and then did gymnastics to work around it. Instead of playing the same silly games just hard code the "slow" clock speed in the code as 27MHz and save ourselves a bit of infrastructure. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
* drm/msm/gpu: use pm-runtimeRob Clark2017-04-081-6/+6
| | | | | | | We need to use pm-runtime properly when IOMMU is using device_link() to control it's own clocks. Signed-off-by: Rob Clark <robdclark@gmail.com>
* drm/msm: gpu: Add A5XX target supportJordan Crouse2016-11-281-1/+1
| | | | | | | Add support for the A5XX family of Adreno GPUs. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
* drm/msm: Remove 'src_clk' from adreno configurationJordan Crouse2016-11-281-1/+1
| | | | | | | | | | | | | | The adreno code inherited a silly workaround from downstream from the bad old days before decent clock control. grp_clk[0] (named 'src_clk') doesn't actually exist - it was used as a proxy for whatever the core clock actually was (usually 'core_clk'). All targets should be able to correctly request 'core_clk' and get the right thing back so zap the anachronism and directly use grp_clk[0] to control the clock rate. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
* drm/msm: gpu Add new gpu register read/write functionsJordan Crouse2016-11-281-0/+39
| | | | | | | | | | | | | | Add some new functions to manipulate GPU registers. gpu_read64 and gpu_write64 can read/write a 64 bit value to two 32 bit registers. For 4XX and older these are normally perfcounter registers, but future targets will use 64 bit addressing so there will be many more spots where a 64 bit read and write are needed. gpu_rmw() does a read/modify/write on a 32 bit register given a mask and bits to OR in. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
* drm/msm: gpu: Return error on hw_init failureJordan Crouse2016-11-281-1/+1
| | | | | | | | | When the GPU hardware init function fails (like say, ME_INIT timed out) return error instead of blindly continuing on. This gives us a small chance of saving the system before it goes boom. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
* drm/msm: convert iova to 64bRob Clark2016-11-281-1/+1
| | | | | | | | | For a5xx the gpu is 64b so we need to change iova to 64b everywhere. On the display side, iova is still 32b so it can ignore the upper bits. (Although all the armv8 devices have an iommu that can map 64b pa to 32b iova.) Signed-off-by: Rob Clark <robdclark@gmail.com>
* drm/msm: support multiple address spacesRob Clark2016-11-271-1/+1
| | | | | | | | We can have various combinations of 64b and 32b address space, ie. 64b CPU but 32b display and gpu, or 64b CPU and GPU but 32b display. So best to decouple the device iova's from mmap offset. Signed-off-by: Rob Clark <robdclark@gmail.com>
* drm/msm: move fence allocation out of msm_gpu_submit()Rob Clark2016-09-151-1/+1
| | | | | | Prep work for next patch. Signed-off-by: Rob Clark <robdclark@gmail.com>
* drm/msm: drop return from gpu->submit()Rob Clark2016-05-081-1/+1
| | | | | | | | At this point, there is nothing left to fail. And submit already has a fence assigned and is added to the submit_list. Any problems from here on out are asynchronous (ie. hangcheck/recovery). Signed-off-by: Rob Clark <robdclark@gmail.com>
* drm/msm: introduce msm_fence_contextRob Clark2016-05-081-2/+5
| | | | | | | | Better encapsulate the per-timeline stuff into fence-context. For now there is just a single fence-context, but eventually we'll also have one per-CRTC to enable fully explicit fencing. Signed-off-by: Rob Clark <robdclark@gmail.com>
* drm/msm: restart queued submits after hangRob Clark2015-06-111-0/+2
| | | | | | | | | | Track the list of in-flight submits. If the gpu hangs, retire up to an including the offending submit, and then re-submit the remainder. This way, for concurrently running piglit tests (for example), one failing test doesn't cause unrelated tests to fail simply because it's submit was queued up after one that triggered a hang. Signed-off-by: Rob Clark <robdclark@gmail.com>
* drm/msm: adreno a306 supportRob Clark2015-06-111-1/+1
| | | | | | | | | As found in apq8016 (used in DragonBoard 410c) and msm8916. Note that numerically a306 is actually 307 (since a305c already claimed 306). Nice and confusing. Signed-off-by: Rob Clark <robdclark@gmail.com>
* drm/msm: clarify downstream bus scalingRob Clark2015-06-111-1/+1
| | | | | | | | | | A few spots in the driver have support for downstream android CONFIG_MSM_BUS_SCALING. This is mainly to simplify backporting the driver for various devices which do not have sufficient upstream kernel support. But the intentionally dead code seems to cause some confusion. Rename the #define to make this more clear. Signed-off-by: Rob Clark <robdclark@gmail.com>
* drm/msm/adreno: move decision about what gpu to to loadRob Clark2014-09-101-1/+1
| | | | | | | Move this into into adreno_device, and decide based on gpu revision rather than just assuming a3xx. Signed-off-by: Rob Clark <robdclark@gmail.com>
* drm/msm/adreno: split adreno device out into it's own fileRob Clark2014-09-101-2/+2
| | | | | | | We'd rather not duplicate these parts as support for additional gpu generations is added. Signed-off-by: Rob Clark <robdclark@gmail.com>
* drm/msm: add perf logging debugfsRob Clark2014-06-021-0/+31
| | | | Signed-off-by: Rob Clark <robdclark@gmail.com>
* drm/msm: crank down gpu when inactiveRob Clark2014-03-311-1/+15
| | | | | | | Shut down the clks when the gpu has nothing to do. A short inactivity timer is used to provide a low pass filter for power transitions. Signed-off-by: Rob Clark <robdclark@gmail.com>
* drm/msm: add support for non-IOMMU systemsRob Clark2014-01-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | Add a VRAM carveout that is used for systems which do not have an IOMMU. The VRAM carveout uses CMA. The arch code must setup a CMA pool for the device (preferrably in highmem.. a 256m-512m VRAM pool in lowmem is not cool). The user can configure the VRAM pool size using msm.vram module param. Technically, the abstraction of IOMMU behind msm_mmu is not strictly needed, but it simplifies the GEM code a bit, and will be useful later when I add support for a2xx devices with GPUMMU, so I decided to keep this part. It appears to be possible to configure the GPU to restrict access to addresses within the VRAM pool, but this is not done yet. So for now the GPU will refuse to load if there is no sort of mmu. Once address based limits are supported and tested to confirm that we aren't giving the GPU access to arbitrary memory, this restriction can be lifted Signed-off-by: Rob Clark <robdclark@gmail.com>
* drm/msm: fix bus scalingRob Clark2014-01-091-0/+4
| | | | | | | This got a bit broken with original patches when re-arranging things to move dependencies on mach-msm inside #ifndef OF. Signed-off-by: Rob Clark <robdclark@gmail.com>
* drm/msm: add basic hangcheck/recovery mechanismRob Clark2013-08-241-0/+10
| | | | | | | | | | A basic, no-frills recovery mechanism in case the gpu gets wedged. We could try to be a bit more fancy and restart the next submit after the one that got wedged, but for now keep it simple. This is enough to recover things if, for example, the gpu hangs mid way through a piglit run. Signed-off-by: Rob Clark <robdclark@gmail.com>
* drm/msm: add a3xx gpu supportRob Clark2013-08-241-0/+114
Add initial support for a3xx 3d core. So far, with hardware that I've seen to date, we can have: + zero, one, or two z180 2d cores + a3xx or a2xx 3d core, which share a common CP (the firmware for the CP seems to implement some different PM4 packet types but the basics of cmdstream submission are the same) Which means that the eventual complete "class" hierarchy, once support for all past and present hw is in place, becomes: + msm_gpu + adreno_gpu + a3xx_gpu + a2xx_gpu + z180_gpu This commit splits out the parts that will eventually be common between a2xx/a3xx into adreno_gpu, and the parts that are even common to z180 into msm_gpu. Note that there is no cmdstream validation required. All memory access from the GPU is via IOMMU/MMU. So as long as you don't map silly things to the GPU, there isn't much damage that the GPU can do. Signed-off-by: Rob Clark <robdclark@gmail.com>