summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/amd/amdgpu
Commit message (Collapse)AuthorAgeFilesLines
* Revert "drm/amdgpu: add DRIVER_SYNCOBJ_TIMELINE to amdgpu"Alex Deucher2019-06-061-2/+1Star
| | | | | | | | This reverts commit 8d8a5a64a8904ea32bbf7292b89c11156d64f9a1. Wait until KHR exposes the VLK support. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: fix a race in GPU reset with IB test (v2)Alex Deucher2019-06-063-59/+61
| | | | | | | | | | | | | | | | | | | | | Split late_init into two functions, one (do_late_init) which just does the hw init, and late_init which calls do_late_init and schedules the IB test work. Call do_late_init in the GPU reset code to run the init code, but not schedule the IB test code. The IB test code is called directly in the gpu reset code so no need to run the IB tests in a separate work thread. If we do, we end up racing. v2: Rework late_init. Pull out the mgpu fan boost and xgmi pstate code into late_init so they get called in all cases. rename the late_init worker thread to delayed work since it's just the IB tests now which can happen later. Schedule the work at init and resume time. It's not needed at reset time because the IB tests are called directly. Reviewed-by: Christian König <christian.koenig@amd.com> Cc: Xinhui Pan <xinhui.pan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: cancel late_init_work before gpu resetxinhui pan2019-06-061-0/+2
| | | | | | | | | gpu reset will run late_init and schedule the late_init_work. if we keep triggering gpu reset in a short time, there are potenial races. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/ttm: Make LRU removal optional v2Christian König2019-05-314-8/+9
| | | | | | | | | | | | | | | We are already doing this for DMA-buf imports and also for amdgpu VM BOs for quite a while now. If this doesn't run into any problems we are probably going to stop removing BOs from the LRU altogether. v2: drop BUG_ON from ttm_bo_add_to_lru Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/sriov: Correct some register program methodEmily Deng2019-05-312-9/+9
| | | | | | | | For the VF, some registers only could be programmed with RLC. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Trigger Huang <Trigger.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdkfd: Return proper error code for gws alloc APIOak Zeng2019-05-311-1/+1
| | | | | | Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu:Fix the unpin warning about csb bufferEmily Deng2019-05-311-3/+1Star
| | | | | | | | | | As it will destroy clear_state_obj, and also will unpin it in the gfx_v9_0_sw_fini, so don't need to call amdgpu_bo_free_kernel in gfx_v9_0_sw_fini, or it will have unpin warning. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: ras injection use gpu addressxinhui pan2019-05-311-0/+16
| | | | | | | | injection need a valid gpu address. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* Merge branch 'drm-next-5.3' of git://people.freedesktop.org/~agd5f/linux ↵Dave Airlie2019-05-3173-1111/+3119
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | into drm-next New stuff for 5.3: - Add new thermal sensors for vega asics - Various RAS fixes - Add sysfs interface for memory interface utilization - Use HMM rather than mmu notifier for user pages - Expose xgmi topology via kfd - SR-IOV fixes - Fixes for manual driver reload - Add unique identifier for vega asics - Clean up user fence handling with UVD/VCE/VCN blocks - Convert DC to use core bpc attribute rather than a custom one - Add GWS support for KFD - Vega powerplay improvements - Add CRC support for DCE 12 - SR-IOV support for new security policy - Various cleanups From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190529220944.14464-1-alexander.deucher@amd.com
| * drm/amdgpu: Need to set the baco cap before baco resetEmily Deng2019-05-282-14/+13Star
| | | | | | | | | | | | | | | | | | | | | | | | | | For passthrough, after rebooted the VM, driver will do a baco reset before doing other driver initialization during loading driver. For doing the baco reset, it will first check the baco reset capability. So first need to set the cap from the vbios information or baco reset won't be enabled. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu/soc15: skip reset on initAlex Deucher2019-05-281-0/+5
| | | | | | | | | | | | | | Not necessary on soc15 and breaks driver reload on server cards. Acked-by: Amber Lin <Amber.Lin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: add DRIVER_SYNCOBJ_TIMELINE to amdgpuChunming Zhou2019-05-281-1/+2
| | | | | | | | | | | | | | Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Flora Cui <Flora.Cui@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: Add function to add/remove gws to kfd processOak Zeng2019-05-282-5/+100
| | | | | | | | | | | | | | | | | | | | GWS bo is shared between all kfd processes. Add function to add gws to kfd process's bo list so gws can be evicted from and restored for process. Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: Add interface to alloc gws from amdgpuOak Zeng2019-05-282-0/+36
| | | | | | | | | | | | | | | | | | Add amdgpu_amdkfd interface to alloc and free gws from amdgpu Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdkfd: Add gws number to kfd topology node propertiesOak Zeng2019-05-283-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | Add amdgpu_amdkfd interface to get num_gws and add num_gws to /sys/class/kfd/kfd/topology/nodes/x/properties. Only report num_gws if MEC FW support GWS barriers. Currently it is determined by a module parameter which will be replaced with MEC FW version check when firmware is ready. Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amd/doc: Add RAS documentation to guideTom St Denis2019-05-241-2/+2
| | | | | | | | | | | | | | Acked-by: Slava Abramov <slava.abramov@amd.com> Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amd/doc: Add XGMI sysfs documentationTom St Denis2019-05-241-0/+28
| | | | | | | | | | | | | | Acked-by: Slava Abramov <slava.abramov@amd.com> Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amd/display: Switch the custom "max bpc" property to the DRM propNicholas Kazlauskas2019-05-242-6/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [Why] The custom "max bpc" property was added to limit color depth while the DRM one was still being merged. It's been a few kernel versions since then and this TODO was still sticking around. [How] Attach the DRM max bpc property to the connector and drop all of our custom property management. Set the max bpc to 8 by default since DRM defaults to the max in the range which would be 16 in this case. No behavioral changes are intended with this patch, it should just be a refactor. v2: Don't force 8bpc when no state is given Cc: Leo Li <sunpeng.li@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: Add Unique Identifier sysfs file unique_id v2Kent Russell2019-05-242-0/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | Add a file that provides a Unique ID for the GPU. This will persist across machines and is guaranteed to be unique. This is only available for GFX9 and newer, so older ASICs will not have this file in the sysfs pool v2: Store it in adev for ASICs that don't have a hwmgr Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: Improve error handling for HMMFelix Kuehling2019-05-241-4/+18
| | | | | | | | | | | | | | | | | | | | | | | | Use unsigned long for number of pages. Check that pfns are valid after hmm_vma_fault. If they are not, return an error instead of continuing with invalid page pointers and PTEs. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: more descriptive message if HMM not enabledPhilip Yang2019-05-241-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | If using old kernel config file, CONFIG_ZONE_DEVICE is not selected, so CONFIG_HMM and CONFIG_HMM_MIRROR is not enabled, the current driver error message "Failed to register MMU notifier" is not clear. Inform user with more descriptive message on how to fix the missing kernel config option. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109808 Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: support userptr cross VMAs case with HMMPhilip Yang2019-05-241-35/+91
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | userptr may cross two VMAs if the forked child process (not call exec after fork) malloc buffer, then free it, and then malloc larger size buf, kerenl will create new VMA adjacent to old VMA which was cloned from parent process, some pages of userptr are in the first VMA, the rest pages are in the second VMA. HMM expects range only have one VMA, loop over all VMAs in the address range, create multiple ranges to handle this case. See is_mergeable_anon_vma in mm/mmap.c for details. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdkfd: support concurrent userptr update for HMMPhilip Yang2019-05-241-6/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Userptr restore may have concurrent userptr invalidation after hmm_vma_fault adds the range to the hmm->ranges list, needs call hmm_vma_range_done to remove the range from hmm->ranges list first, then reschedule the restore worker. Otherwise hmm_vma_fault will add same range to the list, this will cause loop in the list because range->next point to range itself. Add function untrack_invalid_user_pages to reduce code duplication. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: fix HMM config dependency issuePhilip Yang2019-05-243-24/+19Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Only select HMM_MIRROR will get kernel config dependency warnings if CONFIG_HMM is missing in the config. Add depends on HMM will solve the issue. Add conditional compilation to fix compilation errors if HMM_MIRROR is not enabled as HMM config is not enabled. Remove unused function amdgpu_ttm_tt_mark_user_pages. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: replace get_user_pages with HMM mirror helpersPhilip Yang2019-05-249-278/+182Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use HMM helper function hmm_vma_fault() to get physical pages backing userptr and start CPU page table update track of those pages. Then use hmm_vma_range_done() to check if those pages are updated before amdgpu_cs_submit for gfx or before user queues are resumed for kfd. If userptr pages are updated, for gfx, amdgpu_cs_ioctl will restart from scratch, for kfd, restore worker is rescheduled to retry. HMM simplify the CPU page table concurrent update check, so remove guptasklock, mmu_invalidations, last_set_pages fields from amdgpu_ttm_tt struct. HMM does not pin the page (increase page ref count), so remove related operations like release_pages(), put_page(), mark_page_dirty(). Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: use HMM callback to replace mmu notifierPhilip Yang2019-05-244-98/+72Star
| | | | | | | | | | | | | | | | | | | | | | | | | | Replace our MMU notifier with hmm_mirror_ops.sync_cpu_device_pagetables callback. Enable CONFIG_HMM and CONFIG_HMM_MIRROR as a dependency in DRM_AMDGPU_USERPTR Kconfig. It supports both KFD userptr and gfx userptr paths. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: Use heavy weight for tlb invalidation on xgmi configurationshaoyunl2019-05-241-27/+26Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a bug found in vml2 xgmi logic: mtype is always sent as NC on the VMC to TC interface for a page walk, regardless of whether the request is being sent to local or remote GPU. NC means non-coherent and will cause the VMC return data to be cached in the TCC (versus UC – uncached will not cache the data). Since the page table updates are being done by SDMA/HDP, then TCC will never be updated and the GC VML2 will continue to hit on the TCC and never get the updated page tables and result in a fault. Heave weigh tlb invalidation does a WB/INVAL of the L1/L2 GL data caches so TCC will not be hit on next request Signed-off-by: shaoyunl <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: use pcie_bandwidth_available rather than open coding itAlex Deucher2019-05-241-39/+2Star
| | | | | | | | | | | | | | | | | | It does the same thing we were doing already. I though it needed work for gen3/4 speeds, but that seems to be covered already. Reviewed-by: Evan Quan <evan.quan@amd.com> Acked-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: use div64_ul for 32-bit compatibility v1Slava Abramov2019-05-241-2/+2
| | | | | | | | | | | | | | | | | | v1: replace casting to unsigned long with div64_ul Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Slava Abramov <slava.abramov@amd.com> Tested-by: Slava Abramov <slava.abramov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: fix spelling mistake "retrived" -> "retrieved"Colin Ian King2019-05-241-1/+1
| | | | | | | | | | | | | | There is a spelling mistake in a DRM_ERROR error message. Fix this. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu/vega20: use mode1 reset for RAS and XGMIAlex Deucher2019-05-241-0/+9
| | | | | | | | | | | | | | | | If RAS or XGMI are enabled, you have to use mode1 reset rather than BACO. Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amd/powerplay: support ppfeatures sysfs interface on sw smu routineEvan Quan2019-05-241-2/+8
| | | | | | | | | | | | | | | | | | Support ppfeatures sysfs interface on Vega20 sw smu routine. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: Report firmware versions with sysfs v2Ori Messinger2019-05-243-0/+70
| | | | | | | | | | | | | | | | | | | | | | Firmware versions can be found as separate sysfs files at: /sys/class/drm/cardX/device/fw_version (where X is the card number) The firmware versions are displayed in hexadecimal. v2: Moved sysfs files to subfolder Signed-off-by: Ori Messinger <ori.messinger@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: make VCN DPG pause mode detached from general VCNLeo Liu2019-05-243-129/+135
| | | | | | | | | | | | | | | | It should be attached to VCN 1.0 Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: move the VCN DPG mode read and write to VCNLeo Liu2019-05-242-21/+21
| | | | | | | | | | | | | | | | Since this is VCN specific and only used by VCN Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu/sriov: Need to initialize the HDP_NONSURFACE_BAStETiecheng Zhou2019-05-241-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | it requires to initialize HDP_NONSURFACE_BASE, so as to avoid using the value left by a previous VM under sriov scenario. v2: it should not hurt baremetal, generalize it for both sriov and baremetal Signed-off-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Tiecheng Zhou <Tiecheng.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: suppress repeating tmo reportMonk Liu2019-05-241-0/+2
| | | | | | | | | | | | | | | | | | | | only report once per TMO job and the timer would be restarted upon the job finished if it's just slow. Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: remove static GDS, GWS and OA allocationChristian König2019-05-247-137/+28Star
| | | | | | | | | | | | | | | | As far as we know this was never used by userspace and so should be removed. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: check no_user_fence flag for enginesLeo Liu2019-05-241-4/+2Star
| | | | | | | | | | | | | | | | | | To replace checking ring type and make them generic Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu/VCN: set no_user_fence flag to trueLeo Liu2019-05-241-0/+3
| | | | | | | | | | | | | | | | | | There is no user fence support for VCN Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu/VCE: set no_user_fence flag to trueLeo Liu2019-05-243-0/+4
| | | | | | | | | | | | | | | | | | There is no user fence support for VCE Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu/UVD: set no_user_fence flag to trueLeo Liu2019-05-244-0/+7
| | | | | | | | | | | | | | | | | | There is no user fence support for UVD Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: add no_user_fence flag to ring funcsLeo Liu2019-05-241-0/+1
| | | | | | | | | | | | | | | | | | So we can generalize the no user fence supported engine Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: sdma handle ras resumexinhui pan2019-05-241-1/+18
| | | | | | | | | | | | | | | | | | | | | | During S3/S4 bootloader will re-init ras state behind us. Resume might fail or raise a gpu reset. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Tested-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: gfx handle ras resumexinhui pan2019-05-241-1/+19
| | | | | | | | | | | | | | | | | | | | | | During S3/S4 bootloader will re-init ras state behind us. Resume might fail or raise a gpu reset. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Tested-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: gmc handle ras resumexinhui pan2019-05-241-1/+18
| | | | | | | | | | | | | | | | | | | | | | During S3/S4 bootloader will re-init ras state behind us. Resume might fail or raise a gpu reset. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Tested-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: enable ras suspend/resumexinhui pan2019-05-241-0/+4
| | | | | | | | | | | | | | | | | | | | suspend/resume will change ras state behind us. Let driver get notified. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Tested-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: ras support suspend/resumexinhui pan2019-05-243-8/+20
| | | | | | | | | | | | | | | | | | | | add ras suspend function. rename ras_post_init to amdgpu_ras_resume. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Tested-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: add badpages sysfs interafcexinhui pan2019-05-242-0/+147
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | add badpages node. it will output badpages list in format gpu pfn : gpu page size : flags example 0x00000000 : 0x00001000 : R 0x00000001 : 0x00001000 : R 0x00000002 : 0x00001000 : R 0x00000003 : 0x00001000 : R 0x00000004 : 0x00001000 : R 0x00000005 : 0x00001000 : R 0x00000006 : 0x00001000 : R 0x00000007 : 0x00001000 : P 0x00000008 : 0x00001000 : P 0x00000009 : 0x00001000 : P flags can be one of below characters R: reserved. P: pending for reserve. F: failed to reserve for some reasons. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: Fix S3 test issueJames Zhu2019-05-241-9/+5Star
| | | | | | | | | | | | | | | | | | | | | | During S3 test, when system wake up and resume, ras interface is already allocated. Move workaround before ras jumps to resume step in gfx_v9_0_ecc_late_init, and make sure workaround applied during resume. Also remove unused mmGB_EDC_MODE clearing. Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>