Skip to content

Commit a657104

Browse files
Yang Wangalexdeucher
authored andcommitted
drm/amdgpu: fix gpu idle power consumption issue for gfx v12
Older versions of the MES firmware may cause abnormal GPU power consumption. When performing inference tasks on the GPU (e.g., with Ollama using ROCm), the GPU may show abnormal power consumption in idle state and incorrect GPU load information. This issue has been fixed in firmware version 0x8b and newer. Closes: ROCm/ROCm#5706 Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 4e22a5f)
1 parent 52289ce commit a657104

1 file changed

Lines changed: 4 additions & 1 deletion

File tree

drivers/gpu/drm/amd/amdgpu/mes_v12_0.c

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -731,6 +731,9 @@ static int mes_v12_0_set_hw_resources(struct amdgpu_mes *mes, int pipe)
731731
int i;
732732
struct amdgpu_device *adev = mes->adev;
733733
union MESAPI_SET_HW_RESOURCES mes_set_hw_res_pkt;
734+
uint32_t mes_rev = (pipe == AMDGPU_MES_SCHED_PIPE) ?
735+
(mes->sched_version & AMDGPU_MES_VERSION_MASK) :
736+
(mes->kiq_version & AMDGPU_MES_VERSION_MASK);
734737

735738
memset(&mes_set_hw_res_pkt, 0, sizeof(mes_set_hw_res_pkt));
736739

@@ -785,7 +788,7 @@ static int mes_v12_0_set_hw_resources(struct amdgpu_mes *mes, int pipe)
785788
* handling support, other queue will not use the oversubscribe timer.
786789
* handling mode - 0: disabled; 1: basic version; 2: basic+ version
787790
*/
788-
mes_set_hw_res_pkt.oversubscription_timer = 50;
791+
mes_set_hw_res_pkt.oversubscription_timer = mes_rev < 0x8b ? 0 : 50;
789792
mes_set_hw_res_pkt.unmapped_doorbell_handling = 1;
790793

791794
if (amdgpu_mes_log_enable) {

0 commit comments

Comments
 (0)