NetBSD Problem Report #58994
From www@netbsd.org Wed Jan 15 09:20:40 2025
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
key-exchange X25519 server-signature RSA-PSS (2048 bits)
client-signature RSA-PSS (2048 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 54F061A923B
for <gnats-bugs@gnats.NetBSD.org>; Wed, 15 Jan 2025 09:20:40 +0000 (UTC)
Message-Id: <20250115092038.D752D1A923C@mollari.NetBSD.org>
Date: Wed, 15 Jan 2025 09:20:38 +0000 (UTC)
From: kikadf.01@gmail.com
Reply-To: kikadf.01@gmail.com
To: gnats-bugs@NetBSD.org
Subject: System panic with amdgpu and startx
X-Send-Pr-Version: www-1.0
>Number: 58994
>Category: kern
>Synopsis: System panic with amdgpu and startx
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Jan 15 09:25:00 +0000 2025
>Last-Modified: Wed Mar 12 09:45:01 +0000 2025
>Originator: Robert Bagdan
>Release: 10.1
>Organization:
-
>Environment:
NetBSD yamato 10.1 NetBSD 10.1 (GENERIC) #0: Mon Dec 16 13:08:11 UTC 2024 mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/amd64/compile/GENERIC amd64
>Description:
I have a PC with AMD Ryzen 7700X cpu, and AMD RX550 gpu. If I start NetBSD with the loaded amdgpu driver, when I use startx the system is crashing.
My boot entry from boot.cfg:
menu=Boot with amdgpu:rndseed /var/db/entropy-file;load drmkms_sched;load amdgpu;boot
The gpu relevant parts from dmesg:
[ 1.018418] amdgpu0 at pci1 dev 0 function 0: ATI Technologies Radeon 540/540X/550/550X / RX 540X/550/550X (rev. 0xc7)
[ 2.732670] [drm] initializing kernel modesetting (POLARIS12 0x1002:0x699F 0x1002:0x0B04 0xC7).
[ 2.732670] [drm] register mmio base: 0xF6F00000
[ 2.732670] [drm] register mmio size: 262144
[ 2.732670] [drm] PCIE atomic ops is not supported
[ 2.732670] [drm] add ip block number 0 <vi_common>
[ 2.732670] [drm] add ip block number 1 <gmc_v8_0>
[ 2.732670] [drm] add ip block number 2 <tonga_ih>
[ 2.732670] [drm] add ip block number 3 <gfx_v8_0>
[ 2.732670] [drm] add ip block number 4 <sdma_v3_0>
[ 2.732670] [drm] add ip block number 5 <powerplay>
[ 2.741941] [drm] add ip block number 6 <dm>
[ 2.741941] [drm] add ip block number 7 <uvd_v6_0>
[ 2.741941] [drm] add ip block number 8 <vce_v3_0>
[ 2.741941] ATOM BIOS: xxx-xxx-xxx
[ 2.741941] [drm] UVD is enabled in VM mode
[ 2.751729] [drm] UVD ENC is enabled in VM mode
[ 2.751729] [drm] VCE enabled in VM mode
[ 2.751729] [drm] vm size is 256 GB, 2 levels, block size is 10-bit, fragment size is 9-bit
[ 2.751729] amdgpu0: VRAM: 2048M 0x000000F400000000 - 0x000000F47FFFFFFF (2048M used)
[ 2.751729] amdgpu0: GART: 256M 0x000000FF00000000 - 0x000000FF0FFFFFFF
[ 2.751729] [drm] Detected VRAM RAM=2048M, BAR=4096M
[ 2.762173] [drm] RAM width 64bits GDDR5
[ 2.762173] Zone kernel: Available graphics memory: 9007199248635332 KiB
[ 2.762173] Zone dma32: Available graphics memory: 2097152 KiB
[ 2.762173] [drm] amdgpu: 2048M of VRAM memory ready
[ 2.762173] [drm] amdgpu: 3072M of GTT memory ready.
[ 2.762173] [drm] GART: num cpu pages 65536, num gpu pages 65536
[ 2.762173] [drm] PCIE GART of 256M enabled (table at 0x000000F400300000).
[ 2.772190] amdgpu0: interrupting at msi10 vec 0 (amdgpu0)
[ 2.772190] [drm] Chained IB support enabled!
[ 2.772190] hwmgr_sw_init smu backed is polaris10_smu
[ 2.772190] [drm] Found UVD firmware Version: 1.130 Family ID: 16
[ 2.781274] [drm] Found VCE firmware Version: 35.1a Binary ID: 3
[ 2.871273] [drm] DM_PPLIB: values for Engine clock
[ 2.871273] [drm] DM_PPLIB: 214000
[ 2.881273] [drm] DM_PPLIB: 551000
[ 2.891272] [drm] DM_PPLIB: 734000
[ 2.891272] [drm] DM_PPLIB: 980000
[ 2.901272] [drm] DM_PPLIB: 1000000
[ 2.911272] [drm] DM_PPLIB: 1020000
[ 2.911272] [drm] DM_PPLIB: 1046000
[ 2.921272] [drm] DM_PPLIB: 1071000
[ 2.931272] [drm] DM_PPLIB: Validation clocks:
[ 2.931272] [drm] DM_PPLIB: engine_max_clock: 107100
[ 2.941272] [drm] DM_PPLIB: memory_max_clock: 150000
[ 2.951272] [drm] DM_PPLIB: level : 8
[ 2.951272] [drm] DM_PPLIB: values for Memory clock
[ 2.961271] [drm] DM_PPLIB: 300000
[ 2.971271] [drm] DM_PPLIB: 625000
[ 2.971271] [drm] DM_PPLIB: 1500000
[ 2.981271] [drm] DM_PPLIB: Validation clocks:
[ 2.991271] [drm] DM_PPLIB: engine_max_clock: 107100
[ 2.991271] [drm] DM_PPLIB: memory_max_clock: 150000
[ 3.001271] [drm] DM_PPLIB: level : 8
[ 3.011271] [drm] Display Core initialized with v3.2.69!
[ 3.061270] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[ 3.061270] [drm] Driver supports precise vblank timestamp query.
[ 3.101269] [drm] UVD and UVD ENC initialized successfully.
[ 3.201268] [drm] VCE initialized successfully.
[ 3.211268] amdgpufb0 at amdgpu0
[ 3.221268] [drm] Initialized amdgpu 3.36.0 20150101 for amdgpu0 on minor 0
[ 3.231268] amdgpufb0: framebuffer at 0xfb00830000, size 1920x1080, depth 32, stride 7680
[ 3.331266] wsdisplay0 at amdgpufb0 kbdmux 1: console (default, vt100 emulation), using wskbd0
All dmesg output: https://pastebin.com/raw/PtCdSVAR
Xorg log when crashing (all): https://pastebin.com/raw/7brJjEh1
When the os is crashing, I got a dumping error, so /var/crash is empty:
dump ld0: I/O error
i/o error
I don't know this dump error are related with the amdgpu crash, but:
#nvmectl devlist
nvme0: WD Blue SN580 2TB
nvme0ns1 (1907729MB)
When I use NetBSD I don't experience problems with the disk, dmesg not show any issues I think.
I take a photo from my screen, when amdgpu is crashing: https://imgur.com/vJQkClM
>How-To-Repeat:
Load amdgpu kernel module, run startx.
>Fix:
>Audit-Trail:
From: Robert Bagdan <kikadf.01@gmail.com>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/58994: System panic with amdgpu and startx
Date: Wed, 12 Mar 2025 10:42:58 +0100
I read a topic in unitedbsd, someone can resolve a similar issue with
rebuilding the kernel, with comment out the two radeon related lines,
and uncomment the 2 amdgpu lines. I did it, modified the GENERIC conf:
#radeon* at pci? dev ? function ?
#radeondrmkmsfb* at radeonfbbus?
amdgpu* at pci? dev ? function ?
amdgpufb* at amdgpufbbus?
but for me it doesn't work with modesetting Xorg driver, I get system
panic when I start X, as earlier. (However I don't understand why this
rebuild can work for anybody, as I see it is the same driver, just the
amdgpu now inside the kernel not a module..)
Latest NetBSD-10 branch,
amdgpu0 at pci1 dev 0 function 0: ATI Technologies Radeon
540/540X/550/550X / RX 540X/550/550X (rev. 0xc7)
[drm] initializing kernel modesetting (POLARIS12 0x1002:0x699F
0x1002:0x0B04 0xC7).
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2025
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.