NetBSD Problem Report #56565

From martin@aprisoft.de  Tue Dec 21 15:15:13 2021
Return-Path: <martin@aprisoft.de>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id CE5AF1A923B
	for <gnats-bugs@gnats.NetBSD.org>; Tue, 21 Dec 2021 15:15:13 +0000 (UTC)
Message-Id: <20211221151504.21F855CC847@emmas.aprisoft.de>
Date: Tue, 21 Dec 2021 16:15:04 +0100 (CET)
From: martin@NetBSD.org
Reply-To: martin@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: amdgpu crashes
X-Send-Pr-Version: 3.95

>Number:         56565
>Category:       kern
>Synopsis:       amdgpu crashes
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Dec 21 15:20:00 +0000 2021
>Closed-Date:    Wed Feb 16 20:22:07 +0000 2022
>Last-Modified:  Wed Feb 16 20:22:07 +0000 2022
>Originator:     Martin Husemann
>Release:        NetBSD 9.99.92
>Organization:
The NetBSD Foundation, Inc.
>Environment:
System: NetBSD martins.aprisoft.de 9.99.92 NetBSD 9.99.92 (GENERIC) #92: Mon Dec 20 14:35:23 CET 2021 martin@martins.aprisoft.de:/usr/src/sys/arch/amd64/compile/GENERIC amd64
Architecture: x86_64
Machine: amd64
>Description:

My machine has a 041:00:0: ATI Technologies Radeon RX 470/480/570/570X/580/580X/590 (VGA display, revision 0xc7)
041:00:0: 0x67df1002 (0x030000c7)
graphics card.

When enabling amdgpu in the kernel config (and firmware building for it),
I get a kernel fault at init time:

It crashes in:


ttm_bo_validate+0x11a
0xffffffff811da570 is in ttm_bo_validate (../../../../external/bsd/drm2/dist/drm/ttm/ttm_bo.c:926).
921	void ttm_bo_mem_put(struct ttm_buffer_object *bo, struct ttm_mem_reg *mem)
922	{
923		struct ttm_mem_type_manager *man = &bo->bdev->man[mem->mem_type];
924	
925		if (mem->mm_node)
926			(*man->func->put_node)(man, mem);
927	}


ttm_bo_init_reserved+0x35a
0xffffffff811da908 is in ttm_bo_init_reserved (../../../../external/bsd/drm2/dist/drm/ttm/ttm_bo.c:1395).

amdgpu_bo_do_create+0x1cd
0xffffffff8090d4ec is in amdgpu_bo_do_create (../../../../external/bsd/drm2/dist/drm/amd/amdgpu/amdgpu_object.c:585).


Unfortunately I couldn't get a crash dump.


>How-To-Repeat:
s/a

>Fix:
n/A

>Release-Note:

>Audit-Trail:
From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/56565 CVS commit: src/sys/external/bsd/drm2/dist/drm/ttm
Date: Mon, 14 Feb 2022 09:25:39 +0000

 Module Name:	src
 Committed By:	riastradh
 Date:		Mon Feb 14 09:25:39 UTC 2022

 Modified Files:
 	src/sys/external/bsd/drm2/dist/drm/ttm: ttm_bo.c

 Log Message:
 drm/ttm: Avoid uninitialized mem in error branch.

 Not sure why this error branch is getting hit, but let's not make the
 problem worse by choking on stack garbage.

 Candidate fix for symptom of PR kern/56565, PR kern/56711.
 Underlying problem -- that ttm_bo_mem_space fails with ENOMEM --
 remains.


 To generate a diff of this commit:
 cvs rdiff -u -r1.30 -r1.31 src/sys/external/bsd/drm2/dist/drm/ttm/ttm_bo.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->closed
State-Changed-By: andvar@NetBSD.org
State-Changed-When: Wed, 16 Feb 2022 20:22:07 +0000
State-Changed-Why:
fixed with linux_dma_resv.c r1.22 and other recent drm changes by riastradh

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.