NetBSD Problem Report #51675

From dholland@netbsd.org  Tue Nov 29 23:53:32 2016
Return-Path: <dholland@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 2DAFF7A2CC
	for <gnats-bugs@gnats.NetBSD.org>; Tue, 29 Nov 2016 23:53:32 +0000 (UTC)
Message-Id: <20161129235326.53AB6855E0@mail.netbsd.org>
Date: Tue, 29 Nov 2016 23:53:26 +0000 (UTC)
From: dholland@netbsd.org
Reply-To: dholland@netbsd.org
To: gnats-bugs@NetBSD.org
Subject: radeondrmkms wedge with supertuxkart
X-Send-Pr-Version: 3.95

>Number:         51675
>Category:       kern
>Synopsis:       radeondrmkms wedge with supertuxkart
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    riastradh
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Nov 29 23:55:00 +0000 2016
>Closed-Date:    Tue Apr 16 14:34:51 +0000 2019
>Last-Modified:  Tue Apr 16 14:34:51 +0000 2019
>Originator:     David A. Holland
>Release:        NetBSD 7.99.42 (20161125)
>Organization:
>Environment:
System: NetBSD valkyrie 7.99.42 NetBSD 7.99.42 (VALKYRIE) #20: Fri Nov 25 18:33:19 EST 2016  dholland@valkyrie:/usr/src/sys/arch/amd64/compile/VALKYRIE amd64
Architecture: x86_64
Machine: amd64
>Description:

Now that we think 50349 is fixed I updated another machine with the
same radeon card to -current, and discovered that supertuxkart wedges
the X server and/or console.

In particular, when you open supertuxkart and start a race, the X
server hangs cold. It doesn't seem to wedge the rest of the system,
although my options for investigating the wedge state are somewhat
limited at the moment.

Once when I did this supertuxkart cored instead, which is maybe/maybe
not a different issue.

While poking around I found that running teapot from MesaDemos dumps
core, as does glxgears; running with env RADEON_THREAD=FALSE (as
suggested in 49838) to disable the thread in the gallium driver makes
glxgears run, but not teapot. (This much requires a fix to the gallium
driver build, or it doesn't load; but that's been committed.)

I haven't tried the thread thing with supertuxkart yet, but I will
next go.

>How-To-Repeat:

>Fix:

>Release-Note:

>Audit-Trail:

Responsible-Changed-From-To: kern-bug-people->riastradh
Responsible-Changed-By: dholland@NetBSD.org
Responsible-Changed-When: Tue, 29 Nov 2016 23:56:05 +0000
Responsible-Changed-Why:
fyi
(maybe we should have a drmkms-bug-people)


State-Changed-From-To: open->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Fri, 18 Jan 2019 02:53:38 +0000
State-Changed-Why:
The system crashes are fixed by sys/external/bsd/drm2/pci/drm_pci.c -r1.32.
I made a new PR for the X server segfaults: 53888.


State-Changed-From-To: closed->pending-pullups
State-Changed-By: mrg@NetBSD.org
State-Changed-When: Fri, 18 Jan 2019 06:47:39 +0000
State-Changed-Why:
this is likely a problem on both -7 and -8.  let's do pullups.


State-Changed-From-To: pending-pullups->needs-pullups
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Fri, 18 Jan 2019 07:55:55 +0000
State-Changed-Why:
concur, but they haven't been filed yet


From: David Holland <dholland@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/51675 (radeondrmkms wedge with supertuxkart)
Date: Fri, 18 Jan 2019 09:07:33 +0000

 On Fri, Jan 18, 2019 at 07:55:56AM +0000, dholland@NetBSD.org wrote:
  > concur, but they haven't been filed yet

 The merge fails on both -8 and -7 because the comment following the
 change is different; the following patch applies (to both), but
 someone who knows more about the code should eyeball the results to
 make sure it's still valid.

 Index: drm_pci.c
 ===================================================================
 RCS file: /cvsroot/src/sys/external/bsd/drm2/pci/drm_pci.c,v
 retrieving revision 1.17.2.1
 diff -u -r1.17.2.1 drm_pci.c
 --- drm_pci.c	1 Aug 2017 23:12:06 -0000	1.17.2.1
 +++ drm_pci.c	18 Jan 2019 08:19:17 -0000
 @@ -145,6 +145,14 @@
  			continue;
  		}

 +		/*
 +		 * If it's a 64-bit mapping, don't interpret the second
 +		 * half of it as another BAR in the next iteration of
 +		 * the loop -- move on to the next unit.
 +		 */
 +		if (PCI_MAPREG_MEM_TYPE(type) == PCI_MAPREG_MEM_TYPE_64BIT)
 +			unit++;
 +
  		/* Inquire about it.  We'll map it in drm_core_ioremap.  */
  		if (pci_mapreg_info(pa->pa_pc, pa->pa_tag, reg, type,
  			&bm->bm_base, &bm->bm_size, &bm->bm_flags) != 0) {


 -- 
 David A. Holland
 dholland@netbsd.org

From: matthew green <mrg@eterna.com.au>
To: gnats-bugs@NetBSD.org
Cc: riastradh@NetBSD.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,
    dholland@netbsd.org
Subject: re: kern/51675 (radeondrmkms wedge with supertuxkart)
Date: Sat, 19 Jan 2019 07:20:55 +1100

 >  The merge fails on both -8 and -7 because the comment following the
 >  change is different; the following patch applies (to both), but
 >  someone who knows more about the code should eyeball the results to
 >  make sure it's still valid.

 i've been running with this change on -8 for 11 hours now.


 .mrg.

State-Changed-From-To: needs-pullups->pending-pullups
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Sun, 20 Jan 2019 06:43:23 +0000
State-Changed-Why:
pullup-8 #1165
pullup-7 #1673


State-Changed-From-To: pending-pullups->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Tue, 16 Apr 2019 14:34:51 +0000
State-Changed-Why:
Pulled up.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.43 2018/01/16 07:36:43 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.