NetBSD Problem Report #54845

From gson@gson.org  Wed Jan  8 14:30:36 2020
Return-Path: <gson@gson.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 56FCC7A154
	for <gnats-bugs@gnats.NetBSD.org>; Wed,  8 Jan 2020 14:30:36 +0000 (UTC)
Message-Id: <20200108143031.86B86253F3E@guava.gson.org>
Date: Wed,  8 Jan 2020 16:30:31 +0200 (EET)
From: gson@gson.org (Andreas Gustafsson)
Reply-To: gson@gson.org (Andreas Gustafsson)
To: gnats-bugs@NetBSD.org
Subject: sparc panics in sleepq_remove
X-Send-Pr-Version: 3.95

>Number:         54845
>Category:       port-sparc
>Synopsis:       sparc panics in sleepq_remove
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    ad
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Jan 08 14:35:00 +0000 2020
>Closed-Date:    Wed May 20 11:02:37 +0000 2020
>Last-Modified:  Wed May 20 11:02:37 +0000 2020
>Originator:     Andreas Gustafsson
>Release:        NetBSD-current
>Organization:

>Environment:
System: NetBSD
Architecture: sparc
Machine: sparc
>Description:

The last few sparc test runs on the TNF testbed have paniced with:

  panic: TAILQ_PREREMOVE head 0xf0002008 elm 0xf074c040 /tmp/bracket/build/2020.01.06.01.37.57-sparc/src/sys/kern/kern_sleepq.c:117

For example:

  http://releng.netbsd.org/b5reports/sparc/2020/2020.01.06.01.37.57/test.log
  http://releng.netbsd.org/b5reports/sparc/2020/2020.01.06.21.03.24/test.log
  http://releng.netbsd.org/b5reports/sparc/2020/2020.01.07.10.20.07/test.log

>How-To-Repeat:

>Fix:

>Release-Note:

>Audit-Trail:
From: Andrew Doran <ad@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: port-sparc-maintainer@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org
Subject: Re: port-sparc/54845: sparc panics in sleepq_remove
Date: Wed, 8 Jan 2020 15:11:26 +0000

 > The last few sparc test runs on the TNF testbed have paniced with:
 > 
 >   panic: TAILQ_PREREMOVE head 0xf0002008 elm 0xf074c040 /tmp/bracket/build/2020.01.06.01.37.57-sparc/src/sys/kern/kern_sleepq.c:117
 > 
 > For example:
 > 
 >   http://releng.netbsd.org/b5reports/sparc/2020/2020.01.06.01.37.57/test.log
 >   http://releng.netbsd.org/b5reports/sparc/2020/2020.01.06.21.03.24/test.log
 >   http://releng.netbsd.org/b5reports/sparc/2020/2020.01.07.10.20.07/test.log

 I recall seeing a very similar PR from you a month or two ago, on sparc.  It
 looks like something has corrupted the xcall condition variable at the top
 of struct cpu_data/cpu_info.  Another assembly wrap around problem perhaps?

 Andrew


 [ 4836.8778090] panic: TAILQ_PREREMOVE head 0xf0002008 elm 0xf074c040 /tmp/bracket/build/2020.01.06.21.03.24-sparc/src/sys/kern/kern_sleepq.c:117
 [ 4836.8778090] cpu0: Begin traceback...
 [ 4836.8778090] 0x0(0xf04542d0, 0xf53e0c10, 0xf0552000, 0xf0552c00, 0x104, 0xf0552c78) at netbsd:panic+0x20
 [ 4836.8778090] panic(0xf04542d0, 0xf0002008, 0xf074c040, 0xf049ca28, 0x75, 0xf052fc00) at netbsd:sleepq_remove+0x21c
 [ 4836.8778090] sleepq_remove(0xf0002008, 0xf074c040, 0xf083f3a0, 0xf049a400, 0x206, 0xf074c040) at netbsd:cv_wakeup_one+0x84
 [ 4836.8778090] cv_wakeup_one(0xf0002008, 0xf049a530, 0xf0449530, 0xf0526b3c, 0xf074c040, 0xf052fc00) at netbsd:xc_broadcast+0x1ec
 [ 4836.8778090] xc_broadcast(0x0, 0x3, 0x1, 0xf0537038, 0x0, 0xf0536380) at netbsd:xc_barrier+0x14
 [ 4836.8778090] xc_barrier(0x0, 0xf0a6a960, 0xf083f3a0, 0xf049a400, 0x20e, 0x7c) at netbsd:pool_cache_invalidate_groups+0x5c
 [ 4836.8778090] pool_cache_invalidate_groups(0xf06f86e0, 0xf0858d58, 0xf0be3220, 0xf0858d64, 0xf0a6a960, 0x0) at netbsd:pool_cache_invalidate+0xa4
 [ 4836.8778090] pool_cache_invalidate(0xf06f86e0, 0xf085aae8, 0xf083f3a0, 0xf04a2800, 0xf0858d58, 0x0) at netbsd:pool_reclaim+0x64
 [ 4836.8778090] pool_reclaim(0xf06f86e0, 0xf049a530, 0xf06f8754, 0xf0526b3c, 0xf083f3a0, 0xf0533f40) at netbsd:pool_drain+0x6c
 [ 4836.8778090] pool_drain(0xf53e0f4c, 0xf049a530, 0xf53e0f4c, 0xf0527800, 0xf0552400, 0xf06f86e0) at netbsd:uvmpd_pool_drain_thread+0xec
 [ 4836.8778090] uvmpd_pool_drain_thread(0xf054d1d0, 0xf05232a8, 0xf0536800, 0xf052dbc0, 0x0, 0x0) at netbsd:lwp_trampoline+0x8
 [ 4836.8778090] cpu0: End traceback...

From: Andreas Gustafsson <gson@gson.org>
To: Andrew Doran <ad@netbsd.org>
Cc: gnats-bugs@netbsd.org
Subject: Re: port-sparc/54845: sparc panics in sleepq_remove
Date: Wed, 8 Jan 2020 17:38:10 +0200

 Andrew Doran wrote:
 >  I recall seeing a very similar PR from you a month or two ago, on sparc.

 Ah yes, that would be PR 54734, "sparc panics running ATF tests".
 There have been so many regressions in the last couple of months that
 it's hard to kep track of them all.  Whether or not this is the same
 bug, at least it is new in the sense that things worked inbetween:

   2019.11.22.23.38.15 
   2019.11.23.17.32.10 
   2019.12.01.13.20.42 
   2019.12.02.08.33.52 paniced
   2019.12.04.19.51.32 paniced
   2019.12.06.08.40.33 paniced
   2019.12.06.21.45.14 paniced
   2019.12.07.16.00.36 paniced
   2019.12.07.19.38.29 paniced
   2019.12.12.16.49.20 paniced
   2019.12.30.22.13.47 
   2020.01.02.14.33.55 
   2020.01.03.03.44.42 
   2020.01.04.02.21.15 
   2020.01.05.00.03.27 
   2020.01.06.01.37.57 paniced
   2020.01.06.21.03.24 paniced
   2020.01.07.10.20.07 paniced

 -- 
 Andreas Gustafsson, gson@gson.org

Responsible-Changed-From-To: port-sparc-maintainer->ad
Responsible-Changed-By: ad@NetBSD.org
Responsible-Changed-When: Wed, 08 Jan 2020 20:50:06 +0000
Responsible-Changed-Why:
I'll take a look.


From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-sparc/54845: sparc panics in sleepq_remove
Date: Wed, 20 May 2020 13:57:41 +0300

 The sparc tests are running to completion again since this commit:

   2020.05.17.17.12.28 ad src/sys/uvm/uvm_page.c 1.236

 -- 
 Andreas Gustafsson, gson@gson.org

State-Changed-From-To: open->closed
State-Changed-By: gson@NetBSD.org
State-Changed-When: Wed, 20 May 2020 11:02:37 +0000
State-Changed-Why:
Fixed, thanks.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.