NetBSD Problem Report #52858

From martin@duskware.de  Sun Dec 24 20:26:42 2017
Return-Path: <martin@duskware.de>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id BC3527A188
	for <gnats-bugs@gnats.NetBSD.org>; Sun, 24 Dec 2017 20:26:42 +0000 (UTC)
From: martin@NetBSD.org
Reply-To: martin@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: kernel lock up
X-Send-Pr-Version: 3.95

>Number:         52858
>Category:       kern
>Synopsis:       kernel lock up
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    riastradh
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Dec 24 20:30:00 +0000 2017
>Closed-Date:    Sun Jul 26 16:18:37 +0000 2020
>Last-Modified:  Sun Jul 26 16:18:37 +0000 2020
>Originator:     Martin Husemann
>Release:        NetBSD 8.99.9
>Organization:
The NetBSD Foundation, Inc.
>Environment:
System: NetBSD thirdstage.duskware.de 8.99.9 NetBSD 8.99.9 (MODULAR) #40: Sat Dec 23 21:52:55 CET 2017 martin@thirdstage.duskware.de:/usr/src/sys/arch/sparc64/compile/MODULAR sparc64
Architecture: sparc64
Machine: sparc64
>Description:

My machine "randomly" locked up (happened only once, no idea how to reproduce)

ddb backtrace shows:

intr_list_handler(10473bc10, 7, e0047620, 8000000000000000, 1042d60, 10473bb30) at netbsd:intr_list_handler+0x10
sparc_interrupt(1, 7, e00476d0, 8000000000000000, 6, e0048000) at netbsd:sparc_interrupt+0x294
sparc_interrupt(103b915b0, 70000000001, ff070000000001, 18b8400, 6, e0048000) at netbsd:sparc_interrupt+0x294
pool_grow(103b915b0, 2, 18d8800, 18b8400, 0, 2000) at netbsd:pool_grow+0x508
pool_catchup(103b91500, 103b915b1, 18e1800, 18e0c00, 8e7, 105823a20) at netbsd:pool_catchup+0x20
pool_get(18e0d00, 2, 105b1f000, 103b915b0, 105120780, 103b91500) at netbsd:pool_get+0x550
pool_cache_get_slow(103b91740, 7, e0047bb8, 104a41bc0, 2, 103b91500) at netbsd:pool_cache_get_slow+0x1b8
pool_cache_get_paddr(103b91500, 2, 104a41bc0, 1858730, 7, 103b91740) at netbsd:pool_cache_get_paddr+0x298
bge_newbuf_std(1046a2000, 105, 104a41b20, 104a2bed8, 1ce9000, 1046a2828) at netbsd:bge_newbuf_std+0x190
bge_intr(1046a2000, 6, 60000, 1cf34c000, 600e, 6) at netbsd:bge_intr+0xbfc
intr_biglock_wrapper(103b4e548, 0, e0047ed0, 18e0c00, 1042dc0, 103b91500) at netbsd:intr_biglock_wrapper+0x10
sparc_interrupt(1c9e098, 105823a20, ff070000000001, 18d2800, 0, 103ae8280) at netbsd:sparc_interrupt+0x294



>How-To-Repeat:
n/a

>Fix:
n/a

>Release-Note:

>Audit-Trail:
From: coypu@sdf.org
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/52858: kernel lock up
Date: Mon, 25 Dec 2017 02:45:10 +0000

 It looks like we can spin forever in pool_catchup if we have PR_WAITOK
 allocation sleeping followed by a PR_NOWAIT allocation.


 Single CPU, no kpreemption arch

 [lwp #1]
    |
 [  ??  ]
    |
 [pool_grow with PR_WAITOK
 [set PR_GROWING
 [allocation, decide to sleep
    |
   zzZzzZ                          [lwp #2]
                                      |
                                   [  ??  ]
                                      |
                                   [pool_catchup with PR_NOWAIT
                                   [see PR_GROWING already set,
                                   [spin forever returning ERESTART
                                   [(nothing ever preempts me or
 				   increases the pool items)

Responsible-Changed-From-To: kern-bug-people->riastradh
Responsible-Changed-By: maya@NetBSD.org
Responsible-Changed-When: Mon, 25 Dec 2017 02:57:20 +0000
Responsible-Changed-Why:
Ping, you might know about this. also, note christos added a change to have corret ERESTART behaviour following your commit.


From: Ryota Ozaki <ozaki-r@netbsd.org>
To: "gnats-bugs@NetBSD.org" <gnats-bugs@netbsd.org>
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/52858: kernel lock up
Date: Tue, 26 Dec 2017 18:12:10 +0900

 On Mon, Dec 25, 2017 at 11:50 AM,  <coypu@sdf.org> wrote:
 > The following reply was made to PR kern/52858; it has been noted by GNATS.
 >
 > From: coypu@sdf.org
 > To: gnats-bugs@NetBSD.org
 > Cc:
 > Subject: Re: kern/52858: kernel lock up
 > Date: Mon, 25 Dec 2017 02:45:10 +0000
 >
 >  It looks like we can spin forever in pool_catchup if we have PR_WAITOK
 >  allocation sleeping followed by a PR_NOWAIT allocation.
 >
 >
 >  Single CPU, no kpreemption arch
 >
 >  [lwp #1]
 >     |
 >  [  ??  ]
 >     |
 >  [pool_grow with PR_WAITOK
 >  [set PR_GROWING
 >  [allocation, decide to sleep
 >     |
 >    zzZzzZ                          [lwp #2]
 >                                       |
 >                                    [  ??  ]
 >                                       |
 >                                    [pool_catchup with PR_NOWAIT
 >                                    [see PR_GROWING already set,
 >                                    [spin forever returning ERESTART
 >                                    [(nothing ever preempts me or
 >                                    increases the pool items)
 >

 FYI: similar backtrace here (on amd64 though):
   http://mail-index.netbsd.org/source-changes-d/2017/12/26/msg009751.html

   ozaki-r

State-Changed-From-To: open->feedback
State-Changed-By: prlw1@NetBSD.org
State-Changed-When: Sun, 26 Jul 2020 15:28:17 +0000
State-Changed-Why:
The bug discussed in the link ozaki-r added about looks very similar and
was fixed by

  http://cvsweb.netbsd.org/cgi-bin/cvsweb.cgi/src/sys/kern/subr_pool.c#rev1.220

Issue resolved?


State-Changed-From-To: feedback->closed
State-Changed-By: martin@NetBSD.org
State-Changed-When: Sun, 26 Jul 2020 16:18:37 +0000
State-Changed-Why:
I have no way that would have reliably triggered it and have not seen it
again, so no way to verify the fix.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.