NetBSD Problem Report #55895
From gson@gson.org Sun Dec 27 12:01:13 2020
Return-Path: <gson@gson.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id C61901A921F
for <gnats-bugs@gnats.NetBSD.org>; Sun, 27 Dec 2020 12:01:13 +0000 (UTC)
Message-Id: <20201227120108.99507253EDE@guava.gson.org>
Date: Sun, 27 Dec 2020 14:01:08 +0200 (EET)
From: gson@gson.org (Andreas Gustafsson)
Reply-To: gson@gson.org (Andreas Gustafsson)
To: gnats-bugs@NetBSD.org
Subject: panic: kernel diagnostic assertion "(flags & (PR_NOWAIT|PR_LIMITFAIL)) != 0" failed
X-Send-Pr-Version: 3.95
>Number: 55895
>Category: port-sparc
>Synopsis: panic: kernel diagnostic assertion "(flags & (PR_NOWAIT|PR_LIMITFAIL)) != 0" failed
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: chs
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Dec 27 12:05:00 +0000 2020
>Closed-Date: Thu Aug 05 07:53:52 +0000 2021
>Last-Modified: Thu Aug 05 07:53:52 +0000 2021
>Originator: Andreas Gustafsson
>Release: NetBSD-current
>Organization:
>Environment:
System: NetBSD
Architecture: sparc
Machine: sparc
>Description:
The TNF sparc testbed has recorded six random panics with the panic
message in the subject line in the last few months.
Here is a link to the log and the panic message for each one:
http://releng.netbsd.org/b5reports/sparc/commits-2020.08.html#2020.08.10.11.09.15
detect_unused_tests: [ 50699.6707740] panic: kernel diagnostic assertion "(flags & (PR_NOWAIT|PR_LIMITFAIL)) != 0" failed: file "/tmp/bracket/build/2020.08.10.11.09.15-sparc/src/sys/kern/subr_pool.c", line 1181
http://releng.netbsd.org/b5reports/sparc/commits-2020.09.html#2020.09.12.12.11.19
grow_16M_v1_4096: [ 13917.1767735] panic: kernel diagnostic assertion "(flags & (PR_NOWAIT|PR_LIMITFAIL)) != 0" failed: file "/tmp/build/2020.09.12.12.11.19-sparc/src/sys/kern/subr_pool.c", line 1181
http://releng.netbsd.org/b5reports/sparc/commits-2020.09.html#2020.09.13.13.03.15
ldp_regen: [ 26053.9547610] panic: kernel diagnostic assertion "(flags & (PR_NOWAIT|PR_LIMITFAIL)) != 0" failed: file "/tmp/bracket/build/2020.09.13.13.03.15-sparc/src/sys/kern/subr_pool.c", line 1181
http://releng.netbsd.org/b5reports/sparc/commits-2020.10.html#2020.10.01.02.00.04
[ 18.0342095] panic: kernel diagnostic assertion "(flags & (PR_NOWAIT|PR_LIMITFAIL)) != 0" failed: file "/tmp/build/2020.10.01.02.00.04-sparc/src/sys/kern/subr_pool.c", line 1181
http://releng.netbsd.org/b5reports/sparc/commits-2020.11.html#2020.11.05.00.41.04
crossping: [ 27861.1173080] panic: kernel diagnostic assertion "(flags & (PR_NOWAIT|PR_LIMITFAIL)) != 0" failed: file "/tmp/build/2020.11.05.00.41.04-sparc/src/sys/kern/subr_pool.c", line 1181
http://releng.netbsd.org/b5reports/sparc/commits-2020.12.html#2020.12.26.22.28.35
ipsec_tunnel_ipv4_ah_hmacripemd160: [ 10851.7832570] panic: kernel diagnostic assertion "(flags & (PR_NOWAIT|PR_LIMITFAIL)) != 0" failed: file "/tmp/build/2020.12.26.22.28.35-sparc/src/sys/kern/subr_pool.c", line 1181
I'm filing this as category "kern" rather than "port-sparc" because I
suspect it's an MI issue that just happens to hit the sparc testbed
because it has less (emulated) RAM than most. A crash with the same
panic message has also been reported on evbearm6:
https://mail-index.netbsd.org/current-users/2018/11/04/msg034522.html
and analyzed:
https://mail-index.netbsd.org/current-users/2018/11/04/msg034523.html
>How-To-Repeat:
>Fix:
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: kern-bug-people->chs
Responsible-Changed-By: chs@NetBSD.org
Responsible-Changed-When: Thu, 07 Jan 2021 13:04:42 +0000
Responsible-Changed-Why:
I'll fix it
From: Chuck Silvers <chuq@chuq.com>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, netbsd-bugs@netbsd.org,
gnats-admin@netbsd.org, Andreas Gustafsson <gson@gson.org>
Subject: Re: port-sparc/55895 (panic: kernel diagnostic assertion "(flags &
(PR_NOWAIT|PR_LIMITFAIL)) != 0" failed)
Date: Thu, 7 Jan 2021 10:54:33 -0800
all of these assertion failures have this stack trace:
ipsec_tunnel_ipv4_ah_hmacripemd160: [ 10851.7832570] panic: kernel diagnostic assertion "(flags & (PR_NOWAIT|PR_LIMITFAIL)) != 0" failed: file "/tmp/build/2020.12.26.22.28.35-sparc/src/sys/kern/subr_pool.c", line 1181
[ 10851.7955840] cpu0: Begin traceback...
[ 10851.7955840] 0x0(0xf0474818, 0xf584e920, 0xf0577c00, 0xf0578c00, 0x104, 0xf0578b50) at netbsd:kern_assert+0x38
[ 10851.7955840] kern_assert(0xf0474818, 0xf0474808, 0xf04c5438, 0xf04c4c58, 0x49d, 0x0) at netbsd:pool_get+0x818
[ 10851.7955840] pool_get(0xf055d108, 0x1, 0xf04c4c58, 0xf0474808, 0xf055d180, 0x0) at netbsd:pmap_pmap_pool_ctor+0xc4
[ 10851.7955840] pmap_pmap_pool_ctor(0x0, 0xf0a72220, 0x1, 0x0, 0xf055cd38, 0xf0a72000) at netbsd:pool_cache_get_slow+0x18c
[ 10851.7955840] pool_cache_get_slow(0xf055ccc0, 0xf055cec0, 0xf0a72220, 0xf584ea7c, 0x0, 0x1) at netbsd:pool_cache_get_paddr+0x14c
[ 10851.7955840] pool_cache_get_paddr(0xf055ccc0, 0x1, 0x0, 0x0, 0xf055cec0, 0xf055ccc0) at netbsd:pmap_create+0x10
[ 10851.7955840] pmap_create(0xf0d68a5c, 0x3, 0x0, 0xf029115c, 0x0, 0xf0d68ad0) at netbsd:uvmspace_init+0x5c
[ 10851.8042170] uvmspace_init(0xf0d68a50, 0x0, 0x1000, 0xf0000000, 0x1, 0xf05730c0) at netbsd:uvmspace_alloc+0x28
[ 10851.8042170] uvmspace_alloc(0xf0d68a50, 0xf0000000, 0x1, 0xf584ec04, 0x1000, 0xf584d000) at netbsd:uvmspace_exec+0x54
[ 10851.8042170] uvmspace_exec(0xf0d4b180, 0x1000, 0xf0000000, 0x1, 0x0, 0xf0d68108) at netbsd:execve_runproc+0x838
[ 10851.8042170] execve_runproc(0xf0d4b180, 0xf584ecf8, 0x0, 0x0, 0xf0d4b180, 0xf0d9ba90) at netbsd:execve1+0x44
[ 10851.8042170] execve1(0xf0d4b180, 0x1, 0xeffff350, 0xffffffff, 0xeffff050, 0xeffff99c) at netbsd:sys_execve+0x24
[ 10851.8042170] sys_execve(0xf0d4b180, 0xf584ef30, 0xf584ef28, 0xeffff350, 0x0, 0x8573836) at netbsd:syscall+0xe0
[ 10851.8042170] syscall(0xc3b, 0xf584efb0, 0xedc5e124, 0x3b, 0x3, 0xf0d4b180) at netbsd:memfault_sun4m+0x3f8
[ 10851.8042170] cpu0: End traceback...
here is the pool_get() call:
upt = pool_get(&L1_pool, flags);
and here is the ctor for L1_pool:
void *
pgt_page_alloc(struct pool *pp, int flags)
{
int cacheit = (CACHEINFO.c_flags & CACHE_PAGETABLES) != 0;
struct vm_page *pg;
vaddr_t va;
paddr_t pa;
/* Allocate a page of physical memory */
if ((pg = uvm_pagealloc(NULL, 0, NULL, 0)) == NULL)
return (NULL);
...
}
the problem is that the ctor does not retry the page allocation if
uvm_pagealloc() fails but (flags & PR_WAITOK).
From: "Chuck Silvers" <chs@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/55895 CVS commit: src/sys/arch/sparc/sparc
Date: Mon, 11 Jan 2021 06:12:43 +0000
Module Name: src
Committed By: chs
Date: Mon Jan 11 06:12:43 UTC 2021
Modified Files:
src/sys/arch/sparc/sparc: pmap.c
Log Message:
in pgt_page_alloc(), wait and retry the page allocation if PR_WAITOK.
fixes PR 55895.
To generate a diff of this commit:
cvs rdiff -u -r1.369 -r1.370 src/sys/arch/sparc/sparc/pmap.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->feedback
State-Changed-By: chs@NetBSD.org
State-Changed-When: Mon, 11 Jan 2021 06:17:53 +0000
State-Changed-Why:
does the patch I committed fix the problem for you?
(I understand that because the bug only triggers quite infrequently,
it may take several months before it's clear whether it's really gone.)
State-Changed-From-To: feedback->closed
State-Changed-By: gson@NetBSD.org
State-Changed-When: Thu, 05 Aug 2021 07:53:52 +0000
State-Changed-Why:
The last recorded failure on the TNF sparc testbed was with source date
2021.01.11.02.18.40. Looks like it's fixed.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.