NetBSD Problem Report #47437

From campbell@mumble.net  Sat Jan 12 17:09:54 2013
Return-Path: <campbell@mumble.net>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
	by www.NetBSD.org (Postfix) with ESMTP id CE3FF63E9BD
	for <gnats-bugs@gnats.NetBSD.org>; Sat, 12 Jan 2013 17:09:54 +0000 (UTC)
Message-Id: <20130112170936.AC844604DD@jupiter.mumble.net>
Date: Sat, 12 Jan 2013 17:09:36 +0000 (UTC)
From: Taylor R Campbell <campbell+netbsd@mumble.net>
Reply-To: Taylor R Campbell <campbell+netbsd@mumble.net>
To: gnats-bugs@gnats.NetBSD.org
Subject: sometimes boot fails with KASSERT(pmap_tlb_pendcount < ncpu)
X-Send-Pr-Version: 3.95

>Number:         47437
>Category:       port-amd64
>Synopsis:       sometimes boot fails with KASSERT(pmap_tlb_pendcount < ncpu)
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    port-amd64-maintainer
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Jan 12 17:10:00 +0000 2013
>Closed-Date:    Thu Aug 10 17:22:23 +0000 2017
>Last-Modified:  Thu Aug 10 17:22:23 +0000 2017
>Originator:     Taylor R Campbell <campbell+netbsd@mumble.net>
>Release:        NetBSD 6.99.16
>Organization:
>Environment:
System: NetBSD ... 6.99.16 NetBSD 6.99.16 (RIAKERN) #1: Wed Jan  9 19:58:47 UTC 2013 root@...:/home/riastradh/netbsd/current/obj.amd64/sys/arch/amd64/compile/RIAKERN amd64
Architecture: amd64
Machine: amd64
>Description:

	Sometimes when I boot a many-core machine, during autoconf I
	get a panic after the ACPI CPU devices are configured.  I've
	seen the panic several times; last night I caught it on the
	serial console for the first time with ddb and grabbed a stack
	trace.  I believe it always happens after all the acpicpuN
	devices are attached, but I'm not sure.

...
acpicpu21 at cpu21: ACPI CPU
acpicpu22 at cpu22: ACPI CPU
acpicpu23 at cpu23: ACPI CPU
panic: kernel diagnostic assertion "pmap_tlb_pendcount < ncpu" failed: file "/home/riastradh/netbsd/current/src/sys/arch/x86/x86/pmap_tlb.c", line 434
fatal breakpoint trap in supervisor mode
trap type 1 code 0 rip ffffffff8025623d cs 8 rflags 246 cr2 0 ilevel 0 rsp fffffe813a5793a0
curlwp 0xfffffe887568e080 pid 0 lid 16 lowest kstack 0xfffffe813a576000
Stopped in pid 0.16 (system) at netbsd:breakpoint+0x5:  leave
db{0}> bt
breakpoint() at netbsd:breakpoint+0x5
vpanic() at netbsd:vpanic+0x1f2
kern_assert() at netbsd:kern_assert+0x48
pmap_tlb_shootnow() at netbsd:pmap_tlb_shootnow+0x394
pmap_update() at netbsd:pmap_update+0x3b
_x86_memio_unmap() at netbsd:_x86_memio_unmap+0xd2
AcpiExSystemMemorySpaceHandler() at netbsd:AcpiExSystemMemorySpaceHandler+0x245
AcpiEvAddressSpaceDispatch() at netbsd:AcpiEvAddressSpaceDispatch+0x157
AcpiExAccessRegion() at netbsd:AcpiExAccessRegion+0x30b
AcpiExFieldDatumIo() at netbsd:AcpiExFieldDatumIo+0x1b1
AcpiExWriteWithUpdateRule() at netbsd:AcpiExWriteWithUpdateRule+0x116
AcpiExInsertIntoField() at netbsd:AcpiExInsertIntoField+0x1d0
AcpiExWriteDataToField() at netbsd:AcpiExWriteDataToField+0x1be
AcpiExStoreObjectToNode() at netbsd:AcpiExStoreObjectToNode+0x277
AcpiExStore() at netbsd:AcpiExStore+0x1d0
AcpiExOpcode_1A_1T_1R() at netbsd:AcpiExOpcode_1A_1T_1R+0x238
AcpiDsExecEndOp() at netbsd:AcpiDsExecEndOp+0x22d
AcpiPsParseLoop() at netbsd:AcpiPsParseLoop+0xe9
AcpiPsParseAml() at netbsd:AcpiPsParseAml+0x27a
AcpiPsExecuteMethod() at netbsd:AcpiPsExecuteMethod+0x2af
AcpiNsEvaluate() at netbsd:AcpiNsEvaluate+0x305
AcpiEvAsynchExecuteGpeMethod() at netbsd:AcpiEvAsynchExecuteGpeMethod+0x15f
sysmon_task_queue_thread() at netbsd:sysmon_task_queue_thread+0x44
db{0}> 

>How-To-Repeat:

	Boot my many-core machine a few times.

>Fix:

	Yes, please!

>Release-Note:

>Audit-Trail:
From: "Juergen Hannken-Illjes" <hannken@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/47437 CVS commit: src/sys/arch/x86/x86
Date: Fri, 24 Jul 2015 15:20:37 +0000

 Module Name:	src
 Committed By:	hannken
 Date:		Fri Jul 24 15:20:37 UTC 2015

 Modified Files:
 	src/sys/arch/x86/x86: pmap_tlb.c

 Log Message:
 Operation pmap_tlb_processpacket() uses x86_ipi(.., LAPIC_DEST_ALLEXCL, ...)
 when cpuset "target" equals "kcpuset_running".  During boot, while some CPUs
 are not running yet, this will result in more IPI interrupts than expected
 and "pmap_tlb_pendcount" related KASSERTs fire.

 Compare the cpuset "target" against "kcpuset_attached", as this set represents
 the CPUs LAPIC_DEST_ALLEXCL will notify.

 Should fix PR port-amd64/47437


 To generate a diff of this commit:
 cvs rdiff -u -r1.6 -r1.7 src/sys/arch/x86/x86/pmap_tlb.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->closed
State-Changed-By: maxv@NetBSD.org
State-Changed-When: Thu, 10 Aug 2017 17:22:23 +0000
State-Changed-Why:
This issue was obviously fixed - I'm closing this PR.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.