NetBSD Problem Report #41106

From www@NetBSD.org  Tue Mar 31 01:48:40 2009
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by www.NetBSD.org (Postfix) with ESMTP id ED65563B946
	for <gnats-bugs@gnats.netbsd.org>; Tue, 31 Mar 2009 01:48:39 +0000 (UTC)
Message-Id: <20090331014839.B690263B8C8@www.NetBSD.org>
Date: Tue, 31 Mar 2009 01:48:39 +0000 (UTC)
From: dmarquess@gmail.com
Reply-To: dmarquess@gmail.com
To: gnats-bugs@NetBSD.org
Subject: GENERIC.MP memory management faults on API CS20/HP DS20L
X-Send-Pr-Version: www-1.0

>Number:         41106
>Category:       port-alpha
>Synopsis:       GENERIC.MP memory management faults on API CS20/HP DS20L
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    port-alpha-maintainer
>State:          closed
>Class:          doc-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Mar 31 01:50:00 +0000 2009
>Closed-Date:    Wed Sep 16 20:31:36 +0000 2009
>Last-Modified:  Wed Sep 16 20:31:36 +0000 2009
>Originator:     Dustin Marquess
>Release:        netbsd-5-200903290002Z
>Organization:
>Environment:
NetBSD 5.0_RC3 (GENERIC-$Revision: 1.325 $) #0: Sun Mar 29 21:24:52 PDT 2009
        builds@wb27:/home/builds/ab/netbsd-5/alpha/200903290002Z-obj/home/builds/ab/netbsd-5/src/sys/arch/alpha/compile/GENERIC.MP
hp AlphaServer DS20L 833 MHz, s/n 6969696969
8192 byte page size, 2 processors.
total memory = 2048 MB
(2792 KB reserved for PROM, 2045 MB used by NetBSD)
avail memory = 2006 MB
>Description:
When booting a GENERIC.MP kernel, the kernel always receives a memory management fault trap while scanning the SCSI bus:

fxp1 at pci1 dev 3 function 0: i82559 Ethernet, rev 8
fxp1: interrupting at dec 6600 irq 32
fxp1: Ethernet address 00:02:56:00:08:b6
inphy1 at fxp1 phy 1: i82555 10/100 media interface, rev. 4
inphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
ahc0 at pci1 dev 4 function 0: Adaptec 29160 Ultra160 SCSI adapter
ahc0: interrupting at dec 6600 irq 36
ahc0: aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs
scsibus1 at ahc0: 16 targets, 8 luns per target

scsibus0: waiting 2 seconds for devices to settle...

CPU 1: fatal kernel trap:

CPU 1    trap entry = 0x2 (memory management fault)
CPU 1    a0         = 0x8
CPU 1    a1         = 0x1
CPU 1    a2         = 0x0
CPU 1    pc         = 0xfffffc0000634414
CPU 1    ra         = 0xfffffc0000634400
CPU 1    pv         = 0xfffffc0000636900
CPU 1    curlwp     = 0xfffffc007f8a3400
CPU 1        pid = 0, comm = system

scsibus1: waiting 2 seconds for devices to settle...
panic: trap
Stopped in pid 0.12 (system) at netbsd:cpu_Debugger+0x4:        ret     zero,(ra
)
db{1}> bt
cpu_Debugger() at netbsd:cpu_Debugger+0x4
panic() at netbsd:panic+0x244
trap() at netbsd:trap+0x35c
XentMM() at netbsd:XentMM+0x20
--- memory management fault (from ipl 5) ---
uvm_pagealloc_strat() at netbsd:uvm_pagealloc_strat+0x64
uvm_km_alloc_poolpage() at netbsd:uvm_km_alloc_poolpage+0x40
pool_page_alloc_meta() at netbsd:pool_page_alloc_meta+0x24
pool_grow() at netbsd:pool_grow+0x64
pool_get() at netbsd:pool_get+0x5c
pool_cache_put_slow() at netbsd:pool_cache_put_slow+0x1f4
pool_cache_put_paddr() at netbsd:pool_cache_put_paddr+0x168
pmap_do_tlb_shootdown() at netbsd:pmap_do_tlb_shootdown+0x178
alpha_ipi_process() at netbsd:alpha_ipi_process+0xb8
interrupt() at netbsd:interrupt+0x88
XentInt() at netbsd:XentInt+0x1c
--- interrupt (from ipl 0) ---
printf() at netbsd:printf+0xf8
trap() at netbsd:trap+0x168
XentMM() at netbsd:XentMM+0x20
--- memory management fault ---
uvm_pageidlezero() at netbsd:uvm_pageidlezero+0x34
idle_loop() at netbsd:idle_loop+0x1a4
cpu_spinup_trampoline() at netbsd:cpu_spinup_trampoline+0x5c
--- root of call graph ---

Problem does not occur with a GENERIC kernel, only GENERIC.MP
>How-To-Repeat:
Boot a GENERIC.MP kernel.
>Fix:
Unknown

>Release-Note:

>Audit-Trail:
From: "Michael L. Hitch" <mhitch@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/41106 CVS commit: src/sys/arch/alpha/alpha
Date: Sun, 6 Sep 2009 18:06:24 +0000

 Module Name:	src
 Committed By:	mhitch
 Date:		Sun Sep  6 18:06:24 UTC 2009

 Modified Files:
 	src/sys/arch/alpha/alpha: cpu.c

 Log Message:
 There's now some per-cpu initialization that occurs before the secondary
 cpus are told to begin running.  Since the seconedary cpus weren't being
 added to the cpu_info list until then, that initialization wasn't being
 done and resulted in crashes on the secondary cpus.  Add the secondary
 cpus to the cpu_info_list after they've been started (but waiting to be
 told to start running).  This fixes the problem specifically stated in
 PR port-alpha/41106.  MP alphas will now at least boot and begin running,
 but will eventually crash in various ways later.


 To generate a diff of this commit:
 cvs rdiff -u -r1.85 -r1.86 src/sys/arch/alpha/alpha/cpu.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->analyzed
State-Changed-By: mhitch@NetBSD.org
State-Changed-When: Sun, 06 Sep 2009 18:44:10 +0000
State-Changed-Why:
I've analyzed the problem (and it was quite a bit easier than I thought it would be).


From: "Michael L. Hitch" <mhitch@lightning.msu.montana.edu>
To: gnats-bugs@NetBSD.org
Cc: port-alpha-maintainer@netbsd.org, gnats-admin@netbsd.org, 
    netbsd-bugs@netbsd.org
Subject: Re: port-alpha/41106: GENERIC.MP memory management faults on API
 CS20/HP DS20L
Date: Sun, 6 Sep 2009 11:46:28 -0600 (MDT)

 On Tue, 31 Mar 2009, dmarquess@gmail.com wrote:

 > db{1}> bt
 > cpu_Debugger() at netbsd:cpu_Debugger+0x4
 > panic() at netbsd:panic+0x244
 > trap() at netbsd:trap+0x35c
 > XentMM() at netbsd:XentMM+0x20
 > --- memory management fault (from ipl 5) ---
 > uvm_pagealloc_strat() at netbsd:uvm_pagealloc_strat+0x64
 ...
 > Problem does not occur with a GENERIC kernel, only GENERIC.MP

    It will also work with GENERIC.MP if you disable all the secondary cpus 
 (which rather defeats the purpose of GENERIC.MP).

    I finally got a chance to start looking into this and found out that 
 there's some per-cpu initialize going on before the secondary cpus are 
 told they can start running.  Prior to that point, only the primary cpu 
 has been added to the list of cpus, so the per-cpu setup doesn't occur on 
 the secondary cpus.  I've got a fix for that, which at least lets me boot 
 up and start running with multiple cpus, but eventually crashes later when 
 it gets busy.  I'm in the process of trying to track that down now.

 --
 Michael L. Hitch			mhitch@montana.edu
 Computer Consultant
 Information Technology Center
 Montana State University	Bozeman, MT	USA

From: Soren Jacobsen <snj@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/41106 CVS commit: [netbsd-5] src/sys/arch/alpha/alpha
Date: Wed, 16 Sep 2009 04:12:49 +0000

 Module Name:	src
 Committed By:	snj
 Date:		Wed Sep 16 04:12:49 UTC 2009

 Modified Files:
 	src/sys/arch/alpha/alpha [netbsd-5]: cpu.c

 Log Message:
 Pull up following revision(s) (requested by mhitch in ticket #949):
 	sys/arch/alpha/alpha/cpu.c: revision 1.86
 There's now some per-cpu initialization that occurs before the secondary
 cpus are told to begin running.  Since the seconedary cpus weren't being
 added to the cpu_info list until then, that initialization wasn't being
 done and resulted in crashes on the secondary cpus.  Add the secondary
 cpus to the cpu_info_list after they've been started (but waiting to be
 told to start running).  This fixes the problem specifically stated in
 PR port-alpha/41106.  MP alphas will now at least boot and begin running,
 but will eventually crash in various ways later.


 To generate a diff of this commit:
 cvs rdiff -u -r1.82 -r1.82.10.1 src/sys/arch/alpha/alpha/cpu.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Soren Jacobsen <snj@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/41106 CVS commit: [netbsd-5-0] src/sys/arch/alpha/alpha
Date: Wed, 16 Sep 2009 04:14:33 +0000

 Module Name:	src
 Committed By:	snj
 Date:		Wed Sep 16 04:14:33 UTC 2009

 Modified Files:
 	src/sys/arch/alpha/alpha [netbsd-5-0]: cpu.c

 Log Message:
 Pull up following revision(s) (requested by mhitch in ticket #949):
 	sys/arch/alpha/alpha/cpu.c: revision 1.86
 There's now some per-cpu initialization that occurs before the secondary
 cpus are told to begin running.  Since the seconedary cpus weren't being
 added to the cpu_info list until then, that initialization wasn't being
 done and resulted in crashes on the secondary cpus.  Add the secondary
 cpus to the cpu_info_list after they've been started (but waiting to be
 told to start running).  This fixes the problem specifically stated in
 PR port-alpha/41106.  MP alphas will now at least boot and begin running,
 but will eventually crash in various ways later.


 To generate a diff of this commit:
 cvs rdiff -u -r1.82 -r1.82.14.1 src/sys/arch/alpha/alpha/cpu.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: analyzed->closed
State-Changed-By: mhitch@NetBSD.org
State-Changed-When: Wed, 16 Sep 2009 20:31:36 +0000
State-Changed-Why:
Problem found, fixed, and pulled up to the netbsd-5* branches.
On to the next.....


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.