NetBSD Problem Report #38591

From rafal@pobox.com  Tue May  6 00:50:53 2008
Return-Path: <rafal@pobox.com>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by narn.NetBSD.org (Postfix) with ESMTP id 427C763B293
	for <gnats-bugs@gnats.NetBSD.org>; Tue,  6 May 2008 00:50:53 +0000 (UTC)
Message-Id: <20080506004116.CED0114DFA@fearless-vampire-killer.waterside.net>
Date: Mon, 05 May 2008 20:41:16 -0400 (EDT)
From: rafal@netbsd.org
Reply-To: rafal@netbsd.org
To: gnats-bugs@gnats.NetBSD.org
Subject: port-hpcarm kernels unbootable after merge of matt-armv6 branch
X-Send-Pr-Version: 3.95

>Number:         38591
>Category:       port-hpcarm
>Synopsis:       port-hpcarm kernels unbootable after merge of matt-armv6 branch
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    port-hpcarm-maintainer
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue May 06 00:55:00 +0000 2008
>Closed-Date:    Fri Jun 13 13:41:26 +0000 2008
>Last-Modified:  Sat Jun 14 06:14:05 +0000 2008
>Originator:     rafal@netbsd.org
>Release:        NetBSD 4.99.62
>Organization:
TNF
>Environment:
System: NetBSD j720 4.99.62 NetBSD 4.99.62 (J720) #0: Mon May  5 15:57:08 EDT 2008 rafal@fearless-vampire-killer.waterside.net:/extra/netbsd-src/hpcarm-bughunt.head/src/sys/arch/hpcarm/compile/J720 hpcarm
Architecture: arm
Machine: hpcarm
>Description:
	Boot a stock kernel (e.g. JORNADA720), watch it hang after autoconfig.
	Boot a LOCKDEBUG/DIAGNOSTIC kernel and watch it die early in the
	boot with a uninitialized lock panic.

	Here's a boot log from non-LOCKDEBUG / non-DIAGNOSTIC kernel with a
	bit of extra debug goo and some of the issues noted below fixed; for
	one it disables interrupts in hpcarm's initarm(), else it would not
	have gotten that far either.  However, it still has bad interrupt
	masks:

--------HPCBOOT--------
FileManager: FAT
hpcboot build number: 15
HP Jornada 720 (US/UK) (cpu=0x0c108000 machine=0x02c20201)
[progress] 2
[0] 0xc0000000 size 0x08000000
[1] 0xc8000000 size 0x08000000
[2] 0xd0000000 size 0x08000000
[3] 0xd8000000 size 0x08000000
_WIN32_WCE = 400
GetVersionEx
Windows CE 3.0
GetSystemInfo:
wProcessorArchitecture      0x5
wProcessorLevel             0x4
wProcessorRevision          0x9
dwPageSize                  0x1000
dwAllocationGranularity     0x00010000
dwProcessorType             0xa11
Display: 640x240 16bpp
Reg0 :6901b119
Reg1 :c002327f
Reg2 :c002327f
Reg3 :00000001
Reg5 :c0023007
Reg6 :0004f014
Reg13:0a000000
Reg14:cc8443ad
CPSR :400000df
[progress] 3
[progress] 4
open file "\Storage Card\netbsd-4.99.61.gz"(1833675 byte).
[progress] 5
Loader: ELF
[progress] 6
file size: 0x2f13d4+0xbf94+[ksyms: header 0x5d0, symtab 0x5f1a0, strtab 0x32610 = 0x91d80]+[extra: 0x4fb0] = 0x394d30 bytes
address translation table 928 pages. (0x1d00 bytes)
allocated 928 page. mapped 928 page.
[progress] 7
2nd bootloader vaddr=0x0048f000 paddr=0xc1d24000
2nd bootloader copy done.
[progress] 8
seg[0] paddr 0xc0040000 file size 0x2f13d4 mem size 0x2f13d4
	->load 0xc0040000+0x002f13d4=0xc03313d4 ofs=0x00008000+0x2f13d4
seg[1] paddr 0xc03393e0 file size 0xbf94 mem size 0x6d144
	->load 0xc03393e0+0x0006d144=0xc03a6524 ofs=0x002f93e0+0xbf94
	->zero 0xc0345374+0x000611b0=0xc03a6524
ksyms
	->load 0xc03a6524+0x000005d0=0xc03a6af4
	->load 0xc03a6af4+0x0005f1a0=0xc0405c94 ofs=0x00322378+0x5f1a0
	->load 0xc0405c94+0x00032610=0xc04382a4 ofs=0x00381518+0x32610
[progress] 9
load link 918, zero clear link 1
kernel entry address: 0xc0040000
framebuffer: 640x240 type=5 linebytes=1280 addr=0x48200000
console = 2
[progress] 10
sp for bootloader = c1d22000 + 00001000 = c1d23000
kernsize=0x366524
Allocating page tables
IRQ stack: p0xc0010000 v0xc0010000
ABT stack: p0xc0011000 v0xc0011000
UND stack: p0xc0012000 v0xc0012000
SVC stack: p0xc0013000 v0xc0013000
Creating L1 page table
Mapping kernel
pmap_map_chunk: pa=0xc0040000 va=0xc0040000 size=0x400000 resid=0x400000 prot=0x3 cache=1
LLLLLLLLLLLLSSSLLLL
Constructing L2 page tables
pmap_map_chunk: pa=0xc0010000 va=0xc0010000 size=0x1000 resid=0x1000 prot=0x3 cache=1
P
pmap_map_chunk: pa=0xc0011000 va=0xc0011000 size=0x1000 resid=0x1000 prot=0x3 cache=1
P
pmap_map_chunk: pa=0xc0012000 va=0xc0012000 size=0x1000 resid=0x1000 prot=0x3 cache=1
P
pmap_map_chunk: pa=0xc0013000 va=0xc0013000 size=0x2000 resid=0x2000 prot=0x3 cache=1
PP
pmap_map_chunk: pa=0xc0000000 va=0xc0000000 size=0x4000 resid=0x4000 prot=0x3 cache=2
PPPP
pmap_map_chunk: pa=0xc0000000 va=0xc0000000 size=0x10000 resid=0x10000 prot=0x3 cache=2
L
devmap: 80050000 -> 80050023 @ d000d000
pmap_map_chunk: pa=0x80050000 va=0xd000d000 size=0x24 resid=0x1000 prot=0x3 cache=0
P
pmap_map_chunk: pa=0xe0000000 va=0xc0018000 size=0x8000 resid=0x8000 prot=0x3 cache=1
PPPPPPPP
done.
init subsystems: stacks vectors c028018c c0280778 c027fd88
undefined freemempos=c0021000
MMU enabled. control=c000107d
kernsize=0x400000 (including 0x91d80 symbols)
Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
    2006, 2007, 2008
    The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
    The Regents of the University of California.  All rights reserved.

NetBSD 4.99.61 (J720) #5: Sun May  4 18:16:17 EDT 2008
	rafal@fearless-vampire-killer.waterside.net:/extra/netbsd-src/hpcarm-bughunt.61/src/sys/arch/hpcarm/compile/J720
total memory = 32768 KB
avail memory = 27164 KB
timecounter: Timecounters tick every 10.000 msec
mainbus0 (root)
cpu0 at mainbus0: SA-1110 step B-5 (SA-1 core)
cpu0: DC enabled IC enabled WB enabled LABT
cpu0: 16KB/32B 32-way Instruction cache
cpu0: 8KB/32B 32-way write-back Data cache
saip0 at mainbus0
saost0 at saip0 addr 0x90000000-0x9000001f
saost0: SA-11x0 OS Timer
sacom0 at saip0 addr 0x80050000-0x80050023 intr 17
sacom0: SA-11x0 UART3
sacom0: console
sacc0 at saip0 addr 0x40000000-0x40001fff
sacc0: SA-1111 rev 1.1
sacpcic0 at sacc0
pcmcia0 at sacpcic0
pcmcia1 at sacpcic0
sed0 at saip0
sed0: Epson SED1356
sed0: framebuffer address: 0x48200000
sed0: WARNING: powerhook_establish is deprecated
hpcfb0 at sed0
hpcfb0: WARNING: powerhook_establish is deprecated
wsdisplay0 at hpcfb0 kbdmux 1
wsmux1: connecting to wsdisplay0
hpcfb: 640x240 pixels, 65536 colors, 80x24 chars
hpcfb: 640x240 pixels, 65536 colors, 80x24 chars
wsdisplay0: screen 0-1 added (std, vt100 emulation)
j720ssp0 at saip0 addr 0x80070000-0x800700ff
j720kbd0 at j720ssp0
hpckbd0 at j720kbd0
wskbd0 at hpckbd0 mux 1
wskbd0: connecting to wsdisplay0
j720tp0 at j720ssp0
wsmouse0 at j720tp0 mux 0
wskbd at j720tp0 not configured
j720lcd0 at j720ssp0: brightness 38, contrast 135
j720pwr0 at j720ssp0
hpcapm0 at j720pwr0: pseudo power management module
apmdev0 at hpcapm0: Power Management spec V1.2
ipl_none=00000000 ipl_bio=00020203 ipl_net=00020203 ipl_tty=00020203
ipl_vm=00020203 ipl_audio=00020203 ipl_imp=00020002 ipl_high=00020002ipl_serial=00020002
timecounter: Timecounter "clockinterrupt" frequency 100 Hz quality 0
clock: hz=100 stathz=64
timecounter: Timecounter "saost_count" frequency 3686400 Hz quality 100
sacpcic0: card present
IPsec: Initialized Security Association Processing.
ne0 at pcmcia0 function 0: <Ethernet, Adapter, 2.0>
ne0: Ethernet address 00:e0:98:04:2d:ca
ne0: WARNING: powerhook_establish is deprecated
sacpcic0: card present
wdc0 at pcmcia1 function 0: <SanDisk, SDP, 5/3 0.6>
wdc0: i/o mapped mode
atabus0 at wdc0 channel 0
wd0 at atabus0 drive 0: <SanDisk SDCFB-2048>
wd0: drive supports 4-sector PIO transfers, LBA addressing
wd0: 1953 MB, 3970 cyl, 16 head, 63 sec, 512 bytes/sect x 4001760 sectors
wd0: drive supports PIO mode 4
boot device: wd0
root on wd0a dumps on wd0b
root file system type: ffs
WARNING: no TOD clock present
WARNING: using filesystem time
WARNING: CHECK AND RESET THE DATE!
Enter pathname of shell or RETURN for /bin/sh: 
# fsck -p
/dev/rwd0a: file system is clean; not checking
# fsck -fp
/dev/rwd0a: 22535 files, 
*HANG*

	Here's a boot log from LOCKDEBUG / DIAGNOSTIC kernel from sources
	earlier today (a totally clean tree):

--------HPCBOOT--------
FileManager: FAT
hpcboot build number: 15
HP Jornada 720 (US/UK) (cpu=0x0c108000 machine=0x02c20201)
[progress] 2
[0] 0xc0000000 size 0x08000000
[1] 0xc8000000 size 0x08000000
[2] 0xd0000000 size 0x08000000
[3] 0xd8000000 size 0x08000000
_WIN32_WCE = 400
GetVersionEx
Windows CE 3.0
GetSystemInfo:
wProcessorArchitecture      0x5
wProcessorLevel             0x4
wProcessorRevision          0x9
dwPageSize                  0x1000
dwAllocationGranularity     0x00010000
dwProcessorType             0xa11
Display: 640x240 16bpp
Reg0 :6901b119
Reg1 :c002327f
Reg2 :c002327f
Reg3 :00000001
Reg5 :c0023007
Reg6 :000277f0
Reg13:0a000000
Reg14:ccc443ad
CPSR :400000df
[progress] 3
[progress] 4
open file "\Storage Card\netbsd.gz"(1961945 byte).
[progress] 5
Loader: ELF
[progress] 6
file size: 0x33177c+0xc0c8+[ksyms: header 0x5d0, symtab 0x63680, strtab 0x32cc3 = 0x96913]+[extra: 0x4fb0] = 0x3da8c3 bytes
address translation table 1008 pages. (0x1f80 bytes)
allocated 1008 page. mapped 1008 page.
[progress] 7
2nd bootloader vaddr=0x004df000 paddr=0xc1d27000
2nd bootloader copy done.
[progress] 8
seg[0] paddr 0xc0040000 file size 0x33177c mem size 0x33177c
	->load 0xc0040000+0x0033177c=0xc037177c ofs=0x00008000+0x33177c
seg[1] paddr 0xc0379780 file size 0xc0c8 mem size 0x7745c
	->load 0xc0379780+0x0007745c=0xc03f0bdc ofs=0x00339780+0xc0c8
	->zero 0xc0385848+0x0006b394=0xc03f0bdc
ksyms
	->load 0xc03f0bdc+0x000005d0=0xc03f11ac
	->load 0xc03f11ac+0x00063680=0xc045482c ofs=0x00362c88+0x63680
	->load 0xc045482c+0x00032cc3=0xc04874ef ofs=0x003c6308+0x32cc3
[progress] 9
load link 986, zero clear link 1
kernel entry address: 0xc0040000
framebuffer: 640x240 type=5 linebytes=1280 addr=0x48200000
console = 2
[progress] 10
sp for bootloader = c1d25000 + 00001000 = c1d26000
kernsize=0x3b0bdc
Allocating page tables
IRQ stack: p0xc0010000 v0xc0010000
ABT stack: p0xc0011000 v0xc0011000
UND stack: p0xc0012000 v0xc0012000
SVC stack: p0xc0013000 v0xc0013000
Creating L1 page table
Mapping kernel
pmap_map_chunk: pa=0xc0040000 va=0xc0040000 size=0x44c000 resid=0x44c000 prot=0x3 cache=1
LLLLLLLLLLLLSSSLLLLLLLLPPPPPPPPPPPP
Constructing L2 page tables
pmap_map_chunk: pa=0xc0010000 va=0xc0010000 size=0x1000 resid=0x1000 prot=0x3 cache=1
P
pmap_map_chunk: pa=0xc0011000 va=0xc0011000 size=0x1000 resid=0x1000 prot=0x3 cache=1
P
pmap_map_chunk: pa=0xc0012000 va=0xc0012000 size=0x1000 resid=0x1000 prot=0x3 cache=1
P
pmap_map_chunk: pa=0xc0013000 va=0xc0013000 size=0x2000 resid=0x2000 prot=0x3 cache=1
PP
pmap_map_chunk: pa=0xc0000000 va=0xc0000000 size=0x4000 resid=0x4000 prot=0x3 cache=2
PPPP
pmap_map_chunk: pa=0xc0000000 va=0xc0000000 size=0x10000 resid=0x10000 prot=0x3 cache=2
L
devmap: 80050000 -> 80050023 @ d000d000
pmap_map_chunk: pa=0x80050000 va=0xd000d000 size=0x24 resid=0x1000 prot=0x3 cache=0
P
pmap_map_chunk: pa=0xe0000000 va=0xc0018000 size=0x8000 resid=0x8000 prot=0x3 cache=1
PPPPPPPP
done.
init subsystems: stacks vectors c02a9fec c02aa5fc c02a9bf0
undefined freemempos=c0021000
MMU enabled. control=c000107d
kernsize=0x44c000 (including 0x96913 symbols)
panic: lockdebug_lookup: uninitialized lock (lock=0x14)
Begin traceback...
0xc0014bd0
	scp=0xc0014bd0 rlv=0xc0014ba0 (0xc0014ba0)
	rsp=0xc0014ba8 rfp=0xc022a0fc
0xe1a0c00d
	scp=0xe1a0c00d rlv=0xc03455a0 (netbsd:__kernassert+0x39ad4)
	rsp=0xc03bfbe0 rfp=0xc0382218
*HANG*

	It looks like there are a few problems:

	(1) Interrupts are not disabled early in startup, so interrupts
	    can sneak in earlier than expected (I suspect that's the cause
	    of the LOCKDEBUG assertion firing above).

	(2) Interrupt masks don't make sense -- IPL_HIGH doesn't block all
	    interrupts, IPL_NONE does, etc.

	(3) It looks like interrupts are enabled in arch/arm/sa11x0/sa11x0.c
	    but interrupt masks aren't set up beforehand (maybe a non-issue).

	(4) Mutex init of the SA-1111 PCMCIA controller mutex should be moved
	    to arch/arm/sa11x0/sa11x1_pcic.c (sacpcic_attach_common) from
	    sapcic_kthread_create in arch/arm/sa11x0/sa11xx_pcic.c (otherwise
	    we init it twice, per DIAGNOSTIC/LOCKDEBUG). 

	Others?  This is a catch-all bug since I'm not sure what else might
	have been broken by the transition to using the arm/arm32 interrupt
	code (on top of pre-existing broken-ness that seemed to somehow not
	surface).

>How-To-Repeat:
	Try to boot the JORNADA720 kernel on a Jornada 720; try to boot a
	LOCKDEBUG / DIAGNOSTIC version of said kernel.

#
# Customized (debug-added) kernel config file for the Jornada 720
#
# 	$NetBSD: J720,v 1.3 2006/10/02 03:28:30 chs Exp $
#

include		"arch/hpcarm/conf/JORNADA720"

options 	DEBUG_BEFOREMMU
#options	INTR_DEBUG
options		DIAGNOSTIC
options		LOCKDEBUG
options 	VERBOSE_INIT_ARM
options		DDB_ONPANIC=2

>Fix:
	None yet.. I'm working on it as time allows.  The big issue to sort
	out is fixing the interrupt masks; the rest is hopefully just small
	fries after that, but at least the interrupts-enabled-early issue
	made it harder than necessary to even get there.

>Release-Note:

>Audit-Trail:
From: Rafal Boni <rafal@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/38591 CVS commit: src/sys/arch
Date: Fri, 13 Jun 2008 13:24:10 +0000 (UTC)

 Module Name:	src
 Committed By:	rafal
 Date:		Fri Jun 13 13:24:10 UTC 2008

 Modified Files:
 	src/sys/arch/arm/sa11x0: sa11x0.c sa11x0_irq.S sa11x0_irqhandler.c
 	src/sys/arch/hpcarm/hpcarm: autoconf.c hpc_machdep.c
 	src/sys/arch/hpcarm/include: irqhandler.h

 Log Message:
 Let hpcarm kernels boot again after the merge of the armv6 branch.  Fixes
 PR port-hpcarm/38591

 XXX: There is still a hard hang that I've seen on both shark and hpcarm in
 the process exit path; I don't know much beyond that yet.


 To generate a diff of this commit:
 cvs rdiff -r1.22 -r1.23 src/sys/arch/arm/sa11x0/sa11x0.c
 cvs rdiff -r1.13 -r1.14 src/sys/arch/arm/sa11x0/sa11x0_irq.S
 cvs rdiff -r1.15 -r1.16 src/sys/arch/arm/sa11x0/sa11x0_irqhandler.c
 cvs rdiff -r1.16 -r1.17 src/sys/arch/hpcarm/hpcarm/autoconf.c
 cvs rdiff -r1.85 -r1.86 src/sys/arch/hpcarm/hpcarm/hpc_machdep.c
 cvs rdiff -r1.8 -r1.9 src/sys/arch/hpcarm/include/irqhandler.h

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->closed
State-Changed-By: rafal@NetBSD.org
State-Changed-When: Fri, 13 Jun 2008 13:41:26 +0000
State-Changed-Why:
Fixed; still see some hangs around process exit path (on both Shark and hpcarm)
but will open another bug for that.


>Unformatted:
 		NetBSD 4.99.62 (J720) #0: Mon May  5 15:57:08 EDT 2008
 		Clean source from May 5th, ~14:00 EDT

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.