NetBSD Problem Report #17756

Received: (qmail 22719 invoked by uid 605); 29 Jul 2002 14:07:12 -0000
Message-Id: <20020729140708.B163A712@WintelKiller.HEH.Uni-Oldenburg.DE>
Date: Mon, 29 Jul 2002 16:07:08 +0200 (MEST)
From: Thilo.Manske@HEH.Uni-Oldenburg.DE
Sender: gnats-bugs-owner@netbsd.org
Reply-To: Thilo.Manske@HEH.Uni-Oldenburg.DE
To: gnats-bugs@gnats.netbsd.org
Subject: diskless booting/NFS broken since 1.6D and E
X-Send-Pr-Version: 3.95

>Number:         17756
>Category:       port-prep
>Synopsis:       Subject: diskless booting/NFS broken since 1.6D and E
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    port-prep-maintainer
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Jul 29 14:08:00 +0000 2002
>Closed-Date:    Tue Aug 20 08:17:29 +0000 2002
>Last-Modified:  Tue Aug 20 08:17:29 +0000 2002
>Originator:     Thilo Manske
>Release:        current since a few weeks (1.6D, maybe earlier)
>Organization:
Dies ist Thilos Unix Signature! Viel Spass damit.
>Environment:
	<machine, os, target, libraries (multiple lines)>
System: NetBSD Berta 1.6A NetBSD 1.6A (Berta) #85: Fri May 24 12:29:37 MEST 2002 thilo@Berta:/usr/src/sys/arch/prep/compile/Berta prep
Architecture: powerpc
Machine: prep
>Description:
When my diskless IBM 7248/100 boots 1.6D or later kernels it panics
after it mounts the root device:

boot device: <unknown>                                                     
root on pcn0          
nfs_boot: trying DHCP/BOOTP
nfs_boot: DHCP next-server: 10.2.0.1
nfs_boot: my_name=Berta             
nfs_boot: my_domain=T  
nfs_boot: my_addr=10.2.0.32
nfs_boot: my_mask=255.255.255.0
nfs_boot: gateway=10.2.0.1     
root on 10.2.0.1:/diskless/nfs/berta
init: not found                     
panic: no init 

A tcpdump of the last part of the boot sequence looks like this:

15:47:33.682988 Berta.T.1023 > Seti.T.sunrpc:  udp 76
15:47:33.927604 Seti.T.sunrpc > Berta.T.1023:  udp 28
15:47:33.928265 Berta.T.1023 > Seti.T.1016:  udp 84
15:47:34.060538 Seti.T.1016 > Berta.T.1023:  udp 68
15:47:34.061220 Berta.T.1023 > Seti.T.sunrpc:  udp 76
15:47:34.064098 Seti.T.sunrpc > Berta.T.1023:  udp 28
15:47:34.080794 Berta.T.1 > Seti.T.nfs: 96 getattr fh 0,14/1931
15:47:34.081046 Seti.T.nfs > Berta.T.1: reply ok 112 getattr DIR 755 ids 0/0 sz 0x000000200
15:47:34.082853 Berta.T.2 > Seti.T.nfs: 92 fsinfo [|nfs]
15:47:34.083012 Seti.T.nfs > Berta.T.2: reply ok 164 fsinfo rtmax 65536 rtpref 32768 wtmax 65536 wtpref 32768 dtpref 32768 [|nfs]
15:47:34.083930 Berta.T.3 > Seti.T.nfs: 116 readdir fh 0,14/1931 8192 bytes @ 0x000000000
15:47:34.107357 Seti.T.nfs > Berta.T.3: reply ok 852 readdir
15:47:34.163226 arp who-has Seti.T tell Berta.T
15:47:34.163265 arp reply Seti.T is-at 0:e0:29:47:90:ff
15:47:34.163688 Berta.T.1417109505 > Seti.T.nfs: 104 lookup fh 0,14/1931 "dev"
15:47:34.163893 Seti.T.nfs > Berta.T.1417109505: reply ok 236 lookup fh 0,14/1931
15:47:34.164767 Berta.T.1417109506 > Seti.T.nfs: 96 getattr fh 0,14/1931
15:47:34.164909 Seti.T.nfs > Berta.T.1417109506: reply ok 112 getattr DIR 755 ids 0/0 sz 0x000003800
15:47:34.166649 Berta.T.1417109507 > Seti.T.nfs: 108 lookup fh 0,14/1931 "console"
15:47:34.166822 Seti.T.nfs > Berta.T.1417109507: reply ok 236 lookup fh 0,14/1931
15:47:34.167722 Berta.T.1417109508 > Seti.T.nfs: 96 getattr fh 0,14/1931
15:47:34.167857 Seti.T.nfs > Berta.T.1417109508: reply ok 112 getattr CHR 600 ids 0/0 sz 0x000000000
15:47:34.170448 Berta.T.1417109509 > Seti.T.nfs: 104 lookup fh 0,14/1931 "M-^?M-^?M-^?"
15:47:34.170623 Seti.T.nfs > Berta.T.1417109509: reply ok 116 lookup ERROR: No such file or directory

Please note the garbled filename in the last NFS lookup (it's ff ff ff in
hex).

1.6A from May worked fine. The problem doesn't appear on other architecture I've
tried so far (sgimips, sparc, alpha), but I haven't tried an other powerpc platform
yet.

>How-To-Repeat:
nfs mount root on a prep?
>Fix:
>Release-Note:
>Audit-Trail:

From: Thilo Manske <Thilo.Manske@HEH.Uni-Oldenburg.DE>
To: Chuck Silvers <chuq@chuq.com>
Cc: gnats-bugs@netbsd.org
Subject: Re: port-prep/17756: diskless booting/NFS broken since 1.6D and E
Date: Wed, 31 Jul 2002 14:54:35 +0200

 On Mon, Jul 29 2002 at 13:29:01 -0700, Chuck Silvers wrote:
 > > > if that's not it, could you do some binary searching to find which
 > > > commit introduced the problem?
 > > I'll try my best, but that'll need some time...
 > 
 > great, that's a big help.
 Ok, after ~10 kernels I think I have nailed it down to this:

 |Module Name:	syssrc
 |Committed By:	matt
 |Date:		Wed Jun 26 01:16:24 UTC 2002
 |
 |Modified Files:
 |	syssrc/sys/arch/powerpc/include/mpc6xx: vmparam.h
 |
 |Log Message:
 |When not using the OLD pmap, bump kernel KVA space to 512MB (OLD pmap stays
 |at 256MB).
 |
 |
 |To generate a diff of this commit:
 |cvs rdiff -r1.1 -r1.2 syssrc/sys/arch/powerpc/include/mpc6xx/vmparam.h

 I.e. a kernel made from sources checked out with 'Jun 27' does work
 with vmparam.h 1.1 but not with rev 1.2.
 (That file has no RCS tag by the way..)

 -- 
 Dies ist Thilos Unix Signature! Viel Spass damit.

From: Thilo Manske <Thilo.Manske@HEH.Uni-Oldenburg.DE>
To: gnats-bugs@gnats.netbsd.org
Cc:  
Subject: Re: port-prep/17756: diskless booting/NFS broken since 1.6D and E
Date: Tue, 13 Aug 2002 09:30:24 +0200

 The problem seems to have been fixed silently - a kernel made with
 yesterday's sources works. (This PR can be closed now.)

 -- 
 Dies ist Thilos Unix Signature! Viel Spass damit.
State-Changed-From-To: open->closed 
State-Changed-By: chs 
State-Changed-When: Tue Aug 20 01:16:23 PDT 2002 
State-Changed-Why:  
closing at submitter's request. 
I wish we know what was going on here. 
if it happens again, please let us know! 
>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.