NetBSD Problem Report #49154

From www@NetBSD.org  Tue Aug 26 12:45:26 2014
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
	by mollari.NetBSD.org (Postfix) with ESMTPS id B4F62AF4FC
	for <gnats-bugs@gnats.NetBSD.org>; Tue, 26 Aug 2014 12:45:26 +0000 (UTC)
Message-Id: <20140826124525.43751AF57B@mollari.NetBSD.org>
Date: Tue, 26 Aug 2014 12:45:25 +0000 (UTC)
From: jdbaker@mylinuxisp.com
Reply-To: jdbaker@consolidated.net
To: gnats-bugs@NetBSD.org
Subject: Can't find boot device if net-booted from sk(4) gigabit ethernet, prompts for root device
X-Send-Pr-Version: www-1.0

>Number:         49154
>Notify-List:    jdbaker@consolidated.net
>Category:       kern
>Synopsis:       Can't find boot device if net-booted from sk(4) gigabit ethernet, prompts for root device
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Aug 26 12:50:00 +0000 2014
>Last-Modified:  Tue Jan 29 06:31:40 +0000 2019
>Originator:     John D. Baker
>Release:        NetBSD/i386-7.99.1
>Organization:
>Environment:
NetBSD plexor 7.99.1 NetBSD 7.99.1 (PLEXOR) #6: Sun Aug 24 20:23:32 CDT 2014  sysop@verthandi.technoskunk.fur:/d0/build/current/obj/i386/sys/arch/i386/compile/PLEXOR i386

>Description:
When netbooting a system with an sk(4) PCI gigabit ethernet card, the
kernel is unable to find the boot device and subsequently prompts user
for the root device, etc.

Excerpts from 'dmesg':

NetBSD 7.99.1 (PLEXOR) #6: Sun Aug 24 20:23:32 CDT 2014
        sysop@verthandi.technoskunk.fur:/d0/build/current/obj/i386/sys/arch/i386/compile/PLEXOR
total memory = 510 MB
avail memory = 492 MB
kern.module.path=/stand/i386/7.99.1/modules
timecounter: Timecounters tick every 10.000 msec
userconf: configure system autoconfiguration:
uc> disable eap
[ 88] eap* disabled
uc> exit
Continuing...
timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100
Dell Computer Corporation OptiPlex GX110               
[...]
skc0 at pci1 dev 10 function 0: irq 9
skc0: interrupt moderation is 0 us
skc0: 3Com Gigabit NIC (3C2000) rev. (0x1)
sk0 at skc0 port A: Ethernet address 00:0a:5e:24:c2:08
makphy0 at sk0 phy 0: Marvell 88E1011 Gigabit PHY, rev. 3
makphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
[...]
IPsec: Initialized Security Association Processing.
Kernelized RAIDframe activated
findroot: netboot interface not found.
findroot: netboot interface not found.
boot device: <unknown>
root device: sk0
dump device: 
file system (default generic): 
root on sk0
nfs_boot: trying DHCP/BOOTP
skc0: interrupt moderation is 1000 us
ehci0: unrecoverable error, controller halted
ehci0: blocking intrs 0x10


On the machine from which these logs were taken, having the sk(4) card
installed appears to interfere with an eap(4) audio card such that the
machine panics during boot.  That driver was disabled to permit the
boot to proceed.  It appears to also adversely affect the USB portion
of a USB2/Firewire card also installed in the machine.  The ehci(4)
driver would sometimes report "strange port".  Both of these issues
are subjects for their own PRs as the rootdev behavior has been observed
on another machine using this card, but with no ill side-effects.

(This machine was available for testing while the other is not.)
>How-To-Repeat:
Attempt to netboot/NFS-root a machine using an sk(4) card as shown
above.
>Fix:

>Release-Note:

>Audit-Trail:
From: "John D. Baker" <jdbaker@mylinuxisp.com>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/49154: Can't find boot device if net-booted from sk(4)
 gigabit ethernet, prompts for root device
Date: Tue, 26 Aug 2014 08:00:27 -0500 (CDT)

 Looking back through the 'dmesg' output with the eap(4) enabled reveals
 that skc0, eap0, ehci0, and the onboard uhci0 all share the same interrupt
 line (9):

 [...]
 eap0 at pci1 dev 7 function 0: Ensoniq AudioPCI 97 ES1371-B (rev. 0x09)
 eap0: interrupting at irq 9                                            
 eap0: ac97: Crystal CS4297A codec; headphone, 20 bit DAC, 18 bit ADC, Crystal Se
 mi 3D
 eap0: ac97: ext id 0x200<AMAP>
 audio0 at eap0: full duplex, playback, capture, mmap, independent
 eap0: attaching secondary DAC                                    
 audio1 at eap0: full duplex, playback, capture, mmap, independent
 [...]
 ehci0 at pci1 dev 8 function 2: NEC USB2 Host Controller (rev. 0x04)
 ehci0: interrupting at irq 9                                        
 usb0 at ehci0: USB revision 2.0
 [...]
 skc0 at pci1 dev 10 function 0: irq 9 
 skc0: 3Com Gigabit NIC (3C2000) rev. (0x1)
 sk0 at skc0 port A: Ethernet address 00:0a:5e:24:c2:08
 makphy0 at sk0 phy 0: Marvell 88E1011 Gigabit PHY, rev. 3
 makphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
 [...]
 uhci0 at pci0 dev 31 function 2: Intel 82801AA USB Controller (rev. 0x02)
 uhci0: interrupting at irq 9                                             
 usb1 at uhci0: USB revision 1.0

 Perhaps that is significant in the failure to find the root device?
 (It definitely is in the eap(4) panic case, but that's out of scope.)

 -- 
 |/"\ John D. Baker, KN5UKS               NetBSD     Darwin/MacOS X
 |\ / jdbaker[snail]mylinuxisp[flyspeck]com    OpenBSD            FreeBSD
 | X  No HTML/proprietary data in email.   BSD just sits there and works!
 |/ \ GPGkeyID:  D703 4A7E 479F 63F8 D3F4  BD99 9572 8F23 E4AD 1645

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.43 2018/01/16 07:36:43 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.