NetBSD Problem Report #51279

From www@NetBSD.org  Sun Jun 26 19:22:28 2016
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 445037A476
	for <gnats-bugs@gnats.NetBSD.org>; Sun, 26 Jun 2016 19:22:28 +0000 (UTC)
Message-Id: <20160626192227.1E1907AAB3@mollari.NetBSD.org>
Date: Sun, 26 Jun 2016 19:22:27 +0000 (UTC)
From: bsiegert@NetBSD.org
Reply-To: bsiegert@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: bootxx_ffsv2 hangs
X-Send-Pr-Version: www-1.0

>Number:         51279
>Category:       port-amd64
>Synopsis:       bootxx_ffsv2 hangs
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    port-amd64-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Jun 26 19:25:00 +0000 2016
>Last-Modified:  Tue Dec 27 18:50:00 +0000 2016
>Originator:     Benny Siegert
>Release:        NetBSD 7.99.32/amd64, snapshot from 2016-06-26
>Organization:
The NetBSD Foundation
>Environment:
>Description:
I just installed NetBSD-current from a snapshot dated 2016-06-26.

The system is a "Skull Canyon" Intel NUC with Skylake chipset. The disk I installed on is an NVMe SSD. It shows up as nvme0 and ld0.

After installation, here is the full content of my screen when the system hangs:

Fn: diskn
1: Windows
2: NetBSD

[here is where I pressed 2]

NetBSD/x86 ffsv2 Primary Bootstrap

Then it hangs.

I can successfully boot from a USB stick (which uses bootxx_ffsv1) and enter "boot hd1a:/netbsd" to boot the installed system.
>How-To-Repeat:
Install NetBSD-current (amd64) on a recent Intel NUC.
>Fix:
Would using bootxx_ffsv1 help? The FS is v2 though.

>Audit-Trail:
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-amd64/51279: bootxx_ffsv2 hangs
Date: Mon, 27 Jun 2016 04:33:35 +0000

 On Sun, Jun 26, 2016 at 07:25:00PM +0000, bsiegert@NetBSD.org wrote:
  > NetBSD/x86 ffsv2 Primary Bootstrap
  > 
  > Then it hangs.
  > 
  > I can successfully boot from a USB stick (which uses bootxx_ffsv1)
  > and enter "boot hd1a:/netbsd" to boot the installed system.

 Hrm... it is odd, because bootxx_ffsv1 and bootxx_ffsv2 are almost
 identical. I would have guessed that the problem is actually somewhere
 very early in kernel initialization... except that if you can boot the
 same kernel image with bootxx_ffsv1, it can't be.

 Is the usb stick from the same build? And do you still have the
 objects from that build lying around?

 (In general bootxx_ffsv1 won't work on an ffsv2, either, so I'm sort
 of surprised the usb stick can read the volume...)

 (of course, the problem could be: the volume isn't actually ffsv2, and
 in that case maybe that's why bootxx_ffsv2 can't read it... though
 you'd expect errors rather than wedging.)

 -- 
 David A. Holland
 dholland@netbsd.org

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-amd64/51279: bootxx_ffsv2 hangs
Date: Mon, 27 Jun 2016 11:49:21 +0200

 On Mon, Jun 27, 2016 at 04:35:00AM +0000, David Holland wrote:
 >  (In general bootxx_ffsv1 won't work on an ffsv2, either, so I'm sort
 >  of surprised the usb stick can read the volume...)

 The v1 vs. v2 is only about the boot sector reading /boot, which is already
 full done at the point you get the prompt. And /boot should be able to
 work with various file system types.

 Can you please verify you have /boot on your hard disk?

 Martin

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-amd64/51279: bootxx_ffsv2 hangs
Date: Wed, 29 Jun 2016 17:55:27 +0000

 On Mon, Jun 27, 2016 at 09:50:01AM +0000, Martin Husemann wrote:
  >  On Mon, Jun 27, 2016 at 04:35:00AM +0000, David Holland wrote:
  >  >  (In general bootxx_ffsv1 won't work on an ffsv2, either, so I'm sort
  >  >  of surprised the usb stick can read the volume...)
  >  
  >  The v1 vs. v2 is only about the boot sector reading /boot, which is already
  >  full done at the point you get the prompt. And /boot should be able to
  >  work with various file system types.

 right, I knew that. blah

  >  Can you please verify you have /boot on your hard disk?

 excellent point.

 -- 
 David A. Holland
 dholland@netbsd.org

From: Benny Siegert <bsiegert@gmail.com>
To: gnats-bugs@NetBSD.org
Cc: port-amd64-maintainer@netbsd.org,
 gnats-admin@netbsd.org,
 netbsd-bugs@netbsd.org,
 bsiegert@NetBSD.org
Subject: Re: port-amd64/51279: bootxx_ffsv2 hangs
Date: Thu, 30 Jun 2016 22:59:08 +0200

 Thanks for all the replies so far!

 >>> (In general bootxx_ffsv1 won't work on an ffsv2, either, so I'm sort
 >>> of surprised the usb stick can read the volume...)
 >>=20
 >> The v1 vs. v2 is only about the boot sector reading /boot, which is =
 already
 >> full done at the point you get the prompt. And /boot should be able =
 to
 >> work with various file system types.

 It makes no difference if bootxx_ffsv1 or v2 is used. Both give the same =
 result.

 >> Can you please verify you have /boot on your hard disk?
 >=20
 > excellent point.

 I do have a /boot, the same as on the install image.

 I tried the /boot from 7.0.1 (as that=E2=80=99s what is on the USB stick =
 that does boot), and it does not work either.

 =E2=80=94Benny.=

From: Martin Husemann <martin@duskware.de>
To: Benny Siegert <bsiegert@gmail.com>
Cc: gnats-bugs@NetBSD.org
Subject: Re: port-amd64/51279: bootxx_ffsv2 hangs
Date: Fri, 1 Jul 2016 11:07:50 +0200

 On Thu, Jun 30, 2016 at 10:59:08PM +0200, Benny Siegert wrote:
 > It makes no difference if bootxx_ffsv1 or v2 is used. Both give the same result.

 This is very strange.

 Does the ld0 disk appear somehow in the output of sysctl machdep.diskinfo?
 I guess it should be wd1 there, as you said boot hd1a:/netbsd works from
 usb.

 I wonder if the bios keeps it at index 1 (0x81) even if booting from it,
 but passes 0x80 to the boot block? Debugging this will probably involve
 creating special boot blocks with "look how far I got" printouts. Can
 you reboot that machine freely for testing? If so, we should take this
 off-list/gnats and report results back later.

 Martin

From: Benny Siegert <bsiegert@gmail.com>
To: gnats-bugs@NetBSD.org
Cc: port-amd64-maintainer@netbsd.org,
 gnats-admin@netbsd.org,
 netbsd-bugs@netbsd.org,
 bsiegert@NetBSD.org
Subject: Re: port-amd64/51279: bootxx_ffsv2 hangs
Date: Sun, 7 Aug 2016 11:44:30 +0200

 > On Thu, Jun 30, 2016 at 10:59:08PM +0200, Benny Siegert wrote:
 >> It makes no difference if bootxx_ffsv1 or v2 is used. Both give the =
 same result.
 >=20
 > This is very strange.
 >=20
 > Does the ld0 disk appear somehow in the output of sysctl =
 machdep.diskinfo?
 > I guess it should be wd1 there, as you said boot hd1a:/netbsd works =
 from
 > usb.

 I got grub2 (pc) to boot NetBSD working in the meantime. When booting =
 from grub, machdep.diskinfo has
 ld0 sd0

 So ld0 is 0x80? The 0x81 comes from the fact (I think) that the BIOS =
 inserts the USB as 0x80 when booting from it.

 > I wonder if the bios keeps it at index 1 (0x81) even if booting from =
 it,
 > but passes 0x80 to the boot block?

 Probably not?

 > Debugging this will probably involve
 > creating special boot blocks with "look how far I got" printouts. Can
 > you reboot that machine freely for testing?

 Given that I have neither network nor X, I cannot run it productively, =
 so sure, I=E2=80=99d love to! :)

From: mlelstv@serpens.de (Michael van Elst)
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-amd64/51279: bootxx_ffsv2 hangs
Date: Sun, 7 Aug 2016 11:08:35 +0000 (UTC)

 bsiegert@gmail.com (Benny Siegert) writes:

 >I got grub2 (pc) to boot NetBSD working in the meantime. When booting from grub, machdep.diskinfo has
 >ld0 sd0

 >So ld0 is 0x80? The 0x81 comes from the fact (I think) that the BIOS inserts the USB as 0x80 when booting from it.


 machdep.diskinfo is the list of disk devices that exist when the kernel
 tries to determine the boot device.



 sysctl first prints the list of BIOS disks as

      biosnumber:numsectors(ncyl,nhead,nsec),flags

 then follows the list of matched netbsd devices as

      devunit[:biosnumber[,biosnumber,...]]



 Example:

 machdep.diskinfo: 80:976773168(1024/36/30),2 81:976773168(1023/64/32),2  wd0:80 wd1 sd0:81

 The BIOS knows about one boot disk (the first SATA drive wd0) and
 an USB boot disk (external USB enclosure).

 wd0 matches with BIOS disk 80.
 sd0 matches with BIOS disk 81.


 If you just get 'ld0 sd0', then the bootloader didn't provide the list
 of BIOS devices to the kernel and obviously there aren't any matches.


 -- 
 -- 
                                 Michael van Elst
 Internet: mlelstv@serpens.de
                                 "A potential Snark may lurk in every tree."

From: Benny Siegert <bsiegert@gmail.com>
To: gnats-bugs@NetBSD.org
Cc: port-amd64-maintainer@netbsd.org,
 gnats-admin@netbsd.org,
 netbsd-bugs@netbsd.org,
 bsiegert@NetBSD.org
Subject: Re: port-amd64/51279: bootxx_ffsv2 hangs
Date: Tue, 6 Sep 2016 10:10:21 +0200

 >=20
 > machdep.diskinfo is the list of disk devices that exist when the =
 kernel
 > tries to determine the boot device.
 >=20
 > If you just get 'ld0 sd0', then the bootloader didn't provide the list
 > of BIOS devices to the kernel and obviously there aren't any matches.

 OK, booting from the USB stick again (boot hd1a:/netbsd), I get this:

 machdep.diskinfo: 80:3987456(988/64/63),2 81:500118192(1023/255/63),2 =
 ld0:81 sd0:80


From: David Laight <david@l8s.co.uk>
To: gnats-bugs@NetBSD.org
Cc: port-amd64-maintainer@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org, bsiegert@NetBSD.org
Subject: Re: port-amd64/51279: bootxx_ffsv2 hangs
Date: Tue, 27 Dec 2016 18:06:19 +0000

 On Mon, Jun 27, 2016 at 09:50:01AM +0000, Martin Husemann wrote:
 > The following reply was made to PR port-amd64/51279; it has been noted by GNATS.
 > 
 > From: Martin Husemann <martin@duskware.de>
 > To: gnats-bugs@NetBSD.org
 > Cc: 
 > Subject: Re: port-amd64/51279: bootxx_ffsv2 hangs
 > Date: Mon, 27 Jun 2016 11:49:21 +0200
 > 
 >  On Mon, Jun 27, 2016 at 04:35:00AM +0000, David Holland wrote:
 >  >  (In general bootxx_ffsv1 won't work on an ffsv2, either, so I'm sort
 >  >  of surprised the usb stick can read the volume...)
 >  
 >  The v1 vs. v2 is only about the boot sector reading /boot, which is already
 >  full done at the point you get the prompt. And /boot should be able to
 >  work with various file system types.
 >  
 >  Can you please verify you have /boot on your hard disk?

 You might be 'falling over' an undocumented restriction that / must be
 at the start of the bios partition.

 	David

 -- 
 David Laight: david@l8s.co.uk

From: coypu@SDF.ORG
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-amd64/51279: bootxx_ffsv2 hangs
Date: Tue, 27 Dec 2016 18:47:31 +0000

 I apparently forgot to respond to the actual PR.
 Many NVMe can't do legacy boot, which is liekly the problem here.

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.