NetBSD Problem Report #56363

From www@netbsd.org  Sun Aug 15 09:50:09 2021
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id C72B61A921F
	for <gnats-bugs@gnats.NetBSD.org>; Sun, 15 Aug 2021 09:50:08 +0000 (UTC)
Message-Id: <20210815095007.EA1051A9239@mollari.NetBSD.org>
Date: Sun, 15 Aug 2021 09:50:07 +0000 (UTC)
From: tnn@nygren.pp.se
Reply-To: tnn@nygren.pp.se
To: gnats-bugs@NetBSD.org
Subject: boot: Inode not directory
X-Send-Pr-Version: www-1.0

>Number:         56363
>Category:       port-sparc64
>Synopsis:       boot: Inode not directory
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    port-sparc64-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Aug 15 09:55:00 +0000 2021
>Last-Modified:  Sat Aug 21 04:40:01 +0000 2021
>Originator:     Tobias Nygren
>Release:        9.99.88
>Organization:
>Environment:
>Description:
Attempted to use sysinst to install NetBSD on a Sun Netra 240 with a 73 GB SCSI disk. Rendered the system unbootable with bootblock saying
"Inode not directory."
This may be due to too large "/" filesystem.
Reducing "/" from 56 GB to 1 GB made the installation succeed.
OPB version is 4.30.4.a.

>How-To-Repeat:
Install from NetBSD-daily ISO image.

>Fix:
The cause of the error message is not obvious to the end-user, but
is likely due to firmware limitations.
It would be good to improve the boot block error message is to hint at the cause and possibly if sysinst could warn the user if "/" exceeds 4 GB on this port.

>Audit-Trail:
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-sparc64/56363: boot: Inode not directory
Date: Sun, 15 Aug 2021 12:02:54 +0200

 On Sun, Aug 15, 2021 at 09:55:00AM +0000, tnn@nygren.pp.se wrote:
 > It would be good to improve the boot block error message is to hint
 > at the cause and possibly if sysinst could warn the user if "/" exceeds
 > 4 GB on this port.

 Why 4GB? Are you using the latest available firmware (that might be not
 so easy to answer)?

 I hate introducing aritificial limits like this when there is no
 clearly documented limit for the actual firmware (and I have been using
 bigger scsi disks in various sparc64 machines ~forever).

 Martin

From: Tobias Nygren <tnn@nygren.pp.se>
To: gnats-bugs@netbsd.org
Cc: Martin Husemann <martin@duskware.de>
Subject: Re: port-sparc64/56363: boot: Inode not directory
Date: Sun, 15 Aug 2021 12:16:50 +0200

 On Sun, 15 Aug 2021 10:05:01 +0000 (UTC)
 Martin Husemann <martin@duskware.de> wrote:

 > The following reply was made to PR port-sparc64/56363; it has been noted by GNATS.
 > 
 > From: Martin Husemann <martin@duskware.de>
 > To: gnats-bugs@netbsd.org
 > Cc: 
 > Subject: Re: port-sparc64/56363: boot: Inode not directory
 > Date: Sun, 15 Aug 2021 12:02:54 +0200
 > 
 >  On Sun, Aug 15, 2021 at 09:55:00AM +0000, tnn@nygren.pp.se wrote:
 >  > It would be good to improve the boot block error message is to hint
 >  > at the cause and possibly if sysinst could warn the user if "/" exceeds
 >  > 4 GB on this port.
 >  
 >  Why 4GB? Are you using the latest available firmware (that might be not
 >  so easy to answer)?

 Yes, as far as I know it is the latest & last release. The Netra 240 is
 a modern server by NetBSD/sparc64 standards. (dual USIII)

 >  I hate introducing aritificial limits like this when there is no
 >  clearly documented limit for the actual firmware (and I have been using
 >  bigger scsi disks in various sparc64 machines ~forever).

 It doesn't need to be a hard limit. A warning would suffice or at least
 a better error message. As is currently stands the default partition
 sizes suggested by sysinst render an unusable system with no clue
 how to work around it. Other than googling it and eventually,
 hopefully, finding this very PR.

From: Martin Husemann <martin@duskware.de>
To: Tobias Nygren <tnn@nygren.pp.se>
Cc: gnats-bugs@netbsd.org
Subject: Re: port-sparc64/56363: boot: Inode not directory
Date: Sun, 15 Aug 2021 12:26:44 +0200

 On Sun, Aug 15, 2021 at 12:16:50PM +0200, Tobias Nygren wrote:
 > Yes, as far as I know it is the latest & last release. The Netra 240 is
 > a modern server by NetBSD/sparc64 standards. (dual USIII)

 Yes, and I'm pretty suprised by the failure mode.

 It would be good to have a concrete list of troubled firmware versions
 and their limits and match that in sysinst or something.

 My v210 boots from 34GB scsi disks.

 Martin

From: David Brownlee <abs@absd.org>
To: gnats-bugs@netbsd.org
Cc: port-sparc64-maintainer@netbsd.org, gnats-admin@netbsd.org, 
	netbsd-bugs@netbsd.org, tnn@nygren.pp.se
Subject: Re: port-sparc64/56363: boot: Inode not directory
Date: Thu, 19 Aug 2021 13:53:26 +0100

 On Sun, 15 Aug 2021 at 11:30, Martin Husemann <martin@duskware.de> wrote:
 >
 > The following reply was made to PR port-sparc64/56363; it has been noted by GNATS.
 >
 > From: Martin Husemann <martin@duskware.de>
 > To: Tobias Nygren <tnn@nygren.pp.se>
 > Cc: gnats-bugs@netbsd.org
 > Subject: Re: port-sparc64/56363: boot: Inode not directory
 > Date: Sun, 15 Aug 2021 12:26:44 +0200
 >
 >  On Sun, Aug 15, 2021 at 12:16:50PM +0200, Tobias Nygren wrote:
 >  > Yes, as far as I know it is the latest & last release. The Netra 240 is
 >  > a modern server by NetBSD/sparc64 standards. (dual USIII)
 >
 >  Yes, and I'm pretty suprised by the failure mode.
 >
 >  It would be good to have a concrete list of troubled firmware versions
 >  and their limits and match that in sysinst or something.
 >
 >  My v210 boots from 34GB scsi disks.

 Is there any way to ensure that the blocks loaded from /ofwboot are at
 the end of a partition for testing purposes? Maybe have a some code
 create repeated copies of ofwboot until it finds one with a block over
 a required limit, then remove the others and use that one?

 David

From: Martin Husemann <martin@duskware.de>
To: David Brownlee <abs@absd.org>
Cc: gnats-bugs@netbsd.org
Subject: Re: port-sparc64/56363: boot: Inode not directory
Date: Thu, 19 Aug 2021 15:56:02 +0200

 On Thu, Aug 19, 2021 at 01:53:26PM +0100, David Brownlee wrote:
 > Is there any way to ensure that the blocks loaded from /ofwboot are at
 > the end of a partition for testing purposes? Maybe have a some code
 > create repeated copies of ofwboot until it finds one with a block over
 > a required limit, then remove the others and use that one?

 I am not sure I understand your idea - I don't think there is any 
 non-destructive way to probe for this kind of limits.

 Martin

From: David Brownlee <abs@absd.org>
To: Martin Husemann <martin@duskware.de>
Cc: gnats-bugs@netbsd.org
Subject: Re: port-sparc64/56363: boot: Inode not directory
Date: Thu, 19 Aug 2021 15:16:04 +0100

 On Thu, 19 Aug 2021 at 14:56, Martin Husemann <martin@duskware.de> wrote:
 >
 > On Thu, Aug 19, 2021 at 01:53:26PM +0100, David Brownlee wrote:
 > > Is there any way to ensure that the blocks loaded from /ofwboot are at
 > > the end of a partition for testing purposes? Maybe have a some code
 > > create repeated copies of ofwboot until it finds one with a block over
 > > a required limit, then remove the others and use that one?
 >
 > I am not sure I understand your idea - I don't think there is any
 > non-destructive way to probe for this kind of limits.

 I was thinking of a brute force approach where it just kept creating
 copies (and using up disk space), until it found one where at least
 one block was above a specified limit (or potentially all blocks below
 a limit), at which point all files except that one could be deleted.
 The code to check which blocks are in use for each file should be
 already available in installboot :)

 It would not be intended for this to be default behaviour, but would
 probably make sense to build in
 - work in a temporary subdirectory so easier to cleanup
 - add optional max number of copies to try before aborting
 - add min free space in filesystem before aborting

 David

From: Martin Husemann <martin@duskware.de>
To: David Brownlee <abs@absd.org>
Cc: gnats-bugs@netbsd.org
Subject: Re: port-sparc64/56363: boot: Inode not directory
Date: Thu, 19 Aug 2021 16:49:46 +0200

 On Thu, Aug 19, 2021 at 03:16:04PM +0100, David Brownlee wrote:
 > I was thinking of a brute force approach where it just kept creating
 > copies (and using up disk space), until it found one where at least
 > one block was above a specified limit

 The usual aproach would be to write the block number and some random
 other value to each block and read a block (checking if it already got
 a lower number and firmware has truncated something) before writing the
 test content.

 This could only be done inside ofwboot (and not the kernel or userland),
 so would have to happen in a separate path before installation. And it
 would wipe the whole physical disk (if lucky), or at least the OFW
 addressable part of it. And take hours.

 We could add such a "destructive probe mode" to ofwboot and collect results
 and firmware versions, but I doubt it is worth the effort.

 Martin

From: matthew green <mrg@eterna.com.au>
To: gnats-bugs@netbsd.org
Cc: port-sparc64-maintainer@netbsd.org, gnats-admin@netbsd.org,
    netbsd-bugs@netbsd.org, tnn@nygren.pp.se
Subject: re: port-sparc64/56363: boot: Inode not directory
Date: Fri, 20 Aug 2021 04:20:30 +1000

 >  We could add such a "destructive probe mode" to ofwboot and collect results
 >  and firmware versions, but I doubt it is worth the effort.

 this message comes from bootblk, so, i'm going to assume that
 there is a failed read and hopefully ofwboot will get an actual
 error instead of false not error, and we can detect this read
 failure without trying to write to the disk.

 ie, we could do it non-destructively i think.


 .mrg.

From: Martin Husemann <martin@duskware.de>
To: matthew green <mrg@eterna.com.au>
Cc: gnats-bugs@netbsd.org
Subject: Re: port-sparc64/56363: boot: Inode not directory
Date: Thu, 19 Aug 2021 20:25:21 +0200

 On Fri, Aug 20, 2021 at 04:20:30AM +1000, matthew green wrote:
 > >  We could add such a "destructive probe mode" to ofwboot and collect results
 > >  and firmware versions, but I doubt it is worth the effort.
 > 
 > this message comes from bootblk, so, i'm going to assume that
 > there is a failed read and hopefully ofwboot will get an actual
 > error instead of false not error, and we can detect this read
 > failure without trying to write to the disk.

 I think it loads the wrong block w/o error.

 Martin

From: Tobias Nygren <tnn@nygren.pp.se>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-sparc64/56363: boot: Inode not directory
Date: Thu, 19 Aug 2021 21:00:48 +0200

 On Thu, 19 Aug 2021 18:30:02 +0000 (UTC)
 Martin Husemann <martin@duskware.de> wrote:

 >  I think it loads the wrong block w/o error.

 I will dust of my forth and confirm exactly what goes wrong and how.
 If I create an SD card with every block containing it's block number
 and load that in the SCSI2SD I should be able to tell what the
 boundary is and if it is indeed a different block that gets loaded.
 If, say, block 8388608 has the contents of block 0 or it's a block
 full of NULs that is a condition we might detect non-destructively.

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-sparc64/56363: boot: Inode not directory
Date: Fri, 20 Aug 2021 04:17:27 +0000

 On Thu, Aug 19, 2021 at 02:00:03PM +0000, Martin Husemann wrote:
  >  On Thu, Aug 19, 2021 at 01:53:26PM +0100, David Brownlee wrote:
  >  > Is there any way to ensure that the blocks loaded from /ofwboot are at
  >  > the end of a partition for testing purposes? Maybe have a some code
  >  > create repeated copies of ofwboot until it finds one with a block over
  >  > a required limit, then remove the others and use that one?
  >  
  >  I am not sure I understand your idea - I don't think there is any 
  >  non-destructive way to probe for this kind of limits.

 I think the idea was to rerun installboot repeatedly until you get an
 installation that has blocks outside some suspect test range, see if
 it works, and then repeat, bisecting to find the limit.

 You don't really want to fill the disk with copies of the bootblock
 for that, though, since if it silently reads the wrong block you stand
 a decent chance of getting the right material anyway. Better to fill
 the disk with blocks of illegal instructions.

 There's still no good way to control where things end up, though, but
 I guess you could introduce a hack to ffs to always pick cylinder
 groups in reverse order from the end. Or since you're going to be
 booting from alternate media anyway, just use makefs to prepare a
 suitable image and then dd it on.

 -- 
 David A. Holland
 dholland@netbsd.org

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-sparc64/56363: boot: Inode not directory
Date: Fri, 20 Aug 2021 08:08:25 +0200

 The (forth code) boot block (which itself gets loaded fine) has trouble
 reading the / directory. I don't see how the suggested method could
 help or how installboot would be related.

 Martin

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-sparc64/56363: boot: Inode not directory
Date: Sat, 21 Aug 2021 04:35:33 +0000

 On Fri, Aug 20, 2021 at 06:10:02AM +0000, Martin Husemann wrote:
  >  The (forth code) boot block (which itself gets loaded fine) has trouble
  >  reading the / directory. I don't see how the suggested method could
  >  help or how installboot would be related.

 I must have misunderstood then...

 -- 
 David A. Holland
 dholland@netbsd.org

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.