NetBSD Problem Report #54900

From martin@aprisoft.de  Mon Jan 27 10:02:59 2020
Return-Path: <martin@aprisoft.de>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 195D47A0D9
	for <gnats-bugs@gnats.NetBSD.org>; Mon, 27 Jan 2020 10:02:59 +0000 (UTC)
Message-Id: <20200127100248.5DCFE5CC8D0@emmas.aprisoft.de>
Date: Mon, 27 Jan 2020 11:02:48 +0100 (CET)
From: martin@NetBSD.org
Reply-To: martin@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: gpt(8) can not show MBR partitions on non 512 byte/sector disks
X-Send-Pr-Version: 3.95

>Number:         54900
>Category:       bin
>Synopsis:       gpt(8) can not show MBR partitions on non 512 byte/sector disks
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    bin-bug-people
>State:          analyzed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Jan 27 10:05:00 +0000 2020
>Closed-Date:    
>Last-Modified:  Tue Jan 28 17:35:01 +0000 2020
>Originator:     Martin Husemann
>Release:        NetBSD 9.99.42
>Organization:
The NetBSD Foundation, Inc.
>Environment:
System: NetBSD seven-days-to-the-wolves.aprisoft.de 9.99.42 NetBSD 9.99.42 (GENERIC) #345: Mon Jan 27 10:48:14 CET 2020 martin@seven-days-to-the-wolves.aprisoft.de:/work/src/sys/arch/amd64/compile/GENERIC amd64
Architecture: x86_64
Machine: amd64
>Description:

The gpt(8) utility can show MBR partitions on "normal" 512 byte/sector
disks. This does not work for disks with bigger sectors:

# fdisk sd3
Disk: /dev/rsd3
NetBSD disklabel disk geometry:
cylinders: 512, heads: 64, sectors/track: 32 (2048 sectors/cylinder)
total sectors: 1048576, bytes/sector: 2048

BIOS disk geometry:
cylinders: 66, heads: 255, sectors/track: 63 (16065 sectors/cylinder)
total sectors: 1048576

Partitions aligned to 16065 sector boundaries, offset 63

Partition table:
0: Primary DOS with 32 bit FAT - LBA (sysid 12)
    start 64, size 262144 (512 MB, Cyls 0/1/2-16/82/2)
1: <UNUSED>
2: <UNUSED>
3: <UNUSED>
No active partition.
Drive serial number: 2573559659 (0x9965676b)
# gpt show  sd3
gpt: /dev/rsd3: map entry doesn't fit media: new start + new size < start + size
(a + 36 < a + 1ff6)


>How-To-Repeat:
Format a 2k block disk (easily testable with vnd(4)) with fdisk and try to
show the partitioning with gpt.

>Fix:
n/a

>Release-Note:

>Audit-Trail:

State-Changed-From-To: open->analyzed
State-Changed-By: kre@NetBSD.org
State-Changed-When: Tue, 28 Jan 2020 16:08:17 +0000
State-Changed-Why:

It turns out that the sector size is a red herring here, while I have
not set up a test case to demonstrate the same symptoms with a (normal)
512 byte sector size, I have no doubt that it could be done.

The primary issue here is that gpt(8) goes ahead and looks for the
GPT tables even when it has found an MBR that is not a PMBR.  To me
that makes no sense at all.

The disc which provoked this report (image supplied by martin@) had
once been GPT formatted.   The old GPT tables (with the magic "EFI Part"
string at the beginning of the 2nd sector, and a correct checksum for
the GPT header, and partition table) still exist in the image, they had
not been clobbered by changing to an MBR (which of course only alters
sector 0) nor by the partition, or its filesystem, created subsequently
which was placed after where the old GPT partition table ended.  The old
GPT partitions had had their space reused however.

The code in gpt/gpt.c does:

        if (map_init(gpt, devsz) == -1)
                goto close;     

        index = 1;      
        if (gpt_mbr(gpt, 0LL, &index, 0U) == -1)
                goto close;
        if ((found = gpt_gpt(gpt, 1LL, 1)) == -1)
                goto close;
        if (gpt_gpt(gpt, devsz - 1LL, found) == -1) 
                goto close;     

Which init's gpt(8)'s drive map, then looks for an MBR (which succeeds,
and includes the MBR header, and the partitions found [1 of them in this
case] in the drive map).

It then looks for GPT headers (which doesn't summarily fail as it
normally woould on an MBR partitioned drive -- nb: "fail" here is
not a -1 return, that indicates a detected error, but a 0 return,
which as the code suggests, means !found -- but instead starts to
add the partitions it finds into the map that already contains the MBR
partitions.   In this case the GPT header itself is "fine", and is added,
the GPT table is "fine", and is also added, then the GPT partitions are
scanned and (intended to be) added.   This is where it barfs, the first
of the (ancient) GPT partitions overlaps with the (more recent) MBR
partition, and the resulting overlap leads to the not very meaningful:

	gpt: /dev/rsd3: map entry doesn't fit media:
		new start + new size < start + size (a + 36 < a + 1ff6)

(line wrapped manually for this message).   "new" is the free space
available in the drive map, (start)A + (size)36 = 40 (hex) which is
64 (dec), which is where the (sole allocated) MBR partition starts.
1FF6 (2K each in this case) sectors won't fit.

At this point we get the "goto close" (the middle one) above, which
deletes everything achieved so far, returns NULL, and main() exits
rather than doing anything (we never get to even starting the "show"
code - it makes no difference which (valid) gpt sub-command was selected).


I am going to leave the fix for this for one of the gpt(8) experts, but
my recommendation would be to make the code above something like:

        if (map_init(gpt, devsz) == -1)
                goto close;     

        index = 1;      
        if (gpt_mbr(gpt, 0LL, &index, 0U) == -1)
                goto close;
	if (map_find(gpt, MAP_TYPE_MBR) == NULL) {
		if ((found = gpt_gpt(gpt, 1LL, 1)) == -1)
			goto close;
		if (gpt_gpt(gpt, devsz - 1LL, found) == -1) 
			goto close;     
	}

so that if an MBR was located by gpt_mbr() we don't also look for
GPT tables.   An alternative would be to look for a MAP_TYPE_PMBR
and only do the gpt_gpt() if there was a PMBR, but I'm not certain
that every GPT user will always include a PMBR, even on systems
where MBRs have never been used for anything (eg: sun3).


I'm not sure how one would mandate this (except possibly by documenting
the need) but I'd also suggest obliterating at least the primary GPT
header (sector 1 normally) whenever converting (by any method at all)
what was once a GPT formatted drive into something else.


From: John Nemeth <jnemeth@cue.bc.ca>
To: gnats-bugs@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,
        kre@NetBSD.org, martin@NetBSD.org
Cc: 
Subject: Re: bin/54900 (gpt(8) can not show MBR partitions on non 512 byte/sector disks)
Date: Tue, 28 Jan 2020 08:37:14 -0800

 On Jan 28,  4:08pm, kre@NetBSD.org wrote:
 }
 } Synopsis: gpt(8) can not show MBR partitions on non 512 byte/sector disks
 } 
 } State-Changed-From-To: open->analyzed
 } State-Changed-By: kre@NetBSD.org
 } State-Changed-When: Tue, 28 Jan 2020 16:08:17 +0000
 } State-Changed-Why:
 } 
 } It turns out that the sector size is a red herring here, while I have
 } not set up a test case to demonstrate the same symptoms with a (normal)
 } 512 byte sector size, I have no doubt that it could be done.

      Interesting.  I did some tests using a blank "disk" (dd
 if=/dev/zero of=disk ...).  I may have located other issues that
 I need to explore.

 } [snip]
 } 
 } so that if an MBR was located by gpt_mbr() we don't also look for
 } GPT tables.   An alternative would be to look for a MAP_TYPE_PMBR
 } and only do the gpt_gpt() if there was a PMBR, but I'm not certain
 } that every GPT user will always include a PMBR, even on systems
 } where MBRs have never been used for anything (eg: sun3).

      In the UEFI world, GPT is gospel.  The PMBR, as the name
 suggests, is just there to protect the GPT from things that don't
 know about UEFI.  gpt(8) may need adjusting to take this into
 account.  That could be interesting as gpt(8) is really just a
 linked list manipulator where the nodes happen to correspond to
 partitions.  It really wants nodes to be sequential and to fit
 within size.

 } I'm not sure how one would mandate this (except possibly by documenting
 } the need) but I'd also suggest obliterating at least the primary GPT
 } header (sector 1 normally) whenever converting (by any method at all)
 } what was once a GPT formatted drive into something else.

      Failure to completely destroy the GPT when switching to a
 legacy MBR is operator error.  Technically both the primary (sector 1)
 and secondary (last sector) should be destroyed.  If the secondary
 isn't destroyed, UEFI compliant tools would be completely valid in
 restoring the primary and ignoring any other partitioning scheme.

 }-- End of excerpt from kre@NetBSD.org

From: Robert Elz <kre@munnari.OZ.AU>
To: John Nemeth <jnemeth@cue.bc.ca>
Cc: gnats-bugs@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,
        martin@NetBSD.org
Subject: Re: bin/54900 (gpt(8) can not show MBR partitions on non 512 byte/sector disks)
Date: Wed, 29 Jan 2020 00:32:07 +0700

     Date:        Tue, 28 Jan 2020 08:37:14 -0800
     From:        John Nemeth <jnemeth@cue.bc.ca>
     Message-ID:  <202001281637.00SGbEZa017421@server.cornerstoneservice.ca>

   |      In the UEFI world, GPT is gospel.  The PMBR, as the name
   | suggests, is just there to protect the GPT from things that don't
   | know about UEFI.

 Yes, I know.   However, it seems to me that if an MBR exists, and
 no PMBR, then at least on NetBSD (where we know about MBRs) we can
 safely conclude that GPT partitioning is not in use.   If there is
 no MBR (of any kind) but GTP tables exist, I am less confident.

   | gpt(8) may need adjusting to take this into account.

 In what way do you mean?

   | That could be interesting as gpt(8) is really just a
   | linked list manipulator where the nodes happen to correspond to
   | partitions.  It really wants nodes to be sequential and to fit
   | within size.

 Sure, that's reasonable, but IMO mixing up the partition info from
 two unrelated partioning schemes, and hoping that they will turn out
 to co-exist in that model, by magic, is not.

   |      Failure to completely destroy the GPT when switching to a
   | legacy MBR is operator error.

 Which operator?

 Consider that you're given (or buy) a drive which happens to have been
 used with GPT partitioning.   You stick it in your 15 year old NetBSD 4
 system (which needs a space upgrade, discs were small back then) and
 proceed to use it.   No GPT of course, just good old MBR/disklabel (which
 is OK, as even though the drive is big by 2006 standards, it is well within
 what 32 bit block numbers can handle (maybe 128GB or thereabouts).

 Unfortunately, that system dies (to be expected really, it was quite old,
 but this drive remains fine, it was comparatively new after all, so we
 install it in our replacement system which will be running NetBSD 9.
 We hear about this wonderful new partitioning scheme, and decide to try it:

 	gpt migrate wd4

 Oops.   "map entry doesn't fit media".    What do I do now?   Whose fault
 is it that things ended up like this?   Personally, I blame gpt(8).

   | Technically both the primary (sector 1)
   | and secondary (last sector) should be destroyed.

 Agreed - I didn't include the secondary GPT header, just in case
 someone realises they made a mistake, and wants to recover.   But
 if leaving it is dangerous, then by all means, recommend destroying
 it as well.   Just don't assume that any of these recommendations
 are actually acted upon.

 kre


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.