NetBSD Problem Report #56956

From www@netbsd.org  Sun Aug  7 05:08:48 2022
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 1E4A41A9239
	for <gnats-bugs@gnats.NetBSD.org>; Sun,  7 Aug 2022 05:08:48 +0000 (UTC)
Message-Id: <20220807050846.A4F201A923A@mollari.NetBSD.org>
Date: Sun,  7 Aug 2022 05:08:46 +0000 (UTC)
From: wataash@wataash.com
Reply-To: wataash@wataash.com
To: gnats-bugs@NetBSD.org
Subject: i386/pxeboot: cannot load netbsd kernel larger than 32 MiB with TFTP
X-Send-Pr-Version: www-1.0

>Number:         56956
>Category:       port-i386
>Synopsis:       i386/pxeboot: cannot load netbsd kernel larger than 32 MiB with TFTP
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    port-i386-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Aug 07 05:10:00 +0000 2022
>Last-Modified:  Sun Aug 07 07:30:01 +0000 2022
>Originator:     Wataru Ashihara
>Release:        9.2 (http://ftp.jaist.ac.jp/pub/NetBSD/NetBSD-9.2/i386/installation/misc/pxeboot_ia32.bin)
>Organization:
>Environment:
>Description:
In PXE boot, TFTP-loading "netbsd" binary fails with the following error:

    >> NetBSD/x86 PXE boot, Revision 5.1 (Wed May 12 13:15:55 UTC 2021) (from NetBSD
     9.2)
    >> Memory: 625/2094828 k
    Press return to boot now, any other key for boot menu
    booting netbsd - starting in 0 seconds.     
    PXE BIOS Version 2.1
    Using PCI device at bus 0 device 3 function 0
    Ethernet address 52:54:00:ff:08:00
    28306576+1562064+535088read section headers: Unknown error: code 60
    boot: Input/output error
    Boot fai

This happens when x86 machine receives TFTP data block 65536. Instead of
replying the ack, the machine shows the error above.

    x86 machine                                    TFTP server
                 <-- Data packet, Block: 65533
                 --> Acknowlegement, Block: 65533
                 <-- Data packet, Block: 65534
                 --> Acknowlegement, Block: 65534
                 <-- Data packet, Block: 65535
                 --> Acknowlegement, Block: 65535
                 <-- Data packet, Block: 65536
                 (no ack, showing error above)

And this happens when "netbsd" is larger than 512 B * 65536 = 32 MiB, when
compiling kernel without optimization (-O0).

QEMU with gdb told me the error occurs at here:
https://github.com/NetBSD/src/blob/0caacf7bc9bb96e96c948dd08a8dab0542c0220b/sys/lib/libsa/tftp.c#L121

>How-To-Repeat:

>Fix:
Use NFS as a workaround.

>Audit-Trail:
From: Rin Okuyama <rokuyama.rk@gmail.com>
To: gnats-bugs@netbsd.org, port-i386-maintainer@netbsd.org,
 gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, wataash@wataash.com
Cc: 
Subject: Re: port-i386/56956: i386/pxeboot: cannot load netbsd kernel larger
 than 32 MiB with TFTP
Date: Sun, 7 Aug 2022 15:02:40 +0900

 This is because th_block in TFTP header is 16-bit width.

 This patch should work around the problem, although I'm not
 100% sure whether this is correct fix:

 ----
 Index: tftp.c
 ===================================================================
 RCS file: /cvsroot/src/sys/lib/libsa/tftp.c,v
 retrieving revision 1.38
 diff -p -u -r1.38 tftp.c
 --- tftp.c	7 Aug 2022 05:51:55 -0000	1.38
 +++ tftp.c	7 Aug 2022 05:57:15 -0000
 @@ -114,7 +114,7 @@ recvtftp(struct iodesc *d, void *pkt, si
   	t = (struct tftphdr *)pkt;
   	switch (ntohs(t->th_opcode)) {
   	case DATA:
 -		if (ntohs(t->th_block) != d->xid) {
 +		if (ntohs(t->th_block) != (u_short)d->xid) {
   			/*
   			 * Expected block?
   			 */
 ----

 Note that this is patch for tftp.c,v 1.38, which I committed
 just now.

 Also note that, since seek is impossible for TFTP, NFS boot
 should be much faster than TFTP boot.

 Thanks,
 rin

From: mlelstv@serpens.de (Michael van Elst)
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-i386/56956: i386/pxeboot: cannot load netbsd kernel larger than 32 MiB with TFTP
Date: Sun, 7 Aug 2022 06:17:44 -0000 (UTC)

 wataash@wataash.com writes:

 >    PXE BIOS Version 2.1
 >    Using PCI device at bus 0 device 3 function 0
 >    Ethernet address 52:54:00:ff:08:00
 >    28306576+1562064+535088read section headers: Unknown error: code 60
 >    boot: Input/output error
 >    Boot fai

 >This happens when x86 machine receives TFTP data block 65536. Instead of
 >replying the ack, the machine shows the error above.


 That's a limitation of the TFTP protocol.

 You could try to extend this a little bit with the TFTP blocksize
 option, but that's not necessarily supported by all TFTP servers.

 The standalone library (libsa) also doesn't support IP fragmentation
 (it doesn't even validate the ip_off field), on Ethernet (with 1500 byte frames)
  you could at most double the blocksize and thus the maximum file size.

 It might be more versatile to support other protocols to fetch a
 kernel.

 Another workaround might be to compress the kernel with gzip.

From: Wataru Ashihara <wataash@wataash.com>
To: Rin Okuyama <rokuyama.rk@gmail.com>, gnats-bugs@netbsd.org,
 port-i386-maintainer@netbsd.org, gnats-admin@netbsd.org,
 netbsd-bugs@netbsd.org
Cc: 
Subject: Re: port-i386/56956: i386/pxeboot: cannot load netbsd kernel larger
 than 32 MiB with TFTP
Date: Sun, 7 Aug 2022 15:57:27 +0900

 Thanks! I'll try your patch.

From: Wataru Ashihara <wataash@wataash.com>
To: gnats-bugs@netbsd.org, port-i386-maintainer@netbsd.org,
 gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc: 
Subject: Re: port-i386/56956: i386/pxeboot: cannot load netbsd kernel larger
 than 32 MiB with TFTP
Date: Sun, 7 Aug 2022 16:26:58 +0900

 Thank you for telling me a lot!

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.