NetBSD Problem Report #54560
From gson@gson.org Sun Sep 22 14:38:40 2019
Return-Path: <gson@gson.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 74BC07A158
for <gnats-bugs@gnats.NetBSD.org>; Sun, 22 Sep 2019 14:38:40 +0000 (UTC)
Message-Id: <20190922143834.A7CD1989652@guava.gson.org>
Date: Sun, 22 Sep 2019 17:38:34 +0300 (EEST)
From: gson@gson.org (Andreas Gustafsson)
Reply-To: gson@gson.org (Andreas Gustafsson)
To: gnats-bugs@NetBSD.org
Subject: PXE netboot regression
X-Send-Pr-Version: 3.95
>Number: 54560
>Category: kern
>Synopsis: PXE netboot regression
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: manu
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Sep 22 14:40:00 +0000 2019
>Closed-Date: Fri Sep 27 13:00:03 +0000 2019
>Last-Modified: Fri Oct 04 11:35:01 +0000 2019
>Originator: Andreas Gustafsson
>Release: NetBSD-current, source date >= 2019.09.13.05.13.54
>Organization:
>Environment:
System: NetBSD
Architecture: x86_64
Machine: amd64
>Description:
I have an automated testing setup that netboots an INSTALL kernel and
performs a scripted installation of NetBSD/amd64 on physical hardware
using sysinst.
This recently stopped working. As pxeboot starts to load the
kernel over TFTP, it restarts, leading to repeating output on the
console like this:
>> NetBSD/x86 PXE boot, Revision 5.1 (Fri Sep 13 05:13:54 UTC 2019) (from NetBSD 9.99.11)
>> Memory: 552/3668992 k
Press return to boot now, any other key for boot menu
booting netbsd - starting in 0 seconds.
PXE BIOS Version 2.1
Using PCI device at bus 3 device 0 function 0
Ethernet address 98:4b:e1:67:68:98
|
>> NetBSD/x86 PXE boot, Revision 5.1 (Fri Sep 13 05:13:54 UTC 2019) (from NetBSD 9.99.11)
>> Memory: 552/3668992 k
Press return to boot now, any other key for boot menu
booting netbsd - starting in 0 seconds.
PXE BIOS Version 2.1
Using PCI device at bus 3 device 0 function 0
Ethernet address 98:4b:e1:67:68:98
|
>> NetBSD/x86 PXE boot, Revision 5.1 (Fri Sep 13 05:13:54 UTC 2019) (from NetBSD 9.99.11)
>> Memory: 552/3668992 k
Press return to boot now, any other key for boot menu
booting netbsd - starting in 0 seconds.
PXE BIOS Version 2.1
Using PCI device at bus 3 device 0 function 0
Ethernet address 98:4b:e1:67:68:98
|
After this loop has repeated several more times, the system gives up
on netbooting and boots from its second priority boot device (in this
case, the hard disk).
A full console log is at:
http://www.gson.org/netbsd/bugs/build/amd64-baremetal/2019/2019.09.13.05.13.54/install.log
A packet capture of the network traffic is at:
http://www.gson.org/netbsd/bugs/pxeboot/2019.09.13.05.13.54.pcap
The problem started around the time of manu's changes to add multiboot 2
support on September 13.
>How-To-Repeat:
Attempt to netboot an amd64 system using pxeboot.
>Fix:
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: kern-bug-people->manu
Responsible-Changed-By: gson@NetBSD.org
Responsible-Changed-When: Sun, 22 Sep 2019 14:41:19 +0000
Responsible-Changed-Why:
Over to committer.
From: manu@netbsd.org (Emmanuel Dreyfus)
To: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org, netbsd-bugs@netbsd.org,
gnats-admin@netbsd.org, gson@NetBSD.org,
gson@gson.org (Andreas Gustafsson)
Cc:
Subject: Re: kern/54560 (PXE netboot regression)
Date: Sun, 22 Sep 2019 16:57:43 +0200
Hello
Could you try this patch?
Index: sys/arch/i386/stand/pxeboot/Makefile
===================================================================
RCS file: /cvsroot/src/sys/arch/i386/stand/pxeboot/Makefile,v
retrieving revision 1.26
diff -U4 -r1.26 Makefile
--- sys/arch/i386/stand/pxeboot/Makefile 13 Sep 2019 02:19:46 -0000
1.26
+++ sys/arch/i386/stand/pxeboot/Makefile 22 Sep 2019 14:56:48 -0000
@@ -28,8 +28,9 @@
KERNMISCMAKEFLAGS="LIBKERN_ARCH=i386"
.endif
CPPFLAGS+= -DSLOW # for libz
+CPPFLAGS+= -DNO_MULTIBOOT2 # kern/54560
.if (${BASE} == "pxeboot_ia32")
# Take config values from patchable header
CPPFLAGS+= -DSUPPORT_SERIAL=boot_params.bp_consdev
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
manu@netbsd.org
From: Andreas Gustafsson <gson@NetBSD.org>
To: manu@netbsd.org (Emmanuel Dreyfus)
Cc: gnats-bugs@netbsd.org
Subject: Re: kern/54560 (PXE netboot regression)
Date: Mon, 23 Sep 2019 18:51:54 +0300
Emmanuel Dreyfus wrote:
> Could you try this patch?
With the patch, the system netboots successfully (my first try failed,
but that was probably caused by an unrelated issue).
--
Andreas Gustafsson, gson@NetBSD.org
From: manu@netbsd.org (Emmanuel Dreyfus)
To: gnats-bugs@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,
gson@gson.org (Andreas Gustafsson)
Cc:
Subject: Re: kern/54560 (PXE netboot regression)
Date: Tue, 24 Sep 2019 01:28:25 +0200
Andreas Gustafsson <gson@NetBSD.org> wrote:
> With the patch, the system netboots successfully (my first try failed,
> but that was probably caused by an unrelated issue).
I am not sure there are real multiboot + PXE usage. Perhaps we can just
disable multiboot 2 for PXE without looking further into what the
problem actually is.
In case you ar ecurious and want to look into it further, my first hunch
is that the multiboot 2 code increases the pxeboot_ia32.bin file beyond
64kB. You can easily test if this is the problem, remove CPPFLAGS+=
-DNO_MULTIBOOT2 and instead add
CPPFLAGS+= -DNO_GPT
CPPFLAGS+= -DNO_RAIDFRAME
You will get a pxeboot_ia32.bin below 64 kB with multiboot 2 enabled. If
that boots, then we have our answer.
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
manu@netbsd.org
From: Andreas Gustafsson <gson@gson.org>
To: manu@netbsd.org (Emmanuel Dreyfus)
Cc: gnats-bugs@netbsd.org
Subject: Re: kern/54560 (PXE netboot regression)
Date: Tue, 24 Sep 2019 12:25:02 +0300
Emmanuel Dreyfus wrote:
> I am not sure there are real multiboot + PXE usage. Perhaps we can just
> disable multiboot 2 for PXE without looking further into what the
> problem actually is.
>
> In case you ar ecurious and want to look into it further, my first hunch
> is that the multiboot 2 code increases the pxeboot_ia32.bin file beyond
> 64kB. You can easily test if this is the problem, remove CPPFLAGS+=
> -DNO_MULTIBOOT2 and instead add
>
> CPPFLAGS+= -DNO_GPT
> CPPFLAGS+= -DNO_RAIDFRAME
>
> You will get a pxeboot_ia32.bin below 64 kB with multiboot 2 enabled. If
> that boots, then we have our answer.
That did not boot, but your hunch may be correct anyway. Here's a
summary of my test results, with the size of pxeboot_ia32.bin in each:
Source date CPPFLAGS Size Result
==================================================================
2019.09.13.01.34.19 none 61664 pass
2019.09.13.05.13.54 none 73952 fail
2019.09.22.18.31.59 -DNO_MULTIBOOT2 61664 pass
2019.09.21.15.56.09 -DNO_GPT -DNO_RAIDFRAME 65760 fail
That is, the test failed when and only when pxeboot_ia32.bin was
larger than 65536 bytes.
I guess the next question is, why does multiboot 2 use more space
than GPT and raidframe combined?
--
Andreas Gustafsson, gson@gson.org
From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/54560: PXE netboot regression
Date: Fri, 27 Sep 2019 11:41:09 +0300
As discussed on tech-kern under the subject "x86 bootstrap features",
it may be possible to increase the 64k size limit. With the following
patch, the boot succeeds for me:
Index: Makefile
===================================================================
RCS file: /cvsroot/src/sys/arch/i386/stand/pxeboot/Makefile,v
retrieving revision 1.27
diff -u -r1.27 Makefile
--- Makefile 23 Sep 2019 13:42:37 -0000 1.27
+++ Makefile 27 Sep 2019 05:27:41 -0000
@@ -66,7 +66,7 @@
#CFLAGS= -O2 -fomit-frame-pointer -fno-defer-pop
CFLAGS+= -Wall -Wmissing-prototypes -Wstrict-prototypes -Wno-main
-SAMISCCPPFLAGS+= -DHEAP_START=0x10000 -DHEAP_LIMIT=0x30000
+SAMISCCPPFLAGS+= -DHEAP_START=0x20000 -DHEAP_LIMIT=0x40000
SAMISCMAKEFLAGS+= SA_USE_CREAD=yes # Read compressed kernels
CPPFLAGS+= -DPASS_BIOSGEOM
Index: start_pxe.S
===================================================================
RCS file: /cvsroot/src/sys/arch/i386/stand/pxeboot/start_pxe.S,v
retrieving revision 1.6
diff -u -r1.6 start_pxe.S
--- start_pxe.S 18 Mar 2011 17:46:26 -0000 1.6
+++ start_pxe.S 27 Sep 2019 05:27:41 -0000
@@ -69,7 +69,7 @@
# set up %ss and %sp
movl $_end, %eax /* top of bss */
shrl $4, %eax /* as a segment */
- addw $0x1001, %ax /* and + 64k */
+ addw $0x2001, %ax /* and + 128k */
movw %ax, %ss /* for stack */
movw $0xfffc, %sp /* %sp at top of it */
--
Andreas Gustafsson, gson@gson.org
From: "Andreas Gustafsson" <gson@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/54560 CVS commit: src/sys/arch/i386/stand/pxeboot
Date: Fri, 27 Sep 2019 08:57:10 +0000
Module Name: src
Committed By: gson
Date: Fri Sep 27 08:57:10 UTC 2019
Modified Files:
src/sys/arch/i386/stand/pxeboot: Makefile start_pxe.S
Log Message:
Incrase pxeboot code size limit from 64k to 128k. Fixes PR kern/54560.
The start_pxe.S part was suggested by mlelstv.
To generate a diff of this commit:
cvs rdiff -u -r1.27 -r1.28 src/sys/arch/i386/stand/pxeboot/Makefile
cvs rdiff -u -r1.6 -r1.7 src/sys/arch/i386/stand/pxeboot/start_pxe.S
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->closed
State-Changed-By: gson@NetBSD.org
State-Changed-When: Fri, 27 Sep 2019 13:00:03 +0000
State-Changed-Why:
Fix committed.
From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/54560 CVS commit: [netbsd-9] src/sys/arch/i386/stand/pxeboot
Date: Thu, 3 Oct 2019 17:26:00 +0000
Module Name: src
Committed By: martin
Date: Thu Oct 3 17:26:00 UTC 2019
Modified Files:
src/sys/arch/i386/stand/pxeboot [netbsd-9]: Makefile start_pxe.S
Log Message:
Pull up following revision(s) (requested by manu in ticket #277):
sys/arch/i386/stand/pxeboot/start_pxe.S: revision 1.7
sys/arch/i386/stand/pxeboot/Makefile: revision 1.28
Incrase pxeboot code size limit from 64k to 128k. Fixes PR kern/54560.
The start_pxe.S part was suggested by mlelstv.
To generate a diff of this commit:
cvs rdiff -u -r1.25.6.1 -r1.25.6.2 src/sys/arch/i386/stand/pxeboot/Makefile
cvs rdiff -u -r1.6 -r1.6.60.1 src/sys/arch/i386/stand/pxeboot/start_pxe.S
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/54560 CVS commit: [netbsd-8] src/sys/arch/i386/stand/pxeboot
Date: Fri, 4 Oct 2019 11:34:18 +0000
Module Name: src
Committed By: martin
Date: Fri Oct 4 11:34:18 UTC 2019
Modified Files:
src/sys/arch/i386/stand/pxeboot [netbsd-8]: Makefile start_pxe.S
Log Message:
Pull up following revision(s) (requested by manu in ticket #1400):
sys/arch/i386/stand/pxeboot/start_pxe.S: revision 1.7
sys/arch/i386/stand/pxeboot/Makefile: revision 1.28
Incrase pxeboot code size limit from 64k to 128k. Fixes PR kern/54560.
The start_pxe.S part was suggested by mlelstv.
To generate a diff of this commit:
cvs rdiff -u -r1.24.10.1 -r1.24.10.2 src/sys/arch/i386/stand/pxeboot/Makefile
cvs rdiff -u -r1.6 -r1.6.48.1 src/sys/arch/i386/stand/pxeboot/start_pxe.S
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.43 2018/01/16 07:36:43 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.