NetBSD Problem Report #53220
From liman@cafax.se Sat Apr 28 06:02:43 2018
Return-Path: <liman@cafax.se>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 321027A1D0
for <gnats-bugs@gnats.NetBSD.org>; Sat, 28 Apr 2018 06:02:43 +0000 (UTC)
Message-Id: <20180428060238.E52309313A@netbsd8.cafax.se>
Date: Sat, 28 Apr 2018 08:02:38 +0200 (MEST)
From: Lars-Johan Liman <liman@cafax.se>
Reply-To: liman@cafax.se
To: gnats-bugs@NetBSD.org
Subject: sysinst dumps core with extended partitioning.
X-Send-Pr-Version: 3.95
>Number: 53220
>Category: install
>Synopsis: sysinst on Xen domU dumps core on a pre-partitioned gpt image.
>Confidential: no
>Severity: non-critical
>Priority: low
>Responsible: martin
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sat Apr 28 06:05:00 +0000 2018
>Last-Modified: Mon May 07 04:10:00 +0000 2018
>Originator: Lars-Johan Liman
>Release: NetBSD 8.0_RC1
>Organization:
Cafax AB
>Environment:
System: NetBSD netbsd8.cafax.se 8.0_RC1 NetBSD 8.0_RC1 (XEN3_DOMU.201804191727Z$
Architecture: x86_64
Machine: amd64
>Description:
If you want to install an amd64 Xen domU on a disk image that is
pre-partitioned with gpt, and fire up a virtual machine with the
INSTALL kernel, and select "extended partitioning", sysinst
immediately dumps core. This is probably not the intended
behaviour ... ;-)
>How-To-Repeat:
1. Create blank disk image.
dd if=/dev/null of=myimage.img count=...
2. Create gpt partitions on the image.
vnconfig vndX myimage.img
gpt destroy /dev/rvndX
gpt add -s NNNN -t ffs /dev/rvndX
gpt add -s NNNN -t swap /dev/rvndX
gpt add -s NNNN -t ffs /dev/rvndX
gpt add -s NNNN -t ffs /dev/rvndX
gpt label -i 1 -l ROOT /dev/r$drive
gpt label -i 2 -l SWAP /dev/r$drive
gpt label -i 3 -l USR /dev/r$drive
gpt label -i 4 -l VAR /dev/r$drive
vnconfig -u vndX
3. Boot a Xen virtual machine with the 8.0_RC1
netbsd-INSTALL_XEN3_DOMU kernel.
4. Navigate through the sysinst menus
a: Install messages in English
a: Install NetBSD to hard disk
b: Yes (we shall continue!)
b: Extended partitioning
[1] Segmentation fault (core dumped) sysinst
>Fix:
No known fix, but a workaround is to _not_ use sysinst, but to
shutdown the domU, and manually install NetBSD from the dom0 by
mounting the diskimage and filling it by hand.
1. vnconfig vndX myimage.img
2. newfs /dev/rvndXa (etc.)
3. mount /dev/vndXa (etc.)
4. pax -zrvpe -f amd64/binary/sets/base.tgz (etc.)
5. Fix fstab, rc.conf, localtime, passwd (etc.)
6. vnconfig -u vndX
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: install-manager->martin
Responsible-Changed-By: martin@NetBSD.org
Responsible-Changed-When: Sat, 28 Apr 2018 06:16:30 +0000
Responsible-Changed-Why:
Take
From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: install/53220: sysinst dumps core with extended partitioning.
Date: Sat, 28 Apr 2018 14:17:41 +0700
I have seen that as well, though on bare metal rather
that a DomU (though to sysinst there should be little,
if any, difference.)
I used a similar workaround - just install manually.
You could have done it from inside the DomU
though, just tell sysinst not to install, but just give
you a shell - after that, the same procedure
except there's no need to vnconfig anything
(which is fortunate as on bare metal there's
no way to do it outside the running system.)
I have been meaning to look into it - sometime - but it is
not a high priority for me (system is installed now...)
kre
From: Lars-Johan Liman <liman@cafax.se>
To: gnats-bugs@NetBSD.org
Cc: martin@NetBSD.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: install/53220: sysinst dumps core with extended partitioning.
Date: Sat, 28 Apr 2018 21:02:51 +0200
Hi Robert!
(I don't know if you remember me, but we've run into each other at IETFs
in the past, in DNSOP and DNSEXT. I'm one of the 2 m tall DNS Swedes ... ;-)
kre@munnari.OZ.AU:
> The following reply was made to PR install/53220; it has been noted by GNATS.
> From: Robert Elz <kre@munnari.OZ.AU>
> To: gnats-bugs@NetBSD.org
> Cc:
> Subject: Re: install/53220: sysinst dumps core with extended partitioning.
> Date: Sat, 28 Apr 2018 14:17:41 +0700
> I have seen that as well, though on bare metal rather
> that a DomU (though to sysinst there should be little,
> if any, difference.)
Aye. I suspected as much, but I had no bare metal to test it on.
> I used a similar workaround - just install manually.
> You could have done it from inside the DomU
> though, just tell sysinst not to install, but just give
> you a shell - after that, the same procedure
> except there's no need to vnconfig anything
> (which is fortunate as on bare metal there's
> no way to do it outside the running system.)
(Sure, it's just that outside the VM i have access to all my
"convenience tools" whereas inside it, the land is "barren and
cold" ... ;-)
> I have been meaning to look into it - sometime - but it is
> not a high priority for me (system is installed now...)
Understood. But maybe the famous $someone should take a look at it
before 8.0-REL goes out the door.
Cheers,
/Liman
(a.k.a. hostmaster ,at, i-root.servers.net,
or liman ,at, netnod.se (my daywork))
#-------------------------------------------------------------------------
# Lars-Johan Liman, M.Sc. ! E-mail: liman ,at, cafax.se
# Cafax AB ! HTTP : //www.cafax.se/
# Computer Consultants, Sweden ! Voice : +46 8 - 564 702 30
#-------------------------------------------------------------------------
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: install/53220: sysinst dumps core with extended partitioning.
Date: Sun, 29 Apr 2018 11:19:05 +0200
Sorry, but I can not reproduce it on bare metal with the 8.0 RC sysinst.
For me it just works fine:
This is the extended partition manager. All disks, partitions, etc. are
listed below. MBR partitions, if desired, must be created before editing
the disklabel. To use RAID, LVM, or CGD: 1) Create BSD partitions with the
appropriate type; 2) Create RAID/LVM VG/CGD using these partitions; 3) Save
changes; 4) Create partitions for RAID/CGD or Logical Volumes for LVM.
>a: vnd0 GPT-labeled disk UNCHANGED
b: dk0: 4.2BSD 1024M
c: dk1: swap 1024M
d: dk2: 4.2BSD 1024M
e: dk3: 4.2BSD 1024M
f: Create cryptographic volume (CGD)
g: Create virtual disk image (VND)
h: Create volume group (LVM VG)
i: Create software RAID
j: Update device list
k: Save changes
x: Finish partitioning
Robert is correct, this is supposed to be done on the installed
domU, as sysinst will insist on creating the partitions you install into
yourself in the extended partitioning menu (this is a different bug which
is going to be fixed, along with other gpt related issues, but the
fix unlikely will be ready for 8.0, so it will likely be in 8.1 only).
Martin
From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: install/53220: sysinst dumps core with extended partitioning.
Date: Sun, 29 Apr 2018 23:19:21 +0700
Date: Sun, 29 Apr 2018 09:20:01 +0000 (UTC)
From: Martin Husemann <martin@duskware.de>
Message-ID: <20180429092001.275F57A225@mollari.NetBSD.org>
| Sorry, but I can not reproduce it on bare metal with the 8.0 RC sysinst.
I will see if I can set up the conditions that cause the problem.
kre
From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: install/53220: sysinst dumps core with extended partitioning.
Date: Mon, 30 Apr 2018 02:25:27 +0700
Date: Sun, 29 Apr 2018 16:25:01 +0000 (UTC)
From: Robert Elz <kre@munnari.OZ.AU>
Message-ID: <20180429162501.76E3D7A261@mollari.NetBSD.org>
| I will see if I can set up the conditions that cause the problem.
OK, done. core dump from sysinst easy to make...
total 3136
-r--r--r-- 1 root wheel 2646 Mar 30 14:24 .profile
drwxr-xr-x 2 root wheel 512 Mar 30 14:24 bin
drwxr-xr-x 9 root wheel 115968 Apr 30 02:02 dev
drwxr-xr-x 2 root wheel 0 Apr 30 02:02 etc
drwxr-xr-x 2 root wheel 512 Mar 30 14:24 kern
drwxr-xr-x 3 root wheel 512 Mar 30 14:24 libexec
drwxr-xr-x 2 root wheel 512 Mar 30 14:24 mnt
drwxr-xr-x 2 root wheel 512 Mar 30 14:24 mnt2
drwxr-xr-x 2 root wheel 1024 Mar 30 14:24 sbin
-r-xr-xr-x 75 root wheel 2485376 Mar 30 14:24 sysinst
-rw------- 1 root wheel 435904 Apr 30 02:03 sysinst.core
-r--r--r-- 1 root wheel 36898 Mar 30 14:24 sysinstmsgs.de
[...]
I used an 8.0_RC1 build I made on April 21, but I think anything recentish
would do (the previous time it happened for me was a NetBSD-current (8.99.?)
from (very) late January.
This time I booted a XEN DomU (as that's the easied to play in), the
previous time was bare metal booting from an install CD (an install
CD in a format I make it, which might not be quite the same as the
normal one, but which has the same sysinst, ... there was no
problem booting it - at least when the UEFI firmware was switched
to legacy mode - it was made before the code to allow UEFI booting
from cd was added - I have not yet tried that)
The procedure I used just now was:
First set up the disk (in my bare metal test this was done, despite my
request that it not be) by the people who sold me the system...
Boot netbsd-INSTALL kernel from
${RELEASE}/amd64/binary/kernel/netbsd-INSTALL_XEN3_DOMU.gz
exit sysinst, and in the root shell that appears:
gpt create -A xbd0
gpt add -b 2048 -s 204800 -l EFI -t efi xbd0
gpt add -s 1576960 -t windows -l w^HWindows xbd0
gpt add -b 1783808 -l windows_data -t windows xbd0
shutdown
That leaves the disk looking like:
start size index contents
0 1 PMBR (active)
1 1 Pri GPT header
2 32 Pri GPT table
34 2014 Unused
2048 204800 1 GPT part - EFI System
206848 1576960 2 GPT part - Windows basic data
1783808 3144467 3 GPT part - Windows basic data
4928275 32 Sec GPT table
4928307 1 Sec GPT header
Any method that leaves the drive (real or virtual) looking like
that should be OK. Note there's no free space (well, 2014 blocks...)
There's also (in my case) no NetBSD partitions.
My bare metal install looked just the same - the drive (well drives)
were much bigger, but full of windows crap filesystems (mostly
just empty).
Then attempt a NetBSD install:
Boot netbsd-INSTALL kernel
In Sysinst
a - messages in English
a - Install NetBSD to hard disk
b - yes
b - extended partitioning
(BOOM)
I then repeated that, using the same (untouched) drive, using a
NetBSD-current (8.99.14) from late March (the latest one I had
lying around without doing a rebuild.) Same steps in sysinst,
same result.
If you're still unable to reproduce this, let me know, and I'll build
a version of this that includes a sysinst that's built wwith debug
symbols, and see if I can see what is up in the core file.
And from Martin's message:
| Robert is correct, this is supposed to be done on the installed
| domU, as sysinst will insist
That's not quite what I said, I said it can be done in the installed
DomU (as I did just above) not that it should be. What sysinst
insists on is irrelevant, as once it is unable to proceed, it simply
isn't used any more - I almost never use it, when I am doing
DomU builds (my must common type) I simply have DESTDIR be
the filesystem that is to be the DomU filesystem, and build
directly into it. Then I populate /dev fix up what needs to be
fixed in /etc, and boot it.
If I was going to go ahead and install the system just above, I'd
do it the same way - either in the DomU after sysinst has crashed,
I'd gpt remove gpt remove ... then gpt add gpt add ... to set up
the filesystems, mount them, and then fetch and untar the install
sets (which is exactly what I did on the bare metal system,
except for the "fetch" part as they were on the CD already waiting
to be unpacked).
Alternatively, when using XEN, the whole install can be done my
mounting the (to be) DomU filesystem(s) in the Dom0 and then
doing the set unpacks there (where the tooks are better as a
full running NetBSD is available).
Either way works - which is better just depends what you prefer
(the Xen Dom0 way means all the doc is also easily available.)
kre
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: install/53220: sysinst dumps core with extended partitioning.
Date: Sun, 29 Apr 2018 21:36:36 +0200
On Sun, Apr 29, 2018 at 07:30:01PM +0000, Robert Elz wrote:
> If you're still unable to reproduce this, let me know, and I'll build
> a version of this that includes a sysinst that's built wwith debug
> symbols, and see if I can see what is up in the core file.
Indeed I am still unable to reproduce it. Must be something very strange
going on.
Could you just make the sysinst.core file available?
Btw: the official builds all include debug.tgz which has all the debug
symbols needed.
Martin
From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: install/53220: sysinst dumps core with extended partitioning.
Date: Mon, 30 Apr 2018 03:51:32 +0700
Date: Sun, 29 Apr 2018 19:40:01 +0000 (UTC)
From: Martin Husemann <martin@duskware.de>
Message-ID: <20180429194001.3C0807A261@mollari.NetBSD.org>
| Indeed I am still unable to reproduce it. Must be something very strange
| going on.
Yes, very weird. It is so simple to make happen...
| Could you just make the sysinst.core file available?
Sure, both of them:
ftp://munnari.oz.au/martin/8.0_rc1-sysinst.core
ftp://munnari.oz.au/martin/8.99.14-sysinst.core
| Btw: the official builds all include debug.tgz which has all the debug
| symbols needed.
Hmm - mine don't. What's different? I simply do
build.sh [-x] [-u] [-P] release
with MK_CONF=/dev/null
(and settings for the obj, dest, and release dirs that suit
my peculiar needs). It is supposed to be as similar to
the official builds as possible (at least when I include -x,
I often omit that as I don't generally need it when I am
just testing (though my source sets have extsrc.tgz
instead of gnusrc.tgz - II build the source sets with my
own script - variant of the normal one - as well.)
My cd (dvd really) building differs as I add all kinds of
other stuff (source sets, incl pkgsrc, soetimes some
distfiles, ...) in the iso image - if I'm going to burn a dvd I
mught as well fill it up with stuff that might be useful!
I only mentioned that, as my build script for that is
very old, and still uses the emulated floppy method
of booting (or however it used to be done) so that
the kernel boots info a MFS miniroot, rather than using
the cd image as the root filesys, which I think is the
more modern way. I should update it someday...
kre
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: install/53220: sysinst dumps core with extended partitioning.
Date: Mon, 30 Apr 2018 10:56:57 +0200
On Sun, Apr 29, 2018 at 08:55:01PM +0000, Robert Elz wrote:
> Sure, both of them:
>
> ftp://munnari.oz.au/martin/8.0_rc1-sysinst.core
> ftp://munnari.oz.au/martin/8.99.14-sysinst.core
Hmm, both cores don't work for me - they have the $pc in some unmapped
code:
Core was generated by `sysinst'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00000000004b11cc in ?? ()
(gdb) bt
#0 0x00000000004b11cc in ?? ()
#1 0x0000000000000000 in ?? ()
(gdb) x/16i $pc
=> 0x4b11cc: Cannot access memory at address 0x4b11cc
> | Btw: the official builds all include debug.tgz which has all the debug
> | symbols needed.
>
> Hmm - mine don't. What's different? I simply do
>
> build.sh [-x] [-u] [-P] release
Add: -V MKDEBUG=yes
Martin
From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: install/53220: sysinst dumps core with extended partitioning.
Date: Mon, 30 Apr 2018 23:24:34 +0700
Date: Mon, 30 Apr 2018 09:00:01 +0000 (UTC)
From: Martin Husemann <martin@duskware.de>
Message-ID: <20180430090001.5CE367A25E@mollari.NetBSD.org>
| Hmm, both cores don't work for me - they have the $pc in some unmapped
| code:
|
| #1 0x0000000000000000 in ?? ()
| (gdb) x/16i $pc
| => 0x4b11cc: Cannot access memory at address 0x4b11cc
That would explain the SEGV which generates the core file.
sysinst does some indirect function calls:
error = (*item->func)(found, numfound);
for example, which could result in a jump into oblivion.
the pm_edit function in particular takes 3 function pointers
as args, and as that's about where we're going when it all
falls apart, that's where I'd be suspicious.
| Add: -V MKDEBUG=yes
OK, I will do that. I'd have thought that the "official builds" should
avoid using non-default options - would it perhaps be the right thing
to do to make that be on by default, if that is what the releases are
to use?
I'll do an
From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: install/53220: sysinst dumps core with extended partitioning.
Date: Tue, 01 May 2018 11:26:05 +0700
Some progress on this ...
The sysinst code contains this (disks.c)...
static bool
is_gpt(const char *dev)
{
check_available_binaries();
if (!have_gpt)
return false;
return !run_program(RUN_SILENT | RUN_ERROR_OK,
"sh -c 'gpt show %s |grep -e Pri\\ GPT\\ table'", dev);
}
which looks kind of OK, except there is no "grep" in the MFS miniroot
so that is always going to fail (return false). "sed" is installed, and can
generally used to do what grep can do, with the notable exception of
exiting with a status indicating whether a string (or RE) exists in the file.
I (in a local copy only, as I don't really think this is the right way) changed
this to
return !run_program(RUN_SILENT | RUN_ERROR_OK,
"sh -c 'test -n \"$(gpt show %s | while read line; do "
"case \"${line}\" in (*\"Pri GPT table\"*) echo FOUND;exit 0;;"
"esac; done)\"'", dev);
which is somewhat convoluted, but does work to determine if there
is a GPT label on the drive (variations that attempt to be simpler
will probably fail.) Provided gpt (which is tested for) and sh
(kind of mandatory) exist (and given "test" is built into sh) this
should be OK.
A much better solution would be to have a new sub-command for
gpt like
gpt [-q] query dev
which would (if the already existing -q option is not given)
print something like "GPT Primary Label @%d on %s\n"
or "No GPT on %s" as appropriate, and then (regardless of -q)
exit(0) if there is a GPT label, and exit(1) otherwise, then
sysinst could just do:
return !run_program(RUN_SILENT | RUN_ERROR_OK,
"sh -c 'gpt -q query %s'", dev);
(the name "query" for the command could be anything of course).
(It could also just forget the "sh -c" and use (for the 2nd line):
"gpt -q qery %s", dev);
But this one isn't the real problem, though it probably doesn't help.
This is also in sysinst - in partman.c ...
#define remove_vnd_options() (void)0
if (!have_vnd)
remove_vnd_options();
else if (!(vnds = calloc(MAX_VND, sizeof(*vnds))))
have_vnd = 0;
Combine that with:
have_vnd = binary_available("vnconfig");
when "vnconfig" was renamed as "vndconfig" ages ago, and you
have a recipe for potential disaster, when someone eventually
decides that enough time has passed, and the old vnconfig
link to vndconfig should simply be dropped (changing the 4
occurrences of vnconfig into vndconfig should be easy!)
Of course, the same recipe for disaster (not potential) exists when
there is no insalled vnconfig (or vndconfig) program at all - which is
the case on the MFS mini-roon that's embedded in the INSTALL
kernel.
Fixing this would mean actually writing the function, which
given the simplicity of the other similar functions does't look
like it should be too hard, eg, this example...
void
remove_cgd_options()
{
/*
* No CGD available, remove the following menu entries:
*/
remove_menu_option(MENU_pmdiskentry, MSG_encrypt);
remove_menu_option(MENU_pmpartentry, MSG_encrypt);
}
(and similar for gpt, raid, lvm, cgd, color -- several of which also have
no support on the MFS embedded mini-root)
kre
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: install/53220: sysinst dumps core with extended partitioning.
Date: Tue, 1 May 2018 11:20:06 +0200
On Tue, May 01, 2018 at 04:30:01AM +0000, Robert Elz wrote:
> which looks kind of OK, except there is no "grep" in the MFS miniroot
> so that is always going to fail (return false).
OK, you are using the ram disk based installers, while I tested with the
CD image.
> A much better solution would be to have a new sub-command for
> gpt like
>
> gpt [-q] query dev
I did a very lazy poor man's version of that.
> when "vnconfig" was renamed as "vndconfig" ages ago, and you
> have a recipe for potential disaster, when someone eventually
> decides that enough time has passed, and the old vnconfig
> link to vndconfig should simply be dropped (changing the 4
> occurrences of vnconfig into vndconfig should be easy!)
I replaced the name.
> Of course, the same recipe for disaster (not potential) exists when
> there is no insalled vnconfig (or vndconfig) program at all - which is
> the case on the MFS mini-roon that's embedded in the INSTALL
> kernel.
We have the "have_vnd" variable to note that.
> Fixing this would mean actually writing the function, which
> given the simplicity of the other similar functions does't look
> like it should be too hard, eg, this example...
>
> void
> remove_cgd_options()
> {
> /*
> * No CGD available, remove the following menu entries:
> */
> remove_menu_option(MENU_pmdiskentry, MSG_encrypt);
> remove_menu_option(MENU_pmpartentry, MSG_encrypt);
> }
Indeed, but I'm not quite sure what menus exactly you want to remove
in this case.
Martin
From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: install/53220: sysinst dumps core with extended partitioning.
Date: Tue, 01 May 2018 19:57:58 +0700
Date: Tue, 1 May 2018 09:25:01 +0000 (UTC)
From: Martin Husemann <martin@duskware.de>
Message-ID: <20180501092501.9703B7A219@mollari.NetBSD.org>
| OK, you are using the ram disk based installers, while I tested with the
| CD image.
Yes, the original PR text from Lars-Johan Liman says:
3. Boot a Xen virtual machine with the 8.0_RC1
netbsd-INSTALL_XEN3_DOMU kernel.
And then, when I repeated it:
Boot netbsd-INSTALL kernel from
${RELEASE}/amd64/binary/kernel/netbsd-INSTALL_XEN3_DOMU.gz
though any netbsd-INSTALL kernel does the same (my earlier bare
metal test did not use the XEN3_DOMU kernel, obviously.)
| > gpt [-q] query dev
| I did a very lazy poor man's version of that.
That should be fine. I had in the back of my mind the potential to
add options to "gpt query" to test other things which a script might
want to know, but if that need ever eventuates, it can be done then,
and the "header" command can continue exit(1) for no GPT.
| > Of course, the same recipe for disaster (not potential) exists when
| > there is no insalled vnconfig (or vndconfig) program at all - which is
| > the case on the MFS mini-roon that's embedded in the INSTALL
| > kernel.
|
| We have the "have_vnd" variable to note that.
We do, so when have_vnd is false, the code does...
if (!have_vnd)
(void)0();
If someone was looking for a way to manufacture a core file
in a way that gdb would fail to be of much help, I can't think of
many easier strategies...
| Indeed, but I'm not quite sure what menus exactly you want to remove
| in this case.
No idea, I did not look that deeply, but if there are none, then just not
removing the not existing menus would be better than what it is doing now.
That is, change:
if (!have_vnd)
remove_vnd_options();
else if (!(vnds = calloc(MAX_VND, sizeof(*vnds))))
have_vnd = 0;
into
if (have_vnd && !(vnds = calloc(MAX_VND, sizeof(*vnds))))
have_vnd = 0;
It is worth noting that that small segment is the only use of have_vnd
anywhere, so why it sets have_vnd = 0 when it fails I have no idea.
The
if (firstrun) {
/* ... */
firstrun = 0;
}
block makes sure it is never done again, so that "have_vnd = 0"
is assigning to a variable that is never examined again. The
other have_xxx variables are (well, some of them at least, I did
not do an exhaustive search) also used elsewhere.
That is, it sure looks as if the remove_xxx_options() code is really
intended to remove stuff from the menus, so that other code which
will not work wiuthout vndconfig can never later attempt to execute it.
And I do see ...
if (have_lvm) {
pm_upddevlist_adv(m, arg, &i,
&(pm_upddevlist_adv_t) {MSG_create_cnd, PM_VND_T, &vnds_t_in
fo, 0, NULL});
pm_upddevlist_adv(m, arg, &i,
&(pm_upddevlist_adv_t) {MSG_create_vg, PM_LVM_T, &lvms_t_inf
o, 0,
&(pm_upddevlist_adv_t) {MSG_create_lv, PM_LVMLV_T, &lv_t_inf
o, 0, NULL}});
}
and also:
case PM_VND_T:
return pm_edit(PMV_MENU_END, pm_vnd_edit_menufmt,
pm_vnd_set_value, pm_vnd_check, pm_vnd_init,
NULL, ((part_entry_t *)arg)[m->cursel].dev_ptr,
and in pm_menufmt()
case PM_VND_T:
pm_vnd_menufmt(m, opt, arg);
break;
all of which might be related.
At a minimum, if vnds are needed for lvm config (why? I thought lvm
used vg not vnd ?) then perhaps
have_lvm = binary_available("lvm");
should become
have_lvm = have vnd && binary_available("lvm");
But I suspect that things are just more messed up than that.
kre
ps: aside: I also don't understand why from msg.mi.en the
message "Create virtual disk image (VND)" (or from msg.mi.de
"Erstellen Sie virtuelle Disk-Image (VND)" etc) is "message create_cnd"
instead of "message create_vnd" - looks weird - but that's just a name.
From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: install/53220: sysinst dumps core with extended partitioning.
Date: Wed, 02 May 2018 04:03:30 +0700
Ignore the vnd red herring, that turned out to be relevant to nothing (though
the change from vnconfig to vndconfig was certainly the right thing to do).
The actual cause for the core dump is the pm_lvm_find() function,
it assumes that lvms has been allocated to point to a suitable block
of storage, but when we !have_lvm:
if (!have_lvm)
remove_lvm_options();
else if (!(lvms = calloc(MAX_LVM_VG, sizeof(*lvms))))
have_lvm = 0;
the menu entries related to lvm get dleeted, so there's no way to refer to
them, but lvms is just left uninitialized.
Later partman() does
menu_num_entries = pm_upddevlist(&(menudesc){.opts = menu_entrie
and pm_upddevlist() does
changed = 0;
/* Mark all devices as not found */
SLIST_FOREACH(pm_i, &pm_head, l)
if (pm_i->found > 0)
pm_i->found = 0;
/* Detect all present devices */
(void)find_disks("partman");
pm_lvm_find();
pm_clean();
pm_lvm_find() does
for (i = 0; i < MAX_LVM_VG; i++) {
if (! lvms[i].blocked)
continue;
(and much more) using that uninit'd lvms.
This doesn't cause a problem with a fully populated root filesys
sitting under sysinst, as in that case have_lvm == 1, lvms gets calloc()'d
and all is fine.
To test this, I actually made pm_lvm_find() simply return 0 if (!have_lvm )
and with that change, the partman menu appeared (and at that point I
stopped testing... I did not have things set up in a way that the sets were
available, and playing with the partition editor would have meant changing
the setup I have used over and over again while hunting this down (just
in case there turns out to be more to hunt it is nice to keep as is)
However I think a better fix is:
Index: partman.c
===================================================================
RCS file: /cvsroot/src/usr.sbin/sysinst/partman.c,v
retrieving revision 1.19
diff -u -r1.19 partman.c
--- partman.c 1 May 2018 08:27:39 -0000 1.19
+++ partman.c 1 May 2018 20:54:13 -0000
@@ -2609,7 +2609,8 @@
pm_i->found = 0;
/* Detect all present devices */
(void)find_disks("partman");
- pm_lvm_find();
+ if (have_lvm)
+ pm_lvm_find();
pm_clean();
if (m == NULL || arg == NULL)
In addition it might also be a good idea to turn pm_lvm_find() into a
function returning void, rather than its current int return, which is
always 0, which is always (the fragment above is the only use)
ignored - though that's just futzing around.
kre
From: "Robert Elz" <kre@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/53220 CVS commit: src/usr.sbin/sysinst
Date: Tue, 1 May 2018 21:26:41 +0000
Module Name: src
Committed By: kre
Date: Tue May 1 21:26:41 UTC 2018
Modified Files:
src/usr.sbin/sysinst: partman.c
Log Message:
PR install/53220
Don't call pm_lvm_find() unless have_lvm
pm_lvm_find() assumes that data structs (lvms) has been allocated
for it to use, which doesn't happen if !have_lvm
This avoids a sysinst core dump when the lvm command is not installed
(such as when installing from the embedded RAM root filesys in an
INSTALL kernel.)
To generate a diff of this commit:
cvs rdiff -u -r1.19 -r1.20 src/usr.sbin/sysinst/partman.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: Lars-Johan Liman <liman@cafax.se>
To: gnats-bugs@NetBSD.org
Cc: martin@NetBSD.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: install/53220: sysinst dumps core with extended partitioning.
Date: Wed, 02 May 2018 09:01:58 -0400
Martin, Robert,
> | > gpt [-q] query dev
> | I did a very lazy poor man's version of that.
kre@munnari.OZ.AU:
> That should be fine.
What does "gpt show" return when there is no gpt table? Isn't the exit
status from that sufficient for testing?
(I'm currently behind a firewall from h-ll, so I can't test.)
Cheers,
/Liman
From: Robert Elz <kre@munnari.OZ.AU>
To: Lars-Johan Liman <liman@cafax.se>
Cc: gnats-bugs@NetBSD.org, martin@NetBSD.org, netbsd-bugs@netbsd.org
Subject: Re: install/53220: sysinst dumps core with extended partitioning.
Date: Thu, 03 May 2018 06:55:18 +0700
Date: Wed, 02 May 2018 09:01:58 -0400
From: Lars-Johan Liman <liman@cafax.se>
Message-ID: <22po2e1a9l.fsf@netnod.se>
| What does "gpt show" return when there is no gpt table?
Nothing too interesting.
| Isn't the exit status from that sufficient for testing?
No, it exits 0 for that,
I guess it could be changed, but the show command works,
it isn't failing - I think Martin's decision to make the header
command exit(1) is more appropriate - dumping the header
if there is none can be seen as an error.
kre
From: Martin Husemann <martin@duskware.de>
To: Robert Elz <kre@munnari.OZ.AU>
Cc: Lars-Johan Liman <liman@cafax.se>, gnats-bugs@NetBSD.org
Subject: Re: install/53220: sysinst dumps core with extended partitioning.
Date: Thu, 3 May 2018 08:13:10 +0200
On Thu, May 03, 2018 at 06:55:18AM +0700, Robert Elz wrote:
> | Isn't the exit status from that sufficient for testing?
>
> No, it exits 0 for that,
My initial idea was to change that, but:
"gpt show" also works on MBR labeld disks and prints reasonable output
there, so it seems wrong to return EXIT_FAILURE in that case.
Martin
From: Lars-Johan Liman <liman@cafax.se>
To: Martin Husemann <martin@duskware.de>
Cc: Robert Elz <kre@munnari.OZ.AU>, gnats-bugs@NetBSD.org
Subject: Re: install/53220: sysinst dumps core with extended partitioning.
Date: Thu, 03 May 2018 10:48:17 -0400
martin@duskware.de:
> On Thu, May 03, 2018 at 06:55:18AM +0700, Robert Elz wrote:
>> | Isn't the exit status from that sufficient for testing?
>>
>> No, it exits 0 for that,
> My initial idea was to change that, but:
> "gpt show" also works on MBR labeld disks and prints reasonable output
> there, so it seems wrong to return EXIT_FAILURE in that case.
Ah. Right. Agreed.
From: "Soren Jacobsen" <snj@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/53220 CVS commit: [netbsd-8] src/usr.sbin/sysinst
Date: Mon, 7 May 2018 04:06:31 +0000
Module Name: src
Committed By: snj
Date: Mon May 7 04:06:31 UTC 2018
Modified Files:
src/usr.sbin/sysinst [netbsd-8]: partman.c
Log Message:
Pull up following revision(s) (requested by kre in ticket #801):
usr.sbin/sysinst/partman.c: 1.20-1.21
PR install/53220
Don't call pm_lvm_find() unless have_lvm
pm_lvm_find() assumes that data structs (lvms) has been allocated
for it to use, which doesn't happen if !have_lvm
This avoids a sysinst core dump when the lvm command is not installed
(such as when installing from the embedded RAM root filesys in an
INSTALL kernel.)
--
Change return type of pm_lvm_find() from int to void.
It always returns (returned) 0 which was ignored by the one call.
To generate a diff of this commit:
cvs rdiff -u -r1.15.6.2 -r1.15.6.3 src/usr.sbin/sysinst/partman.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.43 2018/01/16 07:36:43 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.