NetBSD Problem Report #56118
From chris@groessler.org Tue Apr 20 16:32:22 2021
Return-Path: <chris@groessler.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 2D7831A9247
for <gnats-bugs@gnats.NetBSD.org>; Tue, 20 Apr 2021 16:32:22 +0000 (UTC)
Message-Id: <20210420151505.83DDFF1610@hppa.groessler.org>
Date: Tue, 20 Apr 2021 17:15:05 +0200 (CEST)
From: chris@groessler.org
Reply-To: chris@groessler.org
To: gnats-bugs@NetBSD.org
Subject: sporadic app crashes in HPPA -current
X-Send-Pr-Version: 3.95
>Number: 56118
>Category: port-hppa
>Synopsis: sporadic app crashes in HPPA -current
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: port-hppa-maintainer
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Tue Apr 20 16:35:00 +0000 2021
>Closed-Date: Mon Aug 29 14:50:13 +0000 2022
>Last-Modified: Mon Aug 29 14:50:13 +0000 2022
>Originator: Christian Groessler <chris@groessler.org>
>Release: NetBSD 9.99.81
>Organization:
private
>Environment:
System: NetBSD hppa.groessler.org 9.99.81 NetBSD 9.99.81 (CPGHPPA) #0: Wed Apr 14 23:36:28 CEST 2021 chris@blitz:/data/home/chris/tmp/netbsd/obj/data/home/chris/tmp/netbsd/src/sys/arch/hppa/compile/CPGHPPA hppa
Architecture: hppa
Machine: hppa
>Description:
'makemandb', run by the 'daily' script crashes with SIGSEGV. Running in gdb gives:
hppa# gdb /usr/sbin/makemandb #-Q
GNU gdb (GDB) 11.0.50.20200914-git
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "hppa--netbsd".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/sbin/makemandb...
Reading symbols from /usr/libdata/debug//usr/sbin/makemandb.debug...
(gdb) set args -Q
(gdb) r
Starting program: /usr/sbin/makemandb -Q
[New process 29082]
Thread 1 "" received signal SIGSEGV, Segmentation fault.
0x00035228 in __canonicalize_funcptr_for_compare ()
(gdb) bt
#0 0x00035228 in __canonicalize_funcptr_for_compare ()
#1 0x00013650 in mdoc_parse_Sh (n=<optimized out>, rec=0xb0001708) at /data/home/chris/tmp/netbsd/src/usr.sbin/makemandb/makemandb.c:1082
#2 0x00013670 in mdoc_parse_Sh (n=<optimized out>, rec=0xb0001708) at /data/home/chris/tmp/netbsd/src/usr.sbin/makemandb/makemandb.c:1098
#3 0x00012bac in proff_node (n=<optimized out>, rec=0xb0001708, roff=0xafe97060, func=0x38248 <mdocs>) at /data/home/chris/tmp/netbsd/src/usr.sbin/makemandb/makemandb.c:1171
#4 0x00036ac8 in begin_parse (fd=4, rec=0xb0001708, mp=0xafe9a000, file=0xaf952bb0 "/usr/share/man/man9lua/systm.9lua") at /data/home/chris/tmp/netbsd/src/usr.sbin/makemandb/makemandb.c:892
#5 update_db (rec=0xb0001708, mp=0xafe9a000, db=0xafeab208) at /data/home/chris/tmp/netbsd/src/usr.sbin/makemandb/makemandb.c:825
#6 main (argc=<optimized out>, argv=<optimized out>) at /data/home/chris/tmp/netbsd/src/usr.sbin/makemandb/makemandb.c:434
(gdb) x/4i $pc
=> 0x35228 <__canonicalize_funcptr_for_compare+48>: probei,r (r3),3,r20
0x3522c <__canonicalize_funcptr_for_compare+52>: cmpiclr,<> 0,r20,r0
0x35230 <__canonicalize_funcptr_for_compare+56>: b,l,n 0x3527c <__canonicalize_funcptr_for_compare+132>,r0
0x35234 <__canonicalize_funcptr_for_compare+60>: ldw 0(r3),r20
(gdb) inf reg r3 r20
r3 0x6c696e68 1818848872
r20 0xfff 4095
(gdb)
I've also built bash from prksrc/shells/bash, and this crashes at startup. Don't know if this is related, but for completeness:
hppa# gdb /usr/pkg/bin/bash
GNU gdb (GDB) 11.0.50.20200914-git
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "hppa--netbsd".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/pkg/bin/bash...
(No debugging symbols found in /usr/pkg/bin/bash)
(gdb) r
Starting program: /usr/pkg/bin/bash
Program received signal SIGSEGV, Segmentation fault.
0x000799fc in hash_search ()
(gdb) bt
#0 0x000799fc in hash_search ()
#1 0x0004e8a0 in find_tempenv_variable ()
#2 0x000c17b4 in ?? ()
#3 0xaf12c168 in jemalloc_secure_getenv (name=0xaf1d9624 "MALLOC_CONF") at /data/home/chris/tmp/netbsd/src/external/bsd/jemalloc/lib/../dist/src/jemalloc.c:744
#4 malloc_conf_init () at /data/home/chris/tmp/netbsd/src/external/bsd/jemalloc/lib/../dist/src/jemalloc.c:970
#5 0xaf12d3ac in malloc_init_hard_a0_locked () at /data/home/chris/tmp/netbsd/src/external/bsd/jemalloc/lib/../dist/src/jemalloc.c:1318
#6 0xaf12db20 in malloc_init_hard () at /data/home/chris/tmp/netbsd/src/external/bsd/jemalloc/lib/../dist/src/jemalloc.c:1554
#7 0xaf1cc1d0 in je_prof_thread_name_set () from /usr/lib/libc.so.12
#8 0xaf03fe0c in ?? () from /usr/lib/libc.so.12
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) x/4i $pc
=> 0x799fc <hash_search+496>: ldw 8(r5),ret0
0x79a00 <hash_search+500>: cmpib,<>,n 0,ret0,0x7984c <hash_search+64>
0x79a04 <hash_search+504>: ldb 0(r6),r8
0x79a08 <hash_search+508>: b,l 0x7990c <hash_search+256>,r0
(gdb) inf reg r5
r5 0xe84f1304 3897496324
(gdb)
My custom config file (CPGHPPA) just hard-codes the root partition.
[~/tmp/netbsd/src/sys/arch/hppa/conf]$ diff -u GENERIC CPGHPPA
--- GENERIC 2021-01-26 12:25:42.614136556 +0100
+++ CPGHPPA 2021-01-26 12:23:37.911318089 +0100
@@ -1,4 +1,4 @@
-# $NetBSD: GENERIC,v 1.37 2021/01/21 06:51:55 nia Exp $
+# $NetBSD: GENERIC,v 1.36 2020/09/27 13:48:51 roy Exp $
#
# GENERIC machine description file
#
@@ -23,7 +23,7 @@
options INCLUDE_CONFIG_FILE # embed config file in kernel binary
options SYSCTL_INCLUDE_DESCR # Include sysctl descriptions in kernel
-#ident "GENERIC-$Revision: 1.37 $"
+#ident "GENERIC-$Revision: 1.36 $"
maxusers 32 # estimated number of users
@@ -80,7 +80,6 @@
include "conf/compat_netbsd20.config"
#options COMPAT_LINUX # binary compatibility with Linux
-#options COMPAT_OSSAUDIO # binary compatibility with Linux
# File systems
file-system FFS # UFS
@@ -193,7 +192,8 @@
#options VGA_RASTERCONSOLE
# Kernel root file system and dump configuration.
-config netbsd root on ? type ?
+config netbsd root on sd0d type ffs
+#config netbsd root on ? type ?
#config netbsd root on sd0a type ffs
#config netbsd root on ? type nfs
>How-To-Repeat:
Build -current for HPPA and run 'makemandb -Q'.
>Fix:
>Release-Note:
>Audit-Trail:
From: coypu@sdf.org
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: port-hppa/56118: sporadic app crashes in HPPA -current
Date: Wed, 21 Apr 2021 11:01:09 +0000
Is this with GCC 10?
From: Christian Groessler <chris@groessler.org>
To: gnats-bugs@netbsd.org, port-hppa-maintainer@netbsd.org,
gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc:
Subject: Re: port-hppa/56118: sporadic app crashes in HPPA -current
Date: Wed, 21 Apr 2021 14:42:16 +0200
No, 9.3.0
hppa$ cc --version
cc (nb1 20200907) 9.3.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
The system was cross-compiled from amd64 Linux:
[~/tmp/netbsd]$ ./tools/bin/hppa--netbsd-gcc --version
hppa--netbsd-gcc (NetBSD nb1 20200907) 9.3.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
On 4/21/21 1:05 PM, coypu@sdf.org wrote:
> The following reply was made to PR port-hppa/56118; it has been noted by GNATS.
>
> From: coypu@sdf.org
> To: gnats-bugs@netbsd.org
> Cc:
> Subject: Re: port-hppa/56118: sporadic app crashes in HPPA -current
> Date: Wed, 21 Apr 2021 11:01:09 +0000
>
> Is this with GCC 10?
>
>
>
> ----------------------------------------------------------------------
> This e-mail was checked for spam by the freeware edition of CleanMail.
> The freeware edition is restricted to personal and non-commercial use.
> You can remove this notice by purchasing a commercial license:
> http://antispam.byteplant.com/products/cleanmail/index.html
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: port-hppa/56118: sporadic app crashes in HPPA -current
Date: Sat, 10 Jul 2021 04:28:57 +0000
On Tue, Apr 20, 2021 at 04:35:00PM +0000, chris@groessler.org wrote:
> #0 0x00035228 in __canonicalize_funcptr_for_compare ()
The name of that function scares me.
Seems that it comes from libgcc and it says
/* WARNING: The code is this function depends on internal and undocumented
details of the GNU linker and dynamic loader as implemented for parisc
linux. */
> #1 0x00013650 in mdoc_parse_Sh (n=<optimized out>, rec=0xb0001708) at /data/home/chris/tmp/netbsd/src/usr.sbin/makemandb/makemandb.c:1082
and sure enough, this is comparing function pointers.
I don't know enough about pa-risc to have any idea what it's about,
but probably something our dynamic linker's doing doesn't match the
expectations of the libgcc code.
--
David A. Holland
dholland@netbsd.org
From: Nick Hudson <nick.hudson@gmx.co.uk>
To: gnats-bugs@netbsd.org, port-hppa-maintainer@netbsd.org,
gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc:
Subject: Re: port-hppa/56118: sporadic app crashes in HPPA -current
Date: Sun, 11 Jul 2021 06:50:40 +0100
On 20/04/2021 17:35, chris@groessler.org wrote:
>
> Thread 1 "" received signal SIGSEGV, Segmentation fault.
> 0x00035228 in __canonicalize_funcptr_for_compare ()
> (gdb) bt
> #0 0x00035228 in __canonicalize_funcptr_for_compare ()
> #1 0x00013650 in mdoc_parse_Sh (n=3D<optimized out>, rec=3D0xb0001708) =
at /data/home/chris/tmp/netbsd/src/usr.sbin/makemandb/makemandb.c:1082
> #2 0x00013670 in mdoc_parse_Sh (n=3D<optimized out>, rec=3D0xb0001708) =
at /data/home/chris/tmp/netbsd/src/usr.sbin/makemandb/makemandb.c:1098
> #3 0x00012bac in proff_node (n=3D<optimized out>, rec=3D0xb0001708, rof=
f=3D0xafe97060, func=3D0x38248 <mdocs>) at /data/home/chris/tmp/netbsd/src=
/usr.sbin/makemandb/makemandb.c:1171
> #4 0x00036ac8 in begin_parse (fd=3D4, rec=3D0xb0001708, mp=3D0xafe9a000=
, file=3D0xaf952bb0 "/usr/share/man/man9lua/systm.9lua") at /data/home/chr=
is/tmp/netbsd/src/usr.sbin/makemandb/makemandb.c:892
> #5 update_db (rec=3D0xb0001708, mp=3D0xafe9a000, db=3D0xafeab208) at /d=
ata/home/chris/tmp/netbsd/src/usr.sbin/makemandb/makemandb.c:825
> #6 main (argc=3D<optimized out>, argv=3D<optimized out>) at /data/home/=
chris/tmp/netbsd/src/usr.sbin/makemandb/makemandb.c:434
> (gdb) x/4i $pc
> =3D> 0x35228 <__canonicalize_funcptr_for_compare+48>: probei,r (r3),3,r2=
0
> 0x3522c <__canonicalize_funcptr_for_compare+52>: cmpiclr,<> 0,r20,r0
> 0x35230 <__canonicalize_funcptr_for_compare+56>: b,l,n 0x3527c <__ca=
nonicalize_funcptr_for_compare+132>,r0
> 0x35234 <__canonicalize_funcptr_for_compare+60>: ldw 0(r3),r20
> (gdb) inf reg r3 r20
> r3 0x6c696e68 1818848872#
This looks more like 'h' 'n' 'i' l' than an userland address.
It'd be good to see the assembly before to see where r3 is coming from.
[snip]
> Starting program: /usr/pkg/bin/bash
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x000799fc in hash_search ()
> (gdb) bt
> #0 0x000799fc in hash_search ()
> #1 0x0004e8a0 in find_tempenv_variable ()
> #2 0x000c17b4 in ?? ()
> #3 0xaf12c168 in jemalloc_secure_getenv (name=3D0xaf1d9624 "MALLOC_CONF=
") at /data/home/chris/tmp/netbsd/src/external/bsd/jemalloc/lib/../dist/sr=
c/jemalloc.c:744
> #4 malloc_conf_init () at /data/home/chris/tmp/netbsd/src/external/bsd/=
jemalloc/lib/../dist/src/jemalloc.c:970
> #5 0xaf12d3ac in malloc_init_hard_a0_locked () at /data/home/chris/tmp/=
netbsd/src/external/bsd/jemalloc/lib/../dist/src/jemalloc.c:1318
> #6 0xaf12db20 in malloc_init_hard () at /data/home/chris/tmp/netbsd/src=
/external/bsd/jemalloc/lib/../dist/src/jemalloc.c:1554
> #7 0xaf1cc1d0 in je_prof_thread_name_set () from /usr/lib/libc.so.12
> #8 0xaf03fe0c in ?? () from /usr/lib/libc.so.12
> Backtrace stopped: previous frame identical to this frame (corrupt stack=
?)
> (gdb) x/4i $pc
> =3D> 0x799fc <hash_search+496>: ldw 8(r5),ret0
> 0x79a00 <hash_search+500>: cmpib,<>,n 0,ret0,0x7984c <hash_sea=
rch+64>
> 0x79a04 <hash_search+504>: ldb 0(r6),r8
> 0x79a08 <hash_search+508>: b,l 0x7990c <hash_search+256>,r0
> (gdb) inf reg r5
> r5 0xe84f1304 3897496324
> (gdb)
hmm, general heap corruption? I'll try locally.
Nick
From: Nick Hudson <nick.hudson@gmx.co.uk>
To: gnats-bugs@netbsd.org, port-hppa-maintainer@netbsd.org,
gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc:
Subject: Re: port-hppa/56118: sporadic app crashes in HPPA -current
Date: Fri, 16 Jul 2021 08:43:36 +0100
This is a multi-part message in MIME format.
--------------2FCC96F34F531F463FD4EE2A
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: quoted-printable
> I've also built bash from prksrc/shells/bash, and this crashes at startu=
p. Don't know if this is related, but for completeness:
[snip]
The bash problem is kinda interesting. bash provides its own getenv and
this is being used by early initialization code in libc / jemalloc when
it does getenv("MALLOC_CONF") before _start has been called in the main
program. _start sets the dp register which needs to contain the GOT of
the main program...
This diff allows bash to start.
I'm still thinking about if it's the right thing to do.
Nick
--------------2FCC96F34F531F463FD4EE2A
Content-Type: text/plain; charset=UTF-8;
name="diff"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
filename="diff"
SW5kZXg6IGFyY2gvaHBwYS9ocHBhX3JlbG9jLmMKPT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQpSQ1MgZmlsZTog
L2N2c3Jvb3Qvc3JjL2xpYmV4ZWMvbGQuZWxmX3NvL2FyY2gvaHBwYS9ocHBhX3JlbG9jLmMs
dgpyZXRyaWV2aW5nIHJldmlzaW9uIDEuNDcKZGlmZiAtdSAtcCAtcjEuNDcgaHBwYV9yZWxv
Yy5jCi0tLSBhcmNoL2hwcGEvaHBwYV9yZWxvYy5jCTE2IE1heSAyMDIwIDE2OjQzOjAwIC0w
MDAwCTEuNDcKKysrIGFyY2gvaHBwYS9ocHBhX3JlbG9jLmMJMTYgSnVsIDIwMjEgMDY6NTY6
NTEgLTAwMDAKQEAgLTUyLDYgKzUyLDcgQEAgX19SQ1NJRCgiJE5ldEJTRDogaHBwYV9yZWxv
Yy5jLHYgMS40NyAyMAogY2FkZHJfdCBfcnRsZF9iaW5kKGNvbnN0IE9ial9FbnRyeSAqLCBj
b25zdCBFbGZfQWRkcik7CiB2b2lkIF9ydGxkX2JpbmRfc3RhcnQodm9pZCk7CiB2b2lkIF9f
cnRsZF9zZXR1cF9ocHBhX3BsdGdvdChjb25zdCBPYmpfRW50cnkgKiwgRWxmX0FkZHIgKik7
Cit2b2lkIF9ydGxkX3NldF9kcChFbGZfQWRkciAqKTsKIAogLyoKICAqIEl0IGlzIHBvc3Np
YmxlIGZvciB0aGUgY29tcGlsZXIgdG8gZW1pdCByZWxvY2F0aW9ucyBmb3IgdW5hbGlnbmVk
IGRhdGEuCkBAIC0zODEsNiArMzgyLDEyIEBAIF9ydGxkX3NldHVwX3BsdGdvdChjb25zdCBP
YmpfRW50cnkgKm9iaikKIHsKIAlFbGZfV29yZCAqZ290ID0gb2JqLT5wbHRnb3Q7CiAKKwor
CWlmIChvYmotPm1haW5wcm9nKSB7CisJCWRiZygoInNldHRpbmcgRFAgdG8gJXAiLCBvYmot
PnBsdGdvdCk7CisJCV9ydGxkX3NldF9kcChvYmotPnBsdGdvdCk7CisJfQorCiAJYXNzZXJ0
KGdvdFstMl0gPT0gUExUX1NUVUJfTUFHSUMxKTsKIAlhc3NlcnQoZ290Wy0xXSA9PSBQTFRf
U1RVQl9NQUdJQzIpOwogCkluZGV4OiBhcmNoL2hwcGEvcnRsZF9zdGFydC5TCj09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT0KUkNTIGZpbGU6IC9jdnNyb290L3NyYy9saWJleGVjL2xkLmVsZl9zby9hcmNoL2hw
cGEvcnRsZF9zdGFydC5TLHYKcmV0cmlldmluZyByZXZpc2lvbiAxLjEzCmRpZmYgLXUgLXAg
LXIxLjEzIHJ0bGRfc3RhcnQuUwotLS0gYXJjaC9ocHBhL3J0bGRfc3RhcnQuUwkxMCBNYXkg
MjAyMCAwNjo0MjozOCAtMDAwMAkxLjEzCisrKyBhcmNoL2hwcGEvcnRsZF9zdGFydC5TCTE2
IEp1bCAyMDIxIDA2OjU2OjUxIC0wMDAwCkBAIC0yMzEsMyArMjMxLDkgQEAgRU5UUlkoX3J0
bGRfYmluZF9zdGFydCxIUFBBX0ZSQU1FX1NJWkUpCiAJYnYJJXIwKCVyMjEpCiAJbm9wCiBF
WElUKF9ydGxkX2JpbmRfc3RhcnQpCisKKworTEVBRl9FTlRSWV9OT1BST0ZJTEUoX3J0bGRf
c2V0X2RwKQorCWJ2CSVyMCglcnApCisJIGNvcHkJJWFyZzAsICVkcAorRVhJVChfcnRsZF9z
ZXRfZHApCg==
--------------2FCC96F34F531F463FD4EE2A--
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: port-hppa/56118: sporadic app crashes in HPPA -current
Date: Sat, 17 Jul 2021 03:44:25 +0000
On Fri, Jul 16, 2021 at 07:45:01AM +0000, Nick Hudson wrote:
> The bash problem is kinda interesting. bash provides its own getenv and
> this is being used by early initialization code in libc / jemalloc when
> it does getenv("MALLOC_CONF") before _start has been called in the main
> program. _start sets the dp register which needs to contain the GOT of
> the main program...
>
> This diff allows bash to start.
>
> I'm still thinking about if it's the right thing to do.
as I said in chat I kinda think it is -- while according to standards
randomly replacing bits of libc is undefined, according to tradition
it's supposed to work, so if we're going to be calling getenv before
_start that initialization had better happen before _start too. :-|
I would mark it as ugly and explain why it needs to be there, and
maybe revisit it if the situation improves at some point (not that
this is too likely) but go ahead.
(also because gnats sucks I put a decoded copy here:
https://www.netbsd.org/~dholland/gnatsblobs/56118/diff)
--
David A. Holland
dholland@netbsd.org
From: Christian Groessler <chris@groessler.org>
To: gnats-bugs@netbsd.org, port-hppa-maintainer@netbsd.org,
gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc:
Subject: Re: port-hppa/56118: sporadic app crashes in HPPA -current
Date: Tue, 27 Jul 2021 14:39:43 +0200
I've tried a variation of the diff, since it didn't work for me (missing
closing bracket):
Index: hppa_reloc.c
===================================================================
RCS file:
/nfs/swamp/zeug/netbsd-rsync/main/src/libexec/ld.elf_so/arch/hppa/hppa_reloc.c,v
retrieving revision 1.47
diff -u -r1.47 hppa_reloc.c
--- hppa_reloc.c 16 May 2020 16:43:00 -0000 1.47
+++ hppa_reloc.c 27 Jul 2021 12:37:23 -0000
@@ -52,6 +52,7 @@
caddr_t _rtld_bind(const Obj_Entry *, const Elf_Addr);
void _rtld_bind_start(void);
void __rtld_setup_hppa_pltgot(const Obj_Entry *, Elf_Addr *);
+void _rtld_set_dp(Elf_Addr *);
/*
* It is possible for the compiler to emit relocations for unaligned data.
@@ -381,6 +382,12 @@
{
Elf_Word *got = obj->pltgot;
+
+ if (obj->mainprog) {
+ hdbg(("setting DP to %p", obj->pltgot));
+ _rtld_set_dp(obj->pltgot);
+ }
+
assert(got[-2] == PLT_STUB_MAGIC1);
assert(got[-1] == PLT_STUB_MAGIC2);
Index: rtld_start.S
===================================================================
RCS file:
/nfs/swamp/zeug/netbsd-rsync/main/src/libexec/ld.elf_so/arch/hppa/rtld_start.S,v
retrieving revision 1.13
diff -u -r1.13 rtld_start.S
--- rtld_start.S 10 May 2020 06:42:38 -0000 1.13
+++ rtld_start.S 27 Jul 2021 12:37:23 -0000
@@ -231,3 +231,9 @@
bv %r0(%r21)
nop
EXIT(_rtld_bind_start)
+
+
+LEAF_ENTRY_NOPROFILE(_rtld_set_dp)
+ bv %r0(%rp)
+ copy %arg0, %dp
+EXIT(_rtld_set_dp)
But now it's worse:
hppa# cd /var/tmp/
hppa# ls
base.tgz debug.tgz games.tgz kern-GENERIC.tgz
misc.tgz rescue.tgz text.tgz xbase.tgz
xdebug.tgz xfont.tgz
comp.tgz etc.tgz kern-CPGHPPA.tgz man.tgz
modules.tgz tests.tgz vi.recover xcomp.tgz
xetc.tgz xserver.tgz
hppa# ls -l /bin/ls
-r-xr-xr-x 1 root wheel 37780 Apr 14 23:35 /bin/ls
hppa# tar -xzUf base.tgz -C /
hppa# ls -l /bin/ls
[1] Segmentation fault (core dumped) ls -l /bin/ls
hppa# hash -re
hash: Illegal option -e
hppa# hash -r
hppa# ls -l /bin/ls
[1] Segmentation fault (core dumped) ls -l /bin/ls
hppa# tar
[1] Segmentation fault (core dumped) tar
regards,
chris
From: "David H. Gutteridge" <david@gutteridge.ca>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: port-hppa/56118: sporadic app crashes in HPPA -current
Date: Mon, 28 Mar 2022 19:55:48 -0400
Probably this isn't news to anyone on this PR, but it's still an issue
with 9.99.95 kernel+userland. I just ran into this myself with some of
the same reproducers. In my case, this is on a B180L+.
Dave
From: Tom Lane <tgl@sss.pgh.pa.us>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: port-hppa/56118: sporadic app crashes in HPPA -current
Date: Sun, 05 Jun 2022 17:36:23 -0400
chris@groessler.org writes:
> Thread 1 "" received signal SIGSEGV, Segmentation fault.
> 0x00035228 in __canonicalize_funcptr_for_compare ()
> ...
> (gdb) x/4i $pc
> =3D> 0x35228 <__canonicalize_funcptr_for_compare+48>: probei,r (r3),3,r2=
0
> 0x3522c <__canonicalize_funcptr_for_compare+52>: cmpiclr,<> 0,r20,r0
> 0x35230 <__canonicalize_funcptr_for_compare+56>: b,l,n 0x3527c <__can=
onicalize_funcptr_for_compare+132>,r0
> 0x35234 <__canonicalize_funcptr_for_compare+60>: ldw 0(r3),r20
I poked into this a little bit. I think what is happening is that
makemandb assumes that it can do this function pointer comparison:
} else if (mdocs[n->tok] =3D=3D pmdoc_Xr) {
without regard for whether "mdocs[n->tok]" is a valid pointer;
as Nick Hudson points out downthread, the value being passed to
__canonicalize_funcptr_for_compare looks a lot more like a bit of
ASCII text than it does a pointer.
Now, __canonicalize_funcptr_for_compare does try to defend itself
against being given a bogus pointer, but it's crashing exactly where
it tries to do that. The PROBEI instruction corresponds to this bit
in external/gpl3/gcc/dist/libgcc/config/pa/fptr.c:
if (!_dl_read_access_allowed ((unsigned int)plabel))
return (unsigned int) fptr;
So seemingly, _dl_read_access_allowed fails on sufficiently-bogus
pointers. I looked up PROBEI in the HPPA Architecture Reference,
and I found out that it only knows how to give the correct answer
if the target address is in the data TLB table. If it isn't (which
of course it wouldn't be in this case), PROBEI causes a "non-access
data TLB miss fault". The manual then says:
If this instruction causes a non-access data TLB miss
fault/non-access data page fault, the operating system's handler is
required to search its page tables for the given address. If found,
it does the appropriate TLB insert and returns to the interrupting
instruction. If not found, the handler must decode the target field
of the instruction, set that GR to 0, set the IPSW[N] bit to 1, and
return to the interrupting instruction.
I don't have a good idea where to look in the NetBSD sources for
HPPA TLB miss handling, but I will bet lunch that it is not handling
this case correctly, because surely PROBEI will never actually try
to dereference the target address, as the SIGSEGV trap would seem to
indicate. I suspect the TLB miss logic just treats any invalid
address as SIGSEGV, without this special case for PROBEI.
(Note there is also a PROBE instruction with the same requirement.)
So the direct cause of the crash looks to me to be missing kernel
functionality. However, I'm also fairly suspicious of makemandb's
code, because it seems to me that comparing ASCII text to a function
pointer is someday going to result in a false match and ensuing
misbehavior. I wonder if that is basically an uninitialized-variable
problem.
The crash in bash does seem quite unrelated.
regards, tom lane
From: Tom Lane <tgl@sss.pgh.pa.us>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: port-hppa/56118: sporadic app crashes in HPPA -current
Date: Sun, 05 Jun 2022 22:34:39 -0400
I wrote:
> I poked into this a little bit. I think what is happening is that
> makemandb assumes that it can do this function pointer comparison:
> } else if (mdocs[n->tok] == pmdoc_Xr) {
> without regard for whether "mdocs[n->tok]" is a valid pointer;
Actually ... this code fragment is utterly broken, and seemingly
has been for awhile. mdocs[] is a constant array of function
pointers:
static const proff_nf mdocs[MDOC_MAX - MDOC_Dd] = {
NULL, /* Dd */
NULL, /* Dt */
and it is supposed to be indexed by token type minus MDOC_Dd,
not just token type. (If you're hoping that MDOC_Dd is zero,
it ain't.) So we're not even indexing the array correctly,
nor do we have any guards for an out-of-range token type,
which is how come we're managing to pass garbage to
__canonicalize_funcptr_for_compare. The one other usage of
mdocs[] in proff_node() gets these indexing considerations right.
So a minimal fix would look like
Index: makemandb.c
===================================================================
RCS file: /cvsroot/src/usr.sbin/makemandb/makemandb.c,v
retrieving revision 1.62
diff -u -r1.62 makemandb.c
--- makemandb.c 6 Apr 2022 03:23:38 -0000 1.62
+++ makemandb.c 6 Jun 2022 02:14:22 -0000
@@ -1078,14 +1078,16 @@
if (n->type == ROFFT_TEXT) {
mdoc_parse_section(n->sec, n->string, rec);
- } else if (mdocs[n->tok] == pmdoc_Xr) {
+ } else if (n->tok >= MDOC_Dd && n->tok < MDOC_MAX &&
+ mdocs[n->tok - MDOC_Dd] == pmdoc_Xr) {
/*
* When encountering other inline macros,
* call pmdoc_macro_handler.
*/
pmdoc_macro_handler(n, rec, MDOC_Xr);
xr_found = 1;
- } else if (mdocs[n->tok] == pmdoc_Pp) {
+ } else if (n->tok >= MDOC_Dd && n->tok < MDOC_MAX &&
+ mdocs[n->tok - MDOC_Dd] == pmdoc_Pp) {
pmdoc_macro_handler(n, rec, MDOC_Pp);
}
I applied this version and I find that "makemandb -Q" completes now
on my HPPA box, which it did not before. However, I don't have a
whole lot of faith in that being a 100% fix, because I don't think
it's entirely guaranteed that the program's GOT is swapped into the
TLB buffers when we reach this code. Seeing that pmdoc_Xr and
pmdoc_Pp appear only once in mdocs[], this code could be simplified
without change of semantics to
Index: makemandb.c
===================================================================
RCS file: /cvsroot/src/usr.sbin/makemandb/makemandb.c,v
retrieving revision 1.62
diff -u -r1.62 makemandb.c
--- makemandb.c 6 Apr 2022 03:23:38 -0000 1.62
+++ makemandb.c 6 Jun 2022 02:17:21 -0000
@@ -1078,14 +1078,14 @@
if (n->type == ROFFT_TEXT) {
mdoc_parse_section(n->sec, n->string, rec);
- } else if (mdocs[n->tok] == pmdoc_Xr) {
+ } else if (n->tok == MDOC_Xr) {
/*
* When encountering other inline macros,
* call pmdoc_macro_handler.
*/
pmdoc_macro_handler(n, rec, MDOC_Xr);
xr_found = 1;
- } else if (mdocs[n->tok] == pmdoc_Pp) {
+ } else if (n->tok == MDOC_Pp) {
pmdoc_macro_handler(n, rec, MDOC_Pp);
}
and on the whole I'd recommend that coding.
I wonder, however, why nobody has noticed that this code doesn't
work as intended on any platform. I don't know enough about roff
to devise a test case for it, but maybe one is needed.
Meanwhile, I still think that the HPPA kernel support for PROBE[I]
is broken.
regards, tom lane
State-Changed-From-To: open->analyzed
State-Changed-By: skrll@NetBSD.org
State-Changed-When: Mon, 06 Jun 2022 09:40:50 +0000
State-Changed-Why:
After fixing some bugs along the way this PR is about missing probe[rw]{,i}
functionality.
From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/56118 CVS commit: [netbsd-9] src/usr.sbin/makemandb
Date: Mon, 6 Jun 2022 11:34:12 +0000
Module Name: src
Committed By: martin
Date: Mon Jun 6 11:34:12 UTC 2022
Modified Files:
src/usr.sbin/makemandb [netbsd-9]: makemandb.c
Log Message:
Pull up following revision(s) (requested by skrll in ticket #1465):
usr.sbin/makemandb/makemandb.c: revision 1.63
Don't index outside the mdocs array of function pointers. Analysis and
suggested fixes from Tom Lane. I played it safe and went with (my
variation of) the minimal fix.
PR port-hppa/56118: sporadic app crashes in HPPA -current
To generate a diff of this commit:
cvs rdiff -u -r1.60 -r1.60.2.1 src/usr.sbin/makemandb/makemandb.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Nick Hudson" <skrll@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/56118 CVS commit: src/sys/arch/hppa/hppa
Date: Thu, 9 Jun 2022 16:38:23 +0000
Module Name: src
Committed By: skrll
Date: Thu Jun 9 16:38:23 UTC 2022
Modified Files:
src/sys/arch/hppa/hppa: pmap.c trap.c
Log Message:
Handle 'NA' (non-access) traps for the lpa and probe instructions. The
change is inspired by OpenBSD with a bunch of my own, mainly stylistic,
changes.
Thanks to Tom Lane for the analysis.
PR/56118: sporadic app crashes in HPPA -current
To generate a diff of this commit:
cvs rdiff -u -r1.117 -r1.118 src/sys/arch/hppa/hppa/pmap.c
cvs rdiff -u -r1.118 -r1.119 src/sys/arch/hppa/hppa/trap.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/56118 CVS commit: [netbsd-9] src/sys/arch/hppa/hppa
Date: Fri, 10 Jun 2022 16:28:16 +0000
Module Name: src
Committed By: martin
Date: Fri Jun 10 16:28:16 UTC 2022
Modified Files:
src/sys/arch/hppa/hppa [netbsd-9]: trap.c
Log Message:
Pull up following revision(s) (requested by skrll in ticket #1466):
sys/arch/hppa/hppa/trap.c: revision 1.119
Handle 'NA' (non-access) traps for the lpa and probe instructions. The
change is inspired by OpenBSD with a bunch of my own, mainly stylistic,
changes.
Thanks to Tom Lane for the analysis.
PR/56118: sporadic app crashes in HPPA -current
To generate a diff of this commit:
cvs rdiff -u -r1.111.4.1 -r1.111.4.2 src/sys/arch/hppa/hppa/trap.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: analyzed->closed
State-Changed-By: skrll@NetBSD.org
State-Changed-When: Mon, 29 Aug 2022 14:50:13 +0000
State-Changed-Why:
Problem fixed.
>Unformatted:
-current from Apr-14-2021, +- 2 days
(Contact us)
$NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.