NetBSD Problem Report #56118

From chris@groessler.org  Tue Apr 20 16:32:22 2021
Return-Path: <chris@groessler.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 2D7831A9247
	for <gnats-bugs@gnats.NetBSD.org>; Tue, 20 Apr 2021 16:32:22 +0000 (UTC)
Message-Id: <20210420151505.83DDFF1610@hppa.groessler.org>
Date: Tue, 20 Apr 2021 17:15:05 +0200 (CEST)
From: chris@groessler.org
Reply-To: chris@groessler.org
To: gnats-bugs@NetBSD.org
Subject: sporadic app crashes in HPPA -current 
X-Send-Pr-Version: 3.95

>Number:         56118
>Category:       port-hppa
>Synopsis:       sporadic app crashes in HPPA -current
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    port-hppa-maintainer
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Apr 20 16:35:00 +0000 2021
>Closed-Date:    Mon Aug 29 14:50:13 +0000 2022
>Last-Modified:  Mon Aug 29 14:50:13 +0000 2022
>Originator:     Christian Groessler <chris@groessler.org>
>Release:        NetBSD 9.99.81
>Organization:
private

>Environment:


System: NetBSD hppa.groessler.org 9.99.81 NetBSD 9.99.81 (CPGHPPA) #0: Wed Apr 14 23:36:28 CEST 2021 chris@blitz:/data/home/chris/tmp/netbsd/obj/data/home/chris/tmp/netbsd/src/sys/arch/hppa/compile/CPGHPPA hppa
Architecture: hppa
Machine: hppa
>Description:
	'makemandb', run by the 'daily' script crashes with SIGSEGV. Running in gdb gives:

hppa# gdb /usr/sbin/makemandb #-Q
GNU gdb (GDB) 11.0.50.20200914-git
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "hppa--netbsd".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/sbin/makemandb...
Reading symbols from /usr/libdata/debug//usr/sbin/makemandb.debug...
(gdb) set args -Q
(gdb) r
Starting program: /usr/sbin/makemandb -Q
[New process 29082]

Thread 1 "" received signal SIGSEGV, Segmentation fault.
0x00035228 in __canonicalize_funcptr_for_compare ()
(gdb) bt
#0  0x00035228 in __canonicalize_funcptr_for_compare ()
#1  0x00013650 in mdoc_parse_Sh (n=<optimized out>, rec=0xb0001708) at /data/home/chris/tmp/netbsd/src/usr.sbin/makemandb/makemandb.c:1082
#2  0x00013670 in mdoc_parse_Sh (n=<optimized out>, rec=0xb0001708) at /data/home/chris/tmp/netbsd/src/usr.sbin/makemandb/makemandb.c:1098
#3  0x00012bac in proff_node (n=<optimized out>, rec=0xb0001708, roff=0xafe97060, func=0x38248 <mdocs>) at /data/home/chris/tmp/netbsd/src/usr.sbin/makemandb/makemandb.c:1171
#4  0x00036ac8 in begin_parse (fd=4, rec=0xb0001708, mp=0xafe9a000, file=0xaf952bb0 "/usr/share/man/man9lua/systm.9lua") at /data/home/chris/tmp/netbsd/src/usr.sbin/makemandb/makemandb.c:892
#5  update_db (rec=0xb0001708, mp=0xafe9a000, db=0xafeab208) at /data/home/chris/tmp/netbsd/src/usr.sbin/makemandb/makemandb.c:825
#6  main (argc=<optimized out>, argv=<optimized out>) at /data/home/chris/tmp/netbsd/src/usr.sbin/makemandb/makemandb.c:434
(gdb) x/4i $pc
=> 0x35228 <__canonicalize_funcptr_for_compare+48>: probei,r (r3),3,r20
   0x3522c <__canonicalize_funcptr_for_compare+52>: cmpiclr,<> 0,r20,r0
   0x35230 <__canonicalize_funcptr_for_compare+56>: b,l,n 0x3527c <__canonicalize_funcptr_for_compare+132>,r0
   0x35234 <__canonicalize_funcptr_for_compare+60>: ldw 0(r3),r20
(gdb) inf reg r3 r20
r3             0x6c696e68          1818848872
r20            0xfff               4095
(gdb)

I've also built bash from prksrc/shells/bash, and this crashes at startup. Don't know if this is related, but for completeness:

hppa# gdb /usr/pkg/bin/bash
GNU gdb (GDB) 11.0.50.20200914-git
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "hppa--netbsd".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/pkg/bin/bash...
(No debugging symbols found in /usr/pkg/bin/bash)
(gdb) r
Starting program: /usr/pkg/bin/bash 

Program received signal SIGSEGV, Segmentation fault.
0x000799fc in hash_search ()
(gdb) bt 
#0  0x000799fc in hash_search ()
#1  0x0004e8a0 in find_tempenv_variable ()
#2  0x000c17b4 in ?? ()
#3  0xaf12c168 in jemalloc_secure_getenv (name=0xaf1d9624 "MALLOC_CONF") at /data/home/chris/tmp/netbsd/src/external/bsd/jemalloc/lib/../dist/src/jemalloc.c:744
#4  malloc_conf_init () at /data/home/chris/tmp/netbsd/src/external/bsd/jemalloc/lib/../dist/src/jemalloc.c:970
#5  0xaf12d3ac in malloc_init_hard_a0_locked () at /data/home/chris/tmp/netbsd/src/external/bsd/jemalloc/lib/../dist/src/jemalloc.c:1318
#6  0xaf12db20 in malloc_init_hard () at /data/home/chris/tmp/netbsd/src/external/bsd/jemalloc/lib/../dist/src/jemalloc.c:1554
#7  0xaf1cc1d0 in je_prof_thread_name_set () from /usr/lib/libc.so.12
#8  0xaf03fe0c in ?? () from /usr/lib/libc.so.12
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) x/4i $pc
=> 0x799fc <hash_search+496>:       ldw 8(r5),ret0
   0x79a00 <hash_search+500>:       cmpib,<>,n 0,ret0,0x7984c <hash_search+64>
   0x79a04 <hash_search+504>:       ldb 0(r6),r8
   0x79a08 <hash_search+508>:       b,l 0x7990c <hash_search+256>,r0
(gdb) inf reg r5
r5             0xe84f1304          3897496324
(gdb) 


My custom config file (CPGHPPA) just hard-codes the root partition.
[~/tmp/netbsd/src/sys/arch/hppa/conf]$ diff -u GENERIC CPGHPPA 
--- GENERIC     2021-01-26 12:25:42.614136556 +0100
+++ CPGHPPA     2021-01-26 12:23:37.911318089 +0100
@@ -1,4 +1,4 @@
-# $NetBSD: GENERIC,v 1.37 2021/01/21 06:51:55 nia Exp $
+# $NetBSD: GENERIC,v 1.36 2020/09/27 13:48:51 roy Exp $
 #
 # GENERIC machine description file
 #
@@ -23,7 +23,7 @@
 options        INCLUDE_CONFIG_FILE     # embed config file in kernel binary
 options        SYSCTL_INCLUDE_DESCR    # Include sysctl descriptions in kernel

-#ident                 "GENERIC-$Revision: 1.37 $"
+#ident                 "GENERIC-$Revision: 1.36 $"

 maxusers       32              # estimated number of users

@@ -80,7 +80,6 @@
 include        "conf/compat_netbsd20.config"

 #options       COMPAT_LINUX    # binary compatibility with Linux
-#options       COMPAT_OSSAUDIO # binary compatibility with Linux

 # File systems
 file-system    FFS             # UFS
@@ -193,7 +192,8 @@
 #options       VGA_RASTERCONSOLE

 # Kernel root file system and dump configuration.
-config         netbsd  root on ? type ?
+config         netbsd  root on sd0d type ffs
+#config                netbsd  root on ? type ?
 #config                netbsd  root on sd0a type ffs
 #config                netbsd  root on ? type nfs


>How-To-Repeat:
	Build -current for HPPA and run 'makemandb -Q'.
>Fix:


>Release-Note:

>Audit-Trail:
From: coypu@sdf.org
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-hppa/56118: sporadic app crashes in HPPA -current
Date: Wed, 21 Apr 2021 11:01:09 +0000

 Is this with GCC 10?

From: Christian Groessler <chris@groessler.org>
To: gnats-bugs@netbsd.org, port-hppa-maintainer@netbsd.org,
        gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc: 
Subject: Re: port-hppa/56118: sporadic app crashes in HPPA -current
Date: Wed, 21 Apr 2021 14:42:16 +0200

 No, 9.3.0


 hppa$ cc --version
 cc (nb1 20200907) 9.3.0
 Copyright (C) 2019 Free Software Foundation, Inc.
 This is free software; see the source for copying conditions. There is NO
 warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


 The system was cross-compiled from amd64 Linux:

 [~/tmp/netbsd]$ ./tools/bin/hppa--netbsd-gcc --version
 hppa--netbsd-gcc (NetBSD nb1 20200907) 9.3.0
 Copyright (C) 2019 Free Software Foundation, Inc.
 This is free software; see the source for copying conditions. There is NO
 warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


 On 4/21/21 1:05 PM, coypu@sdf.org wrote:
 > The following reply was made to PR port-hppa/56118; it has been noted by GNATS.
 >
 > From: coypu@sdf.org
 > To: gnats-bugs@netbsd.org
 > Cc:
 > Subject: Re: port-hppa/56118: sporadic app crashes in HPPA -current
 > Date: Wed, 21 Apr 2021 11:01:09 +0000
 >
 >   Is this with GCC 10?
 >   
 >
 >
 > ----------------------------------------------------------------------
 > This e-mail was checked for spam by the freeware edition of CleanMail.
 > The freeware edition is restricted to personal and non-commercial use.
 > You can remove this notice by purchasing a commercial license:
 > http://antispam.byteplant.com/products/cleanmail/index.html

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-hppa/56118: sporadic app crashes in HPPA -current
Date: Sat, 10 Jul 2021 04:28:57 +0000

 On Tue, Apr 20, 2021 at 04:35:00PM +0000, chris@groessler.org wrote:
  > #0  0x00035228 in __canonicalize_funcptr_for_compare ()

 The name of that function scares me.

 Seems that it comes from libgcc and it says

 /* WARNING: The code is this function depends on internal and undocumented
    details of the GNU linker and dynamic loader as implemented for parisc
    linux.  */

  > #1  0x00013650 in mdoc_parse_Sh (n=<optimized out>, rec=0xb0001708) at /data/home/chris/tmp/netbsd/src/usr.sbin/makemandb/makemandb.c:1082

 and sure enough, this is comparing function pointers.

 I don't know enough about pa-risc to have any idea what it's about,
 but probably something our dynamic linker's doing doesn't match the
 expectations of the libgcc code.

 -- 
 David A. Holland
 dholland@netbsd.org

From: Nick Hudson <nick.hudson@gmx.co.uk>
To: gnats-bugs@netbsd.org, port-hppa-maintainer@netbsd.org,
 gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc: 
Subject: Re: port-hppa/56118: sporadic app crashes in HPPA -current
Date: Sun, 11 Jul 2021 06:50:40 +0100

 On 20/04/2021 17:35, chris@groessler.org wrote:

 >
 > Thread 1 "" received signal SIGSEGV, Segmentation fault.
 > 0x00035228 in __canonicalize_funcptr_for_compare ()
 > (gdb) bt
 > #0  0x00035228 in __canonicalize_funcptr_for_compare ()
 > #1  0x00013650 in mdoc_parse_Sh (n=3D<optimized out>, rec=3D0xb0001708) =
 at /data/home/chris/tmp/netbsd/src/usr.sbin/makemandb/makemandb.c:1082
 > #2  0x00013670 in mdoc_parse_Sh (n=3D<optimized out>, rec=3D0xb0001708) =
 at /data/home/chris/tmp/netbsd/src/usr.sbin/makemandb/makemandb.c:1098
 > #3  0x00012bac in proff_node (n=3D<optimized out>, rec=3D0xb0001708, rof=
 f=3D0xafe97060, func=3D0x38248 <mdocs>) at /data/home/chris/tmp/netbsd/src=
 /usr.sbin/makemandb/makemandb.c:1171
 > #4  0x00036ac8 in begin_parse (fd=3D4, rec=3D0xb0001708, mp=3D0xafe9a000=
 , file=3D0xaf952bb0 "/usr/share/man/man9lua/systm.9lua") at /data/home/chr=
 is/tmp/netbsd/src/usr.sbin/makemandb/makemandb.c:892
 > #5  update_db (rec=3D0xb0001708, mp=3D0xafe9a000, db=3D0xafeab208) at /d=
 ata/home/chris/tmp/netbsd/src/usr.sbin/makemandb/makemandb.c:825
 > #6  main (argc=3D<optimized out>, argv=3D<optimized out>) at /data/home/=
 chris/tmp/netbsd/src/usr.sbin/makemandb/makemandb.c:434
 > (gdb) x/4i $pc
 > =3D> 0x35228 <__canonicalize_funcptr_for_compare+48>: probei,r (r3),3,r2=
 0
 >     0x3522c <__canonicalize_funcptr_for_compare+52>: cmpiclr,<> 0,r20,r0
 >     0x35230 <__canonicalize_funcptr_for_compare+56>: b,l,n 0x3527c <__ca=
 nonicalize_funcptr_for_compare+132>,r0
 >     0x35234 <__canonicalize_funcptr_for_compare+60>: ldw 0(r3),r20
 > (gdb) inf reg r3 r20
 > r3             0x6c696e68          1818848872#

 This looks more like 'h' 'n' 'i' l' than an userland address.

 It'd be good to see the assembly before to see where r3 is coming from.

 [snip]

 > Starting program: /usr/pkg/bin/bash
 >
 > Program received signal SIGSEGV, Segmentation fault.
 > 0x000799fc in hash_search ()
 > (gdb) bt
 > #0  0x000799fc in hash_search ()
 > #1  0x0004e8a0 in find_tempenv_variable ()
 > #2  0x000c17b4 in ?? ()
 > #3  0xaf12c168 in jemalloc_secure_getenv (name=3D0xaf1d9624 "MALLOC_CONF=
 ") at /data/home/chris/tmp/netbsd/src/external/bsd/jemalloc/lib/../dist/sr=
 c/jemalloc.c:744
 > #4  malloc_conf_init () at /data/home/chris/tmp/netbsd/src/external/bsd/=
 jemalloc/lib/../dist/src/jemalloc.c:970
 > #5  0xaf12d3ac in malloc_init_hard_a0_locked () at /data/home/chris/tmp/=
 netbsd/src/external/bsd/jemalloc/lib/../dist/src/jemalloc.c:1318
 > #6  0xaf12db20 in malloc_init_hard () at /data/home/chris/tmp/netbsd/src=
 /external/bsd/jemalloc/lib/../dist/src/jemalloc.c:1554
 > #7  0xaf1cc1d0 in je_prof_thread_name_set () from /usr/lib/libc.so.12
 > #8  0xaf03fe0c in ?? () from /usr/lib/libc.so.12
 > Backtrace stopped: previous frame identical to this frame (corrupt stack=
 ?)
 > (gdb) x/4i $pc
 > =3D> 0x799fc <hash_search+496>:       ldw 8(r5),ret0
 >     0x79a00 <hash_search+500>:       cmpib,<>,n 0,ret0,0x7984c <hash_sea=
 rch+64>
 >     0x79a04 <hash_search+504>:       ldb 0(r6),r8
 >     0x79a08 <hash_search+508>:       b,l 0x7990c <hash_search+256>,r0
 > (gdb) inf reg r5
 > r5             0xe84f1304          3897496324
 > (gdb)

 hmm, general heap corruption? I'll try locally.


 Nick

From: Nick Hudson <nick.hudson@gmx.co.uk>
To: gnats-bugs@netbsd.org, port-hppa-maintainer@netbsd.org,
 gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc: 
Subject: Re: port-hppa/56118: sporadic app crashes in HPPA -current
Date: Fri, 16 Jul 2021 08:43:36 +0100

 This is a multi-part message in MIME format.
 --------------2FCC96F34F531F463FD4EE2A
 Content-Type: text/plain; charset=utf-8; format=flowed
 Content-Transfer-Encoding: quoted-printable


 > I've also built bash from prksrc/shells/bash, and this crashes at startu=
 p. Don't know if this is related, but for completeness:

 [snip]

 The bash problem is kinda interesting. bash provides its own getenv and
 this is being used by early initialization code in libc / jemalloc when
 it does getenv("MALLOC_CONF") before _start has been called in the main
 program. _start sets the dp register which needs to contain the GOT of
 the main program...

 This diff allows bash to start.

 I'm still thinking about if it's the right thing to do.

 Nick


 --------------2FCC96F34F531F463FD4EE2A
 Content-Type: text/plain; charset=UTF-8;
  name="diff"
 Content-Transfer-Encoding: base64
 Content-Disposition: attachment;
  filename="diff"

 SW5kZXg6IGFyY2gvaHBwYS9ocHBhX3JlbG9jLmMKPT09PT09PT09PT09PT09PT09PT09PT09
 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQpSQ1MgZmlsZTog
 L2N2c3Jvb3Qvc3JjL2xpYmV4ZWMvbGQuZWxmX3NvL2FyY2gvaHBwYS9ocHBhX3JlbG9jLmMs
 dgpyZXRyaWV2aW5nIHJldmlzaW9uIDEuNDcKZGlmZiAtdSAtcCAtcjEuNDcgaHBwYV9yZWxv
 Yy5jCi0tLSBhcmNoL2hwcGEvaHBwYV9yZWxvYy5jCTE2IE1heSAyMDIwIDE2OjQzOjAwIC0w
 MDAwCTEuNDcKKysrIGFyY2gvaHBwYS9ocHBhX3JlbG9jLmMJMTYgSnVsIDIwMjEgMDY6NTY6
 NTEgLTAwMDAKQEAgLTUyLDYgKzUyLDcgQEAgX19SQ1NJRCgiJE5ldEJTRDogaHBwYV9yZWxv
 Yy5jLHYgMS40NyAyMAogY2FkZHJfdCBfcnRsZF9iaW5kKGNvbnN0IE9ial9FbnRyeSAqLCBj
 b25zdCBFbGZfQWRkcik7CiB2b2lkIF9ydGxkX2JpbmRfc3RhcnQodm9pZCk7CiB2b2lkIF9f
 cnRsZF9zZXR1cF9ocHBhX3BsdGdvdChjb25zdCBPYmpfRW50cnkgKiwgRWxmX0FkZHIgKik7
 Cit2b2lkIF9ydGxkX3NldF9kcChFbGZfQWRkciAqKTsKIAogLyoKICAqIEl0IGlzIHBvc3Np
 YmxlIGZvciB0aGUgY29tcGlsZXIgdG8gZW1pdCByZWxvY2F0aW9ucyBmb3IgdW5hbGlnbmVk
 IGRhdGEuCkBAIC0zODEsNiArMzgyLDEyIEBAIF9ydGxkX3NldHVwX3BsdGdvdChjb25zdCBP
 YmpfRW50cnkgKm9iaikKIHsKIAlFbGZfV29yZCAqZ290ID0gb2JqLT5wbHRnb3Q7CiAKKwor
 CWlmIChvYmotPm1haW5wcm9nKSB7CisJCWRiZygoInNldHRpbmcgRFAgdG8gJXAiLCBvYmot
 PnBsdGdvdCk7CisJCV9ydGxkX3NldF9kcChvYmotPnBsdGdvdCk7CisJfQorCiAJYXNzZXJ0
 KGdvdFstMl0gPT0gUExUX1NUVUJfTUFHSUMxKTsKIAlhc3NlcnQoZ290Wy0xXSA9PSBQTFRf
 U1RVQl9NQUdJQzIpOwogCkluZGV4OiBhcmNoL2hwcGEvcnRsZF9zdGFydC5TCj09PT09PT09
 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
 PT09PT0KUkNTIGZpbGU6IC9jdnNyb290L3NyYy9saWJleGVjL2xkLmVsZl9zby9hcmNoL2hw
 cGEvcnRsZF9zdGFydC5TLHYKcmV0cmlldmluZyByZXZpc2lvbiAxLjEzCmRpZmYgLXUgLXAg
 LXIxLjEzIHJ0bGRfc3RhcnQuUwotLS0gYXJjaC9ocHBhL3J0bGRfc3RhcnQuUwkxMCBNYXkg
 MjAyMCAwNjo0MjozOCAtMDAwMAkxLjEzCisrKyBhcmNoL2hwcGEvcnRsZF9zdGFydC5TCTE2
 IEp1bCAyMDIxIDA2OjU2OjUxIC0wMDAwCkBAIC0yMzEsMyArMjMxLDkgQEAgRU5UUlkoX3J0
 bGRfYmluZF9zdGFydCxIUFBBX0ZSQU1FX1NJWkUpCiAJYnYJJXIwKCVyMjEpCiAJbm9wCiBF
 WElUKF9ydGxkX2JpbmRfc3RhcnQpCisKKworTEVBRl9FTlRSWV9OT1BST0ZJTEUoX3J0bGRf
 c2V0X2RwKQorCWJ2CSVyMCglcnApCisJIGNvcHkJJWFyZzAsICVkcAorRVhJVChfcnRsZF9z
 ZXRfZHApCg==
 --------------2FCC96F34F531F463FD4EE2A--

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-hppa/56118: sporadic app crashes in HPPA -current
Date: Sat, 17 Jul 2021 03:44:25 +0000

 On Fri, Jul 16, 2021 at 07:45:01AM +0000, Nick Hudson wrote:
  >  The bash problem is kinda interesting. bash provides its own getenv and
  >  this is being used by early initialization code in libc / jemalloc when
  >  it does getenv("MALLOC_CONF") before _start has been called in the main
  >  program. _start sets the dp register which needs to contain the GOT of
  >  the main program...
  >  
  >  This diff allows bash to start.
  >  
  >  I'm still thinking about if it's the right thing to do.

 as I said in chat I kinda think it is -- while according to standards
 randomly replacing bits of libc is undefined, according to tradition
 it's supposed to work, so if we're going to be calling getenv before
 _start that initialization had better happen before _start too. :-|

 I would mark it as ugly and explain why it needs to be there, and
 maybe revisit it if the situation improves at some point (not that
 this is too likely) but go ahead.

 (also because gnats sucks I put a decoded copy here:
 https://www.netbsd.org/~dholland/gnatsblobs/56118/diff)

 -- 
 David A. Holland
 dholland@netbsd.org

From: Christian Groessler <chris@groessler.org>
To: gnats-bugs@netbsd.org, port-hppa-maintainer@netbsd.org,
        gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc: 
Subject: Re: port-hppa/56118: sporadic app crashes in HPPA -current
Date: Tue, 27 Jul 2021 14:39:43 +0200

 I've tried a variation of the diff, since it didn't work for me (missing 
 closing bracket):

 Index: hppa_reloc.c
 ===================================================================
 RCS file: 
 /nfs/swamp/zeug/netbsd-rsync/main/src/libexec/ld.elf_so/arch/hppa/hppa_reloc.c,v
 retrieving revision 1.47
 diff -u -r1.47 hppa_reloc.c
 --- hppa_reloc.c        16 May 2020 16:43:00 -0000      1.47
 +++ hppa_reloc.c        27 Jul 2021 12:37:23 -0000
 @@ -52,6 +52,7 @@
   caddr_t _rtld_bind(const Obj_Entry *, const Elf_Addr);
   void _rtld_bind_start(void);
   void __rtld_setup_hppa_pltgot(const Obj_Entry *, Elf_Addr *);
 +void _rtld_set_dp(Elf_Addr *);

   /*
    * It is possible for the compiler to emit relocations for unaligned data.
 @@ -381,6 +382,12 @@
   {
          Elf_Word *got = obj->pltgot;

 +
 +       if (obj->mainprog) {
 +               hdbg(("setting DP to %p", obj->pltgot));
 +               _rtld_set_dp(obj->pltgot);
 +       }
 +
          assert(got[-2] == PLT_STUB_MAGIC1);
          assert(got[-1] == PLT_STUB_MAGIC2);

 Index: rtld_start.S
 ===================================================================
 RCS file: 
 /nfs/swamp/zeug/netbsd-rsync/main/src/libexec/ld.elf_so/arch/hppa/rtld_start.S,v
 retrieving revision 1.13
 diff -u -r1.13 rtld_start.S
 --- rtld_start.S        10 May 2020 06:42:38 -0000      1.13
 +++ rtld_start.S        27 Jul 2021 12:37:23 -0000
 @@ -231,3 +231,9 @@
          bv      %r0(%r21)
          nop
   EXIT(_rtld_bind_start)
 +
 +
 +LEAF_ENTRY_NOPROFILE(_rtld_set_dp)
 +       bv      %r0(%rp)
 +        copy   %arg0, %dp
 +EXIT(_rtld_set_dp)



 But now it's worse:

 hppa# cd /var/tmp/
 hppa# ls
 base.tgz         debug.tgz        games.tgz        kern-GENERIC.tgz 
 misc.tgz         rescue.tgz       text.tgz         xbase.tgz 
 xdebug.tgz       xfont.tgz
 comp.tgz         etc.tgz          kern-CPGHPPA.tgz man.tgz 
 modules.tgz      tests.tgz        vi.recover       xcomp.tgz 
 xetc.tgz         xserver.tgz
 hppa# ls -l /bin/ls
 -r-xr-xr-x  1 root  wheel  37780 Apr 14 23:35 /bin/ls
 hppa# tar -xzUf base.tgz -C /
 hppa# ls -l /bin/ls
 [1]   Segmentation fault (core dumped) ls -l /bin/ls
 hppa# hash -re
 hash: Illegal option -e
 hppa# hash -r
 hppa# ls -l /bin/ls
 [1]   Segmentation fault (core dumped) ls -l /bin/ls
 hppa# tar
 [1]   Segmentation fault (core dumped) tar


 regards,
 chris

From: "David H. Gutteridge" <david@gutteridge.ca>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-hppa/56118: sporadic app crashes in HPPA -current
Date: Mon, 28 Mar 2022 19:55:48 -0400

 Probably this isn't news to anyone on this PR, but it's still an issue
 with 9.99.95 kernel+userland. I just ran into this myself with some of
 the same reproducers. In my case, this is on a B180L+.

 Dave

From: Tom Lane <tgl@sss.pgh.pa.us>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-hppa/56118: sporadic app crashes in HPPA -current
Date: Sun, 05 Jun 2022 17:36:23 -0400

 chris@groessler.org writes:
 > Thread 1 "" received signal SIGSEGV, Segmentation fault.
 > 0x00035228 in __canonicalize_funcptr_for_compare ()
 > ...
 > (gdb) x/4i $pc
 > =3D> 0x35228 <__canonicalize_funcptr_for_compare+48>: probei,r (r3),3,r2=
 0
 >    0x3522c <__canonicalize_funcptr_for_compare+52>: cmpiclr,<> 0,r20,r0
 >    0x35230 <__canonicalize_funcptr_for_compare+56>: b,l,n 0x3527c <__can=
 onicalize_funcptr_for_compare+132>,r0
 >    0x35234 <__canonicalize_funcptr_for_compare+60>: ldw 0(r3),r20

 I poked into this a little bit.  I think what is happening is that
 makemandb assumes that it can do this function pointer comparison:

         } else if (mdocs[n->tok] =3D=3D pmdoc_Xr) {

 without regard for whether "mdocs[n->tok]" is a valid pointer;
 as Nick Hudson points out downthread, the value being passed to
 __canonicalize_funcptr_for_compare looks a lot more like a bit of
 ASCII text than it does a pointer.

 Now, __canonicalize_funcptr_for_compare does try to defend itself
 against being given a bogus pointer, but it's crashing exactly where
 it tries to do that.  The PROBEI instruction corresponds to this bit
 in external/gpl3/gcc/dist/libgcc/config/pa/fptr.c:

   if (!_dl_read_access_allowed ((unsigned int)plabel))
     return (unsigned int) fptr;

 So seemingly, _dl_read_access_allowed fails on sufficiently-bogus
 pointers.  I looked up PROBEI in the HPPA Architecture Reference,
 and I found out that it only knows how to give the correct answer
 if the target address is in the data TLB table.  If it isn't (which
 of course it wouldn't be in this case), PROBEI causes a "non-access
 data TLB miss fault".  The manual then says:

     If this instruction causes a non-access data TLB miss
     fault/non-access data page fault, the operating system's handler is
     required to search its page tables for the given address. If found,
     it does the appropriate TLB insert and returns to the interrupting
     instruction. If not found, the handler must decode the target field
     of the instruction, set that GR to 0, set the IPSW[N] bit to 1, and
     return to the interrupting instruction.

 I don't have a good idea where to look in the NetBSD sources for
 HPPA TLB miss handling, but I will bet lunch that it is not handling
 this case correctly, because surely PROBEI will never actually try
 to dereference the target address, as the SIGSEGV trap would seem to
 indicate.  I suspect the TLB miss logic just treats any invalid
 address as SIGSEGV, without this special case for PROBEI.
 (Note there is also a PROBE instruction with the same requirement.)

 So the direct cause of the crash looks to me to be missing kernel
 functionality.  However, I'm also fairly suspicious of makemandb's
 code, because it seems to me that comparing ASCII text to a function
 pointer is someday going to result in a false match and ensuing
 misbehavior.  I wonder if that is basically an uninitialized-variable
 problem.

 The crash in bash does seem quite unrelated.

 			regards, tom lane

From: Tom Lane <tgl@sss.pgh.pa.us>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-hppa/56118: sporadic app crashes in HPPA -current
Date: Sun, 05 Jun 2022 22:34:39 -0400

 I wrote:
 > I poked into this a little bit.  I think what is happening is that
 > makemandb assumes that it can do this function pointer comparison:
 >         } else if (mdocs[n->tok] == pmdoc_Xr) {
 > without regard for whether "mdocs[n->tok]" is a valid pointer;

 Actually ... this code fragment is utterly broken, and seemingly
 has been for awhile.  mdocs[] is a constant array of function
 pointers:

 static  const proff_nf mdocs[MDOC_MAX - MDOC_Dd] = {
         NULL, /* Dd */
         NULL, /* Dt */

 and it is supposed to be indexed by token type minus MDOC_Dd,
 not just token type.  (If you're hoping that MDOC_Dd is zero,
 it ain't.)  So we're not even indexing the array correctly,
 nor do we have any guards for an out-of-range token type,
 which is how come we're managing to pass garbage to
 __canonicalize_funcptr_for_compare.  The one other usage of
 mdocs[] in proff_node() gets these indexing considerations right.

 So a minimal fix would look like

 Index: makemandb.c
 ===================================================================
 RCS file: /cvsroot/src/usr.sbin/makemandb/makemandb.c,v
 retrieving revision 1.62
 diff -u -r1.62 makemandb.c
 --- makemandb.c	6 Apr 2022 03:23:38 -0000	1.62
 +++ makemandb.c	6 Jun 2022 02:14:22 -0000
 @@ -1078,14 +1078,16 @@

  	if (n->type == ROFFT_TEXT) {
  		mdoc_parse_section(n->sec, n->string, rec);
 -	} else if (mdocs[n->tok] == pmdoc_Xr) {
 +	} else if (n->tok >= MDOC_Dd && n->tok < MDOC_MAX &&
 +		   mdocs[n->tok - MDOC_Dd] == pmdoc_Xr) {
  		/*
  		 * When encountering other inline macros,
  		 * call pmdoc_macro_handler.
  		 */
  		pmdoc_macro_handler(n, rec, MDOC_Xr);
  		xr_found = 1;
 -	} else if (mdocs[n->tok] == pmdoc_Pp) {
 +	} else if (n->tok >= MDOC_Dd && n->tok < MDOC_MAX &&
 +		   mdocs[n->tok - MDOC_Dd] == pmdoc_Pp) {
  		pmdoc_macro_handler(n, rec, MDOC_Pp);
  	}

 I applied this version and I find that "makemandb -Q" completes now
 on my HPPA box, which it did not before.  However, I don't have a
 whole lot of faith in that being a 100% fix, because I don't think
 it's entirely guaranteed that the program's GOT is swapped into the
 TLB buffers when we reach this code.  Seeing that pmdoc_Xr and
 pmdoc_Pp appear only once in mdocs[], this code could be simplified
 without change of semantics to

 Index: makemandb.c
 ===================================================================
 RCS file: /cvsroot/src/usr.sbin/makemandb/makemandb.c,v
 retrieving revision 1.62
 diff -u -r1.62 makemandb.c
 --- makemandb.c	6 Apr 2022 03:23:38 -0000	1.62
 +++ makemandb.c	6 Jun 2022 02:17:21 -0000
 @@ -1078,14 +1078,14 @@

  	if (n->type == ROFFT_TEXT) {
  		mdoc_parse_section(n->sec, n->string, rec);
 -	} else if (mdocs[n->tok] == pmdoc_Xr) {
 +	} else if (n->tok == MDOC_Xr) {
  		/*
  		 * When encountering other inline macros,
  		 * call pmdoc_macro_handler.
  		 */
  		pmdoc_macro_handler(n, rec, MDOC_Xr);
  		xr_found = 1;
 -	} else if (mdocs[n->tok] == pmdoc_Pp) {
 +	} else if (n->tok == MDOC_Pp) {
  		pmdoc_macro_handler(n, rec, MDOC_Pp);
  	}

 and on the whole I'd recommend that coding.

 I wonder, however, why nobody has noticed that this code doesn't
 work as intended on any platform.  I don't know enough about roff
 to devise a test case for it, but maybe one is needed.

 Meanwhile, I still think that the HPPA kernel support for PROBE[I]
 is broken.

 			regards, tom lane

State-Changed-From-To: open->analyzed
State-Changed-By: skrll@NetBSD.org
State-Changed-When: Mon, 06 Jun 2022 09:40:50 +0000
State-Changed-Why:
After fixing some bugs along the way this PR is about missing probe[rw]{,i}
functionality.


From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/56118 CVS commit: [netbsd-9] src/usr.sbin/makemandb
Date: Mon, 6 Jun 2022 11:34:12 +0000

 Module Name:	src
 Committed By:	martin
 Date:		Mon Jun  6 11:34:12 UTC 2022

 Modified Files:
 	src/usr.sbin/makemandb [netbsd-9]: makemandb.c

 Log Message:
 Pull up following revision(s) (requested by skrll in ticket #1465):

 	usr.sbin/makemandb/makemandb.c: revision 1.63

 Don't index outside the mdocs array of function pointers.  Analysis and
 suggested fixes from Tom Lane.  I played it safe and went with (my
 variation of) the minimal fix.

 PR port-hppa/56118: sporadic app crashes in HPPA -current


 To generate a diff of this commit:
 cvs rdiff -u -r1.60 -r1.60.2.1 src/usr.sbin/makemandb/makemandb.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Nick Hudson" <skrll@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/56118 CVS commit: src/sys/arch/hppa/hppa
Date: Thu, 9 Jun 2022 16:38:23 +0000

 Module Name:	src
 Committed By:	skrll
 Date:		Thu Jun  9 16:38:23 UTC 2022

 Modified Files:
 	src/sys/arch/hppa/hppa: pmap.c trap.c

 Log Message:
 Handle 'NA' (non-access) traps for the lpa and probe instructions.  The
 change is inspired by OpenBSD with a bunch of my own, mainly stylistic,
 changes.

 Thanks to Tom Lane for the analysis.

 PR/56118: sporadic app crashes in HPPA -current


 To generate a diff of this commit:
 cvs rdiff -u -r1.117 -r1.118 src/sys/arch/hppa/hppa/pmap.c
 cvs rdiff -u -r1.118 -r1.119 src/sys/arch/hppa/hppa/trap.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/56118 CVS commit: [netbsd-9] src/sys/arch/hppa/hppa
Date: Fri, 10 Jun 2022 16:28:16 +0000

 Module Name:	src
 Committed By:	martin
 Date:		Fri Jun 10 16:28:16 UTC 2022

 Modified Files:
 	src/sys/arch/hppa/hppa [netbsd-9]: trap.c

 Log Message:
 Pull up following revision(s) (requested by skrll in ticket #1466):

 	sys/arch/hppa/hppa/trap.c: revision 1.119

 Handle 'NA' (non-access) traps for the lpa and probe instructions.  The
 change is inspired by OpenBSD with a bunch of my own, mainly stylistic,
 changes.

 Thanks to Tom Lane for the analysis.

 PR/56118: sporadic app crashes in HPPA -current


 To generate a diff of this commit:
 cvs rdiff -u -r1.111.4.1 -r1.111.4.2 src/sys/arch/hppa/hppa/trap.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: analyzed->closed
State-Changed-By: skrll@NetBSD.org
State-Changed-When: Mon, 29 Aug 2022 14:50:13 +0000
State-Changed-Why:
Problem fixed.


>Unformatted:
 	-current from Apr-14-2021, +- 2 days

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.