NetBSD Problem Report #54307

From john@athena.zia.io  Tue Jun 18 17:29:59 2019
Return-Path: <john@athena.zia.io>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id CEBDB7A14B
	for <gnats-bugs@gnats.NetBSD.org>; Tue, 18 Jun 2019 17:29:58 +0000 (UTC)
Message-Id: <201906181621.x5IGL61k016433@athena.zia.io>
Date: Tue, 18 Jun 2019 16:21:06 GMT
From: john@ziaspace.com
Reply-To: john@ziaspace.com
To: gnats-bugs@NetBSD.org
Subject: Lots of jemalloc assertions in latest -current
X-Send-Pr-Version: 3.95

>Number:         54307
>Category:       port-alpha
>Synopsis:       Lots of jemalloc assertions in latest -current
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    rin
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Jun 18 17:30:00 +0000 2019
>Closed-Date:    Thu Oct 15 14:50:53 +0000 2020
>Last-Modified:  Tue Jul 06 12:25:01 +0000 2021
>Originator:     John Klos
>Release:        NetBSD 8.99.45
>Organization:

>Environment:


System: NetBSD athena.zia.io 8.99.45 NetBSD 8.99.45 (ATHENA-$Revision: 8.99f $) #0: Sun Jun 16 21:12:18 UTC 2019 john@athena.zia.io:/usr/obj-alpha/sys/arch/alpha/compile/ATHENA alpha
Architecture: alpha
Machine: alpha
>Description:

Try to compile, say, pkgsrc/devel/GConf or net/libsoup:

checking for msgfmt... /usr/pkgsrc/devel/GConf/work/.tools/bin/msgfmt
checking for gmsgfmt... /usr/pkgsrc/devel/GConf/work/.tools/bin/msgfmt
<jemalloc>: /usr/src/external/bsd/jemalloc/lib/../dist/src/rtree.c:205: Failed assertion: "!dependent || leaf != NULL"
[1]   Abort trap (core dumped) ${XGETTEXT} --version |
      Done(1)                 grep "(GNU " 2>/dev/null
configure: error: GNU gettext tools not found; required for intltool
*** Error code 1

>How-To-Repeat:

>Fix:


>Release-Note:

>Audit-Trail:
From: coypu@sdf.org
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-alpha/54307: Lots of jemalloc assertions in latest -current
Date: Tue, 6 Aug 2019 16:20:33 +0000

 Is this still an issue after the jemalloc changes since?

From: Rin Okuyama <rokuyama.rk@gmail.com>
To: "gnats-bugs@NetBSD.org" <gnats-bugs@NetBSD.org>
Cc: Christos Zoulas <christos@zoulas.com>
Subject: Re: port-alpha/54307: Lots of jemalloc assertions in latest -current
Date: Fri, 9 Aug 2019 16:59:02 +0900

 jemalloc is still broken for alpha. For example,

 ----
 # cd /usr/tests/bin; atf-run | atf-report
 Tests root: /usr/tests/bin

 atf-report: ERROR: 29: Unexpected token `<<EOF>>'; expected tps-count or
 atf-report: info field
 [1]   Segmentation fault (core dumped) atf-run |
        Done(1)                 atf-report
 ----

 This is probably because alpha has 43-bit virtual address, whereas
 jemalloc assumes 48.

 The attached patch seems to fix the problem as far as I can see.
 At least, pkgsrc/devel/GConf and net/libsoup in the PR, can build
 for me.

 Probably, other 64-bit ports are affected too, even though problems
 are not visible like alpha...

 Thanks,
 rin

 Index: external/bsd/jemalloc/include/jemalloc/internal/jemalloc_internal_defs.h
 ===================================================================
 RCS file: /home/netbsd/src/external/bsd/jemalloc/include/jemalloc/internal/jemalloc_internal_defs.h,v
 retrieving revision 1.10
 diff -p -u -r1.10 jemalloc_internal_defs.h
 --- external/bsd/jemalloc/include/jemalloc/internal/jemalloc_internal_defs.h	14 May 2019 16:22:09 -0000	1.10
 +++ external/bsd/jemalloc/include/jemalloc/internal/jemalloc_internal_defs.h	7 Aug 2019 15:30:59 -0000
 @@ -47,7 +47,11 @@
    */
   #ifdef _LP64
   /* XXX: I will take care of this later */
 +#  ifdef __alpha__
 +#define LG_VADDR 43	/* bit 42 indicates direct map, 42--63 are same */
 +#  else
   #define LG_VADDR 48
 +#  endif
   #else
   #define LG_VADDR 32
   #endif
 Index: external/bsd/jemalloc/lib/Makefile.inc
 ===================================================================
 RCS file: /home/netbsd/src/external/bsd/jemalloc/lib/Makefile.inc,v
 retrieving revision 1.10
 diff -p -u -r1.10 Makefile.inc
 --- external/bsd/jemalloc/lib/Makefile.inc	23 Jul 2019 06:31:20 -0000	1.10
 +++ external/bsd/jemalloc/lib/Makefile.inc	6 Aug 2019 01:13:00 -0000
 @@ -38,12 +38,14 @@ witness.c
   .SUFFIXES: .3
   .PATH.3: ${JEMALLOC}/dist/doc
   .for i in ${JEMALLOC_SRCS}
 -# helps in tracking bad malloc/pointer usage, but has a serious
 -# performance penalty:
 -#   CPPFLAGS.${i}+=-I${JEMALLOC}/include -DJEMALLOC_PROTECT_NOSTD -DJEMALLOC_DEBUG
   CPPFLAGS.${i}+=-I${JEMALLOC}/include -DJEMALLOC_PROTECT_NOSTD
   COPTS.${i}+= -fvisibility=hidden -funroll-loops
   COPTS.${i}+= ${${ACTIVE_CC} == "clang":? -Wno-atomic-alignment :}
 +.if ${MACHINE} == "alpha"
 +# helps in tracking bad malloc/pointer usage, but has a serious
 +# performance penalty:
 +CPPFLAGS.${i}+=-DJEMALLOC_DEBUG
 +.endif
   .endfor

   COPTS.background_thread.c+=-Wno-error=stack-protector

From: "Christos Zoulas" <christos@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/54307 CVS commit: src/external/bsd/jemalloc/include/jemalloc/internal
Date: Fri, 9 Aug 2019 04:10:39 -0400

 Module Name:	src
 Committed By:	christos
 Date:		Fri Aug  9 08:10:39 UTC 2019

 Modified Files:
 	src/external/bsd/jemalloc/include/jemalloc/internal:
 	    jemalloc_internal_defs.h

 Log Message:
 PR/54307: Rin Okuyama: Lots of jemalloc assertions in latest -current


 To generate a diff of this commit:
 cvs rdiff -u -r1.10 -r1.11 \
     src/external/bsd/jemalloc/include/jemalloc/internal/jemalloc_internal_defs.h

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->needs-pullups
State-Changed-By: rin@NetBSD.org
State-Changed-When: Fri, 09 Aug 2019 08:20:51 +0000
State-Changed-Why:
Fix committed.

John, is it working for you?

If so, we have to pullup to netbsd-9 asap.


From: John Klos <john@ziaspace.com>
To: gnats-bugs@netbsd.org
Cc: port-alpha-maintainer@netbsd.org, netbsd-bugs@netbsd.org,
        gnats-admin@netbsd.org, rin@NetBSD.org
Subject: Re: port-alpha/54307 (Lots of jemalloc assertions in latest -current)
Date: Fri, 9 Aug 2019 19:45:14 +0000 (UTC)

 > John, is it working for you?
 >
 > If so, we have to pullup to netbsd-9 asap.

 Once a build of firefox52 is finished, I'll move the system from netbsd-8 
 to netbsd-9 or -current and test this. That should happen within a day or 
 so.

 John

From: John Klos <john@ziaspace.com>
To: Rin Okuyama <rokuyama.rk@gmail.com>
Cc: gnats-bugs@netbsd.org, port-alpha-maintainer@netbsd.org,
        gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: port-alpha/54307 (Lots of jemalloc assertions in latest -current)
Date: Sat, 10 Aug 2019 02:38:51 +0000 (UTC)

 -current from 9-August-2019, 22:04 UTC has tons of segmentation faults 
 from unaligned accesses:

 tar xpf comp.tar
 [ 279.9183081] pid 38 (tar): unaligned access: va=0x3fffded8944 pc=0x3fffd7d2c08 ra=0x3fffd7d2cc4 sp=0x1fffff258 op=ldq
 [ 279.9183081] pid 38 (tar): unaligned access: va=0x3fffded8944 pc=0x3fffd824870 ra=0x3fffd7d2ed4 sp=0x1fffff208 op=ldq
 [ 279.9183081] pid 38 (tar): unaligned access: va=0x3fffded894c pc=0x3fffd8248b8 ra=0x3fffd7d2ed4 sp=0x1fffff208 op=ldq
 [ 279.9192850] pid 38 (tar): unaligned access: va=0x3fffded8984 pc=0x3fffd8248f0 ra=0x0 sp=0x1fffff208 op=ldq
 [1]   Segmentation fault (core dumped) tar xpf comp.tar

 It's hardly usable.

From: Rin Okuyama <rokuyama.rk@gmail.com>
To: John Klos <john@ziaspace.com>
Cc: gnats-bugs@netbsd.org, port-alpha-maintainer@netbsd.org,
 gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: port-alpha/54307 (Lots of jemalloc assertions in latest -current)
Date: Sat, 10 Aug 2019 13:15:58 +0900

 Hmm, I could not reproduce this failure on my DS10.

 My environment is

 - kernel around Aug 9
 - userland around Aug 3 + patched libc

 And

 ds10$ sysctl -a | grep unaligned
 machdep.unaligned_print = 1
 machdep.unaligned_fix = 1
 machdep.unaligned_sigbus = 0

 (Nothing added to /etc/sysctl.conf)

 ds10$ tar --version
 bsdtar 3.4.0 - libarchive 3.4.0 zlib/1.2.10 liblzma/5.2.4 bz2lib/1.0.8

 I will lose access to my DS10 for next few days. After that,
 I will try new kernel and userland.

 Thanks,
 rin

 On 2019/08/10 11:38, John Klos wrote:
 > -current from 9-August-2019, 22:04 UTC has tons of segmentation faults from unaligned accesses:
 > 
 > tar xpf comp.tar
 > [ 279.9183081] pid 38 (tar): unaligned access: va=0x3fffded8944 pc=0x3fffd7d2c08 ra=0x3fffd7d2cc4 sp=0x1fffff258 op=ldq
 > [ 279.9183081] pid 38 (tar): unaligned access: va=0x3fffded8944 pc=0x3fffd824870 ra=0x3fffd7d2ed4 sp=0x1fffff208 op=ldq
 > [ 279.9183081] pid 38 (tar): unaligned access: va=0x3fffded894c pc=0x3fffd8248b8 ra=0x3fffd7d2ed4 sp=0x1fffff208 op=ldq
 > [ 279.9192850] pid 38 (tar): unaligned access: va=0x3fffded8984 pc=0x3fffd8248f0 ra=0x0 sp=0x1fffff208 op=ldq
 > [1]   Segmentation fault (core dumped) tar xpf comp.tar
 > 
 > It's hardly usable.

From: John Klos <john@ziaspace.com>
To: gnats-bugs@netbsd.org
Cc: port-alpha-maintainer@netbsd.org, gnats-admin@netbsd.org,
        netbsd-bugs@netbsd.org
Subject: Re: port-alpha/54307 (Lots of jemalloc assertions in latest -current)
Date: Sat, 10 Aug 2019 18:32:21 +0000 (UTC)

 > ds10$ sysctl -a | grep unaligned
 > machdep.unaligned_print = 1
 > machdep.unaligned_fix = 1
 > machdep.unaligned_sigbus = 0

 Same.

 > ds10$ tar --version
 > bsdtar 3.4.0 - libarchive 3.4.0 zlib/1.2.10 liblzma/5.2.4 bz2lib/1.0.8
 >
 > I will lose access to my DS10 for next few days. After that,
 > I will try new kernel and userland.

 sets/base.tar.xz
 [  63.8202487] pid 1316 (tar): unaligned access: va=0x3fffd5a9006 
 pc=0x3fffd7d2c08 ra=0x3fffd7d2cc4 sp=0x1ffffeb48 op=ldq
 Segmentation fault (core dumped)
 sets/comp.tar.xz
 [  71.2329500] pid 1045 (tar): unaligned access: va=0x3fffd5a6c19 
 pc=0x3fffd7d2c08 ra=0x3fffd7d2cc4 sp=0x1ffffeb38 op=ldq
 Segmentation fault (core dumped)
 sets/games.tar.xz
 Segmentation fault (core dumped)
 sets/man.tar.xz
 Segmentation fault (core dumped)
 sets/misc.tar.xz
 [  80.5420379] pid 1295 (tar): unaligned access: va=0x3fffd5aa634 
 pc=0x3fffd7d2c08 ra=0x3fffd7d2cc4 sp=0x1ffffeb38 op=ldq
 Segmentation fault (core dumped)
 sets/modules.tar.xz
 Segmentation fault (core dumped)
 sets/tests.tar.xz
 Segmentation fault (core dumped)
 sets/text.tar.xz
 [  86.0944825] pid 1331 (tar): unaligned access: va=0x3fffd5a7871 
 pc=0x3fffd7d2c08 ra=0x3fffd7d2cc4 sp=0x1ffffeb38 op=ldq
 Segmentation fault (core dumped)
 sets/xbase.tar.xz
 Segmentation fault (core dumped)
 sets/xcomp.tar.xz
 Segmentation fault (core dumped)
 sets/xetc.tar.xz
 sets/xfont.tar.xz
 Segmentation fault (core dumped)
 sets/xserver.tar.xz
 [ 100.1787124] pid 1379 (tar): unaligned access: va=0x3fffd5aeb8e 
 pc=0x3fffd7d2c08 ra=0x3fffd7d2cc4 sp=0x1ffffeb38 op=ldq
 Segmentation fault (core dumped)


 In addition, with DIAGNOSTIC:

 [ 2754.6460539] panic: kernel diagnostic assertion "l->l_stat == LSSTOP || l->l_stat == LSSUSPENDED" failed: file "/usr/current/src/sys/kern/kern_sleepq.c", line 128
 [ 2754.6460539] cpu0: Begin traceback...
 [ 2754.6460539] alpha trace requires known PC =eject=
 [ 2754.6460539] cpu0: End traceback...
 [ 2754.6460539] cpu1: shutting down...
 [ 2754.6460539] dumping to dev 4,1 offset 8388607


 I'll have to put this machine back on -8, but within a week or so I hope 
 to set up a system which can run -current indefinitely.

From: Rin Okuyama <rokuyama.rk@gmail.com>
To: John Klos <john@ziaspace.com>, gnats-bugs@netbsd.org
Cc: port-alpha-maintainer@netbsd.org, gnats-admin@netbsd.org,
 netbsd-bugs@netbsd.org
Subject: Re: port-alpha/54307 (Lots of jemalloc assertions in latest -current)
Date: Wed, 14 Aug 2019 11:01:27 +0900

 For alpha, jemalloc seems to work only ifdef JEMALLOC_DEBUG.

 If it is defined, everything works fine as far as I can see.
 However, otherwise, /bin/sh dumps core when entering multiuser.

 With working libc,

 ds10# LD_PRELOAD=/lib/libc.so.12.213.orig sh /etc/rc
 [1]   Segmentation fault (core dumped) LD_PRELOAD=/lib/libc.so.12.213.orig sh /etc/rc
 ds10# gdb sh sh.core
 GNU gdb (GDB) 8.3
 Copyright (C) 2019 Free Software Foundation, Inc.
 License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
 This is free software: you are free to change and redistribute it.
 There is NO WARRANTY, to the extent permitted by law.
 Type "show copying" and "show warranty" for details.
 This GDB was configured as "alpha--netbsd".
 Type "show configuration" for configuration details.
 For bug reporting instructions, please see:
 <http://www.gnu.org/software/gdb/bugs/>.
 Find the GDB manual and other documentation resources online at:
      <http://www.gnu.org/software/gdb/documentation/>.

 For help, type "help".
 Type "apropos word" to search for commands related to "word"...
 Reading symbols from sh...
 Reading symbols from /usr/libdata/debug//bin/sh.debug...
 [New process 1]
 Core was generated by `sh'.
 Program terminated with signal SIGSEGV, Segmentation fault.
 #0  bitmap_unset (bit=0, binfo=<optimized out>,
      bitmap=0x3fffdd61b48 <free+984>)
      at /build/src/external/bsd/jemalloc/lib/../include/jemalloc/internal/bitmap.h:345
 345             *gp = g;
 (gdb) bt
 #0  bitmap_unset (bit=0, binfo=<optimized out>,
      bitmap=0x3fffdd61b48 <free+984>)
      at /build/src/external/bsd/jemalloc/lib/../include/jemalloc/internal/bitmap.h:345
 #1  arena_slab_reg_dalloc (ptr=<optimized out>,
      slab_data=0x3fffdd61b48 <free+984>, slab=0x3fffdd61b08 <free+920>)
      at /build/src/external/bsd/jemalloc/lib/../dist/src/arena.c:273
 #2  arena_dalloc_bin_locked_impl (tsdn=0x3fffdec2010, arena=0x3fffd6009c0,
      slab=0x3fffdd61b08 <free+920>, ptr=<optimized out>, junked=<optimized out>)
      at /build/src/external/bsd/jemalloc/lib/../dist/src/arena.c:1543
 #3  0x000003fffdd02ed4 in je_tcache_bin_flush_small (tsd=0x3fffdec2010,
      tcache=<optimized out>, tbin=0x3fffdec2360, binind=16, rem=<optimized out>)
      at /build/src/external/bsd/jemalloc/lib/../dist/src/tcache.c:149
 #4  0x000003fffdd61b08 in tcache_dalloc_small (slow_path=false, binind=0,
      ptr=0x3fffdb0de00, tcache=<optimized out>, tsd=<optimized out>)
      at /build/src/external/bsd/jemalloc/lib/../include/jemalloc/internal/tcache_inlines.h:178
 #5  arena_dalloc (slow_path=false, alloc_ctx=<synthetic pointer>,
      tcache=<optimized out>, ptr=0x3fffdb0de00, tsdn=<optimized out>)
      at /build/src/external/bsd/jemalloc/lib/../include/jemalloc/internal/arena_inlines_b.h:224
 #6  idalloctm (slow_path=false, is_internal=false,
      alloc_ctx=<synthetic pointer>, tcache=<optimized out>, ptr=0x3fffdb0de00,
      tsdn=<optimized out>)
      at /build/src/external/bsd/jemalloc/lib/../include/jemalloc/internal/jemalloc_internal_inlines_c.h:118
 #7  ifree (slow_path=false, tcache=<optimized out>, ptr=0x3fffdb0de00,
      tsd=0x3fffdec2010)
      at /build/src/external/bsd/jemalloc/lib/../dist/src/jemalloc.c:2255
 #8  free (ptr=0x3fffdb0de00)
      at /build/src/external/bsd/jemalloc/lib/../dist/src/jemalloc.c:2426
 #9  0x000000012001c404 in rststackmark (mark=0x1fffff358)
      at /build/src/bin/sh/memalloc.c:198
 #10 0x000000012001ba80 in cmdloop (top=<optimized out>)
      at /build/src/bin/sh/main.c:306
 #11 0x000000012000b548 in dotcmd (argc=<optimized out>, argv=<optimized out>)
      at /build/src/bin/sh/eval.c:1505
 #12 0x000000012000ad10 in evalcommand (cmd=<optimized out>,
      flgs=<optimized out>, backcmd=<optimized out>)
      at /build/src/bin/sh/eval.c:1314
 #13 0x0000000120008654 in evaltree (n=0x3fffdb80030, flags=<optimized out>)
      at /build/src/bin/sh/eval.c:357
 #14 0x000000012001bbd4 in cmdloop (top=<optimized out>)
      at /build/src/bin/sh/main.c:304
 #15 0x0000000120032650 in main (argc=<optimized out>, argv=<optimized out>)
      at /build/src/bin/sh/main.c:246
 (gdb)

 I have no ideas at the moment...

 On 2019/08/11 3:32, John Klos wrote:
 > In addition, with DIAGNOSTIC:
 > 
 > [ 2754.6460539] panic: kernel diagnostic assertion "l->l_stat == LSSTOP || l->l_stat == LSSUSPENDED" failed: file "/usr/current/src/sys/kern/kern_sleepq.c", line 128
 > [ 2754.6460539] cpu0: Begin traceback...
 > [ 2754.6460539] alpha trace requires known PC =eject=
 > [ 2754.6460539] cpu0: End traceback...
 > [ 2754.6460539] cpu1: shutting down...
 > [ 2754.6460539] dumping to dev 4,1 offset 8388607

 This may be a different problem. Please file another PR,
 if you could reproduce this panic.

 Thanks,
 rin

State-Changed-From-To: needs-pullups->open
State-Changed-By: rin@NetBSD.org
State-Changed-When: Wed, 14 Aug 2019 14:00:52 +0000
State-Changed-Why:
Still needs further analysis.


From: "Rin Okuyama" <rin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/54307 CVS commit: src/external/bsd/jemalloc/lib
Date: Fri, 1 Nov 2019 20:53:10 +0000

 Module Name:	src
 Committed By:	rin
 Date:		Fri Nov  1 20:53:10 UTC 2019

 Modified Files:
 	src/external/bsd/jemalloc/lib: Makefile.inc

 Log Message:
 Workaround for random crash of userland binaries, as reported in
 PR port-alpha/54307.

 If rtree.c and tcache.c are compiled with -O0, userland just works
 without problems as far as I can see. Alternately, you can specify
 -DJEMALLOC_DEBUG to avoid random crash. Smells like compiler bug,
 or wrong coding which relies on some undefined behavior.

 Anyway, we need to pull this up into netbsd-9 asap.


 To generate a diff of this commit:
 cvs rdiff -u -r1.10 -r1.11 src/external/bsd/jemalloc/lib/Makefile.inc

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Rin Okuyama" <rin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/54307 CVS commit: src/doc
Date: Fri, 1 Nov 2019 20:55:56 +0000

 Module Name:	src
 Committed By:	rin
 Date:		Fri Nov  1 20:55:56 UTC 2019

 Modified Files:
 	src/doc: HACKS

 Log Message:
 Describe workaround for PR port-alpha/54307.


 To generate a diff of this commit:
 cvs rdiff -u -r1.194 -r1.195 src/doc/HACKS

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: christos@zoulas.com (Christos Zoulas)
To: gnats-bugs@netbsd.org, port-alpha-maintainer@netbsd.org, 
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, john@ziaspace.com
Cc: 
Subject: Re: PR/54307 CVS commit: src/external/bsd/jemalloc/lib
Date: Fri, 1 Nov 2019 17:04:42 -0400

 On Nov 1,  8:55pm, rin@netbsd.org ("Rin Okuyama") wrote:
 -- Subject: PR/54307 CVS commit: src/external/bsd/jemalloc/lib

 |  Workaround for random crash of userland binaries, as reported in
 |  PR port-alpha/54307.
 |  
 |  If rtree.c and tcache.c are compiled with -O0, userland just works
 |  without problems as far as I can see. Alternately, you can specify
 |  -DJEMALLOC_DEBUG to avoid random crash. Smells like compiler bug,
 |  or wrong coding which relies on some undefined behavior.
 |  
 |  Anyway, we need to pull this up into netbsd-9 asap.

 HEAD builds fine with gcc-8. Can you test it (and if the bug is still there)?

 Thanks,

 christos

From: Rin Okuyama <rokuyama.rk@gmail.com>
To: Christos Zoulas <christos@zoulas.com>, gnats-bugs@netbsd.org
Cc: 
Subject: Re: PR/54307 CVS commit: src/external/bsd/jemalloc/lib
Date: Sat, 2 Nov 2019 06:35:03 +0900

 On 2019/11/02 6:04, Christos Zoulas wrote:
 > On Nov 1,  8:55pm, rin@netbsd.org ("Rin Okuyama") wrote:
 > -- Subject: PR/54307 CVS commit: src/external/bsd/jemalloc/lib
 > 
 > |  Workaround for random crash of userland binaries, as reported in
 > |  PR port-alpha/54307.
 > |
 > |  If rtree.c and tcache.c are compiled with -O0, userland just works
 > |  without problems as far as I can see. Alternately, you can specify
 > |  -DJEMALLOC_DEBUG to avoid random crash. Smells like compiler bug,
 > |  or wrong coding which relies on some undefined behavior.
 > |
 > |  Anyway, we need to pull this up into netbsd-9 asap.
 > 
 > HEAD builds fine with gcc-8. Can you test it (and if the bug is still there)?

 I tested it. These files need to be compiled with -O0 even with gcc-8.

 Can I send pullup request? Or would you examine further?

 Thanks,
 rin

From: Rin Okuyama <rokuyama.rk@gmail.com>
To: "gnats-bugs@NetBSD.org" <gnats-bugs@NetBSD.org>
Cc: 
Subject: Re: port-alpha/54307 (Lots of jemalloc assertions in latest -current)
Date: Sun, 3 Nov 2019 16:32:50 +0900

 -------- Forwarded Message --------
 Subject: CVS commit: src/doc
 Date: Sun, 3 Nov 2019 07:10:42 +0000
 From: Rin Okuyama <rin@netbsd.org>
 Reply-To: source-changes-d@NetBSD.org
 To: source-changes-full@NetBSD.org

 Module Name:	src
 Committed By:	rin
 Date:		Sun Nov  3 07:10:42 UTC 2019

 Modified Files:
 	src/doc: HACKS

 Log Message:
 Describe that both GCC 7.4 and 8.3 fail in the last entry.


 To generate a diff of this commit:
 cvs rdiff -u -r1.195 -r1.196 src/doc/HACKS

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.


State-Changed-From-To: open->pending-pullups
State-Changed-By: rin@NetBSD.org
State-Changed-When: Sun, 03 Nov 2019 07:39:45 +0000
State-Changed-Why:
[pullup-9 #392]
https://releng.netbsd.org/cgi-bin/req-9.cgi?show=392

I asked to pullup workaround to netbsd-9 for now.
Otherwise, NetBSD 9.0_BETA is unusable at all.


From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/54307 CVS commit: [netbsd-9] src
Date: Sun, 3 Nov 2019 11:41:58 +0000

 Module Name:	src
 Committed By:	martin
 Date:		Sun Nov  3 11:41:57 UTC 2019

 Modified Files:
 	src/doc [netbsd-9]: HACKS
 	src/external/bsd/jemalloc/include/jemalloc/internal [netbsd-9]:
 	    jemalloc_internal_defs.h
 	src/external/bsd/jemalloc/lib [netbsd-9]: Makefile.inc

 Log Message:
 Pull up following revision(s) (requested by rin in ticket #392):

 	doc/HACKS: revision 1.195
 	doc/HACKS: revision 1.196
 	external/bsd/jemalloc/lib/Makefile.inc: revision 1.11
 	external/bsd/jemalloc/include/jemalloc/internal/jemalloc_internal_defs.h: revision 1.11

 PR/54307: Rin Okuyama: Lots of jemalloc assertions in latest -current

 Workaround for random crash of userland binaries, as reported in
 PR port-alpha/54307.

 If rtree.c and tcache.c are compiled with -O0, userland just works
 without problems as far as I can see. Alternately, you can specify
 -DJEMALLOC_DEBUG to avoid random crash. Smells like compiler bug,
 or wrong coding which relies on some undefined behavior.

 Anyway, we need to pull this up into netbsd-9 asap.

 Describe workaround for PR port-alpha/54307.

 Describe that both GCC 7.4 and 8.3 fail in the last entry.


 To generate a diff of this commit:
 cvs rdiff -u -r1.190 -r1.190.2.1 src/doc/HACKS
 cvs rdiff -u -r1.10 -r1.10.4.1 \
     src/external/bsd/jemalloc/include/jemalloc/internal/jemalloc_internal_defs.h
 cvs rdiff -u -r1.10 -r1.10.2.1 src/external/bsd/jemalloc/lib/Makefile.inc

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: pending-pullups->analyzed
State-Changed-By: rin@NetBSD.org
State-Changed-When: Sun, 03 Nov 2019 11:50:34 +0000
State-Changed-Why:
Workaround pulled-up.
Still needs more analysis.


From: Christos Zoulas <christos@zoulas.com>
To: gnats-bugs@netbsd.org
Cc: port-alpha-maintainer@netbsd.org,
 gnats-admin@netbsd.org,
 netbsd-bugs@netbsd.org,
 john@ziaspace.com
Subject: Re: PR/54307 CVS commit: src/external/bsd/jemalloc/lib
Date: Sun, 3 Nov 2019 10:39:01 -0500

 Thank you!

 > On Nov 1, 2019, at 5:40 PM, Rin Okuyama <rokuyama.rk@gmail.com> wrote:
 >=20
 > The following reply was made to PR port-alpha/54307; it has been noted =
 by GNATS.
 >=20
 > From: Rin Okuyama <rokuyama.rk@gmail.com>
 > To: Christos Zoulas <christos@zoulas.com>, gnats-bugs@netbsd.org
 > Cc:=20
 > Subject: Re: PR/54307 CVS commit: src/external/bsd/jemalloc/lib
 > Date: Sat, 2 Nov 2019 06:35:03 +0900
 >=20
 > On 2019/11/02 6:04, Christos Zoulas wrote:
 >> On Nov 1,  8:55pm, rin@netbsd.org ("Rin Okuyama") wrote:
 >> -- Subject: PR/54307 CVS commit: src/external/bsd/jemalloc/lib
 >>=20
 >> |  Workaround for random crash of userland binaries, as reported in
 >> |  PR port-alpha/54307.
 >> |
 >> |  If rtree.c and tcache.c are compiled with -O0, userland just works
 >> |  without problems as far as I can see. Alternately, you can specify
 >> |  -DJEMALLOC_DEBUG to avoid random crash. Smells like compiler bug,
 >> |  or wrong coding which relies on some undefined behavior.
 >> |
 >> |  Anyway, we need to pull this up into netbsd-9 asap.
 >>=20
 >> HEAD builds fine with gcc-8. Can you test it (and if the bug is still =
 there)?
 >=20
 > I tested it. These files need to be compiled with -O0 even with gcc-8.
 >=20
 > Can I send pullup request? Or would you examine further?
 >=20
 > Thanks,
 > rin
 >=20

From: "Rin Okuyama" <rin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/54307 CVS commit: src
Date: Wed, 7 Oct 2020 07:35:28 +0000

 Module Name:	src
 Committed By:	rin
 Date:		Wed Oct  7 07:35:28 UTC 2020

 Modified Files:
 	src/doc: HACKS
 	src/external/bsd/jemalloc/lib: Makefile.inc

 Log Message:
 PR port-alpha/54307

 GCC 9.3 seems to be able to compile rtree.c with -O2:

 - No new regressions in ATF.
 - System survives over a night, at least, under heavy loads.

 On the other hand, unfortunately, GCC 9.3 still miscompiles tcache.c
 with -O2 or -O1. For example, even ``gcc -g hello.c'' fails with ICE
 if tcache.c is compiled with -O[12] in libc.


 To generate a diff of this commit:
 cvs rdiff -u -r1.212 -r1.213 src/doc/HACKS
 cvs rdiff -u -r1.11 -r1.12 src/external/bsd/jemalloc/lib/Makefile.inc

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

Responsible-Changed-From-To: port-alpha-maintainer->rin
Responsible-Changed-By: thorpej@NetBSD.org
Responsible-Changed-When: Tue, 13 Oct 2020 15:32:16 +0000
Responsible-Changed-Why:
Can we consider this PR resolved? We have the entry in HACKS tracking
future work.


State-Changed-From-To: analyzed->feedback
State-Changed-By: thorpej@NetBSD.org
State-Changed-When: Tue, 13 Oct 2020 15:32:16 +0000
State-Changed-Why:
Can we consider this PR resolved? We have the entry in HACKS tracking
future work.


State-Changed-From-To: feedback->closed
State-Changed-By: rin@NetBSD.org
State-Changed-When: Thu, 15 Oct 2020 14:50:53 +0000
State-Changed-Why:
The problem was worked around, and it is tracked by doc/HACKS.

I still suspect that this problem is more than a simple optimization
bug; something wrong in our configuration files for jemalloc or GCC.

However, I'm not very ready for further investigation at the moment...


From: "Jason R Thorpe" <thorpej@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/54307 CVS commit: src/sys/arch/alpha
Date: Tue, 6 Jul 2021 12:20:52 +0000

 Module Name:	src
 Committed By:	thorpej
 Date:		Tue Jul  6 12:20:52 UTC 2021

 Modified Files:
 	src/sys/arch/alpha/alpha: vm_machdep.c
 	src/sys/arch/alpha/include: param.h

 Log Message:
 - Define STACK_ALIGNBYTES to override the default and ensure that
   stacks are 16-byte aligned, an assumption made by the compiler
   and recommended by the Alpha Architecture Handbook.
 - cpu_lwp_fork(): Ensure 16-byte stack alignment if the caller specified
   one.

 Addresses root casue of PR port-alpha/54307 and PR toolchain/56153.

 Many thanks to rin@ for performing the root cause analysis and testing
 changes.


 To generate a diff of this commit:
 cvs rdiff -u -r1.119 -r1.120 src/sys/arch/alpha/alpha/vm_machdep.c
 cvs rdiff -u -r1.48 -r1.49 src/sys/arch/alpha/include/param.h

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.