NetBSD Problem Report #50087

From martin@duskware.de  Sat Jul 25 14:09:31 2015
Return-Path: <martin@duskware.de>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
	by mollari.NetBSD.org (Postfix) with ESMTPS id CADCAA654B
	for <gnats-bugs@gnats.NetBSD.org>; Sat, 25 Jul 2015 14:09:31 +0000 (UTC)
From: martin@NetBSD.org
Reply-To: martin@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: all threaded programs crash on arm
X-Send-Pr-Version: 3.95

>Number:         50087
>Category:       port-arm
>Synopsis:       all threaded programs crash on arm
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    port-arm-maintainer
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Jul 25 14:10:02 +0000 2015
>Closed-Date:    Thu Jul 30 13:21:23 +0000 2015
>Last-Modified:  Tue Nov 24 17:40:00 +0000 2015
>Originator:     Martin Husemann
>Release:        NetBSD 7.99.20
>Organization:
The NetBSD Foundation, Inc.
>Environment:
System: NetBSD space-truckin.duskware.de 7.99.20 NetBSD 7.99.20 (CUBIETRUCK) #205: Sat Jul 25 15:38:26 CEST 2015 martin@night-owl.duskware.de:/usr/src/sys/arch/evbarm/compile/CUBIETRUCK evbarm
Architecture: earmv7hfeb
Machine: evbarm
>Description:

Update an arm machine to -current and then try:

cd /tmp
ls -l / | gzip -9 > test.gz

and watch:

assertion "pthread__tsd_destructors[key] != NULL" failed: file "/usr/src/lib/libpthread/pthread_tsd.c", line 179, function "pthread__add_specific"
[1]   Done                    ls -l / |
      Abort trap (core dumped) gzip -9 >test.gz

and gdb says:

Core was generated by `gzip'.
Program terminated with signal SIGABRT, Aborted.
#0  0x7bec5528 in kill () from /test/usr/lib/libc.so.12
(gdb) bt
#0  0x7bec5528 in kill () from /test/usr/lib/libc.so.12
#1  0x7bd95808 in pthread.assertfunc () from /test/usr/lib/libpthread.so.1
#2  0x7bd9ae68 in pthread.add_specific () from /test/usr/lib/libpthread.so.1
#3  0x7bec387c in ?? () from /test/usr/lib/libc.so.12
#4  0x7be4e100 in ?? () from /test/usr/lib/libc.so.12
#5  0x7be4e16c in malloc () from /test/usr/lib/libc.so.12
#6  0x00013000 in gz_compress (in=in@entry=0, out=out@entry=1, 
    gsizep=gsizep@entry=0x7fffca80, origname=origname@entry=0x15d5c "", 
    mtime=1437832711) at /usr/src/usr.bin/gzip/gzip.c:540
#7  0x000150f8 in handle_stdout () at /usr/src/usr.bin/gzip/gzip.c:1761
#8  0x00015c50 in main (argc=0, argv=<optimized out>)
    at /usr/src/usr.bin/gzip/gzip.c:384
(gdb) up
#2  0x7bd9ae68 in pthread.add_specific () from /test/usr/lib/libpthread.so.1
(gdb) info reg
r0             0x0      0
r1             0x0      0
r2             0x93     147
r3             0xfa76ced        262630637
r4             0x0      0
r5             0x7bff4c00       2080328704
r6             0x7bdaea04       2077944324
r7             0x279e8  162280
r8             0x7bf21884       2079463556
r9             0x0      0
r10            0x2      2
r11            0x7fffc954       2147469652
r12            0x7bec5524       2079085860
sp             0x7fffc938       0x7fffc938
lr             0x7bd9ae68       2077863528
pc             0x7bd9ae68       0x7bd9ae68 <pthread.add_specific+356>
cpsr           0x40020210       1073873424


>How-To-Repeat:
s/a

>Fix:
n/a

>Release-Note:

>Audit-Trail:
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-arm/50087: all threaded programs crash on arm
Date: Sat, 25 Jul 2015 16:20:42 +0200

 Here is the surrounding code from objdump:

     ad48:       24319fe5        ldr     r3, [pc, #292]  ; ae74 <pthread__add_spe
 cific+0x170>
     ad4c:       033096e7        ldr     r3, [r6, r3]
     ad50:       003093e5        ldr     r3, [r3]
     ad54:       043193e7        ldr     r3, [r3, r4, lsl #2]
     ad58:       000053e3        cmp     r3, #0
     ad5c:       3800000a        beq     ae44 <pthread__add_specific+0x140>
 [..]
     ae44:       40209fe5        ldr     r2, [pc, #64]   ; ae8c <pthread__add_spe
 cific+0x188>
     ae48:       b310a0e3        mov     r1, #179        ; 0xb3
     ae4c:       3c009fe5        ldr     r0, [pc, #60]   ; ae90 <pthread__add_specific+0x18c>
     ae50:       3c309fe5        ldr     r3, [pc, #60]   ; ae94 <pthread__add_specific+0x190>
     ae54:       02208fe0        add     r2, pc, r2
     ae58:       00008fe0        add     r0, pc, r0
     ae5c:       142082e2        add     r2, r2, #20
     ae60:       03308fe0        add     r3, pc, r3
     ae64:       4beaffeb        bl      5798 <pthread__assertfunc>

 so unfortunately the non NULL value in r3 has already been overwritten at
 the time of the core dump.

 Martin

From: christos@zoulas.com (Christos Zoulas)
To: gnats-bugs@NetBSD.org, port-arm-maintainer@netbsd.org, 
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, martin@NetBSD.org
Cc: 
Subject: Re: port-arm/50087: all threaded programs crash on arm
Date: Sat, 25 Jul 2015 10:29:50 -0400

 On Jul 25,  2:25pm, martin@duskware.de (Martin Husemann) wrote:
 -- Subject: Re: port-arm/50087: all threaded programs crash on arm

 Something is probably calling malloc() before pthread is initialized.
 have you installed the debug sets? Why don't we see line numbers in libc?

 christos

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-arm/50087: all threaded programs crash on arm
Date: Sat, 25 Jul 2015 16:30:33 +0200

 Copying over libc.so.12.197 from july 20 fixes the issue.

 Martin

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: joerg@NetBSD.org, christos@NetBSD.org
Subject: Re: port-arm/50087: all threaded programs crash on arm
Date: Sat, 25 Jul 2015 21:09:13 +0200

 On Sat, Jul 25, 2015 at 04:30:33PM +0200, Martin Husemann wrote:
 > Copying over libc.so.12.197 from july 20 fixes the issue.

 Jared pointed out the only possible change:

 Modified Files:
         src/share/mk: bsd.lib.mk

 Log Message:
 Simplify the build of library archives by no longer doing a topological
 sort.


 To generate a diff of this commit:
 cvs rdiff -u -r1.358 -r1.359 src/share/mk/bsd.lib.mk



 ... and indeed reverting this makes libc work again.

 Constructor order interfering with the fragile libc/libpthread startup?

 Martin

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: joerg@NetBSD.org, christos@NetBSD.org, manu@NetBSD.org
Subject: Re: port-arm/50087: all threaded programs crash on arm
Date: Sun, 26 Jul 2015 14:16:36 +0200

 I tried to untangle the libc->libpthread take over mess to make it
 independend of jemalloc being called before libpthread initializes its
 TSD size (extracting the keys from libc) and failed.

 I am not sure wether we should band-aid this to work again or better
 just back out the pthread dynamic max keys change, unless someone can
 suggest another way to fix this cleanly.


 Martin

From: christos@zoulas.com (Christos Zoulas)
To: Martin Husemann <martin@duskware.de>, gnats-bugs@NetBSD.org
Cc: joerg@NetBSD.org, manu@NetBSD.org
Subject: Re: port-arm/50087: all threaded programs crash on arm
Date: Sun, 26 Jul 2015 08:26:14 -0400

 On Jul 26,  2:16pm, martin@duskware.de (Martin Husemann) wrote:
 -- Subject: Re: port-arm/50087: all threaded programs crash on arm

 | I tried to untangle the libc->libpthread take over mess to make it
 | independend of jemalloc being called before libpthread initializes its
 | TSD size (extracting the keys from libc) and failed.

 Where is it called from (malloc before pthread_init)?

 christos

From: christos@zoulas.com (Christos Zoulas)
To: Martin Husemann <martin@duskware.de>, gnats-bugs@NetBSD.org
Cc: joerg@NetBSD.org
Subject: Re: port-arm/50087: all threaded programs crash on arm
Date: Sun, 26 Jul 2015 08:49:51 -0400

 On Jul 25,  9:09pm, martin@duskware.de (Martin Husemann) wrote:
 -- Subject: Re: port-arm/50087: all threaded programs crash on arm

 | ... and indeed reverting this makes libc work again.
 | 
 | Constructor order interfering with the fragile libc/libpthread startup?

 Yes, wow. Let's revert the change until we come up with a better solution.

 christos

From: christos@zoulas.com (Christos Zoulas)
To: gnats-bugs@NetBSD.org, port-arm-maintainer@netbsd.org, 
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, martin@NetBSD.org
Cc: 
Subject: Re: port-arm/50087: all threaded programs crash on arm
Date: Sun, 26 Jul 2015 11:54:27 -0400

 On Jul 26, 12:20pm, martin@duskware.de (Martin Husemann) wrote:
 -- Subject: Re: port-arm/50087: all threaded programs crash on arm

 Here's a fix.

 christos

 Index: jemalloc.c
 ===================================================================
 RCS file: /cvsroot/src/lib/libc/stdlib/jemalloc.c,v
 retrieving revision 1.37
 diff -u -u -r1.37 jemalloc.c
 --- jemalloc.c	20 Jan 2015 18:31:25 -0000	1.37
 +++ jemalloc.c	26 Jul 2015 15:52:41 -0000
 @@ -783,20 +783,58 @@
  static malloc_mutex_t	arenas_mtx; /* Protects arenas initialization. */
  #endif

 -#ifndef NO_TLS
  /*
   * Map of pthread_self() --> arenas[???], used for selecting an arena to use
   * for allocations.
   */
 -static __thread arena_t	*arenas_map;
 -#define	get_arenas_map()	(arenas_map)
 -#define	set_arenas_map(x)	(arenas_map = x)
 +#ifndef NO_TLS
 +static __thread arena_t	**arenas_map;
  #else
 -#ifdef _REENTRANT
 -static thread_key_t arenas_map_key;
 +static arena_t	**arenas_map;
  #endif
 -#define	get_arenas_map()	thr_getspecific(arenas_map_key)
 -#define	set_arenas_map(x)	thr_setspecific(arenas_map_key, x)
 +
 +#if !defined(NO_TLS) || !defined(_REENTRANT)
 +# define	get_arenas_map()	(arenas_map)
 +# define	set_arenas_map(x)	(arenas_map = x)
 +#else
 +
 +static thread_key_t arenas_map_key = -1;
 +
 +static inline arena_t **
 +get_arenas_map(void)
 +{
 +	if (!__isthreaded)
 +		return arenas_map;
 +
 +	if (arenas_map_key == -1) {
 +		(void)thr_keycreate(&arenas_map_key, NULL);
 +		if (arenas_map != NULL) {
 +			thr_setspecific(arenas_map_key, arenas_map);
 +			arenas_map = NULL;
 +		}
 +	}
 +
 +	return thr_getspecific(arenas_map_key);
 +}
 +
 +static __inline void
 +set_arenas_map(arena_t **a)
 +{
 +	if (!__isthreaded) {
 +		arenas_map = a;
 +		return;
 +	}
 +
 +	if (arenas_map_key == -1) {
 +		(void)thr_keycreate(&arenas_map_key, NULL);
 +		if (arenas_map != NULL) {
 +			_DIAGASSERT(arenas_map == a);
 +			arenas_map = NULL;
 +		}
 +	}
 +
 +	thr_setspecific(arenas_map_key, a);
 +}
  #endif

  #ifdef MALLOC_STATS
 @@ -3654,11 +3692,6 @@
  		opt_narenas_lshift += 2;
  	}

 -#ifdef NO_TLS
 -	/* Initialize arena key. */
 -	(void)thr_keycreate(&arenas_map_key, NULL);
 -#endif
 -
  	/* Determine how many arenas to use. */
  	narenas = ncpus;
  	if (opt_narenas_lshift > 0) {

From: Martin Husemann <martin@duskware.de>
To: Christos Zoulas <christos@zoulas.com>
Cc: gnats-bugs@NetBSD.org
Subject: Re: port-arm/50087: all threaded programs crash on arm
Date: Sun, 26 Jul 2015 18:39:01 +0200

 On Sun, Jul 26, 2015 at 11:54:27AM -0400, Christos Zoulas wrote:
 > Here's a fix.

 I like the idea (and a quick test says it is working as expected)!

 Martin

From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/50087 CVS commit: src/lib/libc/stdlib
Date: Sun, 26 Jul 2015 17:21:55 +0000

 Module Name:	src
 Committed By:	martin
 Date:		Sun Jul 26 17:21:55 UTC 2015

 Modified Files:
 	src/lib/libc/stdlib: jemalloc.c

 Log Message:
 Defer using pthread keys until we are threaded.
 From Christos, fixes PR port-arm/50087 by allowing malloc calls prior
 to libpthread initialization.


 To generate a diff of this commit:
 cvs rdiff -u -r1.37 -r1.38 src/lib/libc/stdlib/jemalloc.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Martin Husemann <martin@duskware.de>
To: Christos Zoulas <christos@zoulas.com>
Cc: gnats-bugs@NetBSD.org
Subject: Re: port-arm/50087: all threaded programs crash on arm
Date: Mon, 27 Jul 2015 08:45:39 +0200

 Unfortunately it causes quite heavy fallout in the current test runs:

 One example:

 http://www.netbsd.org/~martin/evbearmv7hf-atf/53_atf.html#fs_ffs_t_snapshot_log_snapshotstress

 Core was generated by `t_snapshot_log'.
 Program terminated with signal SIGSEGV, Segmentation fault.
 #0  arena_dalloc (ptr=0x71204400, chunk=0x71200000, arena=0x230a8) at /usr/src/lib/libc/stdlib/jemalloc.c:2590
 #0  arena_dalloc (ptr=0x71204400, chunk=0x71200000, arena=0x230a8) at /usr/src/lib/libc/stdlib/jemalloc.c:2590
 #1  idalloc (ptr=ptr@entry=0x71204400) at /usr/src/lib/libc/stdlib/jemalloc.c:3267
 #2  0x7bb0f310 in free (ptr=0x71204400) at /usr/src/lib/libc/stdlib/jemalloc.c:3944
 #3  0x7bcee434 in mutex_obj_free (lock=0x7182b700) at /usr/src/lib/librump/../../sys/rump/../kern/kern_mutex_obj.c:140
 #4  0x7bdaa5b8 in vnfree (vp=0x7120dc48) at /usr/src/lib/librumpvfs/../../sys/rump/../kern/vfs_vnode.c:296
 #5  0x7bdaa838 in vrelel (vp=0x7120dc48, flags=<optimized out>) at /usr/src/lib/librumpvfs/../../sys/rump/../kern/vfs_vnode.c:703
 #6  0x7bf8cf04 in ufs_remove (v=0x70bffe28) at /usr/src/sys/rump/fs/lib/libffs/../../../../ufs/ufs/ufs_vnops.c:757
 #7  0x7bcbb914 in VOP_REMOVE (dvp=0x75708c60, vp=<optimized out>, cnp=<optimized out>) at /usr/src/lib/librump/../../sys/rump/../kern/vnode_if.c:791
 #8  0x7bdafd78 in do_sys_unlinkat (l=<optimized out>, fdat=fdat@entry=-100, arg=<optimized out>, flags=flags@entry=0, seg=seg@entry=UIO_USERSPACE) at /usr/src/lib/librumpvfs/../../sys/rump/../kern/vfs_syscalls.c:2715
 #9  0x7bdb3644 in sys_unlink (l=<optimized out>, uap=<optimized out>, retval=<optimized out>) at /usr/src/lib/librumpvfs/../../sys/rump/../kern/vfs_syscalls.c:2613
 #10 0x7bd096ec in sy_call (rval=0x70bfff20, uap=0x70bfff1c, l=0x71902d00, sy=0x7bd48f78 <rumpns_sysent+200>) at /usr/src/lib/librump/../../sys/rump/../sys/syscallvar.h:65
 #11 sy_invoke (code=10, rval=0x70bfff20, uap=0x70bfff1c, l=0x71902d00, sy=0x7bd48f78 <rumpns_sysent+200>) at /usr/src/lib/librump/../../sys/rump/../sys/syscallvar.h:94
 #12 rump_syscall (num=num@entry=10, data=data@entry=0x70bfff1c, dlen=dlen@entry=4, retval=retval@entry=0x70bfff20) at /usr/src/lib/librump/../../sys/rump/librump/rumpkern/rump.c:758
 #13 0x7bcfd46c in rump___sysimpl_unlink (path=path@entry=0x70bfff48 "/mnt/a2/d0/f0") at /usr/src/lib/librump/../../sys/rump/librump/rumpkern/rump_syscalls.c:225
 #14 0x00011394 in fs_activity (arg=0x7fffc958) at /usr/src/tests/fs/ffs/../common/snapshot.c:153
 #15 0x7bc25a60 in pthread__create_tramp (cookie=0x757ea000) at /usr/src/lib/libpthread/pthread.c:576

 and the overview:

 http://www.netbsd.org/~martin/evbearmv7hf-atf/53_atf.html#failed-tcs-summary

 (previous run had only two failures, where one is supposed to be fixed in the
 mean time)


 Martin

From: christos@zoulas.com (Christos Zoulas)
To: Martin Husemann <martin@duskware.de>
Cc: gnats-bugs@NetBSD.org
Subject: Re: port-arm/50087: all threaded programs crash on arm
Date: Mon, 27 Jul 2015 04:05:50 -0400

 On Jul 27,  8:45am, martin@duskware.de (Martin Husemann) wrote:
 -- Subject: Re: port-arm/50087: all threaded programs crash on arm

 | Unfortunately it causes quite heavy fallout in the current test runs:

 Let's go back to a working state by backing out both the new jemalloc
 fix and the bsd.lib.mk change and then figure out what's wrong and fix
 it.

 christos

From: christos@zoulas.com (Christos Zoulas)
To: gnats-bugs@NetBSD.org, port-arm-maintainer@netbsd.org, 
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, martin@NetBSD.org
Cc: 
Subject: Re: port-arm/50087: all threaded programs crash on arm
Date: Mon, 27 Jul 2015 04:21:31 -0400

 On Jul 27,  8:10am, christos@zoulas.com (Christos Zoulas) wrote:
 -- Subject: Re: port-arm/50087: all threaded programs crash on arm

 I can't reproduce the failure on amd64. Could this be an evbarm specific
 issue?

 christos

From: Nick Hudson <skrll@netbsd.org>
To: Christos Zoulas <christos@zoulas.com>, gnats-bugs@NetBSD.org, 
 port-arm-maintainer@netbsd.org, gnats-admin@netbsd.org, 
 netbsd-bugs@netbsd.org, martin@NetBSD.org
Cc: 
Subject: Re: port-arm/50087: all threaded programs crash on arm
Date: Mon, 27 Jul 2015 09:31:21 +0100

 On 07/27/15 09:21, Christos Zoulas wrote:
 > On Jul 27,  8:10am, christos@zoulas.com (Christos Zoulas) wrote:
 > -- Subject: Re: port-arm/50087: all threaded programs crash on arm
 >
 > I can't reproduce the failure on amd64. Could this be an evbarm specific
 > issue?
 >
 > christos
 >
 >
 http://nxr.netbsd.org/xref/src/lib/libc/arch/arm/misc/arm_initfini.c#58

 is this relevant?

 Nick

From: christos@zoulas.com (Christos Zoulas)
To: Nick Hudson <skrll@netbsd.org>, gnats-bugs@NetBSD.org, 
	port-arm-maintainer@netbsd.org, gnats-admin@netbsd.org, 
	netbsd-bugs@netbsd.org, martin@NetBSD.org
Cc: 
Subject: Re: port-arm/50087: all threaded programs crash on arm
Date: Mon, 27 Jul 2015 06:30:52 -0400

 On Jul 27,  9:31am, skrll@netbsd.org (Nick Hudson) wrote:
 -- Subject: Re: port-arm/50087: all threaded programs crash on arm

 | http://nxr.netbsd.org/xref/src/lib/libc/arch/arm/misc/arm_initfini.c#58
 | 
 | is this relevant?
 | 

 Could be, but I added the code to amd64 and I could not reproduce the
 failure.

 christos

From: Martin Husemann <martin@duskware.de>
To: Christos Zoulas <christos@zoulas.com>
Cc: gnats-bugs@NetBSD.org
Subject: Re: port-arm/50087: all threaded programs crash on arm
Date: Tue, 28 Jul 2015 09:04:47 +0200

 We are down to 7 failures as of last night:

 http://www.netbsd.org/~martin/evbearmv7hf-atf/54_atf.html#failed-tcs-summary

 A few spurious/unclear, none directly crashing inside malloc.

 Note that other architectures are hovering int the 1-2 failures range.

 One of the reproducable failures on arm (at least for me) is
 a pthread_rwlock_init() failing:

 void
 rumpuser_rw_init(struct rumpuser_rw **rwp)
 { 
         struct rumpuser_rw *rw;
         size_t allocsz;

         allocsz = (sizeof(*rw)+RUMPUSER_LOCKALIGN) & ~(RUMPUSER_LOCKALIGN-1);

         NOFAIL(rw = aligned_alloc(RUMPUSER_LOCKALIGN, allocsz));
         NOFAIL_ERRNO(pthread_rwlock_init(&rw->pthrw, NULL));

 and that NOFAIL_ERRNO calls abort():

 Core was generated by `t_snapshot_v2'.
 Program terminated with signal SIGABRT, Aborted.
 #0  0x7bb85288 in _lwp_kill () from /usr/lib/libc.so.12
 #0  0x7bb85288 in _lwp_kill () from /usr/lib/libc.so.12
 #1  0x7ba9e290 in abort () at /usr/src/lib/libc/stdlib/abort.c:74
 #2  0x7bc44a34 in rumpuser_rw_init (rwp=0x7b93cc5c) at /usr/src/lib/librumpuser/rumpuser_pth.c:355
 #3  0x7bdaa4ac in vnalloc (mp=mp@entry=0x0) at /usr/src/lib/librumpvfs/../../sys/rump/../kern/vfs_vnode.c:272


 (http://www.netbsd.org/~martin/evbearmv7hf-atf/54_atf.html#fs_ffs_t_snapshot_v2_snapshotstress)

 The other failure is pthread_rwlock_tryrdlock() being called for a NULL
 lock:

 Core was generated by `t_renamerace'.
 Program terminated with signal SIGSEGV, Segmentation fault.
 #0  pthread_rwlock_tryrdlock (ptr=ptr@entry=0x0) at /usr/src/lib/libpthread/pthread_rwlock.c:246
 #0  pthread_rwlock_tryrdlock (ptr=ptr@entry=0x0) at /usr/src/lib/libpthread/pthread_rwlock.c:246
 #1  0x7ba45320 in rumpuser_rw_enter (enum_rumprwlock=0, rw=0x0) at /usr/src/lib/librumpuser/rumpuser_pth.c:380
 #2  0x7bbbb868 in namei_getstartdir (state=<optimized out>, state=<optimized out>) at /usr/src/lib/librumpvfs/../../sys/rump/../kern/vfs_lookup.c:539

 (http://www.netbsd.org/~martin/evbearmv7hf-atf/54_atf.html#fs_vfs_t_renamerace_nfs_renamerace_dirs)

 Martin

From: christos@zoulas.com (Christos Zoulas)
To: Martin Husemann <martin@duskware.de>
Cc: gnats-bugs@NetBSD.org
Subject: Re: port-arm/50087: all threaded programs crash on arm
Date: Tue, 28 Jul 2015 06:50:10 -0400

 On Jul 28,  9:04am, martin@duskware.de (Martin Husemann) wrote:
 -- Subject: Re: port-arm/50087: all threaded programs crash on arm

 | We are down to 7 failures as of last night:
 | 
 | http://www.netbsd.org/~martin/evbearmv7hf-atf/54_atf.html#failed-tcs-summary
 | 
 | A few spurious/unclear, none directly crashing inside malloc.
 | 
 | Note that other architectures are hovering int the 1-2 failures range.
 | 
 | One of the reproducable failures on arm (at least for me) is
 | a pthread_rwlock_init() failing:

 How is that possible (since attr == NULL pthread_rwlock_init can't fail).

 christos

From: Martin Husemann <martin@duskware.de>
To: Christos Zoulas <christos@zoulas.com>
Cc: gnats-bugs@NetBSD.org
Subject: Re: port-arm/50087: all threaded programs crash on arm
Date: Thu, 30 Jul 2015 15:19:13 +0200

 The change triggereing this has been backed out and we are down to a single
 failure (just like all other architectures) again.

 Martin

State-Changed-From-To: open->closed
State-Changed-By: martin@NetBSD.org
State-Changed-When: Thu, 30 Jul 2015 13:21:23 +0000
State-Changed-Why:
solved by backout


From: Joerg Sonnenberger <joerg@britannica.bec.de>
To: gnats-bugs@NetBSD.org
Cc: port-arm-maintainer@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org, martin@NetBSD.org
Subject: Re: PR/50087 CVS commit: src/lib/libc/stdlib
Date: Sun, 9 Aug 2015 18:44:12 +0200

 On Sun, Jul 26, 2015 at 05:25:01PM +0000, Martin Husemann wrote:
 >  Module Name:	src
 >  Committed By:	martin
 >  Date:		Sun Jul 26 17:21:55 UTC 2015
 >  
 >  Modified Files:
 >  	src/lib/libc/stdlib: jemalloc.c
 >  
 >  Log Message:
 >  Defer using pthread keys until we are threaded.
 >  From Christos, fixes PR port-arm/50087 by allowing malloc calls prior
 >  to libpthread initialization.

 This is the wrong approach, IMO. libpthread should not use malloc until
 it knows that libc is fully initialized, not the other way around.

 Joerg

From: christos@zoulas.com (Christos Zoulas)
To: gnats-bugs@NetBSD.org, port-arm-maintainer@netbsd.org, 
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, martin@NetBSD.org
Cc: 
Subject: Re: PR/50087 CVS commit: src/lib/libc/stdlib
Date: Sun, 9 Aug 2015 15:18:07 -0400

 On Aug 9,  4:45pm, joerg@britannica.bec.de (Joerg Sonnenberger) wrote:
 -- Subject: Re: PR/50087 CVS commit: src/lib/libc/stdlib

 |  This is the wrong approach, IMO. libpthread should not use malloc until
 |  it knows that libc is fully initialized, not the other way around.

 This is a problem with libc using malloc the wrong time, not libpthread
 (due to constructor order). The problem seems to be triggered by the
 arm arm_initfini.c. I could reproduce this even on x86_64 by adding that
 file to the libc build.

 christos

From: Joerg Sonnenberger <joerg@britannica.bec.de>
To: gnats-bugs@NetBSD.org
Cc: port-arm-maintainer@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org, martin@NetBSD.org
Subject: Re: PR/50087 CVS commit: src/lib/libc/stdlib
Date: Sun, 9 Aug 2015 22:07:06 +0200

 On Sun, Aug 09, 2015 at 07:20:00PM +0000, Christos Zoulas wrote:
 > The following reply was made to PR port-arm/50087; it has been noted by GNATS.
 > 
 > From: christos@zoulas.com (Christos Zoulas)
 > To: gnats-bugs@NetBSD.org, port-arm-maintainer@netbsd.org, 
 > 	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, martin@NetBSD.org
 > Cc: 
 > Subject: Re: PR/50087 CVS commit: src/lib/libc/stdlib
 > Date: Sun, 9 Aug 2015 15:18:07 -0400
 > 
 >  On Aug 9,  4:45pm, joerg@britannica.bec.de (Joerg Sonnenberger) wrote:
 >  -- Subject: Re: PR/50087 CVS commit: src/lib/libc/stdlib
 >  
 >  |  This is the wrong approach, IMO. libpthread should not use malloc until
 >  |  it knows that libc is fully initialized, not the other way around.
 >  
 >  This is a problem with libc using malloc the wrong time, not libpthread
 >  (due to constructor order). The problem seems to be triggered by the
 >  arm arm_initfini.c. I could reproduce this even on x86_64 by adding that
 >  file to the libc build.

 Oh, that's the use of sysctlbyname, which will internally call the
 getoid function and that's caching the result. It should be using fixed
 oids to reduce the startup overhead in general.

 Joerg

From: christos@zoulas.com (Christos Zoulas)
To: gnats-bugs@NetBSD.org, port-arm-maintainer@netbsd.org, 
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, martin@NetBSD.org
Cc: 
Subject: Re: PR/50087 CVS commit: src/lib/libc/stdlib
Date: Mon, 10 Aug 2015 01:10:05 -0400

 On Aug 9,  8:10pm, joerg@britannica.bec.de (Joerg Sonnenberger) wrote:
 -- Subject: Re: PR/50087 CVS commit: src/lib/libc/stdlib

 |  Oh, that's the use of sysctlbyname, which will internally call the
 |  getoid function and that's caching the result. It should be using fixed
 |  oids to reduce the startup overhead in general.

 We were thinking about fixing that, but then we wondered if it was the
 last one... And fixing malloc to be more robust was more expedient. But
 you are right; those should be using fixed oids.

 christos

From: Joerg Sonnenberger <joerg@britannica.bec.de>
To: gnats-bugs@NetBSD.org
Cc: port-arm-maintainer@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org, martin@NetBSD.org
Subject: Re: PR/50087 CVS commit: src/lib/libc/stdlib
Date: Mon, 10 Aug 2015 10:49:39 +0200

 On Mon, Aug 10, 2015 at 05:15:01AM +0000, Christos Zoulas wrote:
 > The following reply was made to PR port-arm/50087; it has been noted by GNATS.
 > 
 > From: christos@zoulas.com (Christos Zoulas)
 > To: gnats-bugs@NetBSD.org, port-arm-maintainer@netbsd.org, 
 > 	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, martin@NetBSD.org
 > Cc: 
 > Subject: Re: PR/50087 CVS commit: src/lib/libc/stdlib
 > Date: Mon, 10 Aug 2015 01:10:05 -0400
 > 
 >  On Aug 9,  8:10pm, joerg@britannica.bec.de (Joerg Sonnenberger) wrote:
 >  -- Subject: Re: PR/50087 CVS commit: src/lib/libc/stdlib
 >  
 >  |  Oh, that's the use of sysctlbyname, which will internally call the
 >  |  getoid function and that's caching the result. It should be using fixed
 >  |  oids to reduce the startup overhead in general.
 >  
 >  We were thinking about fixing that, but then we wondered if it was the
 >  last one... And fixing malloc to be more robust was more expedient. But
 >  you are right; those should be using fixed oids.

 Well, there are two approaches. The first would be to make the ordering
 of constructors explicit by using different names for the sections. The
 second would be to stick a diagassert in malloc for now, so that
 ___dlauxinfo() is not NULL.

 Joerg

From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/50087 CVS commit: [netbsd-7] src
Date: Tue, 24 Nov 2015 17:37:16 +0000

 Module Name:	src
 Committed By:	martin
 Date:		Tue Nov 24 17:37:16 UTC 2015

 Modified Files:
 	src/include [netbsd-7]: limits.h
 	src/lib/libc/stdlib [netbsd-7]: jemalloc.c
 	src/lib/libpthread [netbsd-7]: pthread.c pthread_int.h
 	    pthread_key_create.3 pthread_tsd.c
 	src/lib/libpthread_dbg [netbsd-7]: pthread_dbg.c

 Log Message:
 Pull up following revision(s) (requested by manu in ticket #829):
 	lib/libpthread_dbg/pthread_dbg.c: revision 1.43 (via patch)
 	lib/libpthread/pthread_int.h: revision 1.91-1.92 (via patch)
 	lib/libc/stdlib/jemalloc.c: revision 1.37-1.38
 	lib/libpthread/pthread_tsd.c: revision 1.12-1.14 (via patch)
 	include/limits.h: revision 1.34 (via patch)
 	lib/libpthread/pthread.c: revision 1.146-1.147 (via patch)
 	lib/libpthread/pthread_key_create.3: revision 1.7 (via patch)

 libpthread:

 Make PTHREAD_KEYS_MAX dynamically adjustable
 NetBSD's PTHREAD_KEYS_MAX is set to 256, which is low compared to
 other systems like Linux (1024) or MacOS X (512). As a result some
 setups tested on Linux will exhibit problems on NetBSD because of
 pthread_keys usage beyond the limit. This happens for instance on
 Apache with various module loaded, and in this case no particular
 developper can be blamed for going beyond the limit, since several
 modules from different sources contribute to the problem.
 This patch makes the limit conigurable through the PTHREAD_KEYS_MAX
 environement variable. If undefined, the default remains unchanged
 (256). In any case, the value cannot be lowered below POSIX-mandated
 _POSIX_THREAD_KEYS_MAX (128).

 While there:
 - use EXIT_FAILURE instead of 1 when calling err(3) in libpthread.
 - Reset _POSIX_THREAD_KEYS_MAX to POSIX mandated 128, instead of 256.

 Fix previous: Can't use calloc/malloc before we complete initialization
 of the thread library, because malloc uses pthread_foo_specific, and it will
 end up initializing itself incorrectly.

 Thanks rump for not letting us use even mmap during initialization.

 libc/jemalloc:

 Fix non _REENTRANT build.
 Defer using pthread keys until we are threaded.
 From Christos, fixes PR port-arm/50087 by allowing malloc calls prior
 to libpthread initialization.


 To generate a diff of this commit:
 cvs rdiff -u -r1.33 -r1.33.8.1 src/include/limits.h
 cvs rdiff -u -r1.34 -r1.34.2.1 src/lib/libc/stdlib/jemalloc.c
 cvs rdiff -u -r1.144 -r1.144.4.1 src/lib/libpthread/pthread.c
 cvs rdiff -u -r1.89 -r1.89.8.1 src/lib/libpthread/pthread_int.h
 cvs rdiff -u -r1.6 -r1.6.24.1 src/lib/libpthread/pthread_key_create.3
 cvs rdiff -u -r1.11 -r1.11.8.1 src/lib/libpthread/pthread_tsd.c
 cvs rdiff -u -r1.42 -r1.42.8.1 src/lib/libpthread_dbg/pthread_dbg.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.