NetBSD Problem Report #57241

From martin@aprisoft.de  Wed Feb 22 11:47:49 2023
Return-Path: <martin@aprisoft.de>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id ADDE51A9239
	for <gnats-bugs@gnats.NetBSD.org>; Wed, 22 Feb 2023 11:47:49 +0000 (UTC)
Message-Id: <20230222114740.6FE405CC81D@emmas.aprisoft.de>
Date: Wed, 22 Feb 2023 12:47:40 +0100 (CET)
From: martin@NetBSD.org
Reply-To: martin@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: mips64el--netbsd-install core dumps randomly
X-Send-Pr-Version: 3.95

>Number:         57241
>Category:       toolchain
>Synopsis:       mips64el--netbsd-install core dumps randomly
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    toolchain-manager
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Feb 22 11:50:00 +0000 2023
>Closed-Date:    
>Last-Modified:  Thu Dec 18 05:00:01 +0000 2025
>Originator:     Martin Husemann
>Release:        NetBSD 10.99.2
>Organization:
The NetBSD Foundation, Inc.
>Environment:
System: NetBSD martins.aprisoft.de 10.99.2 NetBSD 10.99.2 (GENERIC) #159: Tue Feb 21 15:27:55 CET 2023 martin@martins.aprisoft.de:/usr/src/sys/arch/amd64/compile/GENERIC amd64
Architecture: x86_64
Machine: amd64
>Description:

Since the switch to binutils 2.39 we see regular objcopy crashes in
the evbmips-mips64el build:

/home/builds/ab/HEAD/evbmips-mips64el/202302201820Z-tools/bin/mips64el--netbsd-objcopy: libdwarf.so.2.0.debug: section `.note.netbsd.pax' can't be allocated in segment 0
LOAD: .MIPS.abiflags .reginfo .dynamic .hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .init .text .MIPS.stubs .fini .rodata .eh_frame_hdr .eh_frame .note.netbsd.ident .note.netbsd.pax
[1]   Bus error (core dumped) /home/builds/ab/HEAD/evbmips-mips64el/20230220...
--- /home/builds/ab/HEAD/evbmips-mips64el/202302201820Z-dest/usr/bin/netstat ---
*** Failed target: /home/builds/ab/HEAD/evbmips-mips64el/202302201820Z-dest/usr/bin/netstat
*** Failed commands:
	${_MKTARGET_INSTALL}
	=> @# "install " /home/builds/ab/HEAD/evbmips-mips64el/202302201820Z-dest/usr/bin/netstat
	${INSTALL_FILE} -o ${BINOWN} -g ${BINGRP} -m ${BINMODE}  ${STRIPFLAG} ${.ALLSRC} ${.TARGET}
	=> /home/builds/ab/HEAD/evbmips-mips64el/202302201820Z-tools/bin/mips64el--netbsd-install -U -M /home/builds/ab/HEAD/evbmips-mips64el/202302201820Z-dest/METALOG -D /home/builds/ab/HEAD/evbmips-mips64el/202302201820Z-dest -h sha256 -N /home/source/ab/HEAD/src/etc -c  -r -o root -g kmem -m 2555   netstat /home/builds/ab/HEAD/evbmips-mips64el/202302201820Z-dest/usr/bin/netstat
*** [/home/builds/ab/HEAD/evbmips-mips64el/202302201820Z-dest/usr/bin/netstat] Error code 138
nbmake[6]: stopped in /home/source/ab/HEAD/src/usr.bin/netstat

It does not happen on every build, but quite often.

>How-To-Repeat:
s/a

>Fix:
n/a

>Release-Note:

>Audit-Trail:
From: Christos Zoulas <christos@zoulas.com>
To: gnats-bugs@netbsd.org
Cc: toolchain-manager@netbsd.org, gnats-admin@netbsd.org,
 netbsd-bugs@netbsd.org
Subject: Re: toolchain/57241: mips64el--netbsd-objcopy core dumps randomly
Date: Wed, 22 Feb 2023 09:51:29 -0500

 This comes from:

                if (!ELF_SECTION_IN_SEGMENT_1 (this_hdr, p, check_vma, 0)
                    && !ELF_TBSS_SPECIAL (this_hdr, p))
                  {
                    _bfd_error_handler
                      /* xgettext:c-format */
                      (_("%pB: section `%pA' can't be allocated in segment 
 %d"),
                       abfd, sec, j);
                    print_segment_map (m);
                  }


 instrumenting the macro ELF_SECTION_IN_SEGMENT_1 is "mission difficult". 
 Perhaps check ELF_TBSS_SPECIAL first :-)

 christos

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: toolchain/57241: binutils crashes for some builds
Date: Wed, 8 Mar 2023 15:15:05 +0100

 Examing this a bit closer the build cluster gets two core dumps
 during a failed build, one is from the host ld,
 GNU ld (NetBSD Binutils nb1) 2.31.1, and it is quite low on usable information:

 Reading symbols from ld...
 Reading symbols from /usr/libdata/debug//usr/bin/ld.debug...
 [New process 1]
 Core was generated by `ld'.
 Program terminated with signal SIGSEGV, Segmentation fault.
 #0  0x0000000000430e81 in ?? ()
 (gdb) bt
 #0  0x0000000000430e81 in ?? ()
 #1  0x0000000000000000 in ?? ()
 (gdb) info reg
 rax            0x1e484ef           31753455
 rbx            0x63426a            6505066
 rcx            0x0                 0
 rdx            0x100000            1048576
 rsi            0x63426a            6505066
 rdi            0x68                104
 rbp            0x9                 0x9
 rsp            0x7f7fff4293a0      0x7f7fff4293a0
 r8             0x0                 0
 r9             0x0                 0
 r10            0xb0200000000       12103217840128
 r11            0x0                 0
 r12            0x6769752           108435282
 r13            0x63426a            6505066
 r14            0x68                104
 r15            0x0                 0
 rip            0x430e81            0x430e81
 eflags         0x10202             [ IF RF ]
 cs             0x47                71
 ss             0x3f                63
 ds             0x23                35
 es             0x23                35
 fs             0x0                 0
 gs             0x0                 0
 fs_base        <unavailable>
 gs_base        <unavailable>

 The other assumed from objcopy (but I didn't have the exact objdump binary
 around any more):

 warning: exec file is newer than core file.
 [New process 1]
 Core was generated by `mips64el--netbsd'.
 Program terminated with signal SIGBUS, Bus error.
 #0  0x0000000000409c81 in setup_section (
     ibfd=<error reading variable: Cannot access memory at address 0x884a8f2a>, 
     isection=<error reading variable: Cannot access memory at address 0x884a8f22>, 
     obfdarg=<error reading variable: Cannot access memory at address 0x884a8f1a>)
     at /home/source/ab/HEAD/src/tools/binutils/../../external/gpl3/binutils/dist/binutils/objcopy.c:4042
 4042          flags = check_new_section_flags (flags, obfd, name);
 (gdb) bt
 #0  0x0000000000409c81 in setup_section (
     ibfd=<error reading variable: Cannot access memory at address 0x884a8f2a>, 
     isection=<error reading variable: Cannot access memory at address 0x884a8f22>, 
     obfdarg=<error reading variable: Cannot access memory at address 0x884a8f1a>)
     at /home/source/ab/HEAD/src/tools/binutils/../../external/gpl3/binutils/dist/binutils/objcopy.c:4042
 Backtrace stopped: Cannot access memory at address 0x884a8fba
 (gdb) info reg
 rax            0xf764327d          4150538877
 rbx            0x91ba1316          2444890902
 rcx            0x20                32
 rdx            0xc44d9cab          3293420715
 rsi            0x75a6bf2e3b00      129359032498944
 rdi            0xfe4c0ff2          4266397682
 rbp            0x884a8fb2          0x884a8fb2
 rsp            0x7f7fff76c5d8      0x7f7fff76c5d8
 r8             0xfe10f7cb          4262524875
 r9             0x884a8fb2          2286587826
 r10            0xfe10f7cb          4262524875
 r11            0x49d73267          1238839911
 r12            0x6a2ff5ba          1781527994
 r13            0xdb0254c5          3674363077
 r14            0x7f7fff76c808      140187723548680
 r15            0x75a6bf0f7000      129359030480896
 rip            0x409c81            0x409c81 <setup_section+246>
 eflags         0x10283             [ CF SF IF RF ]
 cs             0x47                71
 ss             0x3f                63
 ds             0x23                35
 es             0x23                35
 fs             0x0                 0
 gs             0x0                 0
 fs_base        <unavailable>
 gs_base        <unavailable>

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: toolchain/57241: binutils crashes for some builds
Date: Sun, 12 Mar 2023 14:35:05 +0100

 Here is another core dump that makes a bit more sense but still fails
 to give enough data:

 Reading symbols from 202303071400Z-tools/bin/mips64el--netbsd-objcopy...
 [New process 1]
 Core was generated by `mips64el--netbsd'.
 Program terminated with signal SIGBUS, Bus error.
 #0  0x000000000040c631 in copy_main (argc=0, argv=0x0)
     at /home/source/ab/HEAD/src/tools/binutils/../../external/gpl3/binutils/dist/binutils/objcopy.c:5443
 5443              change_start = change_section_address;
 #0  0x000000000040c631 in copy_main (argc=0, argv=0x0)
     at /home/source/ab/HEAD/src/tools/binutils/../../external/gpl3/binutils/dist/binutils/objcopy.c:5443
 #1  0x000000000040c758 in copy_main (argc=0, argv=0x0)
     at /home/source/ab/HEAD/src/tools/binutils/../../external/gpl3/binutils/dist/binutils/objcopy.c:5482
 #2  0x000000000040cbd8 in copy_main (argc=1, argv=0x7f7fe1a05224)
     at /home/source/ab/HEAD/src/tools/binutils/../../external/gpl3/binutils/dist/binutils/objcopy.c:5617
 #3  0x00000000004049b5 in filter_symbols (abfd=0x7f7fff53c4c0, 
     obfd=0x7f7fe1a00c40, osyms=0x1, isyms=0x7f7fe1a01d99, symcount=0)
     at /home/source/ab/HEAD/src/tools/binutils/../../external/gpl3/binutils/dist/binutils/objcopy.c:1688
 #4  0x000000000040448e in filter_symbols (abfd=0x2c334df2, obfd=0x640d16c8, 
     osyms=0xffffffffffffffff, isyms=0x3ea, symcount=4307852197889)
     at /home/source/ab/HEAD/src/tools/binutils/../../external/gpl3/binutils/dist/binutils/objcopy.c:1560
 #5  0x00000000004038ea in add_specific_symbols (
     filename=0x246 <error: Cannot access memory at address 0x246>, 
     htab=0x7f7fe1a0082d, buffer_p=0x1)
     at /home/source/ab/HEAD/src/tools/binutils/../../external/gpl3/binutils/dist
 #6  0x0000000000402eed in parse_flags (s=0x0)
     at /home/source/ab/HEAD/src/tools/binutils/../../external/gpl3/binutils/dist/binutils/objcopy.c:802

From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/57241 CVS commit: src/share/mk
Date: Sun, 12 Mar 2023 17:22:47 +0000

 Module Name:	src
 Committed By:	martin
 Date:		Sun Mar 12 17:22:47 UTC 2023

 Modified Files:
 	src/share/mk: bsd.own.mk

 Log Message:
 PR 57241: switch mips64el back to old binutils for now


 To generate a diff of this commit:
 cvs rdiff -u -r1.1307 -r1.1308 src/share/mk/bsd.own.mk

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Rin Okuyama" <rin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/57241 CVS commit: src/external/gpl3/binutils
Date: Thu, 24 Aug 2023 06:14:57 +0000

 Module Name:	src
 Committed By:	rin
 Date:		Thu Aug 24 06:14:57 UTC 2023

 Modified Files:
 	src/external/gpl3/binutils/lib/libbfd/arch/mips64eb: bfd.h config.h
 	src/external/gpl3/binutils/lib/libbfd/arch/mips64el: bfd.h defs.mk
 	src/external/gpl3/binutils/lib/libiberty/arch/mips64eb: config.h
 	    defs.mk
 	src/external/gpl3/binutils/lib/libiberty/arch/mips64el: defs.mk
 	src/external/gpl3/binutils/lib/libopcodes/arch/mips64eb: config.h
 	src/external/gpl3/binutils/usr.bin/common/arch/mips64eb: config.h
 	src/external/gpl3/binutils/usr.bin/gas/arch/mips64eb: config.h
 	src/external/gpl3/binutils/usr.bin/gas/arch/mips64el: config.h
 	src/external/gpl3/binutils/usr.bin/ld/arch/mips64eb: config.h
 	src/external/gpl3/binutils/usr.bin/ld/arch/mips64el: defs.mk
 	    ldemul-list.h

 Log Message:
 binutils: mknative mips64e[bl] correctly

 The previous version was very broken; SIZEOF_* seem like LP64, even
 though this should be, and had been configured for n32 userland.

 May fix random build failures reported as PR toolchain/57241.

 XXX
 There still remain oddies. mknative again soon with update for
 configure scripts.


 To generate a diff of this commit:
 cvs rdiff -u -r1.13 -r1.14 \
     src/external/gpl3/binutils/lib/libbfd/arch/mips64eb/bfd.h
 cvs rdiff -u -r1.11 -r1.12 \
     src/external/gpl3/binutils/lib/libbfd/arch/mips64eb/config.h
 cvs rdiff -u -r1.13 -r1.14 \
     src/external/gpl3/binutils/lib/libbfd/arch/mips64el/bfd.h
 cvs rdiff -u -r1.11 -r1.12 \
     src/external/gpl3/binutils/lib/libbfd/arch/mips64el/defs.mk
 cvs rdiff -u -r1.11 -r1.12 \
     src/external/gpl3/binutils/lib/libiberty/arch/mips64eb/config.h \
     src/external/gpl3/binutils/lib/libiberty/arch/mips64eb/defs.mk
 cvs rdiff -u -r1.8 -r1.9 \
     src/external/gpl3/binutils/lib/libiberty/arch/mips64el/defs.mk
 cvs rdiff -u -r1.11 -r1.12 \
     src/external/gpl3/binutils/lib/libopcodes/arch/mips64eb/config.h
 cvs rdiff -u -r1.13 -r1.14 \
     src/external/gpl3/binutils/usr.bin/common/arch/mips64eb/config.h
 cvs rdiff -u -r1.10 -r1.11 \
     src/external/gpl3/binutils/usr.bin/gas/arch/mips64eb/config.h
 cvs rdiff -u -r1.10 -r1.11 \
     src/external/gpl3/binutils/usr.bin/gas/arch/mips64el/config.h
 cvs rdiff -u -r1.13 -r1.14 \
     src/external/gpl3/binutils/usr.bin/ld/arch/mips64eb/config.h
 cvs rdiff -u -r1.10 -r1.11 \
     src/external/gpl3/binutils/usr.bin/ld/arch/mips64el/defs.mk
 cvs rdiff -u -r1.5 -r1.6 \
     src/external/gpl3/binutils/usr.bin/ld/arch/mips64el/ldemul-list.h

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Rin Okuyama" <rin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/57241 CVS commit: src/share/mk
Date: Thu, 24 Aug 2023 06:18:07 +0000

 Module Name:	src
 Committed By:	rin
 Date:		Thu Aug 24 06:18:07 UTC 2023

 Modified Files:
 	src/share/mk: bsd.own.mk

 Log Message:
 bsd.own.mk: Switch mips64e[bl] to binutils 2.39 again

 Potential fix for PR toolchain/57241 has been committed.
 Let us see whether this works fine or not.


 To generate a diff of this commit:
 cvs rdiff -u -r1.1358 -r1.1359 src/share/mk/bsd.own.mk

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->analyzed
State-Changed-By: rin@NetBSD.org
State-Changed-When: Thu, 24 Aug 2023 06:23:10 +0000
State-Changed-Why:
Potential fix committed. Let's see if the problem occur again on
periodic snapshot builds.


State-Changed-From-To: analyzed->open
State-Changed-By: rin@NetBSD.org
State-Changed-When: Thu, 14 Sep 2023 05:09:58 +0000
State-Changed-Why:
Same problem observed again. Let me examine for a while...


From: "Rin Okuyama" <rin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/57241 CVS commit: src/share/mk
Date: Sat, 13 Jul 2024 03:38:13 +0000

 Module Name:	src
 Committed By:	rin
 Date:		Sat Jul 13 03:38:12 UTC 2024

 Modified Files:
 	src/share/mk: bsd.own.mk

 Log Message:
 bsd.own.mk: Switch mips to binutils 2.42

 There is no new regression observed for brief tests on OCTEON ({,n}64eb)
 as well as on MIPSSIM{,64} (all ABI combinations).

 Let us see what happens for PR toolchain/57241; I've never reproduced
 this failure locally. It may be precisely host-environment dependent,
 and therefore very hard to track :(


 To generate a diff of this commit:
 cvs rdiff -u -r1.1387 -r1.1388 src/share/mk/bsd.own.mk

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Taylor R Campbell <riastradh@NetBSD.org>
To: gnats-bugs@NetBSD.org, netbsd-bugs@NetBSD.org
Cc: 
Subject: Re: toolchain/57241: mips64el--netbsd-objcopy core dumps randomly
Date: Tue, 23 Jul 2024 03:22:57 +0000

 We dug into this the other day.

 It's not actually mips64el--netbsd-objcopy that's crashing -- it's
 mips64el--netbsd-install that's crashing, according to all the logs I
 can find.  (The core file name is truncated to 16 characters by struct
 proc::p_comm, hence `mips64el--netbsd', and there is an adjacent
 warning from objcopy which appears to be unrelated.  Perhaps we should
 have a separate PR to track the problem of unclear core provenance.)

 I reviewed a core dump debug data from one of the crashes, and it
 crashed here:

 #0  0x000000000040c631 in be32dec (buf=3D0x72940e194400)
     at /usr/include/sys/endian.h:221
 221     __GEN_ENDIAN_DEC(32, be)
 (gdb) bt
 #0  0x000000000040c631 in be32dec (buf=3D0x72940e194400)
     at /usr/include/sys/endian.h:221
 #1  0x000000000040c758 in SHA256_Transform (context=3D0x7f7fff52bd50,
     data=3D0x72940e194400)
     at /home/source/ab/HEAD/src/tools/compat/../../common/lib/libc/hash/sha=
 2/sha2.c:388
 #2  0x000000000040cbd8 in SHA256_Update (context=3D0x7f7fff52bd50,
     data=3D0x72940e194400 <error: Cannot access memory at address 0x72940e1=
 94400>, len=3D2535104)
     at /home/source/ab/HEAD/src/tools/compat/../../common/lib/libc/hash/sha=
 2/sha2.c:388
 #2  0x000000000040cbd8 in SHA256_Update (context=3D0x7f7fff52bd50,
     data=3D0x72940e194400 <error: Cannot access memory at address 0x72940e1=
 94400>, len=3D2535104)
     at /home/source/ab/HEAD/src/tools/compat/../../common/lib/libc/hash/sha=
 2/sha2.c:487
 #3  0x00000000004049b5 in copy (from_fd=3D7, from_name=3D0x7f7fff53d70a "ip=
 ftest",
     to_fd=3D6,
     to_name=3D0x7f7fff53bf30 "/home/builds/ab/HEAD/evbmips-mips64el/2023031=
 11730Z-dest/usr/sbin/ipftest.inst.sUVfSU", size=3D3179200)
     at /home/source/ab/HEAD/src/tools/binstall/../../usr.bin/xinstall/xinst=
 all.c:927
 #4  0x000000000040448e in install (from_name=3D0x7f7fff53d70a "ipftest",
     to_name=3D0x7f7fff53bf30 "/home/builds/ab/HEAD/evbmips-mips64el/2023031=
 11730Z-dest/usr/sbin/ipftest.inst.sUVfSU", flags=3D0)
     at /home/source/ab/HEAD/src/tools/binstall/../../usr.bin/xinstall/xinst=
 all.c:745
 #5  0x00000000004038ea in main (argc=3D2, argv=3D0x7f7fff53cf38)
     at /home/source/ab/HEAD/src/tools/binstall/../../usr.bin/xinstall/xinst=
 all.c:434

 This occurs when xinstall computes the SHA-256 hash of the file it's
 installing, and has just mmapped to read, here:

 https://nxr.netbsd.org/xref/src/usr.bin/xinstall/xinstall.c?r=3D1.126#927

 The crash is SIGBUS, which on x86 almost certainly means that the
 mmapped file was truncated while it was being read.

 This is probably a consequence of our kooky bsd.prog.mk rules for
 handling debug data, where:

 1. the rule for the program `foo' is to link foo with debug data
 2. the rule for `foo.debug' is to copy the debug data out, and then
    strip debug data out of foo _in place_

 This is a bug on its own and we should fix it, like we fixed it in
 bsd.lib.mk -- have one rule to generate foo.full with everything, a
 separate rule to derive debug data from it in foo.debug, and a third
 rule to derive the stripped program from it in foo.

 This could explain the crashes by the following chain of events:

 (a) dependall depends on foo so it builds foo
 (b) dependall depends on foo.debug which depends on foo so it happens
     later but, by rewriting foo in place, updates foo's mtime
 (c) install depends on ${DESTDIR}/usr/bin/foo and
     ${DESTDIR}/usr/libdata/debug/usr/bin/foo.debug so it builds them
     in parallel:
      i. ${DESTDIR}/usr/bin/foo depends on foo which looks up-to-date,
         so make runs install
     ii. ${DESTDIR}/usr/libdata/debug/usr/bin/foo.debug depends on
         foo.debug, _which looks out-of-date because of foo's mtime_,
         so it runs the foo.debug recipe again which rewrites foo in
         place again

 Thus, (c)(i) runs install in parallel with (c)(ii) which may truncate
 foo (in the process of rebuilding foo.debug).

 But it's weird that this only happens on mips64 -- not even mipsn64,
 it seems.  And, from the records available to me, it's happened during
 the install phase of:

 - external/bsd/ipf/bin/ipftest (3x)
 - usr/sbin/crash (1x)
 - usr/bin/systat (1x)

 Three times in ipftest is pretty suspicious.

 One feature that these directories have in common -- which is also
 peculiar to mips64 builds, other than sgimips -- is the use of
 compat/exec.mk:

 https://nxr.netbsd.org/xref/src/compat/exec.mk?r=3D1.7

 That is, on mips64 builds, which normally use the n32 ABI, these
 programs are built with the 64-bit ABI instead.

 I don't have a theory for how compat/exec.mk could substantially raise
 the probability of races in the bsd.prog.mk foo/foo.debug rules, but
 the evidence suggests something about it does.

 So while we should obviously fix bsd.prog.mk, like we fixed bsd.lib.mk
 already, I suspect there's something else afish with compat/exec.mk
 that we need to understand too.  And until we've spent some more time
 to diagnose compat/exec.mk, and ideally figured that out, I think I'd
 like to leave the bsd.prog.mk bug in so we don't paper over the
 symptoms.

From: Taylor R Campbell <riastradh@NetBSD.org>
To: Roland Illig <rillig@NetBSD.org>
Cc: gnats-bugs@NetBSD.org, netbsd-bugs@NetBSD.org
Subject: Re: toolchain/57241: mips64el--netbsd-install core dumps randomly
Date: Fri, 18 Apr 2025 16:26:30 +0000

 Hi rillig, I wonder whether you might be able to help solve a
 make(1)-related mystery?

 I'm drafting a change to fix the parallel-safety of the foo.debug
 recipe in bsd.prog.mk (a little finicky because it has nontrivial
 interaction with other makefiles like libexec/ld.elf_so/Makefile).

 But before I commit it, I want to make sure I understand the
 underlying cause of PR 57241.

 The immediate symptom is that, e.g., `mips64el--netbsd-install ...
 ipftest ${DESTDIR}/usr/sbin/ipftest' is crashing because its input
 file has been truncated between fstat/mmap and access to file content.
 And it looks like there's a concurrent objcopy from the .debug recipe
 which has truncated ipftest to rewrite it in place.

 But I can't figure out why the concurrent objcopy is happening only in
 the mips64 builds of certain programs like ipftest(8) and crash(8),
 which seem to have in common the use of compat/exec.mk.  (These are
 programs that run with the n64 ABI, in order to read out kernel guts
 on mips64 CPUs, in a userland where _most_ programs run with the n32
 ABI instead because it's more compact and they usually have <4GB RAM.)

 And so I think I need a make(1) wizard to help.


 Here's an example:

 https://releng.netbsd.org/builds/HEAD/202504161330Z/evbmips-mips64el.build.=
 failed
 https://web.archive.org/web/20250418154748/https://releng.netbsd.org/builds=
 /HEAD/202504161330Z/evbmips-mips64el.build.failed

 [1]   Bus error (core dumped) /home/builds/ab/HEAD/evbmips-mips64el/2025041=
 6...
 --- /home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-dest/usr/sbin/ipfte=
 st ---
 ...
 *** Failed target: /home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-dest=
 /usr/sbin/ipftest
 *** In directory: /home/source/ab/HEAD/src/external/bsd/ipf/bin/ipftest
 *** Failed commands:
 	${_MKTARGET_INSTALL}
 	=3D> @# "install " /home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-des=
 t/usr/sbin/ipftest
 	${INSTALL_FILE} -o ${BINOWN} -g ${BINGRP} -m ${BINMODE}  ${STRIPFLAG} ${.A=
 LLSRC} ${.TARGET}
 	=3D> /home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-tools/bin/mips64e=
 l--netbsd-install -U -M /home/builds/ab/HEAD/evbmips-mips64el/202504161330Z=
 -dest/METALOG -D /home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-dest -=
 h sha256 -N /home/source/ab/HEAD/src/etc -c  -r -o root -g wheel -m 555   i=
 pftest /home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-dest/usr/sbin/ip=
 ftest
 *** [/home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-dest/usr/sbin/ipft=
 est] Error code 138
 ...
 /home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-tools/bin/mips64el--net=
 bsd-objcopy: libcrypto.so.15.0.debug: section `.note.netbsd.pax' can't be a=
 llocated in segment 0
 LOAD: .MIPS.abiflags .reginfo .dynamic .hash .dynsym .dynstr .gnu.version .=
 gnu.version_d .gnu.version_r .rel.dyn .init .text .MIPS.stubs .fini .rodata=
  .eh_frame_hdr .eh_frame .note.netbsd.ident .note.netbsd.pax

 The last part -- a warning message about which I just filed another
 bug, PR port-mips/59320: objcopy: section `.note.netbsd.pax' can't be
 allocated in segment 0 -- is evidence that make(1) is still running
 the buggy ipftest.debug recipe which rewrites ipftest in place:

     507 ${_PROGDEBUG.${_P}}: ${_P}
     508 	${_MKTARGET_CREATE}
     509 	( ${OBJCOPY} --only-keep-debug --compress-debug-sections \
     510 	    ${_P} ${_PROGDEBUG.${_P}} && \
     511 	  ${OBJCOPY} --strip-debug -p -R .gnu_debuglink \
     512 		--add-gnu-debuglink=3D${_PROGDEBUG.${_P}} ${_P} \
     513 	) || (rm -f ${_PROGDEBUG.${_P}}; false)

 https://nxr.netbsd.org/xref/src/share/mk/bsd.prog.mk?r=3D1.355#509


 My best guess was that:

 1. When doing dependall, the ipftest.debug recipe above:
    (a) creates ipftest.debug with objcopy at time t0,
    (b) a moment later, modifies ipftest in place with objcopy, at time
        t1 =3D t0 + eps > t1.

 2. When doing install, make(1) finds that ${DESTDIR}/usr/sbin/ipftest
    and ${DESTDIR}/usr/libdata/debug/usr/sbin/ipftest.debug are both
    out of date, so it tries to run, _in parallel_:

    (a) mips64el--netbsd-install ... ipftest ${DESTDIR}/usr/sbin/ipftest,
        because ipftest exists and is up-to-date

    (b) the .debug recipe above again, because ipftest exists and is
        up-to-date with timestamp t1, but ipftest.debug exists and is
        out-of-date with timestamp t0 < t1

 Except this hypothesis doesn't make sense, for two reasons:

 - The problem empirically _only_ happens in mips64 builds with a few
   programs, and nothing in the hypothesis above is restricted to that.

 - We pass `-p' (--preserve-dates) to objcopy(1) in step (1), so it
   restores the mtime of the input file after truncating and
   overwriting it -- and so by the time of make install, it should look
   like ipftest.debug is up-to-date.

 So I can't figure out why, under these circumstances, make install is
 trying to rerun the .debug recipe.  And I can't reproduce it on my
 laptop.

 I tried reading out `make -d g1' and `make -d m' output but it's kind
 of inscrutable to me (I thought `-d g1' would show a graph, with nodes
 and edges for dependency relations, but I can't figure out how to read
 the edges in it).

From: Christos Zoulas <christos@zoulas.com>
To: gnats-bugs@netbsd.org
Cc: toolchain-manager@netbsd.org,
 gnats-admin@netbsd.org,
 netbsd-bugs@netbsd.org,
 "martin@netbsd.org" <martin@NetBSD.org>
Subject: Re: toolchain/57241: mips64el--netbsd-install core dumps randomly
Date: Fri, 18 Apr 2025 14:34:00 -0400

 --Apple-Mail=_F6441B03-E773-42AB-93E9-466FAD395E7C
 Content-Transfer-Encoding: quoted-printable
 Content-Type: text/plain;
 	charset=utf-8

 Why not build the prog.debug file as part of the prog target?

 christos

 --- bsd.prog.mk 28 Nov 2021 15:49:36 -0000      1.340
 +++ bsd.prog.mk 18 Apr 2025 18:33:21 -0000
 @@ -529,6 +529,12 @@
  .if ${MKSTRIPIDENT} !=3D "no"
         ${OBJCOPY} -R .ident ${.TARGET}
  .endif
 +.if defined(_PROGDEBUG.${_P})
 +       (  ${OBJCOPY} --only-keep-debug ${_P} ${_PROGDEBUG.${_P}} \
 +       && ${OBJCOPY} --strip-debug -p -R .gnu_debuglink \
 +               --add-gnu-debuglink=3D${_PROGDEBUG.${_P}} ${_P} \
 +       ) || (rm -f ${_PROGDEBUG.${_P}}; false)
 +.endif
 =20
  CLEANFILES+=3D   ${_P}.d
  .if exists(${_P}.d)
 @@ -554,21 +560,18 @@
  .if ${MKSTRIPIDENT} !=3D "no"
         ${OBJCOPY} -R .ident ${.TARGET}
  .endif
 -.endif # !commands(${_P})
 -.endif # USE_COMBINE
 -
 -${_P}.ro: ${OBJS.${_P}} ${_DPADD.${_P}}
 -       ${_MKTARGET_LINK}
 -       ${CC} ${LDFLAGS:N-pie} -nostdlib -r -Wl,-dc -o ${.TARGET} =
 ${OBJS.${_P}}
 -
  .if defined(_PROGDEBUG.${_P})
 -${_PROGDEBUG.${_P}}: ${_P}
 -       ${_MKTARGET_CREATE}
         (  ${OBJCOPY} --only-keep-debug ${_P} ${_PROGDEBUG.${_P}} \
         && ${OBJCOPY} --strip-debug -p -R .gnu_debuglink \
                 --add-gnu-debuglink=3D${_PROGDEBUG.${_P}} ${_P} \
         ) || (rm -f ${_PROGDEBUG.${_P}}; false)
  .endif
 +.endif # !commands(${_P})
 +.endif # USE_COMBINE
 +
 +${_P}.ro: ${OBJS.${_P}} ${_DPADD.${_P}}
 +       ${_MKTARGET_LINK}
 +       ${CC} ${LDFLAGS:N-pie} -nostdlib -r -Wl,-dc -o ${.TARGET} =
 ${OBJS.${_P}}
 =20
  .endif # defined(OBJS.${_P}) && !empty(OBJS.${_P})                     =
 # }
 =20


 > On Apr 18, 2025, at 12:30=E2=80=AFPM, Taylor R Campbell via gnats =
 <gnats-admin@netbsd.org> wrote:
 >=20
 > The following reply was made to PR toolchain/57241; it has been noted =
 by GNATS.
 >=20
 > From: Taylor R Campbell <riastradh@NetBSD.org>
 > To: Roland Illig <rillig@NetBSD.org>
 > Cc: gnats-bugs@NetBSD.org, netbsd-bugs@NetBSD.org
 > Subject: Re: toolchain/57241: mips64el--netbsd-install core dumps =
 randomly
 > Date: Fri, 18 Apr 2025 16:26:30 +0000
 >=20
 > Hi rillig, I wonder whether you might be able to help solve a
 > make(1)-related mystery?
 >=20
 > I'm drafting a change to fix the parallel-safety of the foo.debug
 > recipe in bsd.prog.mk (a little finicky because it has nontrivial
 > interaction with other makefiles like libexec/ld.elf_so/Makefile).
 >=20
 > But before I commit it, I want to make sure I understand the
 > underlying cause of PR 57241.
 >=20
 > The immediate symptom is that, e.g., `mips64el--netbsd-install ...
 > ipftest ${DESTDIR}/usr/sbin/ipftest' is crashing because its input
 > file has been truncated between fstat/mmap and access to file content.
 > And it looks like there's a concurrent objcopy from the .debug recipe
 > which has truncated ipftest to rewrite it in place.
 >=20
 > But I can't figure out why the concurrent objcopy is happening only in
 > the mips64 builds of certain programs like ipftest(8) and crash(8),
 > which seem to have in common the use of compat/exec.mk.  (These are
 > programs that run with the n64 ABI, in order to read out kernel guts
 > on mips64 CPUs, in a userland where _most_ programs run with the n32
 > ABI instead because it's more compact and they usually have <4GB RAM.)
 >=20
 > And so I think I need a make(1) wizard to help.
 >=20
 >=20
 > Here's an example:
 >=20
 > =
 https://releng.netbsd.org/builds/HEAD/202504161330Z/evbmips-mips64el.build=
 .=3D
 > failed
 > =
 https://web.archive.org/web/20250418154748/https://releng.netbsd.org/build=
 s=3D
 > /HEAD/202504161330Z/evbmips-mips64el.build.failed
 >=20
 > [1]   Bus error (core dumped) =
 /home/builds/ab/HEAD/evbmips-mips64el/2025041=3D
 > 6...
 > --- =
 /home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-dest/usr/sbin/ipfte=3D=

 > st ---
 > ...
 > *** Failed target: =
 /home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-dest=3D
 > /usr/sbin/ipftest
 > *** In directory: =
 /home/source/ab/HEAD/src/external/bsd/ipf/bin/ipftest
 > *** Failed commands:
 > 	${_MKTARGET_INSTALL}
 > 	=3D3D> @# "install " =
 /home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-des=3D
 > t/usr/sbin/ipftest
 > 	${INSTALL_FILE} -o ${BINOWN} -g ${BINGRP} -m ${BINMODE}  =
 ${STRIPFLAG} ${.A=3D
 > LLSRC} ${.TARGET}
 > 	=3D3D> =
 /home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-tools/bin/mips64e=3D
 > l--netbsd-install -U -M =
 /home/builds/ab/HEAD/evbmips-mips64el/202504161330Z=3D
 > -dest/METALOG -D =
 /home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-dest -=3D
 > h sha256 -N /home/source/ab/HEAD/src/etc -c  -r -o root -g wheel -m =
 555   i=3D
 > pftest =
 /home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-dest/usr/sbin/ip=3D
 > ftest
 > *** =
 [/home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-dest/usr/sbin/ipft=3D=

 > est] Error code 138
 > ...
 > =
 /home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-tools/bin/mips64el--ne=
 t=3D
 > bsd-objcopy: libcrypto.so.15.0.debug: section `.note.netbsd.pax' can't =
 be a=3D
 > llocated in segment 0
 > LOAD: .MIPS.abiflags .reginfo .dynamic .hash .dynsym .dynstr =
 .gnu.version .=3D
 > gnu.version_d .gnu.version_r .rel.dyn .init .text .MIPS.stubs .fini =
 .rodata=3D
 >  .eh_frame_hdr .eh_frame .note.netbsd.ident .note.netbsd.pax
 >=20
 > The last part -- a warning message about which I just filed another
 > bug, PR port-mips/59320: objcopy: section `.note.netbsd.pax' can't be
 > allocated in segment 0 -- is evidence that make(1) is still running
 > the buggy ipftest.debug recipe which rewrites ipftest in place:
 >=20
 >     507 ${_PROGDEBUG.${_P}}: ${_P}
 >     508 	${_MKTARGET_CREATE}
 >     509 	( ${OBJCOPY} --only-keep-debug --compress-debug-sections =
 \
 >     510 	    ${_P} ${_PROGDEBUG.${_P}} && \
 >     511 	  ${OBJCOPY} --strip-debug -p -R .gnu_debuglink \
 >     512 		--add-gnu-debuglink=3D3D${_PROGDEBUG.${_P}} =
 ${_P} \
 >     513 	) || (rm -f ${_PROGDEBUG.${_P}}; false)
 >=20
 > https://nxr.netbsd.org/xref/src/share/mk/bsd.prog.mk?r=3D3D1.355#509
 >=20
 >=20
 > My best guess was that:
 >=20
 > 1. When doing dependall, the ipftest.debug recipe above:
 >    (a) creates ipftest.debug with objcopy at time t0,
 >    (b) a moment later, modifies ipftest in place with objcopy, at time
 >        t1 =3D3D t0 + eps > t1.
 >=20
 > 2. When doing install, make(1) finds that ${DESTDIR}/usr/sbin/ipftest
 >    and ${DESTDIR}/usr/libdata/debug/usr/sbin/ipftest.debug are both
 >    out of date, so it tries to run, _in parallel_:
 >=20
 >    (a) mips64el--netbsd-install ... ipftest =
 ${DESTDIR}/usr/sbin/ipftest,
 >        because ipftest exists and is up-to-date
 >=20
 >    (b) the .debug recipe above again, because ipftest exists and is
 >        up-to-date with timestamp t1, but ipftest.debug exists and is
 >        out-of-date with timestamp t0 < t1
 >=20
 > Except this hypothesis doesn't make sense, for two reasons:
 >=20
 > - The problem empirically _only_ happens in mips64 builds with a few
 >   programs, and nothing in the hypothesis above is restricted to that.
 >=20
 > - We pass `-p' (--preserve-dates) to objcopy(1) in step (1), so it
 >   restores the mtime of the input file after truncating and
 >   overwriting it -- and so by the time of make install, it should look
 >   like ipftest.debug is up-to-date.
 >=20
 > So I can't figure out why, under these circumstances, make install is
 > trying to rerun the .debug recipe.  And I can't reproduce it on my
 > laptop.
 >=20
 > I tried reading out `make -d g1' and `make -d m' output but it's kind
 > of inscrutable to me (I thought `-d g1' would show a graph, with nodes
 > and edges for dependency relations, but I can't figure out how to read
 > the edges in it).
 >=20


 --Apple-Mail=_F6441B03-E773-42AB-93E9-466FAD395E7C
 Content-Transfer-Encoding: 7bit
 Content-Disposition: attachment;
 	filename=signature.asc
 Content-Type: application/pgp-signature;
 	name=signature.asc
 Content-Description: Message signed with OpenPGP

 -----BEGIN PGP SIGNATURE-----
 Comment: GPGTools - http://gpgtools.org

 iF0EARECAB0WIQS+BJlbqPkO0MDBdsRxESqxbLM7OgUCaAKbGAAKCRBxESqxbLM7
 OpmlAJ40VE7fhoWs1JtbkPyiKlSBOwk0MQCg6KsYD4sS/xSzrMg8p+qNNMKkjew=
 =Dly2
 -----END PGP SIGNATURE-----

 --Apple-Mail=_F6441B03-E773-42AB-93E9-466FAD395E7C--

From: Taylor R Campbell <riastradh@NetBSD.org>
To: Christos Zoulas <christos@zoulas.com>
Cc: gnats-bugs@netbsd.org, toolchain-manager@netbsd.org,
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, martin@NetBSD.org
Subject: Re: toolchain/57241: mips64el--netbsd-install core dumps randomly
Date: Fri, 18 Apr 2025 18:47:52 +0000

 > Date: Fri, 18 Apr 2025 14:34:00 -0400
 > From: Christos Zoulas <christos@zoulas.com>
 > 
 > Why not build the prog.debug file as part of the prog target?

 We can do that, but it's probably better to use a separate
 intermediate from which the debug data and debug-stripped executable
 are built like we do in bsd.lib.mk.

 I have a patch that does this, and it works, and it will probably
 paper over the mips build issue.  But I think there has to be
 something more going on with the mips build issue -- possibly another
 bug that might bite us in different ways -- and I want to figure out
 what it is before losing the context that provokes it.

From: Roland Illig <roland.illig@gmx.de>
To: Taylor R Campbell <riastradh@NetBSD.org>
Cc: Roland Illig <rillig@NetBSD.org>, gnats-bugs@NetBSD.org,
	netbsd-bugs@NetBSD.org
Subject: Re: toolchain/57241: mips64el--netbsd-install core dumps randomly
Date: Sat, 19 Apr 2025 07:10:18 +0200 (GMT+02:00)

 ------=_Part_4_221469805.1745039418197
 Content-Type: text/plain; charset=UTF-8
 Content-Transfer-Encoding: 7bit

 Hi Taylor,

 Make supports target-local variables, so you could modify the OBJCOPY command for just a few selected targets, without touching bsd.prog.mk.

 You could also ask releng to run the mips64el build in ktrace mode.

 And you could try make's -T option to see where one job starts and another one ends.

 The edges in the "-d g1" graph are printed in the same "target: source" format as in makefiles.

 Roland

 ------=_Part_4_221469805.1745039418197
 Content-Type: text/html; charset=UTF-8
 Content-Transfer-Encoding: 7bit

 <html>
  <head>
   <meta name="viewport" content="width=device-width, initial-scale=1.0">
  </head>
  <body markdown="false">
   <span dir="ltr" style="margin-top:0; margin-bottom:0;">Hi Taylor,</span>
   <br>
   <br><span dir="ltr" style="margin-top:0; margin-bottom:0;">Make supports target-local variables, so you could modify the OBJCOPY command for just a few selected targets, without touching bsd.prog.mk.</span>
   <br>
   <br><span dir="ltr" style="margin-top:0; margin-bottom:0;">You could also ask releng to run the mips64el build in ktrace mode.</span>
   <br>
   <br><span dir="ltr" style="margin-top:0; margin-bottom:0;">And you could try make's -T option to see where one job starts and another one ends.</span>
   <br>
   <br><span dir="ltr" style="margin-top:0; margin-bottom:0;">The edges in the "-d g1" graph are printed in the same "target: source" format as in makefiles.</span>
   <br>
   <br><span dir="ltr" style="margin-top:0; margin-bottom:0;">Roland</span>
   <br>
  </body>
 </html>
 ------=_Part_4_221469805.1745039418197--

From: "Greg A. Woods" <woods@planix.ca>
To: NetBSD GNATS <gnats-bugs@NetBSD.org>
Cc: 
Subject: Re: toolchain/57241: mips64el--netbsd-install core dumps randomly
Date: Tue, 02 Dec 2025 20:45:33 -0800

 --pgp-sign-Multipart_Tue_Dec__2_20:45:30_2025-1
 Content-Type: text/plain; charset=US-ASCII

 I'm pretty sure I encountered this same problem earlier today while
 running a cross-build on my x86_64 macos machine:

 Bus error: 10
 --- /Users/woods/build/woods/very.local/trunk-amd64-inet6-destdir/usr/bin/file ---

 *** Failed target: /Users/woods/build/woods/very.local/trunk-amd64-inet6-destdir/usr/bin/file
 *** In directory: /Volumes/work/woods/g-NetBSD-src/external/bsd/file/bin
 *** Failed commands:
         ${_MKTARGET_INSTALL}
         => @echo '   ' "install " /Users/woods/build/woods/very.local/trunk-amd64-inet6-destdir/usr/bin/file
         ${INSTALL_FILE} -o ${BINOWN} -g ${BINGRP} -m ${BINMODE}  ${STRIPFLAG} ${.ALLSRC} ${.TARGET}
         => /Users/woods/build/woods/very.local/trunk-Darwin-25.1.0-i386-amd64-tools/bin/x86_64--netbsd-install -U -M /Users/woods/build/woods/very.local/trunk-amd64-inet6-destdir/METALOG -D /Users/woods/build/woods/very.local/trunk-amd64-inet6-destdir -h sha256 -N /Volumes/work/woods/g-NetBSD-src/etc -c -p -r -o root -g wheel -m 555  -s file /Users/woods/build/woods/very.local/trunk-amd64-inet6-destdir/usr/bin/file
 *** [/Users/woods/build/woods/very.local/trunk-amd64-inet6-destdir/usr/bin/file] Error code 138

 nbmake[8]: stopped making "install" in /Volumes/work/woods/g-NetBSD-src/external/bsd/file/bin


 This is the first time I've seen it and I've done dozens of successful
 builds on this machine, but if this is a parallel make problem, as very
 well it could be, well then, there you go -- they can be rare and hard
 to spot!  I typically run "build.sh" with "-j 24" on this machine, and
 the objdir/destdir etc are on SSD (source is on an external disk array)
 and it has 16 physical CPU cores (so 32 logical CPUs), so it is pretty
 good at tripping over such problems, and I've encountered more than one,
 but usually in more obvious rules, but it's never a given that they will
 occur.

 It might be nice to fix the underlying core dump too, but that looks
 like it might be a bigger job.

 --
 					Greg A. Woods <gwoods@acm.org>

 Kelowna, BC     +1 250 762-7675           RoboHack <woods@robohack.ca>
 Planix, Inc. <woods@planix.com>     Avoncote Farms <woods@avoncote.ca>

 --pgp-sign-Multipart_Tue_Dec__2_20:45:30_2025-1
 Content-Type: application/pgp-signature
 Content-Transfer-Encoding: 7bit
 Content-Description: OpenPGP Digital Signature

 -----BEGIN PGP SIGNATURE-----

 iF0EABECAB0WIQRuK6dmwVAucmRxuh9mfXG3eL/0fwUCaS/AbQAKCRBmfXG3eL/0
 f5VmAKCvUM4WaTtbq5+JCel4/KbQphB3KwCffNSG4Sr/wzrIpvjX7Eop772GT60=
 =ubuh
 -----END PGP SIGNATURE-----

 --pgp-sign-Multipart_Tue_Dec__2_20:45:30_2025-1--

From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/57241 CVS commit: src
Date: Wed, 17 Dec 2025 01:32:55 +0000

 Module Name:	src
 Committed By:	riastradh
 Date:		Wed Dec 17 01:32:55 UTC 2025

 Modified Files:
 	src/libexec/ld.elf_so: Makefile
 	src/share/mk: bsd.prog.mk

 Log Message:
 bsd.prog.mk: Fix parallel builds of debug data.

 Previously, we had one rule to generate foo, and another rule to
 derive foo.debug from it with objcopy -- and then rewrite foo _in
 place_ to strip the debug data with objcopy.

 This is wrong -- one rule should never overwrite another rule's
 target; this violates the contract with make(1), and can lead it to
 run rules in parallel on files that are changing, which in turn can
 lead the rules to behave mysteriously or crash.

 It's still not clear why in our autobuilds, the only cases where we
 saw these crashes were in mips64 builds of programs that use
 compat/exec.mk (mainly external/bsd/ipf/bin/ipftest, but also
 usr.sbin/crash and usr.bin/systat).  But the bug this change fixes is
 well-understood, and now someone has observed it in a non-mips build,
 so I'm throwing in the towel on figuring out what makes these
 particular programs more likely to trigger the problem.

 PR toolchain/57241: mips64el--netbsd-install core dumps randomly


 To generate a diff of this commit:
 cvs rdiff -u -r1.151 -r1.152 src/libexec/ld.elf_so/Makefile
 cvs rdiff -u -r1.359 -r1.360 src/share/mk/bsd.prog.mk

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/57241 CVS commit: src/share/mk
Date: Wed, 17 Dec 2025 01:33:21 +0000

 Module Name:	src
 Committed By:	riastradh
 Date:		Wed Dec 17 01:33:21 UTC 2025

 Modified Files:
 	src/share/mk: bsd.kmodule.mk

 Log Message:
 bsd.kmodule.mk: Fix parallelism in debug data generation recipes.

 In the recipe for foo.kmod.debug, don't overwrite foo.kmod in place.
 Instead, generate foo.kmod.link with debug data included, and then
 derive foo.kmod (no debug data) and foo.kmod.debug (only debug data)
 from it in separate recipes.

 Same issue as we had with bsd.lib.mk and bsd.prog.mk in the past.

 PR toolchain/57241: mips64el--netbsd-install core dumps randomly


 To generate a diff of this commit:
 cvs rdiff -u -r1.86 -r1.87 src/share/mk/bsd.kmodule.mk

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/57241 CVS commit: src/tests
Date: Wed, 17 Dec 2025 23:43:52 +0000

 Module Name:	src
 Committed By:	riastradh
 Date:		Wed Dec 17 23:43:52 UTC 2025

 Modified Files:
 	src/tests/lib/csu: Makefile
 	src/tests/libexec/ld.elf_so: Makefile

 Log Message:
 tests/lib/csu, tests/libexec/ld.elf_so: Apply CTFMERGE=: workaround.

 The workaround for

 PR toolchain/59364: ctf tools needs update

 required an update after fixing

 PR toolchain/57241: mips64el--netbsd-install core dumps randomly

 by splitting the recipes for ${PROG} and ${PROG}.debug into an
 intermediate ${PROG}.link to avoid overwriting ${PROG} inside the
 recipe for ${PROG}.debug.

 This is kludgey (writing the `.link' suffix into a Makefile isn't
 great) but I see only two cases of it so this'll do for now.


 To generate a diff of this commit:
 cvs rdiff -u -r1.13 -r1.14 src/tests/lib/csu/Makefile
 cvs rdiff -u -r1.32 -r1.33 src/tests/libexec/ld.elf_so/Makefile

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/57241 CVS commit: src/share/mk
Date: Thu, 18 Dec 2025 04:57:55 +0000

 Module Name:	src
 Committed By:	riastradh
 Date:		Thu Dec 18 04:57:55 UTC 2025

 Modified Files:
 	src/share/mk: bsd.kmodule.mk bsd.lib.mk bsd.prog.mk

 Log Message:
 bsd.*.mk: Use objcopy without -p to strip debug data.

 No need to preserve the date with -p/--preserve-dates -- the only
 meaningful effect it has is to cause make(1) to rerun it every time,
 because make(1) treats exactly the same date as out-of-date.

 Followup after:

 PR toolchain/57241: mips64el--netbsd-install core dumps randomly


 To generate a diff of this commit:
 cvs rdiff -u -r1.87 -r1.88 src/share/mk/bsd.kmodule.mk
 cvs rdiff -u -r1.423 -r1.424 src/share/mk/bsd.lib.mk
 cvs rdiff -u -r1.360 -r1.361 src/share/mk/bsd.prog.mk

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2025 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.