NetBSD Problem Report #55578

From martin@duskware.de  Sun Aug 16 15:43:23 2020
Return-Path: <martin@duskware.de>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 2BE931A9239
	for <gnats-bugs@gnats.NetBSD.org>; Sun, 16 Aug 2020 15:43:23 +0000 (UTC)
From: martin@NetBSD.org
Reply-To: martin@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: make(1) error reporting is broken for -j builds
X-Send-Pr-Version: 3.95

>Number:         55578
>Category:       toolchain
>Synopsis:       make(1) error reporting is broken for -j builds
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    sjg
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Aug 16 15:45:00 +0000 2020
>Last-Modified:  Thu Jan 07 18:40:01 +0000 2021
>Originator:     Martin Husemann
>Release:        NetBSD 9.99.71
>Organization:
The NetBSD Foundation, Inc.
>Environment:
System: NetBSD thirdstage.duskware.de 9.99.71 NetBSD 9.99.71 (MODULAR) #366: Sun Aug 16 14:01:58 CEST 2020 martin@thirdstage.duskware.de:/usr/src/sys/arch/sparc64/compile/MODULAR sparc64
Architecture: sparc64
Machine: sparc64
>Description:

When aborting a build, make(1) used to say: stopped in $dir...

It does not tell the directory of the initial error any more and it is often
impossible to derive the information systematically from the build log
if using build.sh with high -j values.

We now only get the original error message(s), e.g. from gcc, spread over
10th of pages intermixed with output from other make jobs, plus in the
end a list of completed target names. Example:

                            ^
/usr/src/sys/crypto/aes/arch/arm/aes_neon.c:195:16: error: excess elements in scalar initializer [-Werror]
      0x02,0x0C,0x0B,0x0A,0x09,0x03,0x07,0x04),
                ^~~~
/usr/src/sys/crypto/aes/arch/arm/arm_neon_imm.h:51:30: note: in definition of macro 'VQ_N_U8'
  {h,g,f,e,d,c,b,a, p,o,n,m,l,k,j,i}
                              ^
--- dependall-external ---
--- dependall-nec_vndr ---
--- dependall-tests ---
--- dependall-sbin ---
--- dependall-compat ---
--- dependall-usr.bin ---
--- dependall-csplit ---
--- dependall-external ---
--- dependall-gpl3 ---
--- dependall-libiberty ---
--- dependall-sys ---
--- dependall-evbarm ---
--- dependsalib ---
--- dependall-usr.bin ---
--- dependall-config ---
--- dependall-external ---
--- dependall-ibm-public ---
--- dependall-usr.sbin ---
--- dependall-btpand ---
--- dependall-sys ---
--- dependall-modules ---
--- dependall-compat_raid_80 ---
--- dependall-share ---
--- dependall-i18n ---
--- dependall-usr.bin ---
--- dependall-compress ---
--- dependall-libexec ---
--- dependall-telnetd ---
--- dependall-sys ---
--- dependall-compat_crypto_50 ---
--- dependall-compat_sysv_50 ---
--- dependall-bin ---
--- dependall-usr.sbin ---
--- dependall-dumpfs ---
--- dependall-sys ---
--- dependall-arch ---
--- dependall-gzboot ---
--- dependall-libexec ---
--- dependall-ld.elf_so ---
--- dependall-sys ---
--- dependall-crypto/external ---
--- dependall-libexec ---
--- dependall-lfs_cleanerd ---
--- dependall-usr.bin ---
--- dependall-crunch ---
--- dependall-external ---
--- dependall-gpl2 ---
--- dependall-usr.sbin ---
--- dependall-dumplfs ---
--- dependall-external ---
--- dependall-apache2 ---
--- dependall-bsd ---
--- dependall-byacc ---
--- dependall-games ---
--- dependall-larn ---
--- dependall-external ---
--- dependall-mpl ---

ERROR: Failed to make distribution
*** BUILD ABORTED ***


>How-To-Repeat:

./build.sh -j 24

>Fix:
n/a

>Release-Note:

>Audit-Trail:
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: toolchain/55578: make(1) error reporting is broken for -j builds
Date: Sun, 16 Aug 2020 17:50:06 +0200

 On some build failures I see a

 stopped in: $dir

 but in the case I cited in this PR before it was missing. Not sure when it
 gets lost and when it happens.

 Martin

Responsible-Changed-From-To: toolchain-manager->sjg
Responsible-Changed-By: rillig@NetBSD.org
Responsible-Changed-When: Tue, 01 Dec 2020 17:55:58 +0000
Responsible-Changed-Why:
Over to maintainer.
https://github.com/NetBSD/src/commit/5463c7ca1692e419c22c74f0f1557635daab2885
https://mail-index.netbsd.org/source-changes/2020/06/19/msg118482.html


From: "Roland Illig" <rillig@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/55578 CVS commit: src
Date: Tue, 1 Dec 2020 17:50:04 +0000

 Module Name:	src
 Committed By:	rillig
 Date:		Tue Dec  1 17:50:04 UTC 2020

 Modified Files:
 	src/distrib/sets/lists/tests: mi
 	src/usr.bin/make/unit-tests: Makefile
 Added Files:
 	src/usr.bin/make/unit-tests: jobs-error-indirect.exp
 	    jobs-error-indirect.mk jobs-error-nested-make.exp
 	    jobs-error-nested-make.mk jobs-error-nested.exp
 	    jobs-error-nested.mk

 Log Message:
 make(1): add tests for suppressing "stopped in"

 These tests demonstrate the unwanted behavior described in PR bin/55578
 and PR bin/55832.


 To generate a diff of this commit:
 cvs rdiff -u -r1.979 -r1.980 src/distrib/sets/lists/tests/mi
 cvs rdiff -u -r1.231 -r1.232 src/usr.bin/make/unit-tests/Makefile
 cvs rdiff -u -r0 -r1.1 src/usr.bin/make/unit-tests/jobs-error-indirect.exp \
     src/usr.bin/make/unit-tests/jobs-error-indirect.mk \
     src/usr.bin/make/unit-tests/jobs-error-nested-make.exp \
     src/usr.bin/make/unit-tests/jobs-error-nested-make.mk \
     src/usr.bin/make/unit-tests/jobs-error-nested.exp \
     src/usr.bin/make/unit-tests/jobs-error-nested.mk

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Roland Illig <roland.illig@gmx.de>
To: Simon Gerraty <sjg@crufty.net>
Cc: gnats-bugs@netbsd.org
Subject: Re: PR/55578
Date: Thu, 7 Jan 2021 09:11:13 +0100 (GMT+01:00)

 07.01.2021 05:13:41 Simon Gerraty <sjg@crufty.net>:
 > So putting .MAKE on a target which isn't a sub-make is wrong.

 You're right. At the time I wrote that test I probably didn't understand .M=
 AKE completely.=C2=A0 Feel free to fix it.

From: Simon Gerraty <sjg@crufty.net>
To: Roland Illig <roland.illig@gmx.de>
Cc: gnats-bugs@netbsd.org, sjg@crufty.net
Subject: Re: PR/55578
Date: Thu, 07 Jan 2021 10:38:12 -0800

 The following, should hopefully strike a good balance between
 not missing a 'stopped in' and supressing noise.
 The exit(6) is to disambiguate a specific corner case.

 diff -r 264252545986 job.c
 --- a/job.c	Sat Jan 02 09:19:54 2021 -0800
 +++ b/job.c	Thu Jan 07 10:35:18 2021 -0800
 @@ -2908,7 +2908,7 @@
  		       errno == EAGAIN)
  			continue;
  		if (shouldDieQuietly(NULL, 1))
 -			exit(2);
 +			exit(6);	/* we aborted */
  		Fatal("A failure has been detected "
  		      "in another branch of the parallel make");
  	}
 diff -r 264252545986 main.c
 --- a/main.c	Sat Jan 02 09:19:54 2021 -0800
 +++ b/main.c	Thu Jan 07 10:35:18 2021 -0800
 @@ -2159,10 +2159,6 @@
  		Var_Stats();
  	}

 -	/* we generally want to keep quiet if a sub-make died */
 -	if (shouldDieQuietly(gn, -1))
 -		return;
 -
  	if (msg != NULL)
  		printf("%s", msg);
  	printf("\n%s: stopped in %s\n", progname, curdir);
 @@ -2170,6 +2166,10 @@
  	if (errorNode != NULL)
  		return;		/* we've been here! */

 +	/* we generally want to keep quiet if a sub-make died */
 +	if (shouldDieQuietly(gn, -1))
 +		return;
 +
  	if (gn != NULL)
  		SetErrorVars(gn);


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.