NetBSD Problem Report #54456

From gson@gson.org  Sun Aug 11 11:30:37 2019
Return-Path: <gson@gson.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id D1AF37A187
	for <gnats-bugs@gnats.NetBSD.org>; Sun, 11 Aug 2019 11:30:37 +0000 (UTC)
Message-Id: <20190811113032.4C97598975B@guava.gson.org>
Date: Sun, 11 Aug 2019 14:30:32 +0300 (EEST)
From: gson@gson.org (Andreas Gustafsson)
Reply-To: gson@gson.org (Andreas Gustafsson)
To: gnats-bugs@NetBSD.org
Subject: Building in a directory whose name contains "-j" may fail
X-Send-Pr-Version: 3.95

>Number:         54456
>Category:       misc
>Synopsis:       Building in a directory whose name contains "-j" may fail
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    lukem
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Aug 11 11:35:00 +0000 2019
>Closed-Date:    Sun May 28 15:40:40 +0000 2023
>Last-Modified:  Sun May 28 15:40:40 +0000 2023
>Originator:     Andreas Gustafsson
>Release:        NetBSD-current
>Organization:

>Environment:
System: NetBSD
Architecture: x86_64
Machine: amd64
>Description:

Some of my builds have been failing with "vfork: Resource temporarily
unavailable", suggesting that they are running out of processes.
These are "build.sh -j 12 release" builds and "ulimit -p" is set
to 500, which ought to be enough (more than 40 processes per make
job).

To debug this, I set up a script to run "proctree" from pkgsrc once a
second during the build and save any output with more than 300 lines.
This caught a process with a lot of descendants and the following
command line:

   /tmp/bracket/build/2019.08.08.14.00.32-amd64-job-000012/tools/bin/nbgmake -j -e MACHINE= MAKEINFO=/tmp/bracket/build/2019.08.08.14.00.32-amd64-job-000012/tools/bin/nbmakeinfo LIBGCC= LIBGCC1= LIBGCC1_TEST= LIBGCC2= INSTALL_LIBGCC= EXTRA_PARTS= CPPFLAGS=-DNETBSD_TOOLS -DTARGET_SYSTEM_ROOT=0  -DTARGET_SYSTEM_ROOT_RELOCATABLE AR=ar RANLIB=ranlib BISON=true DESTDIR= INSTALL=/tmp/bracket/build/2019.08.08.14.00.32-amd64-job-000012/tools/bin/x86_64--netbsd-install -c  -r all-gcc 

Note the "-j" option which is not followed by an argument.
"man gmake" says:

   If the -j option is given without an argument, make will not limit
   the number of jobs that can run simultaneously.

The argument-less -j comes from this line in src/tools/Makefile.gnuhost:

   GMAKE_J_ARGS?=  ${MAKEFLAGS:[*]:M*-j*:C/.*(-j ?[0-9]*).*/\1/W}

These particular builds are made in a directory like
"/tmp/bracket/build/2019.08.08.14.00.32-amd64-job-000012".
In Makefile.gnuhost, the directory name gets expanded into
MAKEFLAGS, which then looks like:

   -d e -m /tmp/bracket/build/2019.08.08.14.00.32-amd64-job-000014/src/share/mk -j 12 -J 15,16 .MAKE.LEVEL.ENV=MAKELEVEL HOST_OSTYPE=NetBSD-8.1-amd64 MKOBJDIRS=yes _SRC_TOP_=/tmp/bracket/build/2019.08.08.14.00.32-amd64-job-000014/src _SRC_TOP_OBJ_=/tmp/bracket/build/2019.08.08.14.00.32-amd64-job-000014/obj _THISDIR_=tools/

which contains both "-j 12" and "amd64-job".  The presence of the
substring "-j" in the latter appears to be confusing the convoluted
make variable expansion used to set GMAKE_J_ARGS such that it yields
just "-j" rather than "-j 12".

As a minimal test case, running the following as a makefile prints
"-j 12":

MAKEFLAGS=-j 12 saint-mary
GMAKE_J_ARGS?=  ${MAKEFLAGS:[*]:M*-j*:C/.*(-j ?[0-9]*).*/\1/W}
all:
	@echo ${GMAKE_J_ARGS}

but if you change "saint-mary" to "saint-john", it prints just "-j".

>How-To-Repeat:

Set "ulimit -p" to 300 or less, and then attempt a "build.sh -j 12
release" (including the tools build, and perhaps changing -j 12 to
whatever is appropriate for your machine) in a directory whose name
contains the substring "-j".

>Fix:

>Release-Note:

>Audit-Trail:
From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: misc/54456: Building in a directory whose name contains "-j" may fail
Date: Sun, 11 Aug 2019 14:45:41 +0300

 Looks like the line setting GMAKE_J_ARGS was moved to Makefile.gnuhost
 from Makefile.gmakehost, where it was initially committed in revision
 1.3 and revised in 1.4:

 ----------------------------
 revision 1.4
 date: 2008-07-07 13:57:03 +0300;  author: apb;  state: Exp;  lines: +2 -2;
 Verify that MAKEFLAGS contains "-j" before trying to manipulate it
 with :C///.
 ----------------------------
 revision 1.3
 date: 2008-07-07 08:43:56 +0300;  author: mrg;  state: Exp;  lines: +4 -2;
 pass the "-j" flag down to gmake.   you can force -j option to
 gmake by setting GMAKE_J_ARGS=-jN.


 discussed with matt@ and a few others.

 XXX: this is kind of hacky, as it will fork off more processes than
 XXX: "-jN" says to, but there's no real way to get parallelism in
 XXX: both the tools/gcc build and the rest of the build without
 XXX: this.
 ----------------------------

 -- 
 Andreas Gustafsson, gson@gson.org

State-Changed-From-To: open->feedback
State-Changed-By: lukem@NetBSD.org
State-Changed-When: Fri, 19 May 2023 10:46:11 +0000
State-Changed-Why:

I've changed the GMAKE_J_ARGS expressions from:
  GMAKE_J_ARGS?= ${MAKEFLAGS:[*]:M*-j*:C/.*(-j ?[0-9]*).*/\1/W}
to
  GMAKE_J_ARGS?= ${MAKEFLAGS:[*]:M*-j*:C/(^|.* )(-j ?[0-9][0-9]*).*/\2/W}
and this seems to fix the issue if I build with -j 32 in a source
directory "src-j9999" or "src-john".

Does this resolve the issue for you?



From: "Luke Mewburn" <lukem@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/54456 CVS commit: src
Date: Fri, 19 May 2023 10:42:34 +0000

 Module Name:	src
 Committed By:	lukem
 Date:		Fri May 19 10:42:34 UTC 2023

 Modified Files:
 	src/external/gpl3/gcc.old/usr.bin/host-libcpp: Makefile
 	src/external/gpl3/gcc/usr.bin/host-libcpp: Makefile
 	src/tools: Makefile.gnuhost

 Log Message:
 Fix passing -j NNN to gmake

 Use a more restrictive pattern to extract -j NNN from MAKEFLAGS
 into GMAKE_J_ARGS, to avoid false positives when the source directory
 has "-j" in the path (e.g "amd64-job-000012" or "src-j9999").
 Previously this could pass either -"-j" or "-j BIGNUM" to gmake
 and result in "vfork: Resource temporarily unavailable" failures.

 PR misc/54456


 To generate a diff of this commit:
 cvs rdiff -u -r1.6 -r1.7 \
     src/external/gpl3/gcc.old/usr.bin/host-libcpp/Makefile
 cvs rdiff -u -r1.5 -r1.6 src/external/gpl3/gcc/usr.bin/host-libcpp/Makefile
 cvs rdiff -u -r1.55 -r1.56 src/tools/Makefile.gnuhost

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

Responsible-Changed-From-To: misc-bug-people->lukem
Responsible-Changed-By: lukem@NetBSD.org
Responsible-Changed-When: Sat, 20 May 2023 08:39:52 +0000
Responsible-Changed-Why:


From: Andreas Gustafsson <gson@gson.org>
To: lukem@NetBSD.org
Cc: gnats-bugs@NetBSD.org
Subject: Re: misc/54456 (Building in a directory whose name contains "-j" may fail)
Date: Wed, 24 May 2023 11:41:25 +0300

 lukem@NetBSD.org wrote:
 >   GMAKE_J_ARGS?= ${MAKEFLAGS:[*]:M*-j*:C/(^|.* )(-j ?[0-9][0-9]*).*/\2/W}
 [...]
 > Does this resolve the issue for you?

 My build setup has changed a lot since I filed the PR, and I am now
 having trouble reproducing the original issue as needed to verify the
 fix.  As far as I'm concerned, the PR can be closed.
 -- 
 Andreas Gustafsson, gson@gson.org

State-Changed-From-To: feedback->needs-pullups
State-Changed-By: lukem@NetBSD.org
State-Changed-When: Wed, 24 May 2023 09:22:55 +0000
State-Changed-Why:
i've requested pullup to netbsd-10.
may request for netbsd-9 too.


From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/54456 CVS commit: [netbsd-10] src
Date: Sun, 28 May 2023 09:49:03 +0000

 Module Name:	src
 Committed By:	martin
 Date:		Sun May 28 09:49:03 UTC 2023

 Modified Files:
 	src/external/gpl3/gcc/usr.bin/host-libcpp [netbsd-10]: Makefile
 	src/tools [netbsd-10]: Makefile.gnuhost

 Log Message:
 Pull up following revision(s) (requested by lukem in ticket #177):

 	tools/Makefile.gnuhost: revision 1.56
 	external/gpl3/gcc/usr.bin/host-libcpp/Makefile: revision 1.6

 Fix passing -j NNN to gmake

 Use a more restrictive pattern to extract -j NNN from MAKEFLAGS
 into GMAKE_J_ARGS, to avoid false positives when the source directory
 has "-j" in the path (e.g "amd64-job-000012" or "src-j9999").

 Previously this could pass either -"-j" or "-j BIGNUM" to gmake
 and result in "vfork: Resource temporarily unavailable" failures.

 PR misc/54456


 To generate a diff of this commit:
 cvs rdiff -u -r1.5 -r1.5.6.1 \
     src/external/gpl3/gcc/usr.bin/host-libcpp/Makefile
 cvs rdiff -u -r1.54 -r1.54.2.1 src/tools/Makefile.gnuhost

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: needs-pullups->closed
State-Changed-By: lukem@NetBSD.org
State-Changed-When: Sun, 28 May 2023 15:40:40 +0000
State-Changed-Why:
been pulled up to netbsd-10.
looks like netbsd-9 would need other build pullups first
and that's outside the scope of what I can test right now


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.