NetBSD Problem Report #53548

From ryo@tetera.org  Fri Aug 24 11:23:16 2018
Return-Path: <ryo@tetera.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id D6AF27A1A1
	for <gnats-bugs@gnats.NetBSD.org>; Fri, 24 Aug 2018 11:23:16 +0000 (UTC)
Message-Id: <5b7feaa2.1c69fb81.50400.1375@mx.google.com>
Date: Fri, 24 Aug 2018 20:22:35 +0900
From: ryo@tetera.org
Reply-To: ryoon@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: here document of /bin/sh of today is broken
X-Send-Pr-Version: 3.95

>Number:         53548
>Category:       bin
>Synopsis:       here document of /bin/sh of today is broken
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kre
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Aug 24 11:25:00 +0000 2018
>Closed-Date:    Sat Oct 26 21:12:15 +0000 2019
>Last-Modified:  Sat Oct 26 21:12:15 +0000 2019
>Originator:     Ryo ONODERA
>Release:        NetBSD 8.99.24
>Organization:

>Environment:


System: NetBSD brownie 8.99.24 NetBSD 8.99.24 (DTRACE7) #1: Fri Aug 24 14:58:18 JST 2018 ryoon@brownie:/usr/world/8.99/amd64/obj/sys/arch/amd64/compile/DTRACE7 amd64
Architecture: x86_64
Machine: amd64
>Description:

Here document of /bin/sh of today on NetBSD/amd64-current is broken.
Test shell script is here:

$ cat test2.sh
#! /bin/sh

testfunc () {
all=$(echo $1)
enabled=$(echo $2)
for part in $all; do
  if $(echo $enabled | grep -q -E "(^| )$part($| )"); then
    echo "$part : enabled"
  else
    echo "$part : disabled"
  fi
done
}

items_all='A B C D'
items_enabled='B D'

cat > output.txt << EOF
$(testfunc "$items_all" "$items_enabled")
EOF

/bin/sh case:
$ /bin/sh test2.sh
$ cat output.txt
A : enabled
B : enabled
C : enabled
D : enabled

/bin/ksh case:
$ /bin/ksh test2.sh
$ cat output.txt
A : disabled
B : enabled
C : disabled
D : enabled

/bin/sh case is incorrect.

>How-To-Repeat:
See above.

>Fix:

I have no idea.

>Release-Note:

>Audit-Trail:

Responsible-Changed-From-To: bin-bug-people->kre
Responsible-Changed-By: kre@NetBSD.org
Responsible-Changed-When: Sat, 25 Aug 2018 00:32:27 +0000
Responsible-Changed-Why:
I am looking into this PR


State-Changed-From-To: open->analyzed
State-Changed-By: kre@NetBSD.org
State-Changed-When: Sat, 25 Aug 2018 02:17:28 +0000
State-Changed-Why:
Fix is coming.   I am not sure the here doc part is relevant
(it might be needed to set up the circumstances to trigger the
bug, at least in this case, I am not sure).

The actual problem is related to the short circuit shell
execution in a child shell that has vforked().   In most cases
after a vfork() the shell simply exec's something and then
the exit status of "something" becomes the exit status of
that child.  Or the exec fails, and the shell exits with an
internal shell generated exit status (usually 126 or 127).
Those are all handled.   But it is also possible that the
vfork'd shell can run other commands and wait for them and
should return the exit status from the last of those.  If
all that completes normally, it does - but when the shell
short circuits execution because it knows it has nothing
left to do, it was ignoring that exit status.

Previously (until last weekend's changes) we did none of that,
so the issue did not arise.

The FreeBSD shell, from which I copied these recent changes
(mostly) is less agressive with vfork() than we are, and only
do it when all that can possibly happen is exec() - further
they have abandoned the shell optomisation (from 7th edition
times) where if an exec() fails with ENOEXEC (exec of a file
that does not contain a binary image - and these days a #!
script counts as a binary image) the shell simply forks, resets
itself, and interprets the file as a script.  Instead of that
they simply pretend that instead of the command given having
been
	commandfile arg1 arg2 ...
it was instead
	/bin/sh commandfile arg1 arg2 ...
and exec /bin/sh to run it.   That means that after a vfork()
they always exec something (or fail with a shell error).

The fix is fairly simple, and I am testing it now.   Meanwhile
it is possible to work around the problem by simply disabling
vfork (which can be done with the -F option to sh, or "set -F"
in the script.)

kre


From: "Robert Elz" <kre@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/53548 CVS commit: src/bin/sh
Date: Sat, 25 Aug 2018 02:42:49 +0000

 Module Name:	src
 Committed By:	kre
 Date:		Sat Aug 25 02:42:49 UTC 2018

 Modified Files:
 	src/bin/sh: eval.c

 Log Message:
 PR bin/53548

 Deal with the new shell internal exit reason EXEXIT in the case of
 a shell which has vfork()'d.   It takes a peculiar set of circumstances
 to get into a situation where this is ever relevant, but it can be
 done.   See the PR for details.


 To generate a diff of this commit:
 cvs rdiff -u -r1.160 -r1.161 src/bin/sh/eval.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: analyzed->feedback
State-Changed-By: kre@NetBSD.org
State-Changed-When: Sat, 25 Aug 2018 02:46:44 +0000
State-Changed-Why:
A fix for this problem has been committed, it should be in
/bin/sh from HEAD sources (or builds) from Sat, 25 Aug 2018 02:45 UTC
(or a minute or two earlier) and later.

When you get a chance, can you confirm this fixes the problem?


From: Ryo ONODERA <ryo@tetera.org>
To: gnats-bugs@NetBSD.org, kre@NetBSD.org
Cc: 
Subject: Re: bin/53548 (here document of /bin/sh of today is broken)
Date: Sun, 02 Sep 2018 10:37:26 +0900 (JST)

 Hi,

 It works fine now.
 And pkgsrc/multimedia/mplayer configure stage is also works fine.

 Thank you very much!

 From: kre@NetBSD.org, Date: Sat, 25 Aug 2018 02:46:44 +0000 (UTC)

 > Synopsis: here document of /bin/sh of today is broken
 > 
 > State-Changed-From-To: analyzed->feedback
 > State-Changed-By: kre@NetBSD.org
 > State-Changed-When: Sat, 25 Aug 2018 02:46:44 +0000
 > State-Changed-Why:
 > A fix for this problem has been committed, it should be in
 > /bin/sh from HEAD sources (or builds) from Sat, 25 Aug 2018 02:45 UTC
 > (or a minute or two earlier) and later.
 > 
 > When you get a chance, can you confirm this fixes the problem?

 --
 Ryo ONODERA // ryo@tetera.org
 PGP fingerprint = 82A2 DC91 76E0 A10A 8ABB  FD1B F404 27FA C7D1 15F3

State-Changed-From-To: feedback->needs-pullups
State-Changed-By: kre@NetBSD.org
State-Changed-When: Sun, 02 Sep 2018 02:26:49 +0000
State-Changed-Why:
Thanks for the confirmation.

If/When the primary change gets pulled up to -8,
the correction made for this problem will go with it.


From: coypu@sdf.org
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: bin/53548 (here document of /bin/sh of today is broken)
Date: Wed, 23 Oct 2019 01:00:13 +0000

 > If/When the primary change gets pulled up to -8,

 I expect this is getting less likely with netbsd-9 just out the door :-)
 -8 is probably about to see mostly security/stability fixes.

From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: bin/53548 (here document of /bin/sh of today is broken)
Date: Wed, 23 Oct 2019 15:14:59 +0700

     Date:        Wed, 23 Oct 2019 01:05:01 +0000 (UTC)
     From:        coypu@sdf.org
     Message-ID:  <20191023010501.909667A24A@mollari.NetBSD.org>

   |  > If/When the primary change gets pulled up to -8,
   |  
   |  I expect this is getting less likely with netbsd-9 just out the door :-)
   |  -8 is probably about to see mostly security/stability fixes.

 Yes, I started several times to make a patch for the changes (the
 "primary change", ie: PR bin/48875) for -8, but never succeeded before
 abandoning the attempt (there have been many structural changes to the
 shell, most unrelated to the changes in question, which means making the
 patch is all essentially repeating the original work, while hoping to
 both do it correctly the old way, and not miss anything - including extra
 mods that are no longer needed in -9).

 Since the -8 shell as it is now is correct (wrt this anyway -- just slightly
 surprising in this one aspect) I decided to give up, and just allow that
 shell version to eventually expire when -10 gets released.

 I should have closed the relevant PRs (including this one, which was
 for a bug introduced in the fix for 48875, and so irrelevant to -8 as
 it is now) a while ago.   I will get around to that soon.

 Of course, anyone else who desires is welcome to attempt to take the
 fixes from the -9 sh (or HEAD, they are essentially the same) and
 apply them to the -8 sh.   It is also possible to take the entire -9
 sh and build it with -8, as I remember with almost no changes, perhaps
 none (though the division of labour between the built in kill, and sh,
 wrt printing signal names with kill -l and trap -l, which do the same
 thing, more or less, and so use the same routine, might have altered,
 and if so, requires a change from #if 0 to #if 1 or vice versa, I forget,
 in one place...)

 kre

 ps: even after -9 is released, -8 should still get more than just
 security fixes.   If any real bugs (that actually matter) to its sh
 are reported, I will do my best to fix them.


State-Changed-From-To: needs-pullups->closed
State-Changed-By: kre@NetBSD.org
State-Changed-When: Sat, 26 Oct 2019 21:12:15 +0000
State-Changed-Why:
Since the change for which this was a bug fix is not going to be
pulled up to -8, this clearly isn't needed their either.  No pullup
is needed, so we're done.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.43 2018/01/16 07:36:43 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.