NetBSD Problem Report #54112
From kre@munnari.OZ.AU Wed Apr 10 06:49:15 2019
Return-Path: <kre@munnari.OZ.AU>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id E000B7A14F
for <gnats-bugs@gnats.NetBSD.org>; Wed, 10 Apr 2019 06:49:15 +0000 (UTC)
Message-Id: <201904100649.x3A6n7kX018291@jinx.noi.kre.to>
Date: Wed, 10 Apr 2019 13:49:07 +0700 (+07)
From: kre@munnari.OZ.AU
To: gnats-bugs@NetBSD.org
Subject: /bin/sh "$@" still broken: when field splitting this time
X-Send-Pr-Version: 3.95
>Number: 54112
>Category: bin
>Synopsis: /bin/sh "$@" still broken: when field splitting this time
>Confidential: no
>Severity: non-critical
>Priority: low
>Responsible: kre
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Apr 10 06:50:00 +0000 2019
>Closed-Date: Wed Apr 10 08:13:42 +0000 2019
>Last-Modified: Wed Apr 10 08:15:00 +0000 2019
>Originator: Robert Elz
>Release: NetBSD 8.99.30 (any version at all)
>Organization:
>Environment:
System: NetBSD jinx.noi.kre.to 8.99.30 NetBSD 8.99.30 (1.1-20190114) #9: Mon Jan 14 13:29:08 ICT 2019 kre@onyx.coe.psu.ac.th:/usr/obj/testing/kernels/amd64/JINX amd64
Architecture: x86_64
Machine: amd64
>Description:
Given:
argc() { echo "$#"; }
set -- ''
Then:
argc ${0+"$@" }
should say "1" but says "0" instead.
This variant:
argc ${0+"$@"}
does give 1 (as it should).
>How-To-Repeat:
As above.
Yes, this is an absurd usage, but still valid, and it should work.
This test case came from a bug-bash bug report, which reported
a problem with expansion of ${0+"$@" "$@"} so I tested it with
our /bin/sh.
Sigh. (I suspect the bash issue is/was something quite different).
The problem is that "$@" produces a "" string (as it should) but
that is represented as the C string "" (ie: nothing). When that
is abutted to the space, we get " ", which we then field split,
which eliminates the space, leaving nothing at all.
When the space is not there, the result is fully quoted, we have
no field splitting regions, so no field splitting actually happens,
and the null string remained untouched.
In all of these the only relevance of the use of $0 is that we
know $0 is set, so the text after the + is expanded to produce
the result -- any set variable would do. Similar things happened
in the case of ${unset-"$@" } of course.
>Fix:
Coming very soon. This PR is mostly just so something can be
referenced in the CVS commit message, rather than including all
this noise there, and can also be referenced in the new ATF
tests that will also be added to look for this case.
Most of this has been fixed and waiting to get processed for a
month now, but the fix originally broke
set --; argc "${0+$@}"
producing 1 instead of 0 (which POSIX as it is right now requires).
That POSIX is to be changed to make this unspecified (as most
other shells produce 1 for this case) is irrelevant here, this
expansion was fixed a while ago in this shell, and the fix should
be retained (0 is the logically sensible value). The previous fix
for this was one of the ugliest of the old gross hacks, now being
removed. Good riddance! The newer solution isn't exactly a marvel
of astonishing beauty either, unfortunately.
Note that this change does alter the way we interpret:
set -- ; X=; argc ${0+"$X$@"}
Previuously we produced 1, now we produce 0. This one has been an
unspecified case in POSIX for a while now, different shells handle it
differently ... except for this there is no obvious correct answer,
either way can be argued to be better, so I believe this change to be
acceptable.
The rule now is (for this unspecified case -- so no-one should
rely upon it) that when we have a double quoted string (any subset
of a word), and $@ appears inside the double quotes, and in a context
where "$@" makes sense, if after that string has been expanded all we
have left is the surrounding "" then the whole string (that is, the
remaining pair of quotes) is simply deleted. "$@" is the obvious
simplest example of this (here producing nothing is required, and
you can rely upon that). The unspecified case is when there was
something else inside the same "" pair as the $@ which also expanded
to nothing. "$@$@" is another simple example. That one we used to
produce nothing for (weirdly different from the $X$@ case) whereas
some other shells produce "" (1 arg) for that. With the new rule,
we will continue producing 0 (nothing) for the double $@ example,
and now also fpr the $X$@ example (or $@$X etc).
Note: all of this applies only in contexts were the magic "$@"
rules apply, everwhere else we treat $@ the same as $* (different
rules). That's actually mostly another unspecified case, so you
should not rely upon $@ being the same as $* - just use $* and not
$@ when appropriate.
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: bin-bug-people->kre
Responsible-Changed-By: kre@NetBSD.org
Responsible-Changed-When: Wed, 10 Apr 2019 06:50:26 +0000
Responsible-Changed-Why:
I am looking into this PR
State-Changed-From-To: open->analyzed
State-Changed-By: kre@NetBSD.org
State-Changed-When: Wed, 10 Apr 2019 06:51:11 +0000
State-Changed-Why:
Issue is understood, fix in the pipeline (only holdup at the minute
is the new ATF tests that are needed).
State-Changed-From-To: analyzed->closed
State-Changed-By: kre@NetBSD.org
State-Changed-When: Wed, 10 Apr 2019 08:13:42 +0000
State-Changed-Why:
Problem fixed
From: "Robert Elz" <kre@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/54112 CVS commit: src
Date: Wed, 10 Apr 2019 08:13:11 +0000
Module Name: src
Committed By: kre
Date: Wed Apr 10 08:13:11 UTC 2019
Modified Files:
src/bin/sh: expand.c
src/tests/bin/sh: t_expand.sh
Log Message:
PR bin/54112
Fix handling of "$@" (that is, double quoted dollar at), when it
appears in a string which will be subject to field splitting.
Eg:
${0+"$@" }
More common usages, like the simple "$@" or ${0+"$@"} end up
being entirely quoted, so no field splitting happens, and the
problem was avoided.
See the PR for more details.
This ends up making a bunch of old hack code (and some that was
relatively new) vanish - for now it is just #if 0'd or commented out.
Cleanups of that stuff will happen later.
That some of the worst $@ hacks are now gone does not mean that processing
of "$@" does not retain a very special place in every hackers heart.
RIP extreme ugliness - long live the merely ordinary ugly.
Added a new bin/sh ATF test case to verify that all this remains fixed.
To generate a diff of this commit:
cvs rdiff -u -r1.131 -r1.132 src/bin/sh/expand.c
cvs rdiff -u -r1.20 -r1.21 src/tests/bin/sh/t_expand.sh
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.43 2018/01/16 07:36:43 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.