NetBSD Problem Report #50179
From fstd.lkml@gmail.com Thu Aug 27 00:37:53 2015
Return-Path: <fstd.lkml@gmail.com>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
by mollari.NetBSD.org (Postfix) with ESMTPS id 12874A6531
for <gnats-bugs@gnats.NetBSD.org>; Thu, 27 Aug 2015 00:37:53 +0000 (UTC)
Message-Id: <20150827003748.GA7494@frozen.localdomain>
Date: Thu, 27 Aug 2015 02:37:48 +0200
From: Timo Buhrmester <fstd.lkml@gmail.com>
Reply-To: Timo Buhrmester <fstd.lkml@gmail.com>
To: gnats-bugs@NetBSD.org
Subject: sh(1) variable expansion bug
X-Send-Pr-Version: 3.95
>Number: 50179
>Category: bin
>Synopsis: ${var#?} removes two instead of one character if $var starts with a \201-byte
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: bin-bug-people
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Thu Aug 27 00:40:00 +0000 2015
>Closed-Date: Sun Sep 06 00:13:31 +0000 2015
>Last-Modified: Sun Sep 06 00:13:31 +0000 2015
>Originator: Timo Buhrmester
>Release: NetBSD 7.99.20
>Organization:
>Environment:
System: NetBSD frozen.localdomain 7.99.20 NetBSD 7.99.20 (FRZKERN32) #0: Wed Aug 12 01:07:44 CEST 2015 build@frozen.localdomain:/usr/build/obj-frozen32-i386-i386/sys/arch/i386/compile/FRZKERN32 i386
Architecture: i386
Machine: i386
>Description:
We're having a bug in sh(1) that can be triggered with the following lines:
$ orig="$(printf '\201foo')" # ---------------------(1)
$ oneless="${orig#?}" # ---------------------(2)
$ echo "$oneless" # prints 'oo' instead of 'foo'
In (1), a string that starts with a \201-byte (aka 0x81, CTLESC) followed by 'foo' is created and assigned to `orig`.
In (2) we're then trying to nibble off the first byte, so $oneless ought to be 'foo'. However, sh(1) removes *two* bytes, leaving $oneless as just 'oo'. For prefixes other than \201, the shell behaves correctly.
Other shells (ksh from base, bash from pkgsrc and FreeBSD's sh(1)) do not have this behavior.
The code responsible seems to be (src/bin/sh/expand.c, function subevalvar):
| case VSTRIMLEFT: // <------------------- the ${var#prefix} case
| for (loc = startp; loc < str; loc++) {
| c = *loc;
| *loc = '\0';
| if (patmatch(str, startp, varflags & VSQUOTE))
| goto recordleft;
| *loc = c;
| if ((varflags & VSQUOTE) && *loc == CTLESC)
| loc++; // <--------- Oops.
| }
The loop tries to match the `pattern`-part of ${var#pattern}, pointed to by `str`, against bigger and bigger prefixes of $var, pointed to by `startp`.
In the second iteration, *loc is \201, also known as CTLESC, and the VSQUOTE-bit in varflags is also set. So we (mistakenly) `loc++` in the bottom conditional and hence consume the next byte that would follow, the 'f' in 'foo'.
I'm not familiar with when and why the shell needs to insert its own control characters (like CTLESC) into strings, but in this case it clearly should not interpret them.
>How-To-Repeat:
#!/bin/sh
badbyte='\201'
orig="$(printf "${badbyte}foo")" # -------------------------(1)
oneless="${orig#?}" # -------------------------------------(2)
if [ "$oneless" != foo ]; then
printf "Expected 'foo' but got '%s'\n" "$oneless"
fi
>Fix:
None known so far.
>Release-Note:
>Audit-Trail:
From: Timo Buhrmester <fstd.lkml@gmail.com>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: bin/50179: sh(1) variable expansion bug
Date: Thu, 27 Aug 2015 03:19:02 +0200
--pf9I7BMVVzbSWLtt
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
> >Fix:
> None known so far.
Okay, FreeBSD fixed this in 2010, see https://lists.freebsd.org/pipermail/svn-src-head/2010-October/021986.html
Attached is the same patch, processed to apply cleanly against our sh(1).
It Fixes The Problem(TM) for me, on a first glance. Have not yet run any atf-tests, though.
Timo Buhrmester
--pf9I7BMVVzbSWLtt
Content-Type: text/plain; charset=utf-8
Content-Disposition: attachment; filename="sh.nb.patch"
--- bin/sh/expand.c.orig
+++ bin/sh/expand.c
@@ -98,7 +98,7 @@ struct arglist exparg; /* holds expanded arg list */
STATIC void argstr(char *, int);
STATIC char *exptilde(char *, int);
STATIC void expbackq(union node *, int, int);
-STATIC int subevalvar(char *, char *, int, int, int, int);
+STATIC int subevalvar(char *, char *, int, int, int, int, int);
STATIC char *evalvar(char *, int);
STATIC int varisset(char *, int);
STATIC void varvalue(char *, int, int, int);
@@ -495,7 +495,7 @@ expbackq(union node *cmd, int quoted, int flag)
STATIC int
-subevalvar(char *p, char *str, int strloc, int subtype, int startloc, int varflags)
+subevalvar(char *p, char *str, int strloc, int subtype, int startloc, int varflags, int quotes)
{
char *startp;
char *loc = NULL;
@@ -548,10 +548,10 @@ subevalvar(char *p, char *str, int strloc, int subtype, int startloc, int varfla
for (loc = startp; loc < str; loc++) {
c = *loc;
*loc = '\0';
- if (patmatch(str, startp, varflags & VSQUOTE))
+ if (patmatch(str, startp, quotes))
goto recordleft;
*loc = c;
- if ((varflags & VSQUOTE) && *loc == CTLESC)
+ if (quotes && *loc == CTLESC)
loc++;
}
return 0;
@@ -560,11 +560,11 @@ subevalvar(char *p, char *str, int strloc, int subtype, int startloc, int varfla
for (loc = str - 1; loc >= startp;) {
c = *loc;
*loc = '\0';
- if (patmatch(str, startp, varflags & VSQUOTE))
+ if (patmatch(str, startp, quotes))
goto recordleft;
*loc = c;
loc--;
- if ((varflags & VSQUOTE) && loc > startp &&
+ if (quotes && loc > startp &&
*(loc - 1) == CTLESC) {
for (q = startp; q < loc; q++)
if (*q == CTLESC)
@@ -577,10 +577,10 @@ subevalvar(char *p, char *str, int strloc, int subtype, int startloc, int varfla
case VSTRIMRIGHT:
for (loc = str - 1; loc >= startp;) {
- if (patmatch(str, loc, varflags & VSQUOTE))
+ if (patmatch(str, loc, quotes))
goto recordright;
loc--;
- if ((varflags & VSQUOTE) && loc > startp &&
+ if (quotes && loc > startp &&
*(loc - 1) == CTLESC) {
for (q = startp; q < loc; q++)
if (*q == CTLESC)
@@ -593,9 +593,9 @@ subevalvar(char *p, char *str, int strloc, int subtype, int startloc, int varfla
case VSTRIMRIGHTMAX:
for (loc = startp; loc < str - 1; loc++) {
- if (patmatch(str, loc, varflags & VSQUOTE))
+ if (patmatch(str, loc, quotes))
goto recordright;
- if ((varflags & VSQUOTE) && *loc == CTLESC)
+ if (quotes && *loc == CTLESC)
loc++;
}
return 0;
@@ -763,7 +763,7 @@ again: /* jump here after setting a variable with ${var=text} */
STPUTC('\0', expdest);
patloc = expdest - stackblock();
if (subevalvar(p, NULL, patloc, subtype,
- startloc, varflags) == 0) {
+ startloc, varflags, quotes) == 0) {
int amount = (expdest - stackblock() - patloc) + 1;
STADJUST(-amount, expdest);
}
@@ -776,7 +776,7 @@ again: /* jump here after setting a variable with ${var=text} */
case VSQUESTION:
if (set)
break;
- if (subevalvar(p, var, 0, subtype, startloc, varflags)) {
+ if (subevalvar(p, var, 0, subtype, startloc, varflags, quotes)) {
varflags &= ~VSNUL;
/*
* Remove any recorded regions beyond
--pf9I7BMVVzbSWLtt--
From: "Christos Zoulas" <christos@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/50179 CVS commit: src/bin/sh
Date: Thu, 27 Aug 2015 03:46:47 -0400
Module Name: src
Committed By: christos
Date: Thu Aug 27 07:46:47 UTC 2015
Modified Files:
src/bin/sh: expand.c
Log Message:
PR/50179: Timo Buhrmester: sh(1) variable expansion bug
To generate a diff of this commit:
cvs rdiff -u -r1.92 -r1.93 src/bin/sh/expand.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Sun, 06 Sep 2015 00:13:31 +0000
State-Changed-Why:
Christos committed it
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.