NetBSD Problem Report #55979

From www@netbsd.org  Sat Feb  6 12:07:41 2021
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id DE6A21A9239
	for <gnats-bugs@gnats.NetBSD.org>; Sat,  6 Feb 2021 12:07:41 +0000 (UTC)
Message-Id: <20210206120740.665BC1A923A@mollari.NetBSD.org>
Date: Sat,  6 Feb 2021 12:07:40 +0000 (UTC)
From: jtunney@gmail.com
Reply-To: jtunney@gmail.com
To: gnats-bugs@NetBSD.org
Subject: sh single quotes removes nul characters
X-Send-Pr-Version: www-1.0

>Number:         55979
>Category:       bin
>Synopsis:       sh single quotes removes nul characters
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kre
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Feb 06 12:10:00 +0000 2021
>Closed-Date:    Thu Apr 08 14:53:29 +0000 2021
>Last-Modified:  Thu Apr 08 14:53:29 +0000 2021
>Originator:     Justine Tunney
>Release:        9.1
>Organization:
>Environment:
NetBSD netbsd 9.1 NetBSD 9.1 (GENERIC) #0: Sun Oct 18 19:24:30 UTC 2020  mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/amd64/compile/GENERIC amd64
>Description:
/bin/sh and /bin/ksh remove ASCII NUL characters embedded in single quoted strings. This is inconsistent with the behavior of shells on other platforms. POSIX requires this content be preserved:

    2.2.2 Single-Quotes
    Enclosing characters in single-quotes ( '' ) shall preserve the literal
    value of each character within the single-quotes. A single-quote cannot
    occur within single-quotes.

I need it because I'm the author of Cosmopolitan Libc which uses a polyglot executable format where binary content is concatenated to a shell script. I just added support for NetBSD. Right now it doesn't work with /bin/sh so I have to tell users to install bash. See https://github.com/jart/cosmopolitan

This use case is supported by POSIX.

    "The input file may be of any type, but the initial portion of the
     file intended to be parsed according to the shell grammar (XREF to
     XSH 2.10.2 Shell Grammar Rules) shall consist of characters and
     shall not contain the NUL character. The shell shall not enforce
     any line length limits."

    "Earlier versions of this standard required that input files to the
     shell be text files except that line lengths were unlimited.
     However, that was overly restrictive in relation to the fact that
     shells can parse a script without a trailing newline, and in
     relation to a common practice of concatenating a shell script
     ending with an 'exit' or 'exec $command' with a binary data payload
     to form a single-file self-extracting archive."

    http://austingroupbugs.net/view.php?id=1250
    http://austingroupbugs.net/view.php?id=1226#c4394

FreeBSD /bin/sh was recently updated to incorporate this change:

    https://github.com/freebsd/freebsd-src/commit/9a1cd363318b7e9e70ef6af27d1675b371c16b1a

Could NetBSD update its /bin/sh shell? 

Here's an explanation of the format and binaries for testing purposes. They do in fact support NetBSD.

    https://justine.lol/ape.html  <-- design doc
    https://justine.lol/hello.com <-- binary file
>How-To-Repeat:
printf "x='\1\0\1'\nprintf '%%s'"' "$x"\n' | /bin/sh | hexdump -C
00000000  01 01                                             |..|

printf "x='\1\0\1'\nprintf '%%s'"' "$x"\n' | /bin/ksh | hexdump -C
00000000  01 01                                             |..|

>Fix:
Possibly changing something to do with `sqsyntax` or `readtoken1` in your Almquist Shell fork in bin/sh/parse.c

>Release-Note:

>Audit-Trail:

Responsible-Changed-From-To: bin-bug-people->kre
Responsible-Changed-By: kre@NetBSD.org
Responsible-Changed-When: Sat, 06 Feb 2021 18:03:37 +0000
Responsible-Changed-Why:
I am looking into this PR


From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: bin/55979: sh single quotes removes nul characters
Date: Sun, 07 Feb 2021 01:03:16 +0700

     Date:        Sat,  6 Feb 2021 12:10:00 +0000 (UTC)
     From:        jtunney@gmail.com
     Message-ID:  <20210206121000.CFF391A923D@mollari.NetBSD.org>

   | /bin/sh and /bin/ksh remove ASCII NUL characters embedded in single
   | quoted strings. This is inconsistent with the behavior of shells on
   | other platforms. POSIX requires this content be preserved:

 I doubt that, and I will look and see if I can find where it explicitly
 says differently, later.

   | This use case is supported by POSIX.
   |
   |     "The input file may be of any type, but the initial portion of the
   |      file intended to be parsed according to the shell grammar (XREF to
   |      XSH 2.10.2 Shell Grammar Rules) shall consist of characters and
   |      shall not contain the NUL character. The shell shall not enforce
   |      any line length limits."

 Appending stuff to the end of a script is supported (or should be), if
 we're not doing that correctly (which is possible, it is an unusual usage)
 then that should be fixed.   But note from what you just quoted (with
 the unimportant words for this purpose elided)

 		the initial portion of the file intended to be parsed
 		according to the shell grammar [...] shall not contain
 		the NUL character.

 That is, if a NUL is part of the script itself, then it is non-conforming
 (whereas whatever follows the script and is never parsed or executed does
 not have that requirement).

   |     http://austingroupbugs.net/view.php?id=1250
   |     http://austingroupbugs.net/view.php?id=1226#c4394

 Yes, I know those two, and neither has anything to do with (at least
 what I perceive to be) the issue raised by this PR.

   | FreeBSD /bin/sh was recently updated to incorporate this change:

 I will take a look at what they did.

   | Could NetBSD update its /bin/sh shell?

 It depends just what is really required.

   | printf "x='\1\0\1'\nprintf '%%s'"' "$x"\n' | /bin/sh | hexdump -C
   | 00000000  01 01                                             |..|
   |
   | printf "x='\1\0\1'\nprintf '%%s'"' "$x"\n' | /bin/ksh | hexdump -C
   | 00000000  01 01  
                                            |..|
 bash5 $ printf "x='\1\0\1'\nprintf '%%s'"' "$x"\n' | ksh93 | hexdump -C
 ksh93: syntax error at line 1: `zero byte' unexpected
 bash5 $ printf "x='\1\0\1'\nprintf '%%s'"' "$x"\n' | bosh | hexdump -C
 bash5 $ 
 bash5 $ printf "x='\1\0\1'\nprintf '%%s'"' "$x"\n' | dash | hexdump -C
 00000000  01 01                                             |..|
 00000002
 bash5 $ printf "x='\1\0\1'\nprintf '%%s'"' "$x"\n' | mksh | hexdump -C
 00000000  01 01                                             |..|
 00000002
 bash5 $ printf "x='\1\0\1'\nprintf '%%s'"' "$x"\n' | yash | hexdump -C
 syntax error: the single quotation is not closed

 Which shell (apart from zsh which is decidedly a non-posix shell)
 supports that?

 I don't have a (current) FreeBSD sh to test at the minute.

   | >Fix:
   | Possibly changing something to do with `sqsyntax` or `readtoken1`
   | in your Almquist Shell fork in bin/sh/parse.c

 It would be much more than that, the shell uses standard C strings
 (char *, terminated by \0) everywhere internally, it would be major
 work to make it handle a \0 embedded in a variable value, or similar.

 It is simply impossible to embed \0 in a command arg or environment
 variable, the formats of those things are defined to be \0 terminated
 strings.

 Since bash does not actually allow \0 in sh input (as I recall, it discards
 NUL chars, just as we do), and yet your script works with bash, I assume
 that the actual issue is something different.   If I can work out what
 that is, and a fix is reasonable to implement, I will see what I can do.

 kre

From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: bin/55979: sh single quotes removes nul characters
Date: Sun, 07 Feb 2021 02:20:18 +0700

 OK, I see the issue now, and as I suspected, it has noting whatever
 to do with NUL characters in single quoted strings.

 One of the issues with shells need to deal with, is that when they
 see an executable file (ie: 'x' permission set, and not a directory)
 that the system cannot actually execute (execl() fails with ENOEXEC)
 what do they do.

 The traditional behaviour was simply to assume that the file is a
 shell script, and attempt to parse and execute it.   That's what
 the Thompson shell did, and early versions of the Bourne shell,
 and is what allowed shell scripts to "pretend" to be commands in
 the days before #! support was added.   For the rest of this we
 will forget about #!, as while it makes it simpler to make scripts
 (sh, awk, perl, ...) work, such they can be executed by any other
 program using execl() (or one of its variants), the existence of
 this facility didn't change the way that shells work at all, it just
 made it less likely that the execl() would fail.

 Any random file with 'x' permission, which wasn't an actual executable
 binary, was run as a script - which is fine when it was a sh script,
 but irritated users with (typically many) error messages when it was
 not.   So, shells grew some "smarts" and attempted to detect which files
 were scripts, and which were not, using heuristics to tell the difference.

 A common method, the one dealt in the austin group POSIX defect reports
 you cited, was simply to look for a \0 in the initial part of the file
 (the first buffer read, of whatever size the shell reads chunks of files).

 That works for detecting binary files, usually, but doesn't allow the
 leading script, followed by other data, that we actually want to allow,
 so the heuristic was changed to look for a \0 in the first line of the
 file (that is, a \0 before a \n).

 That's what the FreeBSD change you mention does - though it is actually
 more restrictive than that, if there is a \0 anywhere in the first
 block of the file, it requires there be (at least one) lower case
 ascii alpha, or a '$' or a '`', and a subsequent \n, before the first \0
 is found.   Previously (before that change) the treated any file
 containing a \0 in the first block of the file as binary.

 The NetBSD /bin/sh (I haven't looked at what /bin/ksh does) is actually
 far more permissive in this area than most other shells.   It forbids
 just one thing from being treated as a shell script, which is an ELF
 binary file, as that, we have found, is the most common kind of file
 to have 'x' permission, not be a script, and not actually be executable.
 That is, usually, ELF binaries for some other OS or architecture, that
 the kernel cannot simply run.

 Your hello.com is an ELF binary (it starts "\177ELF") which is exactly
 what we look for when deciding to reject the file.   We don't look for \0
 characters at all for this purpose, they're irrelevant.

 The code in sh is:

                         if (memcmp(magic, "\177ELF", 4) == 0) {
                                 (void)close(fd);
                                 error("Cannot execute ELF binary %s", fname);
                         }

 which is exactly what happens when sh is used to run your file.  That's
 our only check.

 Unless there turns out to be considerable support for altering that test
 (it is, I believe, the current heuristic after several previous attempts
 were less successful) I do not plan on doing so.

 kre


State-Changed-From-To: open->analyzed
State-Changed-By: kre@NetBSD.org
State-Changed-When: Sat, 06 Feb 2021 19:31:45 +0000
State-Changed-Why:
Issue understood, no chnages to sh currently planned.


From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: bin/55979: sh single quotes removes nul characters
Date: Sun, 07 Feb 2021 02:45:25 +0700

 I just realized that when I gave the examples of other shells
 processing the input with \0 chars in the script, I omitted
 the bash test (it was the first I tested, and I just didn't
 cut far enough back in my terminal buffer).

 bash5 $ printf "x='\1\0\1'\nprintf '%%s'"' "$x"\n' | bash | hexdump -C
 00000000  01 01                                             |..|
 00000002

 The "bash5 $ " prompt for all of those examples is just because I
 used by bash5 test window to run these tests (which is irrelevant).

 kre

From: Justine Tunney <jtunney@gmail.com>
To: gnats-bugs@netbsd.org
Cc: kre@netbsd.org, netbsd-bugs@netbsd.org, gnats-admin@netbsd.org
Subject: Re: bin/55979 (sh single quotes removes nul characters)
Date: Sat, 6 Feb 2021 17:55:51 -0800

 --00000000000075681a05bab55ce4
 Content-Type: text/plain; charset="UTF-8"

 > Issue understood, no chnages to sh currently planned.

 What would it take to change that?

 On Sat, Feb 6, 2021 at 11:31 AM <kre@netbsd.org> wrote:

 > Synopsis: sh single quotes removes nul characters
 >
 > State-Changed-From-To: open->analyzed
 > State-Changed-By: kre@NetBSD.org
 > State-Changed-When: Sat, 06 Feb 2021 19:31:45 +0000
 > State-Changed-Why:
 > Issue understood, no chnages to sh currently planned.
 >
 >
 >
 >

 --00000000000075681a05bab55ce4
 Content-Type: text/html; charset="UTF-8"
 Content-Transfer-Encoding: quoted-printable

 <div dir=3D"ltr"><div>&gt; Issue understood, no chnages to sh currently pla=
 nned.</div><div><br></div>What would it take to change that?</div><br><div =
 class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Sat, Feb 6, =
 2021 at 11:31 AM &lt;<a href=3D"mailto:kre@netbsd.org">kre@netbsd.org</a>&g=
 t; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0p=
 x 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Synops=
 is: sh single quotes removes nul characters<br>
 <br>
 State-Changed-From-To: open-&gt;analyzed<br>
 State-Changed-By: kre@NetBSD.org<br>
 State-Changed-When: Sat, 06 Feb 2021 19:31:45 +0000<br>
 State-Changed-Why:<br>
 Issue understood, no chnages to sh currently planned.<br>
 <br>
 <br>
 <br>
 </blockquote></div>

 --00000000000075681a05bab55ce4--

From: Christos Zoulas <christos@zoulas.com>
To: gnats-bugs@netbsd.org
Cc: kre@netbsd.org,
 gnats-admin@netbsd.org,
 netbsd-bugs@netbsd.org,
 jtunney@gmail.com
Subject: Re: bin/55979 (sh single quotes removes nul characters)
Date: Sat, 6 Feb 2021 21:13:47 -0500

 --Apple-Mail=_14E67400-411A-409D-BE91-0AF1C93BE46E
 Content-Transfer-Encoding: quoted-printable
 Content-Type: text/plain;
 	charset=us-ascii

 Weird, it seems to be working for me.

 $ ./hello.com
 hello world
 $ echo $NETBSD_SHELL
 20181212 BUILD:20210109211525Z
 $ uname -a
 NetBSD quasar.astron.com 9.99.78 NetBSD 9.99.78 (QUASAR) #203: Sun Jan =
 24 23:21:07 EST 2021  =
 christos@quasar.astron.com:/usr/src/sys/arch/amd64/compile/QUASAR amd64


 --Apple-Mail=_14E67400-411A-409D-BE91-0AF1C93BE46E
 Content-Transfer-Encoding: 7bit
 Content-Disposition: attachment;
 	filename=signature.asc
 Content-Type: application/pgp-signature;
 	name=signature.asc
 Content-Description: Message signed with OpenPGP

 -----BEGIN PGP SIGNATURE-----
 Comment: GPGTools - http://gpgtools.org

 iF0EARECAB0WIQS+BJlbqPkO0MDBdsRxESqxbLM7OgUCYB9M2wAKCRBxESqxbLM7
 OmosAJwK2BY/Xjv+xxcpzEByvu9QyFd97QCgmtQt2rP8TbZSc2JR3DiR5JpobFE=
 =KGlM
 -----END PGP SIGNATURE-----

 --Apple-Mail=_14E67400-411A-409D-BE91-0AF1C93BE46E--

From: Robert Elz <kre@munnari.OZ.AU>
To: Christos Zoulas <christos@zoulas.com>
Cc: gnats-bugs@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,
        jtunney@gmail.com
Subject: Re: bin/55979 (sh single quotes removes nul characters)
Date: Sun, 07 Feb 2021 18:15:03 +0700

     Date:        Sat, 6 Feb 2021 21:13:47 -0500
     From:        Christos Zoulas <christos@zoulas.com>
     Message-ID:  <07BE2947-3B90-463F-B8D9-532A15FA0FED@zoulas.com>

   | Weird, it seems to be working for me.
   |
   | $ ./hello.com
   | hello world

 That works, Justine said it was working on NetBSD, what doesn't
 work is "sh hello.com" which I think is what is wanted.

 Justine, to change the shell we'd need a different heuristic that
 works as well, or at least close to it, to avoid executing files
 that should not be executed.   And support from the users.

 kre

From: Justine Tunney <jtunney@gmail.com>
To: gnats-bugs@netbsd.org
Cc: kre@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: bin/55979 (sh single quotes removes nul characters)
Date: Tue, 9 Feb 2021 20:08:50 -0800

 --000000000000775f7705baf39137
 Content-Type: text/plain; charset="UTF-8"
 Content-Transfer-Encoding: quoted-printable

 This could be a memory corruption issue. /bin/sh behaves unpredictably when
 it encounters nul characters inside single quotes. Sometimes scripts that
 do this will work and sometimes they don't. When they don't work it'll
 usually prints garbled data:

     -bash-5.0# sh
     netbsd# ./hello.com
     ./hello.com: r=EF=BF=BD=EF=BF=BD=EF=BF=BDPQ=EF=BF=BD=EF=BF=BD=EF=BF=BD=
 =EF=BF=BD=EF=BF=BD=EF=BF=BD1=DB=B0=EF=BF=BD=EF=BF=BDYXr=EF=BF=BD=C6=83=EF=
 =BF=BD: not found
     ./hello.com: xec: not found
     ./hello.com: 6: Syntax error: "else" unexpected

 ktrace reveals that $PATH search uses clobbered memory after parsing a
 single quoted string with NUL characters:

     ktrace sh -c ./hello.com
     kdump -f ktrace.out
      11172      1 sh       CALL  read(0xc,0x11f62e180,0x3f8)
      11172      1 sh       GIO   fd 12 read 1016 bytes
            "MZqFpD=3D'\n\0\0\^P\0\M-x\0\0\0... etc.
             \M-L\M-{\^N\^_\M-h\0\0^\M^A\M... etc.
             \0U\M-*'\n#'\"\no=3D\"$(command -v \"... etc.
      11172      1 sh       RET   read 1016/0x3f8
      11172      1 sh       CALL
  mmap(0,0x1000,PROT_READ|PROT_WRITE,0x1002<PRIVATE,ANONYMOUS,ALIGN=3DNONE>,=
 0xffffffff,0,0)
      11172      1 sh       RET   mmap 126131311058944/0x72b73bfda000
      11172      1 sh       CALL  __stat50(0x11f62e7f0,0x7f7fffbbe840)
      11172      1 sh       NAMI
  "/root/bin/r=EF=BF=BD=EF=BF=BD=EF=BF=BDPQ<86>=EF=BF=BD=EF=BF=BD=EF=BF=BD=
 =EF=BF=BD=EF=BF=BD1=DB=B0^A=EF=BF=BD^B=EF=BF=BD^SYXr^]<8C>=EF=BF=BD<83>=EF=
 =BF=BD"
      11172      1 sh       RET   __stat50 -1 errno 2 No such file or
 directory
      11172      1 sh       CALL  __stat50(0x11f62e7f0,0x7f7fffbbe840)
      11172      1 sh       NAMI
  "/sbin/r=EF=BF=BD=EF=BF=BD=EF=BF=BDPQ<86>=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=
 =BD=EF=BF=BD1=DB=B0^A=EF=BF=BD^B=EF=BF=BD^SYXr^]<8C>=EF=BF=BD<83>=EF=BF=BD"
      11172      1 sh       RET   __stat50 -1 errno 2 No such file or
 directory
      11172      1 sh       CALL  __stat50(0x11f62e7f0,0x7f7fffbbe840)
      11172      1 sh       NAMI
  "/usr/sbin/r=EF=BF=BD=EF=BF=BD=EF=BF=BDPQ<86>=EF=BF=BD=EF=BF=BD=EF=BF=BD=
 =EF=BF=BD=EF=BF=BD1=DB=B0^A=EF=BF=BD^B=EF=BF=BD^SYXr^]<8C>=EF=BF=BD<83>=EF=
 =BF=BD"
      11172      1 sh       RET   __stat50 -1 errno 2 No such file or
 directory

 Can we fix this?

 I misdiagnosed the issue earlier. Please disregard what I said about
 needing NULs in strings. I don't care if NUL is filtered out. What I need
 is for the shell to safely ignore binary data inside single quotes. For
 more background on this executable format, see the following screenshot
 https://justine.lol/apeheader.png and the design doc
 https://justine.lol/ape.html

 As for execve() + ENOEXEC safety restrictions, I have no opinion or need
 for those.
 If NetBSD wants to implement them, then I'd recommend doing what FreeBSD
 did:
 check that a line exists before the first NUL character containing a
 lowercase letter.
 APE binaries always start with "MZqFpD=3D\n" so it won't impact this use
 case. See:
 https://github.com/freebsd/freebsd-src/commit/e0f5c1387df23c8c4811f5b24a7ef=
 6ecac51a71a
 https://github.com/jart/zsh/commit/94a4bc14bb2e415ec3d10cf716512bd3e0d99f48

 On Sun, Feb 7, 2021 at 3:20 AM Robert Elz <kre@munnari.oz.au> wrote:

 > The following reply was made to PR bin/55979; it has been noted by GNATS.
 >
 > From: Robert Elz <kre@munnari.OZ.AU>
 > To: Christos Zoulas <christos@zoulas.com>
 > Cc: gnats-bugs@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org=
 ,
 >         jtunney@gmail.com
 > Subject: Re: bin/55979 (sh single quotes removes nul characters)
 > Date: Sun, 07 Feb 2021 18:15:03 +0700
 >
 >      Date:        Sat, 6 Feb 2021 21:13:47 -0500
 >      From:        Christos Zoulas <christos@zoulas.com>
 >      Message-ID:  <07BE2947-3B90-463F-B8D9-532A15FA0FED@zoulas.com>
 >
 >    | Weird, it seems to be working for me.
 >    |
 >    | $ ./hello.com
 >    | hello world
 >
 >  That works, Justine said it was working on NetBSD, what doesn't
 >  work is "sh hello.com" which I think is what is wanted.
 >
 >  Justine, to change the shell we'd need a different heuristic that
 >  works as well, or at least close to it, to avoid executing files
 >  that should not be executed.   And support from the users.
 >
 >  kre
 >
 >

 --000000000000775f7705baf39137
 Content-Type: text/html; charset="UTF-8"
 Content-Transfer-Encoding: quoted-printable

 <div dir=3D"ltr"><div dir=3D"ltr">This could be a memory corruption issue. =
 /bin/sh behaves unpredictably when it encounters nul characters inside sing=
 le=C2=A0quotes. Sometimes scripts that do this will work and sometimes they=
  don&#39;t. When they don&#39;t work it&#39;ll usually prints garbled data:=
 <div><br></div><div>=C2=A0 =C2=A0 -bash-5.0# sh<br>=C2=A0 =C2=A0 netbsd# ./=
 <a href=3D"http://hello.com">hello.com</a><br>=C2=A0 =C2=A0 ./<a href=3D"ht=
 tp://hello.com">hello.com</a>: r=EF=BF=BD=EF=BF=BD=EF=BF=BDPQ=EF=BF=BD=EF=
 =BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD1=DB=B0=EF=BF=BD=EF=BF=BDYXr=EF=
 =BF=BD=C6=83=EF=BF=BD: not found<br>=C2=A0 =C2=A0 ./<a href=3D"http://hello=
 .com">hello.com</a>: xec: not found<br>=C2=A0 =C2=A0 ./<a href=3D"http://he=
 llo.com">hello.com</a>: 6: Syntax error: &quot;else&quot; unexpected<br></d=
 iv><div><br></div><div>ktrace reveals that $PATH search uses clobbered memo=
 ry after parsing a single quoted string with NUL=C2=A0characters:</div><div=
 ><br></div><div>=C2=A0 =C2=A0=C2=A0ktrace sh -c ./<a href=3D"http://hello.c=
 om">hello.com</a></div><div>=C2=A0 =C2=A0 kdump -f ktrace.out</div><div>=C2=
 =A0 =C2=A0 =C2=A011172 =C2=A0 =C2=A0 =C2=A01 sh =C2=A0 =C2=A0 =C2=A0 CALL =
 =C2=A0read(0xc,0x11f62e180,0x3f8)<br>=C2=A0 =C2=A0 =C2=A011172 =C2=A0 =C2=
 =A0 =C2=A01 sh =C2=A0 =C2=A0 =C2=A0 GIO =C2=A0 fd 12 read 1016 bytes<br>=C2=
 =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0&quot;MZqFpD=3D&#39;\n\0\0\^P\0\M-x\0=
 \0\0... etc.<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 \M-L\M-{\^N\^_\M-=
 h\0\0^\M^A\M... etc.<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 \0U\M-*&#=
 39;\n#&#39;\&quot;\no=3D\&quot;$(command -v \&quot;... etc.<br>=C2=A0 =C2=
 =A0 =C2=A011172 =C2=A0 =C2=A0 =C2=A01 sh =C2=A0 =C2=A0 =C2=A0 RET =C2=A0 re=
 ad 1016/0x3f8<br>=C2=A0 =C2=A0 =C2=A011172 =C2=A0 =C2=A0 =C2=A01 sh =C2=A0 =
 =C2=A0 =C2=A0 CALL =C2=A0mmap(0,0x1000,PROT_READ|PROT_WRITE,0x1002&lt;PRIVA=
 TE,ANONYMOUS,ALIGN=3DNONE&gt;,0xffffffff,0,0)<br>=C2=A0 =C2=A0 =C2=A011172 =
 =C2=A0 =C2=A0 =C2=A01 sh =C2=A0 =C2=A0 =C2=A0 RET =C2=A0 mmap 1261313110589=
 44/0x72b73bfda000<br>=C2=A0 =C2=A0 =C2=A011172 =C2=A0 =C2=A0 =C2=A01 sh =C2=
 =A0 =C2=A0 =C2=A0 CALL =C2=A0__stat50(0x11f62e7f0,0x7f7fffbbe840)<br>=C2=A0=
  =C2=A0 =C2=A011172 =C2=A0 =C2=A0 =C2=A01 sh =C2=A0 =C2=A0 =C2=A0 NAMI =C2=
 =A0&quot;/root/bin/r=EF=BF=BD=EF=BF=BD=EF=BF=BDPQ&lt;86&gt;=EF=BF=BD=EF=BF=
 =BD=EF=BF=BD=EF=BF=BD=EF=BF=BD1=DB=B0^A=EF=BF=BD^B=EF=BF=BD^SYXr^]&lt;8C&gt=
 ;=EF=BF=BD&lt;83&gt;=EF=BF=BD&quot;<br>=C2=A0 =C2=A0 =C2=A011172 =C2=A0 =C2=
 =A0 =C2=A01 sh =C2=A0 =C2=A0 =C2=A0 RET =C2=A0 __stat50 -1 errno 2 No such =
 file or directory<br>=C2=A0 =C2=A0 =C2=A011172 =C2=A0 =C2=A0 =C2=A01 sh =C2=
 =A0 =C2=A0 =C2=A0 CALL =C2=A0__stat50(0x11f62e7f0,0x7f7fffbbe840)<br>=C2=A0=
  =C2=A0 =C2=A011172 =C2=A0 =C2=A0 =C2=A01 sh =C2=A0 =C2=A0 =C2=A0 NAMI =C2=
 =A0&quot;/sbin/r=EF=BF=BD=EF=BF=BD=EF=BF=BDPQ&lt;86&gt;=EF=BF=BD=EF=BF=BD=
 =EF=BF=BD=EF=BF=BD=EF=BF=BD1=DB=B0^A=EF=BF=BD^B=EF=BF=BD^SYXr^]&lt;8C&gt;=
 =EF=BF=BD&lt;83&gt;=EF=BF=BD&quot;<br>=C2=A0 =C2=A0 =C2=A011172 =C2=A0 =C2=
 =A0 =C2=A01 sh =C2=A0 =C2=A0 =C2=A0 RET =C2=A0 __stat50 -1 errno 2 No such =
 file or directory<br>=C2=A0 =C2=A0 =C2=A011172 =C2=A0 =C2=A0 =C2=A01 sh =C2=
 =A0 =C2=A0 =C2=A0 CALL =C2=A0__stat50(0x11f62e7f0,0x7f7fffbbe840)<br>=C2=A0=
  =C2=A0 =C2=A011172 =C2=A0 =C2=A0 =C2=A01 sh =C2=A0 =C2=A0 =C2=A0 NAMI =C2=
 =A0&quot;/usr/sbin/r=EF=BF=BD=EF=BF=BD=EF=BF=BDPQ&lt;86&gt;=EF=BF=BD=EF=BF=
 =BD=EF=BF=BD=EF=BF=BD=EF=BF=BD1=DB=B0^A=EF=BF=BD^B=EF=BF=BD^SYXr^]&lt;8C&gt=
 ;=EF=BF=BD&lt;83&gt;=EF=BF=BD&quot;<br>=C2=A0 =C2=A0 =C2=A011172 =C2=A0 =C2=
 =A0 =C2=A01 sh =C2=A0 =C2=A0 =C2=A0 RET =C2=A0 __stat50 -1 errno 2 No such =
 file or directory<br></div><div><br></div><div>Can we fix this?</div><div><=
 br></div><div><div>I misdiagnosed the issue earlier. Please disregard what =
 I said about needing NULs in strings. I don&#39;t care if NUL is filtered o=
 ut. What I need is for the shell to safely ignore binary data inside single=
 =C2=A0quotes. For more background on this executable format, see the follow=
 ing screenshot <a href=3D"https://justine.lol/apeheader.png">https://justin=
 e.lol/apeheader.png</a> and the design doc <a href=3D"https://justine.lol/a=
 pe.html">https://justine.lol/ape.html</a></div><div></div></div><div><br></=
 div><div>As for execve() + ENOEXEC safety restrictions, I have no opinion o=
 r need for those.</div><div>If NetBSD wants to implement them, then I&#39;d=
  recommend doing what FreeBSD did:</div><div>check that a line exists befor=
 e the first NUL character containing a lowercase letter.</div><div>APE bina=
 ries always start with &quot;MZqFpD=3D\n&quot; so it won&#39;t impact this =
 use case. See:</div><div><a href=3D"https://github.com/freebsd/freebsd-src/=
 commit/e0f5c1387df23c8c4811f5b24a7ef6ecac51a71a">https://github.com/freebsd=
 /freebsd-src/commit/e0f5c1387df23c8c4811f5b24a7ef6ecac51a71a</a><br></div><=
 div><a href=3D"https://github.com/jart/zsh/commit/94a4bc14bb2e415ec3d10cf71=
 6512bd3e0d99f48">https://github.com/jart/zsh/commit/94a4bc14bb2e415ec3d10cf=
 716512bd3e0d99f48</a><br></div></div></div><br><div class=3D"gmail_quote"><=
 div dir=3D"ltr" class=3D"gmail_attr">On Sun, Feb 7, 2021 at 3:20 AM Robert =
 Elz &lt;<a href=3D"mailto:kre@munnari.oz.au">kre@munnari.oz.au</a>&gt; wrot=
 e:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0=
 .8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">The following=
  reply was made to PR bin/55979; it has been noted by GNATS.<br>
 <br>
 From: Robert Elz &lt;<a href=3D"mailto:kre@munnari.OZ.AU" target=3D"_blank"=
 >kre@munnari.OZ.AU</a>&gt;<br>
 To: Christos Zoulas &lt;<a href=3D"mailto:christos@zoulas.com" target=3D"_b=
 lank">christos@zoulas.com</a>&gt;<br>
 Cc: <a href=3D"mailto:gnats-bugs@netbsd.org" target=3D"_blank">gnats-bugs@n=
 etbsd.org</a>, <a href=3D"mailto:gnats-admin@netbsd.org" target=3D"_blank">=
 gnats-admin@netbsd.org</a>, <a href=3D"mailto:netbsd-bugs@netbsd.org" targe=
 t=3D"_blank">netbsd-bugs@netbsd.org</a>,<br>
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 <a href=3D"mailto:jtunney@gmail.com" target=3D"=
 _blank">jtunney@gmail.com</a><br>
 Subject: Re: bin/55979 (sh single quotes removes nul characters)<br>
 Date: Sun, 07 Feb 2021 18:15:03 +0700<br>
 <br>
 =C2=A0 =C2=A0 =C2=A0Date:=C2=A0 =C2=A0 =C2=A0 =C2=A0 Sat, 6 Feb 2021 21:13:=
 47 -0500<br>
 =C2=A0 =C2=A0 =C2=A0From:=C2=A0 =C2=A0 =C2=A0 =C2=A0 Christos Zoulas &lt;<a=
  href=3D"mailto:christos@zoulas.com" target=3D"_blank">christos@zoulas.com<=
 /a>&gt;<br>
 =C2=A0 =C2=A0 =C2=A0Message-ID:=C2=A0 &lt;<a href=3D"mailto:07BE2947-3B90-4=
 63F-B8D9-532A15FA0FED@zoulas.com" target=3D"_blank">07BE2947-3B90-463F-B8D9=
 -532A15FA0FED@zoulas.com</a>&gt;<br>
 <br>
 =C2=A0 =C2=A0| Weird, it seems to be working for me.<br>
 =C2=A0 =C2=A0|<br>
 =C2=A0 =C2=A0| $ ./<a href=3D"http://hello.com" rel=3D"noreferrer" target=
 =3D"_blank">hello.com</a><br>
 =C2=A0 =C2=A0| hello world<br>
 <br>
 =C2=A0That works, Justine said it was working on NetBSD, what doesn&#39;t<b=
 r>
 =C2=A0work is &quot;sh <a href=3D"http://hello.com" rel=3D"noreferrer" targ=
 et=3D"_blank">hello.com</a>&quot; which I think is what is wanted.<br>
 <br>
 =C2=A0Justine, to change the shell we&#39;d need a different heuristic that=
 <br>
 =C2=A0works as well, or at least close to it, to avoid executing files<br>
 =C2=A0that should not be executed.=C2=A0 =C2=A0And support from the users.<=
 br>
 <br>
 =C2=A0kre<br>
 <br>
 </blockquote></div>

 --000000000000775f7705baf39137--

From: Kamil Rytarowski <kamil@netbsd.org>
To: gnats-bugs@netbsd.org, kre@netbsd.org, gnats-admin@netbsd.org,
 netbsd-bugs@netbsd.org, jtunney@gmail.com
Cc: 
Subject: Re: bin/55979 (sh single quotes removes nul characters)
Date: Wed, 10 Feb 2021 09:39:23 +0100

 MSan gives this:

 LC_ALL=C  /usr/src/bin/sh/sh ./hello.com
 cmdname=0x8 path=0x7f7fffffedac argv[0]=0x299f790
 cmdname=0x7140000000c0
 cmdname='/home/kamil/.local/bin/r���PQ������1۰��YXr�ƃ�'
 ==13613==WARNING: MemorySanitizer: use-of-uninitialized-value
     #0 0x46fcf3 in shellexec /usr/src/bin/sh/exec.c:138:18
     #1 0x464ddc in evalcommand /usr/src/bin/sh/eval.c:1392:3
     #2 0x44e198 in evaltree /usr/src/bin/sh/eval.c:375:4
     #3 0x5191cf in cmdloop /usr/src/bin/sh/main.c:320:4
     #4 0x5175fe in main /usr/src/bin/sh/main.c:262:3
     #5 0x41fa8b in ___start (/usr/src/bin/sh/sh+0x41fa8b)

   Uninitialized value was stored to memory at
     #0 0x46f31d in shellexec /usr/src/bin/sh/exec.c:126
     #1 0x464ddc in evalcommand /usr/src/bin/sh/eval.c:1392:3
     #2 0x44e198 in evaltree /usr/src/bin/sh/eval.c:375:4
     #3 0x5191cf in cmdloop /usr/src/bin/sh/main.c:320:4
     #4 0x5175fe in main /usr/src/bin/sh/main.c:262:3
     #5 0x41fa8b in ___start (/usr/src/bin/sh/sh+0x41fa8b)

   Uninitialized value was created by an allocation of 'cmdentry' in the
 stack frame of function 'evalcommand'
     #0 0x458420 in evalcommand /usr/src/bin/sh/eval.c:870

 SUMMARY: MemorySanitizer: use-of-uninitialized-value
 /usr/src/bin/sh/exec.c:138:18 in shellexec
 Exiting
 ./hello.com: 6: Syntax error: ")" unexpected

 There is something wrong with or around padvance(). ':' gets stripped
 from PATH and there is an uninitialized memory read.

From: Robert Elz <kre@munnari.OZ.AU>
To: Justine Tunney <jtunney@gmail.com>
Cc: gnats-bugs@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: bin/55979 (sh single quotes removes nul characters)
Date: Wed, 10 Feb 2021 20:08:53 +0700

     Date:        Tue, 9 Feb 2021 20:08:50 -0800
     From:        Justine Tunney <jtunney@gmail.com>
     Message-ID:  <CANtdasQJxExVw_fpBHGX=qPWMs56PC-6RH6nufTYM-X25CCORQ@mail.gmail.com>

   | This could be a memory corruption issue. /bin/sh behaves unpredictably when
   | it encounters nul characters inside single quotes.

 That's unlikely, as it simply ignores nul chars when it reads them,
 what you're seeing is probably something different.

   | When they don't work it'll usually prints garbled data:

 Can you find a simple (short) test case (doesn't matter if it
 does, or should do, anything meaningful) that you believe behaves
 incorrectly, and send it to me?   Then I can take a look.

 Actually, by inventing my own test case, I see that while we have
 ancient code that deletes nul chars when it sees them, the way that's
 done is (and has been for decades) broken, so we only ignore some of
 them, not all.   Since \0 chars anywhere in shell scripts make a
 non-conforming script, actually seeing a \0 char in a script is very rare,
 so no-one has ever noticed.   I will fix the way we do that (make nul
 chars be truly ignored, so that they're just not there), but I doubt that
 it will fix your problem, as the effect seems to be different than you
 described (but without seeing an actual failing test case I cannot be
 certain).

   | I misdiagnosed the issue earlier.

 Yes, I had worked that out.

   | What I need
   | is for the shell to safely ignore binary data inside single quotes.

 Assuming that you don't try and use it (which is what I believe is
 your intent) that should work, just provided, of course, the binary
 data doesn't happen to contain a ' character.   Aside from \0, the
 shell doesn't (shouldn't) really care what binary values form any of
 the parts of the script which doesn't have syntax constraints.

 Of course, bugs can always exist (and have in the past).

   | APE binaries always start with "MZqFpD=\n" so it won't impact this use
   | case.

 If that was true, you wouldn't have a problem, but at least the hello.com you
 provided a link to earlier started \177ELF which is where the issue arises.

 kre

From: Robert Elz <kre@munnari.OZ.AU>
To: Justine Tunney <jtunney@gmail.com>
Cc: gnats-bugs@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: bin/55979 (sh single quotes removes nul characters)
Date: Wed, 10 Feb 2021 20:24:28 +0700

     Date:        Tue, 9 Feb 2021 20:08:50 -0800
     From:        Justine Tunney <jtunney@gmail.com>
     Message-ID:  <CANtdasQJxExVw_fpBHGX=qPWMs56PC-6RH6nufTYM-X25CCORQ@mail.gmail.com>

 One more thing:

   |      11172      1 sh       CALL mmap(0,0x1000,PROT_READ|PROT_WRITE,0x1002<PRIVATE,ANONYMOUS,ALIGN=NONE>,0xffffffff,0,0)
   |      11172      1 sh       RET   mmap 126131311058944/0x72b73bfda000

 That's very odd, sh doesn't call mmap() anywhere (and doesn't use stdio
 for input/output either) and doesn't dynamically load anything either
 (libc and libedit should have been loaded at startup).   Are you sure
 that's the NetBSD /bin/sh doing that?

 In that sh, do "echo $NETBSD_SHELL" and show what it says please.

 kre


From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org, kamil@netbsd.org
Cc: 
Subject: Re: bin/55979 (sh single quotes removes nul characters)
Date: Wed, 10 Feb 2021 23:48:13 +0700

     Date:        Wed, 10 Feb 2021 08:45:02 +0000 (UTC)
     From:        Kamil Rytarowski <kamil=40netbsd.org>
     Message-ID:  <20210210084502.5509F1A9241=40mollari.NetBSD.org>

   =7C  SUMMARY: MemorySanitizer: use-of-uninitialized-value
   =7C  /usr/src/bin/sh/exec.c:138:18 in shellexec
   =7C  Exiting
   =7C  ./hello.com: 6: Syntax error: =22)=22 unexpected
   =7C =20
   =7C  There is something wrong with or around padvance(). ':' gets strip=
 ped
   =7C  from PATH and there is an uninitialized memory read.

 I will take a look at that, but first, Kamil can you tell me which sh
 source version you used for that, HEAD, or netbsd-9, or ?   (Perhaps the
 cvs version numbers of exec.c and eval.c).

 Also, which hello.com are you using?  When I try it (with the original
 from the original PR, fetched soon after it was filed) I see:

 	jinx=24 ./sh /tmp/hello.com
 	./sh: Cannot execute ELF binary /tmp/hello.com

 which is exactly what I'd expect to see (but version I have does not
 start with that =22MZqFpD=3D=5Cn=22 string).

 The analysis from msan is a little odd, as cmdentry local var (struct)
 used in evalcommand(), and while fields from it are passed to other funct=
 ions,
 the struct itself is not (nor is its address ever evaluated), so it is a
 bit hard to imagine how shellexec() is reading uninit'd values from it.

 On the other hand, when evalcommand() calls shellexec() it does pass
 cmdentry.u.index (a field in a union in the struct) as one of the paramet=
 ers,
 but if that is the uninit'd value, I'd have expected it to be detected
 where it is fetched (in evalcommand()) rather than where the value is use=
 d.

 Apart from that, about all I can imagine is that something has a wild
 pointer which is accessing random stack memory -- that or perhaps the
 random(ish) binary data which is being used here is somehow fooling msan
 into believing something different than what is actually happening.

 kre

 ps: everyone ignore the questing/comment about mmap() ... martin=40
 reminded me that that's just malloc() doing its thing.   I date from
 the days when malloc() used sbrk()...


From: Kamil Rytarowski <kamil@netbsd.org>
To: Robert Elz <kre@munnari.OZ.AU>, gnats-bugs@netbsd.org, kamil@netbsd.org
Cc: 
Subject: Re: bin/55979 (sh single quotes removes nul characters)
Date: Wed, 10 Feb 2021 18:37:43 +0100

 On 10.02.2021 17:48, Robert Elz wrote:
 >     Date:        Wed, 10 Feb 2021 08:45:02 +0000 (UTC)
 >     From:        Kamil Rytarowski <kamil@netbsd.org>
 >     Message-ID:  <20210210084502.5509F1A9241@mollari.NetBSD.org>
 > 
 >   |  SUMMARY: MemorySanitizer: use-of-uninitialized-value
 >   |  /usr/src/bin/sh/exec.c:138:18 in shellexec
 >   |  Exiting
 >   |  ./hello.com: 6: Syntax error: ")" unexpected
 >   |  
 >   |  There is something wrong with or around padvance(). ':' gets stripped
 >   |  from PATH and there is an uninitialized memory read.
 > 
 > I will take a look at that, but first, Kamil can you tell me which sh
 > source version you used for that, HEAD, or netbsd-9, or ?   (Perhaps the
 > cvs version numbers of exec.c and eval.c).
 > 
 > Also, which hello.com are you using?  When I try it (with the original
 > from the original PR, fetched soon after it was filed) I see:
 > 
 > 	jinx$ ./sh /tmp/hello.com
 > 	./sh: Cannot execute ELF binary /tmp/hello.com
 > 
 > which is exactly what I'd expect to see (but version I have does not
 > start with that "MZqFpD=\n" string).
 > 
 > The analysis from msan is a little odd, as cmdentry local var (struct)
 > used in evalcommand(), and while fields from it are passed to other functions,
 > the struct itself is not (nor is its address ever evaluated), so it is a
 > bit hard to imagine how shellexec() is reading uninit'd values from it.
 > 
 > On the other hand, when evalcommand() calls shellexec() it does pass
 > cmdentry.u.index (a field in a union in the struct) as one of the parameters,
 > but if that is the uninit'd value, I'd have expected it to be detected
 > where it is fetched (in evalcommand()) rather than where the value is used.
 > 
 > Apart from that, about all I can imagine is that something has a wild
 > pointer which is accessing random stack memory -- that or perhaps the
 > random(ish) binary data which is being used here is somehow fooling msan
 > into believing something different than what is actually happening.
 > 
 > kre
 > 
 > ps: everyone ignore the questing/comment about mmap() ... martin@
 > reminded me that that's just malloc() doing its thing.   I date from
 > the days when malloc() used sbrk()...
 > 
 > 

 I've uploaded a tarball with src/bin/sh/ with hello.com and the prebuilt
 sh executable to:

 http://netbsd.org/~kamil/sh-pr55979.tar.bz2 (2.5 MB)

 $ uname -a
 NetBSD chieftec 9.99.79 NetBSD 9.99.79 (GENERIC) #3: Tue Jan 26 13:24:54
 CET 2021
 root@chieftec:/public/netbsd-root/sys/arch/amd64/compile/GENERIC amd64

 The sh version is from HEAD of CVS and I have removed local patches
 printf-ing some debug code as seen in the original report.

 It shows me:

 $ ./sh hello.com

 ==14798==WARNING: MemorySanitizer: use-of-uninitialized-value
     #0 0x46fa69 in shellexec /usr/src/bin/sh/exec.c:136:18
     #1 0x464ddc in evalcommand /usr/src/bin/sh/eval.c:1392:3
     #2 0x44e198 in evaltree /usr/src/bin/sh/eval.c:375:4
     #3 0x518f3f in cmdloop /usr/src/bin/sh/main.c:320:4
     #4 0x51736e in main /usr/src/bin/sh/main.c:262:3
     #5 0x41fa8b in ___start (/usr/src/bin/sh/./sh+0x41fa8b)

   Uninitialized value was stored to memory at
     #0 0x46f31d in shellexec /usr/src/bin/sh/exec.c:126
     #1 0x464ddc in evalcommand /usr/src/bin/sh/eval.c:1392:3
     #2 0x44e198 in evaltree /usr/src/bin/sh/eval.c:375:4
     #3 0x518f3f in cmdloop /usr/src/bin/sh/main.c:320:4
     #4 0x51736e in main /usr/src/bin/sh/main.c:262:3
     #5 0x41fa8b in ___start (/usr/src/bin/sh/./sh+0x41fa8b)

   Uninitialized value was created by an allocation of 'cmdentry' in the
 stack frame of function 'evalcommand'
     #0 0x458420 in evalcommand /usr/src/bin/sh/eval.c:870

 SUMMARY: MemorySanitizer: use-of-uninitialized-value
 /usr/src/bin/sh/exec.c:136:18 in shellexec
 Exiting
 hello.com: 6: Syntax error: ")" unexpected

From: Kamil Rytarowski <kamil@netbsd.org>
To: Kamil Rytarowski <kamil@netbsd.org>, Robert Elz <kre@munnari.OZ.AU>,
 gnats-bugs@netbsd.org
Cc: 
Subject: Re: bin/55979 (sh single quotes removes nul characters)
Date: Wed, 10 Feb 2021 18:39:51 +0100

 On 10.02.2021 18:37, Kamil Rytarowski wrote:
 > On the other hand, when evalcommand() calls shellexec() it does pass
 > cmdentry.u.index (a field in a union in the struct) as one of the parameters,
 > but if that is the uninit'd value, I'd have expected it to be detected
 > where it is fetched (in evalcommand()) rather than where the value is used.

 MSan detects uninitialized memory read once it is used, not once it is
 fetched.

From: Justine Tunney <jtunney@gmail.com>
To: gnats-bugs@netbsd.org
Cc: kre@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: bin/55979 (sh single quotes removes nul characters)
Date: Wed, 10 Feb 2021 10:39:13 -0800

 --0000000000004da53505baffba91
 Content-Type: text/plain; charset="UTF-8"

 Have we tried using ASAN to troubleshoot this?

 > the hello.com you provided a link to earlier started \177ELF

 That's because the binary modified itself. The code following MZqFpD='' is
 a printf ELF>$0 so the first 64 bytes have a conventional ELF header for
 subsequent invocations. Try downloading https://justine.lol/hello.com
 again. That file can be your test case. I can create a more minimal one too
 if you need it.

 On Wed, Feb 10, 2021 at 5:25 AM Robert Elz <kre@munnari.oz.au> wrote:

 > The following reply was made to PR bin/55979; it has been noted by GNATS.
 >
 > From: Robert Elz <kre@munnari.OZ.AU>
 > To: Justine Tunney <jtunney@gmail.com>
 > Cc: gnats-bugs@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
 > Subject: Re: bin/55979 (sh single quotes removes nul characters)
 > Date: Wed, 10 Feb 2021 20:24:28 +0700
 >
 >      Date:        Tue, 9 Feb 2021 20:08:50 -0800
 >      From:        Justine Tunney <jtunney@gmail.com>
 >      Message-ID:  <CANtdasQJxExVw_fpBHGX=
 > qPWMs56PC-6RH6nufTYM-X25CCORQ@mail.gmail.com>
 >
 >  One more thing:
 >
 >    |      11172      1 sh       CALL
 > mmap(0,0x1000,PROT_READ|PROT_WRITE,0x1002<PRIVATE,ANONYMOUS,ALIGN=NONE>,0xffffffff,0,0)
 >    |      11172      1 sh       RET   mmap 126131311058944/0x72b73bfda000
 >
 >  That's very odd, sh doesn't call mmap() anywhere (and doesn't use stdio
 >  for input/output either) and doesn't dynamically load anything either
 >  (libc and libedit should have been loaded at startup).   Are you sure
 >  that's the NetBSD /bin/sh doing that?
 >
 >  In that sh, do "echo $NETBSD_SHELL" and show what it says please.
 >
 >  kre
 >
 >
 >

 --0000000000004da53505baffba91
 Content-Type: text/html; charset="UTF-8"
 Content-Transfer-Encoding: quoted-printable

 <div dir=3D"ltr"><div>Have we tried using ASAN to troubleshoot this?<br></d=
 iv><div><br></div>&gt; the <a href=3D"http://hello.com">hello.com</a> you=
 =C2=A0provided a link to earlier started \177ELF<br><div><br></div><div>Tha=
 t&#39;s because the binary modified itself. The code following MZqFpD=3D&#3=
 9;&#39; is a printf ELF&gt;$0 so the first 64 bytes have a conventional ELF=
  header for subsequent invocations. Try downloading <a href=3D"https://just=
 ine.lol/hello.com">https://justine.lol/hello.com</a> again. That file can b=
 e your test case. I can create a more minimal one too if you need it.</div>=
 </div><br><div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">=
 On Wed, Feb 10, 2021 at 5:25 AM Robert Elz &lt;<a href=3D"mailto:kre@munnar=
 i.oz.au">kre@munnari.oz.au</a>&gt; wrote:<br></div><blockquote class=3D"gma=
 il_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,2=
 04,204);padding-left:1ex">The following reply was made to PR bin/55979; it =
 has been noted by GNATS.<br>
 <br>
 From: Robert Elz &lt;<a href=3D"mailto:kre@munnari.OZ.AU" target=3D"_blank"=
 >kre@munnari.OZ.AU</a>&gt;<br>
 To: Justine Tunney &lt;<a href=3D"mailto:jtunney@gmail.com" target=3D"_blan=
 k">jtunney@gmail.com</a>&gt;<br>
 Cc: <a href=3D"mailto:gnats-bugs@netbsd.org" target=3D"_blank">gnats-bugs@n=
 etbsd.org</a>, <a href=3D"mailto:gnats-admin@netbsd.org" target=3D"_blank">=
 gnats-admin@netbsd.org</a>, <a href=3D"mailto:netbsd-bugs@netbsd.org" targe=
 t=3D"_blank">netbsd-bugs@netbsd.org</a><br>
 Subject: Re: bin/55979 (sh single quotes removes nul characters)<br>
 Date: Wed, 10 Feb 2021 20:24:28 +0700<br>
 <br>
 =C2=A0 =C2=A0 =C2=A0Date:=C2=A0 =C2=A0 =C2=A0 =C2=A0 Tue, 9 Feb 2021 20:08:=
 50 -0800<br>
 =C2=A0 =C2=A0 =C2=A0From:=C2=A0 =C2=A0 =C2=A0 =C2=A0 Justine Tunney &lt;<a =
 href=3D"mailto:jtunney@gmail.com" target=3D"_blank">jtunney@gmail.com</a>&g=
 t;<br>
 =C2=A0 =C2=A0 =C2=A0Message-ID:=C2=A0 &lt;CANtdasQJxExVw_fpBHGX=3D<a href=
 =3D"mailto:qPWMs56PC-6RH6nufTYM-X25CCORQ@mail.gmail.com" target=3D"_blank">=
 qPWMs56PC-6RH6nufTYM-X25CCORQ@mail.gmail.com</a>&gt;<br>
 <br>
 =C2=A0One more thing:<br>
 <br>
 =C2=A0 =C2=A0|=C2=A0 =C2=A0 =C2=A0 11172=C2=A0 =C2=A0 =C2=A0 1 sh=C2=A0 =C2=
 =A0 =C2=A0 =C2=A0CALL mmap(0,0x1000,PROT_READ|PROT_WRITE,0x1002&lt;PRIVATE,=
 ANONYMOUS,ALIGN=3DNONE&gt;,0xffffffff,0,0)<br>
 =C2=A0 =C2=A0|=C2=A0 =C2=A0 =C2=A0 11172=C2=A0 =C2=A0 =C2=A0 1 sh=C2=A0 =C2=
 =A0 =C2=A0 =C2=A0RET=C2=A0 =C2=A0mmap 126131311058944/0x72b73bfda000<br>
 <br>
 =C2=A0That&#39;s very odd, sh doesn&#39;t call mmap() anywhere (and doesn&#=
 39;t use stdio<br>
 =C2=A0for input/output either) and doesn&#39;t dynamically load anything ei=
 ther<br>
 =C2=A0(libc and libedit should have been loaded at startup).=C2=A0 =C2=A0Ar=
 e you sure<br>
 =C2=A0that&#39;s the NetBSD /bin/sh doing that?<br>
 <br>
 =C2=A0In that sh, do &quot;echo $NETBSD_SHELL&quot; and show what it says p=
 lease.<br>
 <br>
 =C2=A0kre<br>
 <br>
 <br>
 </blockquote></div>

 --0000000000004da53505baffba91--

From: Robert Elz <kre@munnari.OZ.AU>
To: Justine Tunney <jtunney@gmail.com>
Cc: gnats-bugs@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: bin/55979 (sh single quotes removes nul characters)
Date: Thu, 11 Feb 2021 09:33:38 +0700

     Date:        Wed, 10 Feb 2021 10:39:13 -0800
     From:        Justine Tunney <jtunney@gmail.com>
     Message-ID:  <CANtdasRbJQCznmCStFQFsYuOrb8BjQdoQiv8XLfAxOEYgD2RAw@mail.gmail.com>

   | That's because the binary modified itself.

 Oh.   OK.

   | Try downloading https://justine.lol/hello.com again.

 Did that, I'll see what I can find.

 kre

From: Robert Elz <kre@munnari.OZ.AU>
To: Kamil Rytarowski <kamil@netbsd.org>
Cc: gnats-bugs@netbsd.org
Subject: Re: bin/55979 (sh single quotes removes nul characters)
Date: Thu, 11 Feb 2021 16:55:39 +0700

     Date:        Wed, 10 Feb 2021 18:39:51 +0100
     From:        Kamil Rytarowski <kamil@netbsd.org>
     Message-ID:  <adab2a54-892d-36fd-895e-d056bebe8bf2@netbsd.org>

   | MSan detects uninitialized memory read
   | once it is used, not once it is fetched.


 oh.  ok.  that would explain things.  thanks
 for the tarball.

 kre

From: Robert Elz <kre@munnari.OZ.AU>
To: Kamil Rytarowski <kamil@netbsd.org>
Cc: gnats-bugs@netbsd.org
Subject: Re: bin/55979 (sh single quotes removes nul characters)
Date: Fri, 12 Feb 2021 00:48:31 +0700

     Date:        Wed, 10 Feb 2021 18:37:43 +0100
     From:        Kamil Rytarowski <kamil@netbsd.org>
     Message-ID:  <d926a6ef-5877-f1de-bf77-d7b2b57027c2@netbsd.org>

   | It shows me:
   |
   | $ ./sh hello.com
   |
   | ==14798==WARNING: MemorySanitizer: use-of-uninitialized-value
   |     #0 0x46fa69 in shellexec /usr/src/bin/sh/exec.c:136:18

 I know what this is now, it was introduced (inadvertently) as part of
 the fix for PRs 42184 and 52687 (two PRs, same issue) - the problem there
 was short-circuiting command not found processing, when a command name
 without a '/' in it was not found in a PATH search.

 The PATH search needs to be done in the parent shell (the one which read
 the command and is processing it) so the results can be cached, but all
 other processing (including redirects, which are still supposed to happen -
 and in particular which can redirect stderr, to which the command not found
 error message should be sent) is supposed to happen in a sub-shell.   The
 way it was before the fix, when the parent shell fails to locate the command
 in the path search, it simply aborted command processing, issued the error
 message and went on with whatever should happen next (issuing next prompt,
 or whatever) - the redirects never happened (so the not found error message
 went to the shell's stderr, rather than to where stderr had been redirected).
 The fix for that was simply to drop the short-circuit test, and treat not-found
 commands the same as found commands, fork a sub-shell, process all the
 redirects, and then go find the command again (which will fail again, except
 in the very weird case of a race condition where the command suddenly appears
 in a directory that is in PATH, but we don't care about that, it's a race,
 someone wins, someone loses.)

 The relevant var (field in the cmdentry struct), the index, wasn't being
 set in the not-found case.   It marks which entry in PATH located the
 command (but of course, there is none for a not-found command).  With
 the old short circuit in place, that was fine, as after the initial not
 found lookup, everything stopped, and nothing ever looked at that value.

 Now, that has changed, and the second time we search for the command, we
 do look at it (PATH isn't an array, so the uninit'd var isn't being used
 as an index, just to compare against the counted elements in PATH as they
 are processed one by one - it avoids meaningless attempts at execl() when
 we have already determined that the command won't be found there).   Since
 the command doesn't exist anywhere, it really makes no difference what the
 index value we're looking at says, anywhere it points us to (including
 nowhere) isn't going to locate the command.  Except in the (non-existing
 I believe) case of hardware which actually detects the use of the uninitialised
 value and does something strange (stranger than simply using some random value)
 (MSAN is the one example of such "hardware") I believe this problem makes
 no difference to anything that matters.

 So, while it will get fixed, I don't think this will actually affect the
 actual issue in this PR (as amended) and I most likely won't bother
 requesting pullups to -8 or -9 (the change that caused it happened in
 Aug 2018, before -9 was released, and was marked to be pulled up to -8,
 which I assume happened).   That is unless some other change related to
 this PR is needed, and should be pulled up, in which case this trivial fix
 can hitch a ride.

 A commit to fix this in HEAD will come sometime in the not too distant
 future.

 The issue in this PR triggered the MSAN detection, as it ends up attempting
 to run a bogus (and hence, not found) command name.   Why that is happening
 is the next thing to discover, but it cannot be because of this (MSAN) issue,
 this comes later.

 kre

From: Robert Elz <kre@munnari.OZ.AU>
To: Justine Tunney <jtunney@gmail.com>
Cc: gnats-bugs@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: bin/55979 (sh single quotes removes nul characters)
Date: Sat, 13 Feb 2021 00:42:42 +0700

     Date:        Wed, 10 Feb 2021 10:39:13 -0800
     From:        Justine Tunney <jtunney@gmail.com>
     Message-ID:  <CANtdasRbJQCznmCStFQFsYuOrb8BjQdoQiv8XLfAxOEYgD2RAw@mail.gmail.com>

   | Try downloading https://justine.lol/hello.com again.
   | That file can be your test case.

 I did that, and used it, though:

   | I can create a more minimal one too if you need it.

 I did that for myself as well, once I worked out just what was
 going on, and what was really needed to perform the test (small
 tests that test nothing more than the bug are ideal ... of course
 to make one of those you need to first know exactly what the
 bug really is).

 Turns out that back in 1995 a fix for another problem broke the
 way that \0's were being dropped, and caused corrupted input (more
 corrupted than just dropping the nul chars) to be handed to the
 shell parser.

 That we have had this bug for more than 25 years says something
 about just how rare it is for shell scripts to actually contain
 nul characters.   Of course, any that do are non-conforming, so
 we could simply claim our current behaviour as correct (though
 it clearly is an obvious bug ... now it has been found).

 I am currently building an updated system (HEAD) with the fix
 (and the MSAN detected bug fix as well, not that that one really
 matters), after which (tomorrow, or later today depending upon
 timezones) I'll run the ATF tests to check that nothing new appears
 to have broken (as this fix should change nothing for a script without
 nul characters in it, and the MSAN fix change is truly trivial,
 that should not be a problem).

 Expect to see the fix(es) in HEAD early next week.

 After it has caused no problems there for a while, I'll request
 a pullup to -9.   But as neither of these bugs bother almost
 anyone, ever, and because its remaining lifetime isn't expected
 to be all that long, I don't think I'll bother with pullups to -8.
 (Of course, any other developer who wants to do the work to
 incorporate the fixes there, and test them properly, could do that.)

 If you want to test the updated sh on a -9 system, just copy the
 sources from HEAD (after the fix appears) and build it on a -9
 system (cd src/bin/sh; <update sources to HEAD>; make) (or as part
 of a build of the -9 system on any host).  There should be no problems
 doing that (but going to -8, while not all that hard, is slightly more
 complex than just copy and build).   Note that building sh requires
 access to the sources of its built-in commands (test, printf, kill, ...)
 so the rest of the (userland at least) sources should be present.

 kre

 ps: no changes are needed to the way we detect binary files so as not to
 interpret them as scripts - that turned out to all be misdirection, the
 actual problem is closer to what was originally reported, though single
 quoted strings have nothing specific to do with it, and nuls certainly
 do not survive sh processing.


From: Justine Tunney <jtunney@gmail.com>
To: gnats-bugs@netbsd.org
Cc: kre@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: bin/55979 (sh single quotes removes nul characters)
Date: Mon, 15 Feb 2021 13:18:53 -0800

 --00000000000079a54205bb668aaf
 Content-Type: text/plain; charset="UTF-8"

 I think the fact that people are still discovering novel use cases for the
 design of the Thompson Shell after all these years, just goes to show how
 gracefully the technology has aged, and how empowering it continues to be
 as a technology. Thanks for being flexible. I'm looking forward to seeing
 the fixes!

 On Fri, Feb 12, 2021 at 9:45 AM Robert Elz <kre@munnari.oz.au> wrote:

 > The following reply was made to PR bin/55979; it has been noted by GNATS.
 >
 > From: Robert Elz <kre@munnari.OZ.AU>
 > To: Justine Tunney <jtunney@gmail.com>
 > Cc: gnats-bugs@netbsd.org, netbsd-bugs@netbsd.org
 > Subject: Re: bin/55979 (sh single quotes removes nul characters)
 > Date: Sat, 13 Feb 2021 00:42:42 +0700
 >
 >      Date:        Wed, 10 Feb 2021 10:39:13 -0800
 >      From:        Justine Tunney <jtunney@gmail.com>
 >      Message-ID:  <
 > CANtdasRbJQCznmCStFQFsYuOrb8BjQdoQiv8XLfAxOEYgD2RAw@mail.gmail.com>
 >
 >    | Try downloading https://justine.lol/hello.com again.
 >    | That file can be your test case.
 >
 >  I did that, and used it, though:
 >
 >    | I can create a more minimal one too if you need it.
 >
 >  I did that for myself as well, once I worked out just what was
 >  going on, and what was really needed to perform the test (small
 >  tests that test nothing more than the bug are ideal ... of course
 >  to make one of those you need to first know exactly what the
 >  bug really is).
 >
 >  Turns out that back in 1995 a fix for another problem broke the
 >  way that \0's were being dropped, and caused corrupted input (more
 >  corrupted than just dropping the nul chars) to be handed to the
 >  shell parser.
 >
 >  That we have had this bug for more than 25 years says something
 >  about just how rare it is for shell scripts to actually contain
 >  nul characters.   Of course, any that do are non-conforming, so
 >  we could simply claim our current behaviour as correct (though
 >  it clearly is an obvious bug ... now it has been found).
 >
 >  I am currently building an updated system (HEAD) with the fix
 >  (and the MSAN detected bug fix as well, not that that one really
 >  matters), after which (tomorrow, or later today depending upon
 >  timezones) I'll run the ATF tests to check that nothing new appears
 >  to have broken (as this fix should change nothing for a script without
 >  nul characters in it, and the MSAN fix change is truly trivial,
 >  that should not be a problem).
 >
 >  Expect to see the fix(es) in HEAD early next week.
 >
 >  After it has caused no problems there for a while, I'll request
 >  a pullup to -9.   But as neither of these bugs bother almost
 >  anyone, ever, and because its remaining lifetime isn't expected
 >  to be all that long, I don't think I'll bother with pullups to -8.
 >  (Of course, any other developer who wants to do the work to
 >  incorporate the fixes there, and test them properly, could do that.)
 >
 >  If you want to test the updated sh on a -9 system, just copy the
 >  sources from HEAD (after the fix appears) and build it on a -9
 >  system (cd src/bin/sh; <update sources to HEAD>; make) (or as part
 >  of a build of the -9 system on any host).  There should be no problems
 >  doing that (but going to -8, while not all that hard, is slightly more
 >  complex than just copy and build).   Note that building sh requires
 >  access to the sources of its built-in commands (test, printf, kill, ...)
 >  so the rest of the (userland at least) sources should be present.
 >
 >  kre
 >
 >  ps: no changes are needed to the way we detect binary files so as not to
 >  interpret them as scripts - that turned out to all be misdirection, the
 >  actual problem is closer to what was originally reported, though single
 >  quoted strings have nothing specific to do with it, and nuls certainly
 >  do not survive sh processing.
 >
 >
 >

 --00000000000079a54205bb668aaf
 Content-Type: text/html; charset="UTF-8"
 Content-Transfer-Encoding: quoted-printable

 <div dir=3D"ltr"><div>I think the fact that people are still discovering no=
 vel use cases for the design of the Thompson Shell after all these years, j=
 ust goes to show how gracefully the technology has aged, and how empowering=
  it continues to be as a technology. Thanks for being flexible. I&#39;m loo=
 king forward to seeing the fixes!</div></div><br><div class=3D"gmail_quote"=
 ><div dir=3D"ltr" class=3D"gmail_attr">On Fri, Feb 12, 2021 at 9:45 AM Robe=
 rt Elz &lt;<a href=3D"mailto:kre@munnari.oz.au">kre@munnari.oz.au</a>&gt; w=
 rote:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0p=
 x 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">The follow=
 ing reply was made to PR bin/55979; it has been noted by GNATS.<br>
 <br>
 From: Robert Elz &lt;<a href=3D"mailto:kre@munnari.OZ.AU" target=3D"_blank"=
 >kre@munnari.OZ.AU</a>&gt;<br>
 To: Justine Tunney &lt;<a href=3D"mailto:jtunney@gmail.com" target=3D"_blan=
 k">jtunney@gmail.com</a>&gt;<br>
 Cc: <a href=3D"mailto:gnats-bugs@netbsd.org" target=3D"_blank">gnats-bugs@n=
 etbsd.org</a>, <a href=3D"mailto:netbsd-bugs@netbsd.org" target=3D"_blank">=
 netbsd-bugs@netbsd.org</a><br>
 Subject: Re: bin/55979 (sh single quotes removes nul characters)<br>
 Date: Sat, 13 Feb 2021 00:42:42 +0700<br>
 <br>
 =C2=A0 =C2=A0 =C2=A0Date:=C2=A0 =C2=A0 =C2=A0 =C2=A0 Wed, 10 Feb 2021 10:39=
 :13 -0800<br>
 =C2=A0 =C2=A0 =C2=A0From:=C2=A0 =C2=A0 =C2=A0 =C2=A0 Justine Tunney &lt;<a =
 href=3D"mailto:jtunney@gmail.com" target=3D"_blank">jtunney@gmail.com</a>&g=
 t;<br>
 =C2=A0 =C2=A0 =C2=A0Message-ID:=C2=A0 &lt;<a href=3D"mailto:CANtdasRbJQCznm=
 CStFQFsYuOrb8BjQdoQiv8XLfAxOEYgD2RAw@mail.gmail.com" target=3D"_blank">CANt=
 dasRbJQCznmCStFQFsYuOrb8BjQdoQiv8XLfAxOEYgD2RAw@mail.gmail.com</a>&gt;<br>
 <br>
 =C2=A0 =C2=A0| Try downloading <a href=3D"https://justine.lol/hello.com" re=
 l=3D"noreferrer" target=3D"_blank">https://justine.lol/hello.com</a> again.=
 <br>
 =C2=A0 =C2=A0| That file can be your test case.<br>
 <br>
 =C2=A0I did that, and used it, though:<br>
 <br>
 =C2=A0 =C2=A0| I can create a more minimal one too if you need it.<br>
 <br>
 =C2=A0I did that for myself as well, once I worked out just what was<br>
 =C2=A0going on, and what was really needed to perform the test (small<br>
 =C2=A0tests that test nothing more than the bug are ideal ... of course<br>
 =C2=A0to make one of those you need to first know exactly what the<br>
 =C2=A0bug really is).<br>
 <br>
 =C2=A0Turns out that back in 1995 a fix for another problem broke the<br>
 =C2=A0way that \0&#39;s were being dropped, and caused corrupted input (mor=
 e<br>
 =C2=A0corrupted than just dropping the nul chars) to be handed to the<br>
 =C2=A0shell parser.<br>
 <br>
 =C2=A0That we have had this bug for more than 25 years says something<br>
 =C2=A0about just how rare it is for shell scripts to actually contain<br>
 =C2=A0nul characters.=C2=A0 =C2=A0Of course, any that do are non-conforming=
 , so<br>
 =C2=A0we could simply claim our current behaviour as correct (though<br>
 =C2=A0it clearly is an obvious bug ... now it has been found).<br>
 <br>
 =C2=A0I am currently building an updated system (HEAD) with the fix<br>
 =C2=A0(and the MSAN detected bug fix as well, not that that one really<br>
 =C2=A0matters), after which (tomorrow, or later today depending upon<br>
 =C2=A0timezones) I&#39;ll run the ATF tests to check that nothing new appea=
 rs<br>
 =C2=A0to have broken (as this fix should change nothing for a script withou=
 t<br>
 =C2=A0nul characters in it, and the MSAN fix change is truly trivial,<br>
 =C2=A0that should not be a problem).<br>
 <br>
 =C2=A0Expect to see the fix(es) in HEAD early next week.<br>
 <br>
 =C2=A0After it has caused no problems there for a while, I&#39;ll request<b=
 r>
 =C2=A0a pullup to -9.=C2=A0 =C2=A0But as neither of these bugs bother almos=
 t<br>
 =C2=A0anyone, ever, and because its remaining lifetime isn&#39;t expected<b=
 r>
 =C2=A0to be all that long, I don&#39;t think I&#39;ll bother with pullups t=
 o -8.<br>
 =C2=A0(Of course, any other developer who wants to do the work to<br>
 =C2=A0incorporate the fixes there, and test them properly, could do that.)<=
 br>
 <br>
 =C2=A0If you want to test the updated sh on a -9 system, just copy the<br>
 =C2=A0sources from HEAD (after the fix appears) and build it on a -9<br>
 =C2=A0system (cd src/bin/sh; &lt;update sources to HEAD&gt;; make) (or as p=
 art<br>
 =C2=A0of a build of the -9 system on any host).=C2=A0 There should be no pr=
 oblems<br>
 =C2=A0doing that (but going to -8, while not all that hard, is slightly mor=
 e<br>
 =C2=A0complex than just copy and build).=C2=A0 =C2=A0Note that building sh =
 requires<br>
 =C2=A0access to the sources of its built-in commands (test, printf, kill, .=
 ..)<br>
 =C2=A0so the rest of the (userland at least) sources should be present.<br>
 <br>
 =C2=A0kre<br>
 <br>
 =C2=A0ps: no changes are needed to the way we detect binary files so as not=
  to<br>
 =C2=A0interpret them as scripts - that turned out to all be misdirection, t=
 he<br>
 =C2=A0actual problem is closer to what was originally reported, though sing=
 le<br>
 =C2=A0quoted strings have nothing specific to do with it, and nuls certainl=
 y<br>
 =C2=A0do not survive sh processing.<br>
 <br>
 <br>
 </blockquote></div>

 --00000000000079a54205bb668aaf--

From: Robert Elz <kre@munnari.OZ.AU>
To: Justine Tunney <jtunney@gmail.com>
Cc: gnats-bugs@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: bin/55979 (sh single quotes removes nul characters)
Date: Tue, 16 Feb 2021 05:24:22 +0700

     Date:        Mon, 15 Feb 2021 13:18:53 -0800
     From:        Justine Tunney <jtunney@gmail.com>
     Message-ID:  <CANtdasQBkrubwkgnR6oa+ruFJUc7g8P2YJWA-YEP7=9qy9Xu7g@mail.gmail.com>

   | I think the fact that people are still discovering novel use cases for the
   | design of the Thompson Shell after all these years,

 While the basic design of the Thompson shell persists (kind of), that one
 really is dead now (long long dead).   It had way too many limitations.

   | I'm looking forward to seeing the fixes!

 Soon, I decided we need an ATF test for this first (committed first),
 so I'm doing that (part of what it will become) initially, and then soon
 after it is committed the fix will appear (I have your hello.com script
 working).

 kre


From: "Robert Elz" <kre@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/55979 CVS commit: src
Date: Tue, 16 Feb 2021 09:46:25 +0000

 Module Name:	src
 Committed By:	kre
 Date:		Tue Feb 16 09:46:24 UTC 2021

 Modified Files:
 	src/distrib/sets/lists/tests: mi
 	src/tests/bin/sh: Makefile
 Added Files:
 	src/tests/bin/sh: t_input.sh

 Log Message:
 PR bin/55979

 Add a sh ATF test to demonstrate a bug in the way that \0 characters
 are dropped from scripts.   This test will eventually be extended to
 test other potential sh script input related issues.

 When initially committed, this test should fail.  It should succeed
 when the fix for the PR is committed (soon).

 Nb: this tests only the \0 related issues from the PR, the MSAN
 detected uninitialised variable (struct field) can only be detected
 by MSAN, as it has no visible impact on the operation of the shell
 when running on any real (or even emulated) hardware.
 (It will, however, also be fixed).


 To generate a diff of this commit:
 cvs rdiff -u -r1.1018 -r1.1019 src/distrib/sets/lists/tests/mi
 cvs rdiff -u -r1.14 -r1.15 src/tests/bin/sh/Makefile
 cvs rdiff -u -r0 -r1.1 src/tests/bin/sh/t_input.sh

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Robert Elz" <kre@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/55979 CVS commit: src/bin/sh
Date: Tue, 16 Feb 2021 15:30:12 +0000

 Module Name:	src
 Committed By:	kre
 Date:		Tue Feb 16 15:30:12 UTC 2021

 Modified Files:
 	src/bin/sh: exec.c

 Log Message:
 PR bin/55979

 This fixes the MSAN detected reference to an unitialised variable
 (an unitialised field in a struct) which happens when a command is
 not found after a PATH search.

 Aside from skipping some known to be going to fail exec*() calls
 in some cases, the setting of the relevant field is irrelevant,
 so this problem makes no practical difference to the shell, or any
 shell script.

 XXX (maybe) pullup -9


 To generate a diff of this commit:
 cvs rdiff -u -r1.54 -r1.55 src/bin/sh/exec.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Robert Elz" <kre@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/55979 CVS commit: src/bin/sh
Date: Tue, 16 Feb 2021 15:30:26 +0000

 Module Name:	src
 Committed By:	kre
 Date:		Tue Feb 16 15:30:26 UTC 2021

 Modified Files:
 	src/bin/sh: input.c

 Log Message:
 PR bin/55979

 Correctly handle (ie: ignore completely) \0 chars (nuls) in the
 shell command input stream (script, dot file, or stdin).

 Previously nul chars were ignored correctly in the line in which
 they occurred, but would cause trailing chars of that line to reappear
 as the start of the following line.   If there was just one \0 skipped,
 this would generally result in an extra \n in the sh input, which in
 most cases has no effect.   With multiple \0's in a single line, more
 of the end of that line was duplicated into the following one.  This
 usually manifested as a weird "command not found" error.

 Note that any \0 chars in the sh input make the script non-conforming,
 so fixing this is not crucial (no \0's should really ever be seen) but
 it was an obvious bug in the code, which was attempting to ignore nul
 chars (as do many other shells), so let it be fixed.

 XXX pullup -9


 To generate a diff of this commit:
 cvs rdiff -u -r1.71 -r1.72 src/bin/sh/input.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: analyzed->feedback
State-Changed-By: kre@NetBSD.org
State-Changed-When: Tue, 16 Feb 2021 15:43:56 +0000
State-Changed-Why:
Can you verify that the problem is now solved?


State-Changed-From-To: feedback->pending-pullups
State-Changed-By: kre@NetBSD.org
State-Changed-When: Mon, 05 Apr 2021 05:09:55 +0000
State-Changed-Why:
[pullup-9 #1242] requested.


From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/55979 CVS commit: [netbsd-9] src/bin/sh
Date: Tue, 6 Apr 2021 17:52:03 +0000

 Module Name:	src
 Committed By:	martin
 Date:		Tue Apr  6 17:52:03 UTC 2021

 Modified Files:
 	src/bin/sh [netbsd-9]: exec.c input.c

 Log Message:
 Pull up following revision(s) (requested by kre in ticket #1242):

 	bin/sh/input.c: revision 1.72
 	bin/sh/exec.c: revision 1.55

 PR bin/55979

 This fixes the MSAN detected reference to an unitialised variable
 (an unitialised field in a struct) which happens when a command is
 not found after a PATH search.
 Aside from skipping some known to be going to fail exec*() calls
 in some cases, the setting of the relevant field is irrelevant,
 so this problem makes no practical difference to the shell, or any
 shell script.

 XXX (maybe) pullup -9

 PR bin/55979

 Correctly handle (ie: ignore completely) \0 chars (nuls) in the
 shell command input stream (script, dot file, or stdin).
 Previously nul chars were ignored correctly in the line in which
 they occurred, but would cause trailing chars of that line to reappear
 as the start of the following line.   If there was just one \0 skipped,
 this would generally result in an extra \n in the sh input, which in
 most cases has no effect.   With multiple \0's in a single line, more
 of the end of that line was duplicated into the following one.  This
 usually manifested as a weird "command not found" error.

 Note that any \0 chars in the sh input make the script non-conforming,
 so fixing this is not crucial (no \0's should really ever be seen) but
 it was an obvious bug in the code, which was attempting to ignore nul
 chars (as do many other shells), so let it be fixed.

 XXX pullup -9


 To generate a diff of this commit:
 cvs rdiff -u -r1.53.2.1 -r1.53.2.2 src/bin/sh/exec.c
 cvs rdiff -u -r1.71 -r1.71.2.1 src/bin/sh/input.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: pending-pullups->closed
State-Changed-By: kre@NetBSD.org
State-Changed-When: Thu, 08 Apr 2021 14:53:29 +0000
State-Changed-Why:
Pullup done.  Feedback timeout.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.