NetBSD Problem Report #59896
From jarle@ulf.intern.norid.no Wed Jan 7 08:14:31 2026
Return-Path: <jarle@ulf.intern.norid.no>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
key-exchange X25519 server-signature RSA-PSS (2048 bits)
client-signature RSA-PSS (2048 bits))
(Client CN "mail.netbsd.org", Issuer "R13" (verified OK))
by mollari.NetBSD.org (Postfix) with ESMTPS id B72D31A923C
for <gnats-bugs@gnats.NetBSD.org>; Wed, 7 Jan 2026 08:14:31 +0000 (UTC)
Message-Id: <20260107080553.94E1AFEAE5@ulf.intern.norid.no>
Date: Wed, 7 Jan 2026 09:05:53 +0100 (CET)
From: jarle.greipsland@norid.no
Reply-To: jarle.greipsland@norid.no
To: gnats-bugs@NetBSD.org
Subject: tr extends string2 incorrectly when [#*0] is specified
X-Send-Pr-Version: 3.95
X-From4GNATS: "jarle.greipsland@norid.no via gnats" <gnats-admin@NetBSD.org>
>Number: 59896
>Category: bin
>Synopsis: tr extends string2 incorrectly when [#*0] is specified
>Confidential: no
>Severity: non-critical
>Priority: low
>Responsible: bin-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Jan 07 08:15:00 +0000 2026
>Last-Modified: Sun Jan 11 15:10:01 +0000 2026
>Originator: jarle.greipsland@norid.no
>Release: NetBSD 10.1_STABLE
>Organization:
>Environment:
System: NetBSD ulf.intern.norid.no 10.1_STABLE NetBSD 10.1_STABLE (GENERIC) #3: Mon Dec 1 13:50:12 CET 2025 jarle@ulf.intern.norid.no:/usr/obj/sys/arch/amd64/compile/GENERIC amd64
Architecture: x86_64
Machine: amd64
>Description:
NetBSD tr:
$ echo A N Y | tr A-Z 'a-c[q*20]x-z'
a q y
which is the expected output.
$ echo A N Y | tr A-Z 'a-c[q*0]x-z'
a q q
should have been equivalent to the first expression, but it isn't.
The man page states:
... If n is omitted or is zero, it is interpreted as
large enough to extend the string2 sequence to the length of
string1. If n has a leading zero, it is interpreted as an
octal value; otherwise, it is interpreted as a decimal
value.
It seems like NetBSD's tr does not take into account any characters
in string2 found after the [#*n] expression.
GNU tr from coreutils behaves as I would expect:
$ echo A N Y | gtr A-Z 'a-c[q*20]x-z'
a q y
$ echo A N Y | gtr A-Z 'a-c[q*0]x-z'
a q y
>How-To-Repeat:
see above
>Fix:
>Audit-Trail:
From: RVP <rvp@SDF.ORG>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: bin/59896: tr extends string2 incorrectly when [#*0] is
specified
Date: Sat, 10 Jan 2026 07:32:51 +0000 (UTC)
This message is in MIME format. The first part should be readable text,
while the remaining parts are likely unreadable without MIME-aware tools.
--0-2034718689-1768030371=:12460
Content-Type: text/plain; charset=ISO-8859-15; format=flowed
Content-Transfer-Encoding: 8BIT
On Wed, 7 Jan 2026, jarle.greipsland@norid.no via gnats wrote:
> NetBSD tr:
> $ echo A N Y | tr A-Z 'a-c[q*20]x-z'
> a q y
> which is the expected output.
>
> $ echo A N Y | tr A-Z 'a-c[q*0]x-z'
> a q q
> should have been equivalent to the first expression, but it isn't.
>
> The man page states:
> ... If n is omitted or is zero, it is interpreted as
> large enough to extend the string2 sequence to the length of
> string1. If n has a leading zero, it is interpreted as an
> octal value; otherwise, it is interpreted as a decimal
> value.
>
It seems to me that this says `[q*]' will extend `string2' to the remaining
length of `string1' first, then `x-z' will get tacked on, exceeding len. of
`string1', and therefore will not be used.
> It seems like NetBSD's tr does not take into account any characters
> in string2 found after the [#*n] expression.
>
But, should it?
> GNU tr from coreutils behaves as I would expect:
> $ echo A N Y | gtr A-Z 'a-c[q*20]x-z'
> a q y
> $ echo A N Y | gtr A-Z 'a-c[q*0]x-z'
> a q y
>
GNU tr(1) seems to be the only one which does this. Both OpenBSD and FreeBSD
tr behave like NetBSD's; as does the tr in OpenIndiana 2025.10.
-RVP
PS. I like what GNU tr(1) does, but, I'd like clarification on this first,
vis-à-vis POSIX and historical behaviour. :)
--0-2034718689-1768030371=:12460--
From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: bin/59896: tr extends string2 incorrectly when [#*0] is specified
Date: Sat, 10 Jan 2026 22:24:34 +0700
Date: Sat, 10 Jan 2026 08:50:04 +0000 (UTC)
From: =22RVP via gnats=22 <gnats-admin=40NetBSD.org>
Message-ID: <20260110085004.4B0121A9244=40mollari.NetBSD.org>
=7C > It seems like NetBSD's tr does not take into account any charact=
ers
=7C > in string2 found after the =5B=23*n=5D expression.
=7C But, should it?
The GNU version as reported here (assuming there's no additional
info, which is actually what I expect from many of their applications)
is crazy.
Consider
tr A-Z 'abc=5Bq*0=5Drst=5Bx*0=5Dyz
and attempt to work out how many q's and how many x's should
be in string 2.
Just close this PR and forget it.
=7C PS. I like what GNU tr(1) does, but, I'd like clarification on thi=
s first,
=7C vis-=E0-vis POSIX and historical behaviour. :)
POSIX's definition is essentially the same as our man page (you'd
almost think that one copied the other, or both copied the same original =
:-)
There's nothing in it about what happens if =5Bx*0=5D appears twice, or
how anything which follows one of those should be treated. All examples
using it use only that form (usually without the explicit 0). That is,
there is nothing else in string2 at all, the idea is to replace all of
some set of chars (or all not in some set of chars) with 1 char.
I think it makes sense to make it unspecified, or even undefined, if
anything follows one of those, the =5Bq*=5D should end string2.
The (non-normative) rationale notes that one use of this notation is
to allow the traditional BSD tr behaviour where a short string2 had its
final char duplicated as many times as needed to make string2 the
appropriate length, which would support this interpretation.
kre
From: Jarle Greipsland <jarle.greipsland@norid.no>
To: gnats-bugs@netbsd.org, gnats-admin@NetBSD.org
Cc: netbsd-bugs@netbsd.org
Subject: Re: bin/59896: tr extends string2 incorrectly when [#*0] is
specified
Date: Sun, 11 Jan 2026 15:57:57 +0100 (CET)
"RVP via gnats" <gnats-admin@NetBSD.org> writes:
> On Wed, 7 Jan 2026, jarle.greipsland@norid.no via gnats wrote:
> > The man page states:
> > ... If n is omitted or is zero, it is interpreted as
> > large enough to extend the string2 sequence to the length of
> > string1. If n has a leading zero, it is interpreted as an
> > octal value; otherwise, it is interpreted as a decimal
> > value.
> >
> It seems to me that this says `[q*]' will extend `string2' to the remaining
> length of `string1' first, then `x-z' will get tacked on, exceeding len. of
> `string1', and therefore will not be used.
I guess that is a valid interpretation of the text. Although not
the first one that came to mind, for a non-native speaker of the
English language.
> > It seems like NetBSD's tr does not take into account any characters
> > in string2 found after the [#*n] expression.
> >
> But, should it?
Good question. It adds a capability to the utility (but possibly
a fringe capability or one that is not wanted or is confusing).
The behavior following from your interpretation can otherwise be
acheived by specifying a short string2, and rely on the
documented behavior where "the last character found in string2 is
duplicated until string1 is exhausted."
> > GNU tr from coreutils behaves as I would expect:
> > $ echo A N Y | gtr A-Z 'a-c[q*20]x-z'
> > a q y
> > $ echo A N Y | gtr A-Z 'a-c[q*0]x-z'
> > a q y
> >
>
> GNU tr(1) seems to be the only one which does this. Both OpenBSD and FreeBSD
> tr behave like NetBSD's; as does the tr in OpenIndiana 2025.10.
Score one for conformity and predictability, I guess.
-jarle
--
"My poor knowledge of Greek mythology has always been my Archimedes heel."
From: Jarle Greipsland <jarle.greipsland@norid.no>
To: gnats-bugs@netbsd.org, gnats-admin@NetBSD.org
Cc: netbsd-bugs@netbsd.org
Subject: Re: bin/59896: tr extends string2 incorrectly when [#*0] is
specified
Date: Sun, 11 Jan 2026 16:02:44 +0100 (CET)
"Robert Elz via gnats" <gnats-admin@NetBSD.org> writes:
> The GNU version as reported here (assuming there's no additional
> info, which is actually what I expect from many of their applications)
> is crazy.
>
> Consider
>
> tr A-Z 'abc=5Bq*0=5Drst=5Bx*0=5Dyz
>
> and attempt to work out how many q's and how many x's should
> be in string 2.
$ gtr A-Z 'abc[q*0]rst[x*0]yz'
gtr: only one [c*] repeat construct may appear in string2
> Just close this PR and forget it.
Fair enough. I'll adjust my expectations.
> The (non-normative) rationale notes that one use of this notation is
> to allow the traditional BSD tr behaviour where a short string2 had its
> final char duplicated as many times as needed to make string2 the
> appropriate length, which would support this interpretation.
Makes sense.
-jarle
--
"Computers are not intelligent. They only think they are."
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2026
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.