NetBSD Problem Report #59896

From jarle@ulf.intern.norid.no  Wed Jan  7 08:14:31 2026
Return-Path: <jarle@ulf.intern.norid.no>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits)
	 client-signature RSA-PSS (2048 bits))
	(Client CN "mail.netbsd.org", Issuer "R13" (verified OK))
	by mollari.NetBSD.org (Postfix) with ESMTPS id B72D31A923C
	for <gnats-bugs@gnats.NetBSD.org>; Wed,  7 Jan 2026 08:14:31 +0000 (UTC)
Message-Id: <20260107080553.94E1AFEAE5@ulf.intern.norid.no>
Date: Wed,  7 Jan 2026 09:05:53 +0100 (CET)
From: jarle.greipsland@norid.no
Reply-To: jarle.greipsland@norid.no
To: gnats-bugs@NetBSD.org
Subject: tr extends string2 incorrectly when [#*0] is specified
X-Send-Pr-Version: 3.95
X-From4GNATS: "jarle.greipsland@norid.no via gnats" <gnats-admin@NetBSD.org>

>Number:         59896
>Category:       bin
>Synopsis:       tr extends string2 incorrectly when [#*0] is specified
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    bin-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Jan 07 08:15:00 +0000 2026
>Last-Modified:  Sun Jan 11 15:10:01 +0000 2026
>Originator:     jarle.greipsland@norid.no
>Release:        NetBSD 10.1_STABLE
>Organization:

>Environment:


System: NetBSD ulf.intern.norid.no 10.1_STABLE NetBSD 10.1_STABLE (GENERIC) #3: Mon Dec 1 13:50:12 CET 2025 jarle@ulf.intern.norid.no:/usr/obj/sys/arch/amd64/compile/GENERIC amd64
Architecture: x86_64
Machine: amd64
>Description:

NetBSD tr:
$ echo A N Y | tr A-Z 'a-c[q*20]x-z'
a q y
which is the expected output.

$ echo A N Y | tr A-Z 'a-c[q*0]x-z'
a q q
should have been equivalent to the first expression, but it isn't.

The man page states:
... If n is omitted or is zero, it is interpreted as
    large enough to extend the string2 sequence to the length of
    string1.  If n has a leading zero, it is interpreted as an
    octal value; otherwise, it is interpreted as a decimal
    value.
It seems like NetBSD's tr does not take into account any characters
in string2 found after the [#*n] expression.

GNU tr from coreutils behaves as I would expect:
$ echo A N Y | gtr A-Z 'a-c[q*20]x-z'
a q y
$ echo A N Y | gtr A-Z 'a-c[q*0]x-z'
a q y

>How-To-Repeat:
see above

>Fix:


>Audit-Trail:
From: RVP <rvp@SDF.ORG>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: bin/59896: tr extends string2 incorrectly when [#*0] is
 specified
Date: Sat, 10 Jan 2026 07:32:51 +0000 (UTC)

   This message is in MIME format.  The first part should be readable text,
   while the remaining parts are likely unreadable without MIME-aware tools.

 --0-2034718689-1768030371=:12460
 Content-Type: text/plain; charset=ISO-8859-15; format=flowed
 Content-Transfer-Encoding: 8BIT

 On Wed, 7 Jan 2026, jarle.greipsland@norid.no via gnats wrote:

 > NetBSD tr:
 > $ echo A N Y | tr A-Z 'a-c[q*20]x-z'
 > a q y
 > which is the expected output.
 >
 > $ echo A N Y | tr A-Z 'a-c[q*0]x-z'
 > a q q
 > should have been equivalent to the first expression, but it isn't.
 >
 > The man page states:
 > ... If n is omitted or is zero, it is interpreted as
 >    large enough to extend the string2 sequence to the length of
 >    string1.  If n has a leading zero, it is interpreted as an
 >    octal value; otherwise, it is interpreted as a decimal
 >    value.
 >

 It seems to me that this says `[q*]' will extend `string2' to the remaining
 length of `string1' first, then `x-z' will get tacked on, exceeding len. of
 `string1', and therefore will not be used.

 > It seems like NetBSD's tr does not take into account any characters
 > in string2 found after the [#*n] expression.
 >

 But, should it?

 > GNU tr from coreutils behaves as I would expect:
 > $ echo A N Y | gtr A-Z 'a-c[q*20]x-z'
 > a q y
 > $ echo A N Y | gtr A-Z 'a-c[q*0]x-z'
 > a q y
 >

 GNU tr(1) seems to be the only one which does this. Both OpenBSD and FreeBSD
 tr behave like NetBSD's; as does the tr in OpenIndiana 2025.10.

 -RVP

 PS. I like what GNU tr(1) does, but, I'd like clarification on this first,
 vis-à-vis POSIX and historical behaviour. :)
 --0-2034718689-1768030371=:12460--

From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: bin/59896: tr extends string2 incorrectly when [#*0] is specified
Date: Sat, 10 Jan 2026 22:24:34 +0700

     Date:        Sat, 10 Jan 2026 08:50:04 +0000 (UTC)
     From:        =22RVP via gnats=22 <gnats-admin=40NetBSD.org>
     Message-ID:  <20260110085004.4B0121A9244=40mollari.NetBSD.org>

   =7C  > It seems like NetBSD's tr does not take into account any charact=
 ers
   =7C  > in string2 found after the =5B=23*n=5D expression.

   =7C  But, should it?

 The GNU version as reported here (assuming there's no additional
 info, which is actually what I expect from many of their applications)
 is crazy.

 Consider

 	tr A-Z 'abc=5Bq*0=5Drst=5Bx*0=5Dyz

 and attempt to work out how many q's and how many x's should
 be in string 2.

 Just close this PR and forget it.

   =7C  PS. I like what GNU tr(1) does, but, I'd like clarification on thi=
 s first,
   =7C  vis-=E0-vis POSIX and historical behaviour. :)

 POSIX's definition is essentially the same as our man page (you'd
 almost think that one copied the other, or both copied the same original =
 :-)

 There's nothing in it about what happens if =5Bx*0=5D appears twice, or
 how anything which follows one of those should be treated.  All examples
 using it use only that form (usually without the explicit 0).  That is,
 there is nothing else in string2 at all, the idea is to replace all of
 some set of chars (or all not in some set of chars) with 1 char.

 I think it makes sense to make it unspecified, or even undefined, if
 anything follows one of those, the =5Bq*=5D should end string2.

 The (non-normative) rationale notes that one use of this notation is
 to allow the traditional BSD tr behaviour where a short string2 had its
 final char duplicated as many times as needed to make string2 the
 appropriate length, which would support this interpretation.

 kre

From: Jarle Greipsland <jarle.greipsland@norid.no>
To: gnats-bugs@netbsd.org, gnats-admin@NetBSD.org
Cc: netbsd-bugs@netbsd.org
Subject: Re: bin/59896: tr extends string2 incorrectly when [#*0] is
 specified
Date: Sun, 11 Jan 2026 15:57:57 +0100 (CET)

 "RVP via gnats" <gnats-admin@NetBSD.org> writes:
 >  On Wed, 7 Jan 2026, jarle.greipsland@norid.no via gnats wrote:
 >  > The man page states:
 >  > ... If n is omitted or is zero, it is interpreted as
 >  >    large enough to extend the string2 sequence to the length of
 >  >    string1.  If n has a leading zero, it is interpreted as an
 >  >    octal value; otherwise, it is interpreted as a decimal
 >  >    value.
 >  >
 >  It seems to me that this says `[q*]' will extend `string2' to the remaining
 >  length of `string1' first, then `x-z' will get tacked on, exceeding len. of
 >  `string1', and therefore will not be used.
 I guess that is a valid interpretation of the text.  Although not
 the first one that came to mind, for a non-native speaker of the
 English language.

 >  > It seems like NetBSD's tr does not take into account any characters
 >  > in string2 found after the [#*n] expression.
 >  >
 >  But, should it?
 Good question.  It adds a capability to the utility (but possibly
 a fringe capability or one that is not wanted or is confusing).
 The behavior following from your interpretation can otherwise be
 acheived by specifying a short string2, and rely on the
 documented behavior where "the last character found in string2 is
 duplicated until string1 is exhausted."

 >  > GNU tr from coreutils behaves as I would expect:
 >  > $ echo A N Y | gtr A-Z 'a-c[q*20]x-z'
 >  > a q y
 >  > $ echo A N Y | gtr A-Z 'a-c[q*0]x-z'
 >  > a q y
 >  >
 >  
 >  GNU tr(1) seems to be the only one which does this. Both OpenBSD and FreeBSD
 >  tr behave like NetBSD's; as does the tr in OpenIndiana 2025.10.
 Score one for conformity and predictability, I guess.

 					-jarle
 -- 
 "My poor knowledge of Greek mythology has always been my Archimedes heel."

From: Jarle Greipsland <jarle.greipsland@norid.no>
To: gnats-bugs@netbsd.org, gnats-admin@NetBSD.org
Cc: netbsd-bugs@netbsd.org
Subject: Re: bin/59896: tr extends string2 incorrectly when [#*0] is
 specified
Date: Sun, 11 Jan 2026 16:02:44 +0100 (CET)

 "Robert Elz via gnats" <gnats-admin@NetBSD.org> writes:
 >  The GNU version as reported here (assuming there's no additional
 >  info, which is actually what I expect from many of their applications)
 >  is crazy.
 >  
 >  Consider
 >  
 >  	tr A-Z 'abc=5Bq*0=5Drst=5Bx*0=5Dyz
 >  
 >  and attempt to work out how many q's and how many x's should
 >  be in string 2.
 $ gtr A-Z 'abc[q*0]rst[x*0]yz'
 gtr: only one [c*] repeat construct may appear in string2

 >  Just close this PR and forget it.
 Fair enough.  I'll adjust my expectations.

 >  The (non-normative) rationale notes that one use of this notation is
 >  to allow the traditional BSD tr behaviour where a short string2 had its
 >  final char duplicated as many times as needed to make string2 the
 >  appropriate length, which would support this interpretation.
 Makes sense.
 				-jarle
 -- 
 "Computers are not intelligent.  They only think they are."

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2026 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.