NetBSD Problem Report #24014
Received: (qmail 29521 invoked by uid 605); 7 Jan 2004 17:26:16 -0000
Message-Id: <200401071726.i07HQ3WD002661@phaeton.entropie.net>
Date: Wed, 7 Jan 2004 18:26:04 +0100 (CET)
From: Martin Weber <Ephaeton@gmx.net>
Sender: gnats-bugs-owner@NetBSD.org
Reply-To: Ephaeton@gmx.net
To: gnats-bugs@gnats.NetBSD.org
Subject: fmt's ispref() is borked.
X-Send-Pr-Version: 3.95
>Number: 24014
>Category: bin
>Synopsis: fmt's ispref() steps only its first variable, leading to wrong results.
>Confidential: no
>Severity: non-critical
>Priority: low
>Responsible: bin-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Jan 07 17:27:00 +0000 2004
>Closed-Date:
>Last-Modified: Mon May 25 20:35:09 +0000 2009
>Originator: Martin Weber
>Release: NetBSD 1.6ZG
>Organization:
>Environment:
fmt.c:
$NetBSD: fmt.c,v 1.17 2003/08/07 11:13:47 agc Exp $
$NetBSD: fmt.c,v 1.17 2003/08/07 11:13:47 agc Exp $
System: NetBSD phaeton.entropie.net 1.6ZG NetBSD 1.6ZG (FAETON) #0: Mon Dec 15 01:29:47 CET 2003 root@phaeton.entropie.net:/space/obj/usr/src/sys/arch/i386/compile/FAETON i386
Architecture: i386
Machine: i386
>Description:
The following is ispref() from fmt.c, usr.bin/fmt/fmt.c:
static int ispref(const char*s1, const char*s2) {
while (*s1++ == *s2) ;
return (*s1 == '\0');
}
This only steps its first variable, i.e. "ab" is, due to ispref(),
NOT a prefix to "abcd". "aaaaaaaa", though, IS a prefix of "ab" ..
(due to ispref()). When adding the stepping of the second var (s2),
we gotta check *s1/*s2, too -- if s1 == s2 it'll coredump else
(ispref("a", "a") -> *s1++ == *s2++ ('a'), *s1++ == *s2++ ('\0'), *s1++ -> BOOM)
>How-To-Repeat:
Er, code viewing.
>Fix:
Index: fmt.c
===================================================================
RCS file: /pub/NetBSD-CVS/src/usr.bin/fmt/fmt.c,v
retrieving revision 1.17
diff -u -r1.17 fmt.c
--- fmt.c 2003/08/07 11:13:47 1.17
+++ fmt.c 2004/01/07 17:17:25
@@ -526,7 +526,7 @@
ispref(const char *s1, const char *s2)
{
- while (*s1++ == *s2)
+ while (*s1 && *s2 && *s1++ == *s2++)
;
return (*s1 == '\0');
}
>Release-Note:
>Audit-Trail:
State-Changed-From-To: open->feedback
State-Changed-By: wiz
State-Changed-When: Sat Jan 10 15:11:14 UTC 2004
State-Changed-Why:
From how ispref is used, it seems correct to me.
cp is initialized to the start of the line and then increased
as long as it contains a space; so the whole second argument
to ispref just contains spaces.
Do you have a reproducible case where your modified
version behaves "better" than the current version?
Cheers,
Thomas
From: Martin Weber <Ephaeton@gmx.net>
To: gnats-bugs@gnats.netbsd.org
Cc: Ephaeton@gmx.net, wiz@netbsd.org, gnats-admin@netbsd.org
Subject: Re: bin/24014
Date: Sat, 10 Jan 2004 17:07:01 +0100
On Sat, Jan 10, 2004 at 03:13:01PM -0000, wiz@netbsd.org wrote:
> Synopsis: fmt's ispref() steps only its first variable, leading to wrong results.
>
> State-Changed-From-To: open->feedback
> State-Changed-By: wiz
> State-Changed-When: Sat Jan 10 15:11:14 UTC 2004
> State-Changed-Why:
> >From how ispref is used, it seems correct to me.
> cp is initialized to the start of the line and then increased
> as long as it contains a space; so the whole second argument
> to ispref just contains spaces.
Er... huh ?
in prefix(), first step over all spaces, so cp points to the
start of non-space. Then we check To, Cc etc. with ispref.
Look at this:
(gdb) cont
Continuing.
Breakpoint 1, ispref (s1=0x8049c2e "Cc",
s2=0xbfbfed70 "Cc: bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla") at fmt.c:529
529 while (*s1++ == *s2)
(gdb) print *s1
$5 = 67 'C'
(gdb) print *s2
$6 = 67 'C'
(gdb) step 1
Breakpoint 1, ispref (s1=0x8049c2f "c",
s2=0xbfbfed70 "Cc: bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla") at fmt.c:529
529 while (*s1++ == *s2)
(gdb) print *s1
$7 = 99 'c'
(gdb) print *s2
$8 = 67 'C'
(gdb) step 1
531 return (*s1 == '\0');
(gdb) print *s1
$9 = 0 '\0'
(gdb) call ispref("Cc", "Cbla")
Breakpoint 1, ispref (s1=0x4805a010 "Cc", s2=0x48057120 "Cbla") at fmt.c:529
529 while (*s1++ == *s2)
The program being debugged stopped while in a function called from GDB.
When the function (ispref) is done executing, GDB will silently
stop (instead of continuing to evaluate the expression containing
the function call).
(gdb) print *s1
$10 = 67 'C'
(gdb) print *s2
$11 = 67 'C'
(gdb) step 1
Breakpoint 1, ispref (s1=0x4805a011 "c", s2=0x48057120 "Cbla") at fmt.c:529
529 while (*s1++ == *s2)
(gdb) print *s1
$12 = 99 'c'
(gdb) print *s2
$13 = 67 'C'
(gdb) step 1
531 return (*s1 == '\0');
(gdb) print *s1
$14 = 0 '\0'
(gdb)
>
> Do you have a reproducible case where your modified
> version behaves "better" than the current version?
a) if anyone ever decides to use the source of an open source utility
to get quality source they'll get the wrong impression if using ispref
from fmt
ispref("aaa", "abcde") IS true in the old implementation!
b) here you are (look at 3rd line!):
fmt-mine = fmt with my patch,
fmt-theirs = fmt without.
/usr/src/usr.bin/fmt//fmt-theirs < eingabe2
From wiz@netbsd.org Sat Jan 10 16:13:36 2004
To: bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla
Subject: bla Cc: bla bla bla bla bla bla bla bla bla bla bla bla
bla bla bla bla bla bla bla bla bla bla bla bla
...
/usr/src/usr.bin/fmt//fmt-mine < eingabe2
From wiz@netbsd.org Sat Jan 10 16:13:36 2004
To: bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla
Subject: bla
Cc: bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla
bla bla bla bla bla bla bla bla bla
Check the second line. The subject checking is always false due to the
borked ispref():
Continuing.
Breakpoint 1, ispref (s1=0x8049c26 "Subject", s2=0xbfbfed70 "Subject: bla")
at fmt.c:529
529 while (*s1++ == *s2)
(gdb) display *s1
2: *s1 = 83 'S'
(gdb) display *s2
3: *s2 = 83 'S'
(gdb) step 1
Breakpoint 1, ispref (s1=0x8049c27 "ubject", s2=0xbfbfed70 "Subject: bla")
at fmt.c:529
529 while (*s1++ == *s2)
3: *s2 = 83 'S'
2: *s1 = 117 'u'
(gdb) step 1
531 return (*s1 == '\0');
3: *s2 = 83 'S'
2: *s1 = 98 'b'
Oh, while at it, ishead() from mail.c is broken for WRITING mails (the From
header reads: From: bla, not From bla <date> ! See below [**])
The mark-lineno > 3 is wrong, too(*). Mails nowadays have more than two lines of
headers. I think it should always assume it has more headers to read if it
found a mail-header and read no blank line (that's the format isn't it?)
Compare (gdb) run of these two inputs:
From wiz@netbsd.org Sat Jan 10 16:13:36 2004
To: bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla
Subject: bla
Cc: bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla
ispref(..., "Cc: bla bla ...") not called.
vs.
From wiz@netbsd.org Sat Jan 10 16:13:36 2004
To: bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla
Cc: bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla
Subject: bla
ispref(..., "Subject: bla bla ...") not called.
[**]: my headerlines here:
From: Martin Weber <phaeton@phaeton.entropie.net>
To: gnats-bugs@gnats.netbsd.org
Cc: Ephaeton@gmx.net, wiz@netbsd.org, gnats-admin@netbsd.org
Bcc:
Subject: Re: bin/24014
fmt turns this into :
From: Martin Weber <phaeton@phaeton.entropie.net> To:
gnats-bugs@gnats.netbsd.org Cc: Ephaeton@gmx.net, wiz@netbsd.org,
gnats-admin@netbsd.org Bcc: Subject: Re: bin/24014
because it doesn't realize what From: means.
Concluding, we might say fmt is so brutally broken for formatting mail
messages (its original purpose, wasn't it ?) that we should add a passage
to the manpage to NOT ruin your mails with it until this is fixed.
(Mail header parsing in general, ispref, ishead, fixed headerline count
in special)
*phew* I dunno, probably there's more problems to this than just the
obvious ones I just could think about :)
Regards,
-Martin
From: Thomas Klausner <wiz@netbsd.org>
To: Martin Weber <Ephaeton@gmx.net>
Cc: gnats-bugs@gnats.netbsd.org, gnats-admin@netbsd.org
Subject: Re: bin/24014
Date: Sat, 10 Jan 2004 17:24:16 +0100
On Sat, Jan 10, 2004 at 05:07:01PM +0100, Martin Weber wrote:
> in prefix(), first step over all spaces, so cp points to the
> start of non-space. Then we check To, Cc etc. with ispref.
Oops. Sorry, you are right.
I expect your patches for all the problems you found ;)
Thomas
From: Martin Weber <Ephaeton@gmx.net>
To: Thomas Klausner <wiz@netbsd.org>
Cc: gnats-bugs@gnats.netbsd.org, gnats-admin@netbsd.org
Subject: Re: bin/24014
Date: Sat, 10 Jan 2004 17:28:29 +0100
On Sat, Jan 10, 2004 at 05:24:16PM +0100, Thomas Klausner wrote:
> On Sat, Jan 10, 2004 at 05:07:01PM +0100, Martin Weber wrote:
> > in prefix(), first step over all spaces, so cp points to the
> > start of non-space. Then we check To, Cc etc. with ispref.
>
> Oops. Sorry, you are right.
>
> I expect your patches for all the problems you found ;)
> Thomas
Hmpf :P Okay. (note, I found all these problems while thinking
about whether you're right, so I first have to write it up)
-Martin
From: Martin Weber <Ephaeton@gmx.net>
To: Thomas Klausner <wiz@netbsd.org>
Cc: gnats-bugs@gnats.netbsd.org, gnats-admin@netbsd.org
Subject: Re: bin/24014
Date: Sat, 10 Jan 2004 18:32:57 +0100
On Sat, Jan 10, 2004 at 05:24:16PM +0100, Thomas Klausner wrote:
> (...)
> I expect your patches for all the problems you found ;)
Okidokay.
I've nuked ispref & usage of mail.c's stuff. Instead I've implemented
an own ishead() for fmt.c, (Can't we write a nbsd util lib for searching
through mboxes, outside of mail, mailx or whatever ?:) as follows:
If we're on the first line, and the string starts with either "From "
or "From:", then we assume this one is a header, and assume anything
following is a header until we find no header anymore. Things on 2nd+
line are considered headers if string regmatches /^[^ \t]+:/.
Patch:
================================================================================
Index: Makefile
===================================================================
RCS file: /pub/NetBSD-CVS/src/usr.bin/fmt/Makefile,v
retrieving revision 1.7
diff -u -r1.7 Makefile
--- Makefile 2002/09/19 11:31:21 1.7
+++ Makefile 2004/01/10 17:31:18
@@ -4,9 +4,7 @@
.include <bsd.own.mk>
PROG= fmt
-SRCS= fmt.c head.c
+SRCS= fmt.c
WARNS= 2
-
-.PATH: ${NETBSDSRCDIR}/usr.bin/mail
.include <bsd.prog.mk>
Index: fmt.c
===================================================================
RCS file: /pub/NetBSD-CVS/src/usr.bin/fmt/fmt.c,v
retrieving revision 1.17
diff -u -r1.17 fmt.c
--- fmt.c 2003/08/07 11:13:47 1.17
+++ fmt.c 2004/01/10 17:31:18
@@ -69,13 +69,10 @@
int max_length; /* Max line length in output */
int pfx; /* Current leading blank count */
int lineno; /* Current input line */
-int mark; /* Last place we saw a head line */
+int headermark; /* Still in (email-)headers ? */
int center;
-char *headnames[] = {"To", "Subject", "Cc", 0};
-
static void fmt(FILE *);
-static int ispref(const char *, const char *);
static void leadin(void);
static void oflush(void);
static void pack(const char *, int);
@@ -83,8 +80,8 @@
static void setout(void);
static void split(const char *, int);
static void tabulate(char *);
+static int ishead(const char *, int);
-int ishead(const char *);
int main(int, char **);
/*
@@ -104,7 +101,7 @@
max_length = MAX_LENGTH;
setout();
lineno = 1;
- mark = -10;
+ headermark = 1;
setlocale(LC_ALL, "");
@@ -290,7 +287,6 @@
prefix(const char line[], int add_space)
{
const char *cp;
- char **hp;
int np, h;
if (strlen(line) == 0) {
@@ -308,15 +304,10 @@
*/
if (np != pfx && (np > pfx || abs(pfx-np) > 8))
oflush();
- if ((h = ishead(cp)) != 0)
- oflush(), mark = lineno;
- if (lineno - mark < 3 && lineno - mark > 0)
- for (hp = &headnames[0]; *hp != (char *) 0; hp++)
- if (ispref(*hp, cp)) {
- h = 1;
- oflush();
- break;
- }
+ if (headermark && (h = ishead(cp, lineno)))
+ oflush();
+ else if (!h)
+ headermark=0;
if (!h && (h = (*cp == '.')))
oflush();
pfx = np;
@@ -520,13 +511,23 @@
}
/*
- * Is s1 a prefix of s2??
+ * Is s a mailheader line ?
+ * On line == 1 --> if first four chars are "From " or "From :" -> yes
+ * on line > 1 --> if =~ /^[^ \t]+:/ -> yes
*/
static int
-ispref(const char *s1, const char *s2)
+ishead(const char *s, int line)
{
+ const char *cp;
- while (*s1++ == *s2)
- ;
- return (*s1 == '\0');
+ if (s == NULL)
+ return 0;
+
+ if (line == 1)
+ return (strncmp(s, "From", 4) == 0 && (*(s+4) == ':' || *(s+4) == ' '));
+ else
+ for (cp = s; *cp && !isspace((int) *cp); cp++)
+ if (cp != s && *cp == ':')
+ return 1;
+ return 0;
}
================================================================================
Regards,
-Martin
From: Martin Weber <Ephaeton@gmx.net>
To: Thomas Klausner <wiz@netbsd.org>
Cc: gnats-bugs@gnats.netbsd.org, gnats-admin@netbsd.org
Subject: Re: bin/24014
Date: Sat, 10 Jan 2004 19:48:36 +0100
Hoi,
warning: previous patch had an oversight (uninit var). Now it works
(again) as it should. On a side note, -C ("center") processing
is totally broken :p
% fmt 10 15
bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla
bla bla bla
bla bla bla
bla bla bla
bla bla bla
bla bla bla
bla bla bla
bla bla bla
bla bla bla
% fmt -C 10 15
bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla
bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla
No patch for that (yet - and surely not today). Other (updated) patch is:
================================================================================
Index: Makefile
===================================================================
RCS file: /pub/NetBSD-CVS/src/usr.bin/fmt/Makefile,v
retrieving revision 1.7
diff -u -r1.7 Makefile
--- Makefile 2002/09/19 11:31:21 1.7
+++ Makefile 2004/01/10 18:47:04
@@ -4,9 +4,7 @@
.include <bsd.own.mk>
PROG= fmt
-SRCS= fmt.c head.c
+SRCS= fmt.c
WARNS= 2
-
-.PATH: ${NETBSDSRCDIR}/usr.bin/mail
.include <bsd.prog.mk>
Index: fmt.c
===================================================================
RCS file: /pub/NetBSD-CVS/src/usr.bin/fmt/fmt.c,v
retrieving revision 1.17
diff -u -r1.17 fmt.c
--- fmt.c 2003/08/07 11:13:47 1.17
+++ fmt.c 2004/01/10 18:47:04
@@ -69,13 +69,10 @@
int max_length; /* Max line length in output */
int pfx; /* Current leading blank count */
int lineno; /* Current input line */
-int mark; /* Last place we saw a head line */
+int headermark; /* Still in (email-)headers ? */
int center;
-char *headnames[] = {"To", "Subject", "Cc", 0};
-
static void fmt(FILE *);
-static int ispref(const char *, const char *);
static void leadin(void);
static void oflush(void);
static void pack(const char *, int);
@@ -83,8 +80,8 @@
static void setout(void);
static void split(const char *, int);
static void tabulate(char *);
+static int ishead(const char *, int);
-int ishead(const char *);
int main(int, char **);
/*
@@ -104,7 +101,7 @@
max_length = MAX_LENGTH;
setout();
lineno = 1;
- mark = -10;
+ headermark = 1;
setlocale(LC_ALL, "");
@@ -290,12 +287,12 @@
prefix(const char line[], int add_space)
{
const char *cp;
- char **hp;
- int np, h;
+ int np, h=0;
if (strlen(line) == 0) {
oflush();
putchar('\n');
+ if (headermark) headermark=0;
return;
}
for (cp = line; *cp == ' '; cp++)
@@ -308,15 +305,10 @@
*/
if (np != pfx && (np > pfx || abs(pfx-np) > 8))
oflush();
- if ((h = ishead(cp)) != 0)
- oflush(), mark = lineno;
- if (lineno - mark < 3 && lineno - mark > 0)
- for (hp = &headnames[0]; *hp != (char *) 0; hp++)
- if (ispref(*hp, cp)) {
- h = 1;
- oflush();
- break;
- }
+ if (headermark && (h = ishead(cp, lineno)))
+ oflush();
+ else if (headermark)
+ headermark=0;
if (!h && (h = (*cp == '.')))
oflush();
pfx = np;
@@ -520,13 +512,23 @@
}
/*
- * Is s1 a prefix of s2??
+ * Is s a mailheader line ?
+ * On line == 1 --> if first four chars are "From " or "From :" -> yes
+ * on line > 1 --> if =~ /^[^ \t]+:/ -> yes
*/
static int
-ispref(const char *s1, const char *s2)
+ishead(const char *s, int line)
{
+ const char *cp;
- while (*s1++ == *s2)
- ;
- return (*s1 == '\0');
+ if (s == NULL)
+ return 0;
+
+ if (line == 1)
+ return (strncmp(s, "From", 4) == 0 && (*(s+4) == ':' || *(s+4) == ' '));
+ else
+ for (cp = s; *cp && !isspace((int) *cp); cp++)
+ if (cp != s && *cp == ':')
+ return 1;
+ return 0;
}
================================================================================
Sorry for the previously broken one (uninitialized vars suck).
-martin
State-Changed-From-To: feedback->suspended
State-Changed-By: wiz
State-Changed-When: Tue Jan 13 21:54:23 UTC 2004
State-Changed-Why:
Martin wants to make some major changes -- suspend the
PR for now.
Responsible-Changed-From-To: bin-bug-people->wiz
Responsible-Changed-By: wiz
Responsible-Changed-When: Tue Jan 13 21:54:23 UTC 2004
Responsible-Changed-Why:
I'll take care of it.
Responsible-Changed-From-To: wiz->bin-bug-people
Responsible-Changed-By: wiz@netbsd.org
Responsible-Changed-When: Mon, 21 Mar 2005 11:29:38 +0000
Responsible-Changed-Why:
No progress, and I lost interest for now.
State-Changed-From-To: suspended->feedback
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Mon, 27 Apr 2009 06:47:33 +0000
State-Changed-Why:
What's the state of this nowadays?
From: "Martin S. Weber" <Ephaeton@gmx.net>
To: gnats-bugs@NetBSD.org
Cc: gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, dholland@NetBSD.org
Subject: Re: bin/24014 (fmt's ispref() steps only its first variable, leading to wrong results.)
Date: Mon, 25 May 2009 16:13:13 -0400
On Mon, Apr 27, 2009 at 06:47:33AM +0000, dholland@NetBSD.org wrote:
> Synopsis: fmt's ispref() steps only its first variable, leading to wrong results.
>
> State-Changed-From-To: suspended->feedback
> State-Changed-By: dholland@NetBSD.org
> State-Changed-When: Mon, 27 Apr 2009 06:47:33 +0000
> State-Changed-Why:
> What's the state of this nowadays?
Everything claimed broken in the PR still is. Probably part of my old patches, too,
although they seemed to improve the situation back then, obviously. I probably should
do something about this, finally, after all the years (sigh), like, polish the patches
once more, offer regression tests etc.
For now, something copy-paste'able that might become a regression test somewhen:
# I'm using ksh here
100blas() { for i in $(jot 100 1) ; do printf "bla " ; done ; echo ;}
# 1. fmt -C is broken
100blas > fmt-input1
fmt 10 15 < fmt-input1
fmt -C 10 15 < fmt-input1
# 2. fmt for formatting mails is broken
touch fmt-input2
echo "From myself" >> fmt-input2
echo "To: $(100blas)" >> fmt-input2
echo "Subject: $(100blas)" >> fmt-input2
echo "Cc: $(100blas)" >> fmt-input2
echo "Reply-To: $(100blas)" >> fmt-input2
fmt < fmt-input2
Regards,
-Martin
State-Changed-From-To: feedback->open
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Mon, 25 May 2009 20:35:09 +0000
State-Changed-Why:
Sigh, all broke.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.