NetBSD Problem Report #1880

From gnats  Sun Dec 31 20:16:06 1995
Received: from slate.Mines.EDU by pain.lcs.mit.edu (8.6.12/8.6.9) with SMTP id UAA00941 for <gnats-bugs@gnats.netbsd.org>; Sun, 31 Dec 1995 20:09:16 -0500
Message-Id: <9601010107.AA17380@geek.Mines.EDU>
Date: Sun, 31 Dec 95 18:07:37 -0700
From: "James E. Bernard" <jbernard@geek.Mines.EDU>
To: gnats-bugs@gnats.netbsd.org
Cc: jbernard@slate.Mines.EDU
Subject: mail has a too-limited implementation of string quoting

>Number:         1880
>Category:       bin
>Synopsis:       mail has a too-limited implementation of string quoting
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    bin-bug-people
>State:          closed
>Class:          change-request
>Submitter-Id:   net
>Arrival-Date:   Sun Dec 31 20:20:03 +0000 1995
>Closed-Date:    Fri May 21 20:33:57 +0000 2021
>Last-Modified:  Fri May 21 20:33:57 +0000 2021
>Originator:     Jim Bernard
>Release:        1.1
>Organization:
	Speaking for myself
>Environment:
System: NetBSD zoo 1.1 NetBSD 1.1 (ZOO) #0: Sun Dec 3 12:56:42 MST 1995 local@zoo:/home/local/netbsd-1.1/usr/src/sys/arch/i386/compile/ZOO i386


>Description:
	Strings in ~/.mailrc cannot be protected from metacharacter interpretation.
	For example, I find it useful to pass egrep-style regular expressions to
	the PAGER, such as:
		set PAGER="less -i -c -p'^Message |^To:|^From:|^Subject:'"
	but mail insists on interpreting ^M, ^T, ^F, and ^S as control characters
	unless excessive amounts of escaping are used, e.g.:
		set PAGER="less -i -c -p'\\\^Message |\\\^To:|\\\^From:|\\\^Subject:'"
	(which achieves the desired effect, but is horrible and prevents sharing
	of ~.mailrc with sunos mail).
>How-To-Repeat:
	Install less-290, patched to accept extended regular expressions:

--- search.c-dist	Thu Mar  9 23:04:24 1995
+++ search.c	Wed Dec 27 13:30:42 1995
@@ -248,7 +248,7 @@
 {
 #if HAVE_POSIX_REGCOMP
 	regex_t *s = (regex_t *) ecalloc(1, sizeof(regex_t));
-	if (regcomp(s, pattern, 0))
+	if (regcomp(s, pattern, REG_EXTENDED))
 	{
 		free(s);
 		error("Invalid pattern", NULL_PARG);

	Add to your ~/.mailrc the following lines:
		set crt=22
		set PAGER="less -i -c -p'^Message |^To:|^From:|^Subject:'"
	and try to read a mail message longer than 22 lines.  Exciting things
	will happen, including creation of a log file whose name begins "sage ".

	Alternatively, more can be used, also patched to support extended
	regular expressions (pr previously submitted on this):

--- prim.c-dist	Fri Oct 13 21:19:06 1995
+++ prim.c	Thu Dec 28 18:28:02 1995
@@ -640,7 +640,7 @@
 		}
 		else
 			regfree(cpattern);
-		if (regcomp(cpattern, pattern, 0))
+		if (regcomp(cpattern, pattern, REG_EXTENDED))
 		{
 			error("Invalid pattern");
 			return(0);
>Fix:
	The problem is that getrawlist (used both to split strings from
	the ~/.mailrc file and to split args to exec external programs)
	always processes metacharacters regardless of whether single or
	double quotes are used.  The patch below adds support for literal
	processing of text between single quotes, whether they appear as
	outer quotes, or embedded within text between double quotes.  If
	the single quotes are embedded in text between double quotes, they
	are preserved in the output; that is, exactly one layer of quoting
	is stripped when getrawlist processes a string.  (In the above
	example, the double quotes are stripped off when ~/.mailrc is read,
	and the single quotes are stripped off just prior to invoking PAGER.)

--- list.c-dist	Fri Oct 13 21:16:01 1995
+++ list.c	Thu Dec 28 16:28:19 1995
@@ -391,6 +391,7 @@
 	int  argc;
 {
 	register char c, *cp, *cp2, quotec;
+	int quotel;
 	int argn;
 	char linebuf[BUFSIZ];

@@ -408,12 +409,22 @@
 		}
 		cp2 = linebuf;
 		quotec = '\0';
+		quotel = -1;
 		while ((c = *cp) != '\0') {
 			cp++;
-			if (quotec != '\0') {
-				if (c == quotec)
-					quotec = '\0';
-				else if (c == '\\')
+			if (quotel != -1) {
+				if (c == quotec) {
+					if (quotel-- != 0) {
+						*cp2++ = c;
+						quotec = '"';
+					} else
+						quotec = '\0';
+				} else if (quotec == '\'')
+					*cp2++ = c;
+				else if (c == '\'') {
+					*cp2++ = quotec = c;
+					quotel = 1;
+				} else if (c == '\\')
 					switch (c = *cp++) {
 					case '\0':
 						*cp2++ = '\\';
@@ -463,9 +474,10 @@
 					}
 				} else
 					*cp2++ = c;
-			} else if (c == '"' || c == '\'')
+			} else if (c == '"' || c == '\'') {
 				quotec = c;
-			else if (c == ' ' || c == '\t')
+				quotel = 0;
+			} else if (c == ' ' || c == '\t')
 				break;
 			else
 				*cp2++ = c;

>Release-Note:
>Audit-Trail:

State-Changed-From-To: open->analyzed
State-Changed-By: jtc
State-Changed-When: Tue Jan  9 21:34:47 1996
State-Changed-Why:
Submitter was correct in pointing at getrawlist, but I think his
implementation may be lacking too.

POSIX.2 (Section 4.40.7.2) has the following passages about mailx quoting:

- An argument can be enclosed between paired double quotes (" ") or
  single quotes (' '); any white space, shell word expansion, or
  backslash cvharacters within the quotes shall be treated literally as
  part of the argument.  A double quote shall be treated literally
  within single quotes and vice versa.  These special properties of
  quote marks shall occur only when the are paired at the beginning and
  end of the argument.

- A backslash outside the enclosing quotes shall be discarded and the
  following character treaated literally as part of the argument.

- An unquoted backslash at the end of a command line shall be discarded
  and the next line shall continue the command.





From: Jim Bernard <jbernard@mines.edu>
To: gnats-bugs@gnats.netbsd.org
Cc:  
Subject: Re: bin/1880: mail has a too-limited implementation of string quoting
Date: Sat, 27 Mar 2004 13:26:39 -0700

 Here's the current version of the patch that I have been using, verified
 to apply to today's source file.  I don't think I have made any functional
 changes, just adaptations to more recent versions of the file.  I haven't
 addressed jtc's POSIX-compatibility concerns, since I don't normally use
 mail now, so there's not much motivation for me to improve it further.

 --- list.c-dist	2003-08-08 03:32:34.000000000 -0600
 +++ list.c	2003-09-17 19:49:41.000000000 -0600
 @@ -383,10 +383,11 @@
   */
  int
  getrawlist(char line[], char **argv, int argc)
  {
  	char c, *cp, *cp2, quotec;
 +	int quotel;
  	int argn;
  	char linebuf[BUFSIZ];

  	argn = 0;
  	cp = line;
 @@ -400,16 +401,26 @@
  			"Too many elements in the list; excess discarded.\n");
  			break;
  		}
  		cp2 = linebuf;
  		quotec = '\0';
 +		quotel = -1;
  		while ((c = *cp) != '\0') {
  			cp++;
 -			if (quotec != '\0') {
 -				if (c == quotec)
 -					quotec = '\0';
 -				else if (c == '\\')
 +			if (quotel != -1) {
 +				if (c == quotec) {
 +					if (quotel-- != 0) {
 +						*cp2++ = c;
 +						quotec = '"';
 +					} else
 +						quotec = '\0';
 +				} else if (quotec == '\'')
 +					*cp2++ = c;
 +				else if (c == '\'') {
 +					*cp2++ = quotec = c;
 +					quotel = 1;
 +				} else if (c == '\\')
  					switch (c = *cp++) {
  					case '\0':
  						*cp2++ = '\\';
  						cp--;
  						break;
 @@ -455,13 +466,14 @@
  						*cp2++ = '^';
  						cp--;
  					}
  				} else
  					*cp2++ = c;
 -			} else if (c == '"' || c == '\'')
 +			} else if (c == '"' || c == '\'') {
  				quotec = c;
 -			else if (c == ' ' || c == '\t')
 +				quotel = 0;
 +			} else if (c == ' ' || c == '\t')
  				break;
  			else
  				*cp2++ = c;
  		}
  		*cp2 = '\0';
From: Christos Zoulas <christos@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/1880 CVS commit: src/usr.bin/mail
Date: Sun,  7 Dec 2008 19:17:09 +0000 (UTC)

 Module Name:	src
 Committed By:	christos
 Date:		Sun Dec  7 19:17:09 UTC 2008

 Modified Files:
 	src/usr.bin/mail: mail.1

 Log Message:
 PR/1880: Jim Barnard: Pass backslash escaped characters unintepreted inside
 single quoted strings. Document new behavior, and its relationship with POSIX.


 To generate a diff of this commit:
 cvs rdiff -r1.52 -r1.53 src/usr.bin/mail/mail.1

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Christos Zoulas <christos@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/1880 CVS commit: src/usr.bin/mail
Date: Sun,  7 Dec 2008 19:21:00 +0000 (UTC)

 Module Name:	src
 Committed By:	christos
 Date:		Sun Dec  7 19:21:00 UTC 2008

 Modified Files:
 	src/usr.bin/mail: list.c

 Log Message:
 PR/1880: Jim Barnard: Don't parse backslash escaped characters inside single
 quoted strings.


 To generate a diff of this commit:
 cvs rdiff -r1.23 -r1.24 src/usr.bin/mail/list.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: analyzed->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Fri, 21 May 2021 20:33:57 +0000
State-Changed-Why:
Apparently Christos committed it in 2008.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: gnats-precook-prs,v 1.4 2018/12/21 14:20:20 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.