NetBSD Problem Report #55436
From jklowden@schemamania.org Tue Jun 30 19:43:29 2020
Return-Path: <jklowden@schemamania.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id E3DB51A9217
for <gnats-bugs@gnats.NetBSD.org>; Tue, 30 Jun 2020 19:43:29 +0000 (UTC)
Message-Id: <20200630182714.94092256FB74@mail.schemamania.org>
Date: Tue, 30 Jun 2020 14:27:14 -0400 (EDT)
From: jklowden@schemamania.org
Reply-To: jklowden@schemamania.org
To: gnats-bugs@NetBSD.org
Subject: strptime does not process %G or %V
X-Send-Pr-Version: 3.95
>Number: 55436
>Category: lib
>Synopsis: strptime does not process %G or %V
>Confidential: no
>Severity: serious
>Priority: low
>Responsible: lib-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Tue Jun 30 19:45:00 +0000 2020
>Last-Modified: Wed Jun 09 04:45:03 +0000 2021
>Originator: jklowden
>Release: NetBSD 7.0
>Organization:
self
>Environment:
System: NetBSD oak.schemamania.org 7.0 NetBSD 7.0 (GENERIC.201509250726Z) amd64
Architecture: x86_64
Machine: amd64
>Description:
strptime(3) claims to honor %G and %V for ISO dates. In fact, both metacharacters are processed but ignored. strptime returns a pointer indicating the input was accepted, but the tm structure is unchanged.
>How-To-Repeat:
The attached program accepts three arguments: strptime format, strftime format, and a datestring. It prints the contents of the strptime struct tm output, and the strftime output from that tm. One example use:
$ t/strpftime W%V %D W26
tm = {
tm_sec = 0,
tm_min = 0,
tm_hour = 0,
tm_mday = 0,
tm_mon = 0,
tm_year = 0,
tm_wday = 0,
tm_yday = 0,
tm_isdst = 0 }
01/00/00 ('' unparsed)
It is perhaps interesting that the GNU strptime exhibits similar behavior.
>Fix:
No fix. Demonstration program:
[snip]
#include <ctype.h>
#include <err.h>
#include <getopt.h>
#include <libgen.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
#include <unistd.h>
#include <sys/types.h>
extern char *optarg;
int
main(int argc, char *argv[]) {
char output[80];
struct tm tm = {};
if( argc < 4 ) {
errx(EXIT_FAILURE, "syntax %s: strptime_fmt strftime_fmt datetime", argv[0]);
}
const char *ifmt = argv[1], *pend;
const char *ofmt = argv[2];
const char *input = argv[3];
if( (pend = strptime(input, ifmt, &tm)) == NULL ) {
errx(EXIT_FAILURE, "could not parse '%s' with '%s'", input, ifmt);
}
const char fmt[] =
" tm = {\n"
" tm_sec = %d, \n"
" tm_min = %d, \n"
" tm_hour = %d, \n"
" tm_mday = %d, \n"
" tm_mon = %d, \n"
" tm_year = %d, \n"
" tm_wday = %d, \n"
" tm_yday = %d, \n"
" tm_isdst = %d }\n";
printf(fmt,
tm.tm_sec,
tm.tm_min,
tm.tm_hour,
tm.tm_mday,
tm.tm_mon,
tm.tm_year,
tm.tm_wday,
tm.tm_yday,
tm.tm_isdst);
strftime(output, sizeof(output), ofmt, &tm);
printf("%s ('%s' unparsed)\n", output, pend);
return EXIT_SUCCESS;
}
[pins]
>Audit-Trail:
From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: lib/55436: strptime does not process %G or %V
Date: Wed, 01 Jul 2020 12:37:23 +0700
Date: Tue, 30 Jun 2020 19:45:00 +0000 (UTC)
From: jklowden@schemamania.org
Message-ID: <20200630194500.7AF2E1A9228@mollari.NetBSD.org>
| strptime(3) claims to honor %G and %V for ISO dates.
Note that even though strptime(3) says that both %G and %V are NetBSD
extensions, they are going to be in the next POSIX standard.
About %G POSIX will say ..,.
G The week-based year (see below) as a decimal number (for example, 1977).
Leading zeros shall be permitted but shall not be required. A leading
'+' or '-' character shall be permitted before any leading zeros but
shall not be required. The effect of this year, if any, on the tm
structure pointed to by tm is unspecified.
Note the final sentence.
About %V POSIX will say ....
V The week number of the week-based year (see below) as a decimal number
[01,53].
Leading zeros shall be permitted but shall not be required. The effect
of this week number, if any, on the tm structure pointed to by tm is
unspecified.
Same (or similar) final sentence.
The same caveat applies to %g and %z (both also new) and has been added to
the specifications of %U and %W. It also appears in a modified form in the
newly added %Z (different form, as that is required to set tm_isdst in a
defined way, but any other effects are unspecified).
The point of all of these is not so much that they can be used to parse
a time string and extract info from them, but so the output from strftime
can be read, without error, by a suitably constructed strptime specification.
| It is perhaps interesting that the GNU strptime exhibits similar behavior.
Not so much really.
I don't think there is any bug here.
kre
From: "James K. Lowden" <jklowden@schemamania.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: lib/55436: strptime does not process %G or %V
Date: Wed, 1 Jul 2020 18:57:41 -0400
On Wed, 1 Jul 2020 05:40:02 +0000 (UTC)
Robert Elz <kre@munnari.OZ.AU> wrote:
> The point of all of these is not so much that they can be used to
> parse a time string and extract info from them, but so the output
> from strftime can be read, without error, by a suitably constructed
> strptime specification.
Thank you very much for the explanation. I'm astonshed by POSIX's
rationale. I don't see how skipping over metacharacter sequences
without parsing them into the output helps anyone. It's a recipe for
error.
I note that on my system, no mention is made on the man page for
strptime that %G and %V are accepted but inoperative.
I just spent two days working around this ... feature. Would NetBSD be
interested in a patch that makes the unspecified result useful?
--jkl
From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: lib/55436: strptime does not process %G or %V
Date: Thu, 02 Jul 2020 22:20:15 +0700
Date: Wed, 1 Jul 2020 23:00:02 +0000 (UTC)
From: "James K. Lowden" <jklowden@schemamania.org>
Message-ID: <20200701230002.13BEB1A9218@mollari.NetBSD.org>
POSIX don't invent rules and force people to follow, they document what
actually works, so that you know what you can expect to happen.
It isn't that anyone doesn't want the data to be useful, just that the
implementations (or enough of them) don't use it, so pretending that
applications can make use of it would not help anyone.
| Would NetBSD be interested in a patch that makes the unspecified
| result useful?
"unspecified" means that we can do something meaningful if something
meaningful can be found.
Interested might depend upon how complex the patch turns out to be, and
how much it complicates later updates.
kre
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: lib/55436: strptime does not process %G or %V
Date: Wed, 9 Jun 2021 04:43:50 +0000
not sent to gnats
------
From: Brian Ginsbach <ginsbach@netbsd.org>
To: "James K. Lowden" <jklowden@schemamania.org>
Cc: lib-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: lib/55436: strptime does not process %G or %V
Date: Thu, 2 Jul 2020 18:23:43 +0000
On Wed, Jul 01, 2020 at 11:00:02PM +0000, James K. Lowden wrote:
> The following reply was made to PR lib/55436; it has been noted by GNATS.
>
> From: "James K. Lowden" <jklowden@schemamania.org>
> To: gnats-bugs@netbsd.org
> Cc:
> Subject: Re: lib/55436: strptime does not process %G or %V
> Date: Wed, 1 Jul 2020 18:57:41 -0400
>
> On Wed, 1 Jul 2020 05:40:02 +0000 (UTC)
> Robert Elz <kre@munnari.OZ.AU> wrote:
>
> > The point of all of these is not so much that they can be used to
> > parse a time string and extract info from them, but so the output
> > from strftime can be read, without error, by a suitably constructed
> > strptime specification.
>
> Thank you very much for the explanation. I'm astonshed by POSIX's
> rationale. I don't see how skipping over metacharacter sequences
> without parsing them into the output helps anyone. It's a recipe for
> error.
This is a very good explaination by kre. Again, originally strptime(3)
was not a 1-to-1 match for strftime(3) regarding format specifiers.
True not parsing can be a recipe for error but so can parsing
without enough data.
>
> I note that on my system, no mention is made on the man page for
> strptime that %G and %V are accepted but inoperative.
>
> I just spent two days working around this ... feature. Would NetBSD be
> interested in a patch that makes the unspecified result useful?
I have worked on NetBSD's strptime(3) off and on over the years.
The last time I surveyed OSS implementations of strptime(3) there
were no implementations that did anything useful with either %G or
%V. Hence, in part, why NetBSD's and the GNU C library's strptime(3)
behave similarly for these two.
However, I do have local changes that will help make %G and %V
useful in some cases. Just haven't had the time to commit the
code. NetBSD 9.0 has a lot of changes to strptime(3), including a
framework that makes it easier to, under the right circumstances,
handle %G and %V, rather than simply skipping them.
Unfortunately, as the kre pointed out in the POSIX text there are
cases where it is not possible to fill in a struct tm. Your test
case is a perfect example. What year should be used when the only
data parsed is W26 into W%V? Current year? Last year? Here is a
comment from my uncommitted code that illustrates some of these
issues:
/*
* N.B. mixing ISO and non-ISO conversion
* specifiers is undefined. We convert:
* %U with %[Gg] same as %U with %[Yy]
* %V with %[Yy] same as %V with %[Gg]
* %W with %[Gg] same as %V with %[Gg] (week > 0)
* %W with %[Gg] same as %W with %[Yy] (week == 0)
*/
If you have changes against the latest NetBSD version of strptime(3)
please send them along. I will see how they compare with my
uncommitted changes.
Brian
(Contact us)
$NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.