NetBSD Problem Report #55436

From jklowden@schemamania.org  Tue Jun 30 19:43:29 2020
Return-Path: <jklowden@schemamania.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id E3DB51A9217
	for <gnats-bugs@gnats.NetBSD.org>; Tue, 30 Jun 2020 19:43:29 +0000 (UTC)
Message-Id: <20200630182714.94092256FB74@mail.schemamania.org>
Date: Tue, 30 Jun 2020 14:27:14 -0400 (EDT)
From: jklowden@schemamania.org
Reply-To: jklowden@schemamania.org
To: gnats-bugs@NetBSD.org
Subject: strptime does not process %G or %V
X-Send-Pr-Version: 3.95

>Number:         55436
>Category:       lib
>Synopsis:       strptime does not process %G or %V
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    lib-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Jun 30 19:45:00 +0000 2020
>Last-Modified:  Wed Jun 09 04:45:03 +0000 2021
>Originator:     jklowden
>Release:        NetBSD 7.0
>Organization:
self
>Environment:
System: NetBSD oak.schemamania.org 7.0 NetBSD 7.0 (GENERIC.201509250726Z) amd64
Architecture: x86_64
Machine: amd64
>Description:
	strptime(3) claims to honor %G and %V for ISO dates.  In fact, both metacharacters are processed but ignored.  strptime returns a pointer indicating the input was accepted, but the tm structure is unchanged. 
>How-To-Repeat:
	The attached program accepts three arguments: strptime format, strftime format, and a datestring.  It prints the contents of the strptime struct tm output, and the strftime output from that tm. One example use:

  $ t/strpftime W%V %D  W26
    tm = {
      tm_sec = 0, 
      tm_min = 0, 
      tm_hour = 0, 
      tm_mday = 0, 
      tm_mon = 0, 
      tm_year = 0, 
      tm_wday = 0, 
      tm_yday = 0, 
      tm_isdst = 0 }
  01/00/00 ('' unparsed)

It is perhaps interesting that the GNU strptime exhibits similar behavior. 

>Fix:
No fix.  Demonstration program:
[snip]
#include <ctype.h>
#include <err.h>
#include <getopt.h>
#include <libgen.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
#include <unistd.h>

#include <sys/types.h>

extern char *optarg;

int
main(int argc, char *argv[]) {
  char output[80];
  struct tm tm = {};

  if( argc < 4 ) {
    errx(EXIT_FAILURE, "syntax %s: strptime_fmt strftime_fmt datetime", argv[0]);
  }

  const char *ifmt = argv[1], *pend;
  const char *ofmt = argv[2];
  const char *input = argv[3];

  if( (pend = strptime(input, ifmt, &tm)) == NULL ) {
    errx(EXIT_FAILURE, "could not parse '%s' with '%s'", input, ifmt);
  }

  const char fmt[] =
    "  tm = {\n"
    "    tm_sec = %d, \n"
    "    tm_min = %d, \n"
    "    tm_hour = %d, \n"
    "    tm_mday = %d, \n"
    "    tm_mon = %d, \n"
    "    tm_year = %d, \n"
    "    tm_wday = %d, \n"
    "    tm_yday = %d, \n"
    "    tm_isdst = %d }\n";

  printf(fmt,    
	 tm.tm_sec, 
	 tm.tm_min, 
	 tm.tm_hour, 
	 tm.tm_mday, 
	 tm.tm_mon, 
	 tm.tm_year, 
	 tm.tm_wday, 
	 tm.tm_yday, 
	 tm.tm_isdst);

  strftime(output, sizeof(output), ofmt, &tm);
  printf("%s ('%s' unparsed)\n", output, pend);

  return EXIT_SUCCESS;
}
[pins]

>Audit-Trail:
From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: lib/55436: strptime does not process %G or %V
Date: Wed, 01 Jul 2020 12:37:23 +0700

     Date:        Tue, 30 Jun 2020 19:45:00 +0000 (UTC)
     From:        jklowden@schemamania.org
     Message-ID:  <20200630194500.7AF2E1A9228@mollari.NetBSD.org>

   | 	strptime(3) claims to honor %G and %V for ISO dates.

 Note that even though strptime(3) says that both %G and %V are NetBSD
 extensions, they are going to be in the next POSIX standard.

 About %G POSIX will say ..,.

   G    The week-based year (see below) as a decimal number (for example, 1977).
        Leading zeros shall be permitted but shall not be required. A leading
        '+' or '-' character shall be permitted before any leading zeros but
        shall not be required. The effect of this year, if any, on the tm
        structure pointed to by tm is unspecified.

 Note the final sentence.

 About %V POSIX will say ....

   V    The week number of the week-based year (see below) as a decimal number
        [01,53].
        Leading zeros shall be permitted but shall not be required. The effect
        of this week number, if any, on the tm structure pointed to by tm is
        unspecified.

 Same (or similar) final sentence.

 The same caveat applies to %g and %z (both also new) and has been added to
 the specifications of %U and %W.   It also appears in a modified form in the
 newly added %Z (different form, as that is required to set tm_isdst in a
 defined way, but any other effects are unspecified).

 The point of all of these is not so much that they can be used to parse
 a time string and extract info from them, but so the output from strftime
 can be read, without error, by a suitably constructed strptime specification.


   | It is perhaps interesting that the GNU strptime exhibits similar behavior. 

 Not so much really.

 I don't think there is any bug here.

 kre

From: "James K. Lowden" <jklowden@schemamania.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: lib/55436: strptime does not process %G or %V
Date: Wed, 1 Jul 2020 18:57:41 -0400

 On Wed,  1 Jul 2020 05:40:02 +0000 (UTC)
 Robert Elz <kre@munnari.OZ.AU> wrote:

 >  The point of all of these is not so much that they can be used to
 > parse a time string and extract info from them, but so the output
 > from strftime can be read, without error, by a suitably constructed
 > strptime specification. 

 Thank you very much for the explanation.  I'm astonshed by POSIX's
 rationale.  I don't see how skipping over metacharacter sequences
 without parsing them into the output helps anyone.  It's a recipe for
 error.  

 I note that on my system, no mention is made on the man page for
 strptime that %G and %V are accepted but inoperative.  

 I just spent two days working around this ... feature.  Would NetBSD be
 interested in a patch that makes the unspecified result useful?  

 --jkl

From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: lib/55436: strptime does not process %G or %V
Date: Thu, 02 Jul 2020 22:20:15 +0700

     Date:        Wed,  1 Jul 2020 23:00:02 +0000 (UTC)
     From:        "James K. Lowden" <jklowden@schemamania.org>
     Message-ID:  <20200701230002.13BEB1A9218@mollari.NetBSD.org>

 POSIX don't invent rules and force people to follow, they document what
 actually works, so that you know what you can expect to happen.

 It isn't that anyone doesn't want the data to be useful, just that the
 implementations (or enough of them) don't use it, so pretending that
 applications can make use of it would not help anyone.

   | Would NetBSD be interested in a patch that makes the unspecified
   | result useful?  

 "unspecified" means that we can do something meaningful if something
 meaningful can be found.

 Interested might depend upon how complex the patch turns out to be, and
 how much it complicates later updates.

 kre

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: lib/55436: strptime does not process %G or %V
Date: Wed, 9 Jun 2021 04:43:50 +0000

 not sent to gnats

    ------

 From: Brian Ginsbach <ginsbach@netbsd.org>
 To: "James K. Lowden" <jklowden@schemamania.org>
 Cc: lib-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
 Subject: Re: lib/55436: strptime does not process %G or %V
 Date: Thu, 2 Jul 2020 18:23:43 +0000

 On Wed, Jul 01, 2020 at 11:00:02PM +0000, James K. Lowden wrote:
 > The following reply was made to PR lib/55436; it has been noted by GNATS.
 > 
 > From: "James K. Lowden" <jklowden@schemamania.org>
 > To: gnats-bugs@netbsd.org
 > Cc: 
 > Subject: Re: lib/55436: strptime does not process %G or %V
 > Date: Wed, 1 Jul 2020 18:57:41 -0400
 > 
 >  On Wed,  1 Jul 2020 05:40:02 +0000 (UTC)
 >  Robert Elz <kre@munnari.OZ.AU> wrote:
 >  
 >  >  The point of all of these is not so much that they can be used to
 >  > parse a time string and extract info from them, but so the output
 >  > from strftime can be read, without error, by a suitably constructed
 >  > strptime specification. 
 >  
 >  Thank you very much for the explanation.  I'm astonshed by POSIX's
 >  rationale.  I don't see how skipping over metacharacter sequences
 >  without parsing them into the output helps anyone.  It's a recipe for
 >  error.  

 This is a very good explaination by kre. Again, originally strptime(3)
 was not a 1-to-1 match for strftime(3) regarding format specifiers.
 True not parsing can be a recipe for error but so can parsing
 without enough data.

 >  
 >  I note that on my system, no mention is made on the man page for
 >  strptime that %G and %V are accepted but inoperative.  
 >  
 >  I just spent two days working around this ... feature.  Would NetBSD be
 >  interested in a patch that makes the unspecified result useful?  

 I have worked on NetBSD's strptime(3) off and on over the years.
 The last time I surveyed OSS implementations of strptime(3) there
 were no implementations that did anything useful with either %G or
 %V. Hence, in part, why NetBSD's and the GNU C library's strptime(3)
 behave similarly for these two.

 However, I do have local changes that will help make %G and %V
 useful in some cases.  Just haven't had the time to commit the
 code. NetBSD 9.0 has a lot of changes to strptime(3), including a
 framework that makes it easier to, under the right circumstances,
 handle %G and %V, rather than simply skipping them.

 Unfortunately, as the kre pointed out in the POSIX text there are
 cases where it is not possible to fill in a struct tm. Your test
 case is a perfect example. What year should be used when the only
 data parsed is W26 into W%V?  Current year? Last year? Here is a
 comment from my uncommitted code that illustrates some of these
 issues:


 	/*
 	 * N.B. mixing ISO and non-ISO conversion
 	 * specifiers is undefined.  We convert:
 	 *  %U with %[Gg] same as %U with %[Yy]
 	 *  %V with %[Yy] same as %V with %[Gg]
 	 *  %W with %[Gg] same as %V with %[Gg] (week > 0)
 	 *  %W with %[Gg] same as %W with %[Yy] (week == 0)
 	 */

 If you have changes against the latest NetBSD version of strptime(3)
 please send them along. I will see how they compare with my
 uncommitted changes.

 Brian

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.