NetBSD Problem Report #49033

From gson@gson.org  Fri Jul 25 10:53:29 2014
Return-Path: <gson@gson.org>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
	by mollari.NetBSD.org (Postfix) with ESMTPS id C5DEEA7581
	for <gnats-bugs@gnats.NetBSD.org>; Fri, 25 Jul 2014 10:53:28 +0000 (UTC)
Message-Id: <20140725105316.C7C1475E31@guava.gson.org>
Date: Fri, 25 Jul 2014 13:53:16 +0300 (EEST)
From: gson@gson.org (Andreas Gustafsson)
Reply-To: gson@gson.org (Andreas Gustafsson)
To: gnats-bugs@gnats.NetBSD.org
Subject: fs/vfs/t_vnops/udf_lookup_complex test case randomly fails
X-Send-Pr-Version: 3.95

>Number:         49033
>Category:       kern
>Synopsis:       fs/vfs/t_vnops/udf_lookup_complex test case randomly fails
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    kern-bug-people
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Jul 25 10:55:00 +0000 2014
>Closed-Date:    Wed Aug 31 03:40:40 +0000 2016
>Last-Modified:  Wed Aug 31 03:40:40 +0000 2016
>Originator:     Andreas Gustafsson
>Release:        NetBSD-current
>Organization:

>Environment:
System: NetBSD
Architecture: i386
Machine: i386
>Description:

On the TNF testbed, the fs/vfs/t_vnops/udf_lookup_complex test case
has failed in about half the runs ever since it was added, with the
error

  t_vnops.c:100: memcmp(&sb1, &sb2, sizeof(sb1)) == 0 not met

This affects all three architectures being tested (i386, amd64, sparc).
Logs from a recent occurrence are at:

  http://releng.netbsd.org/b5reports/i386/build/2014.07.24.13.22.49/test.html#fs_vfs_t_vnops_udf_lookup_complex

>How-To-Repeat:

Run the ATF tests.

>Fix:

>Release-Note:

>Audit-Trail:
From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/49033 CVS commit: src/tests/fs/vfs
Date: Fri, 25 Jul 2014 12:16:22 +0000

 Module Name:	src
 Committed By:	martin
 Date:		Fri Jul 25 12:16:22 UTC 2014

 Modified Files:
 	src/tests/fs/vfs: t_vnops.c

 Log Message:
 PR kern/49033: try to provide better diagnostics if the udf_lookup_complex
 test randomly fails.


 To generate a diff of this commit:
 cvs rdiff -u -r1.39 -r1.40 src/tests/fs/vfs/t_vnops.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Andreas Gustafsson <gson@gson.org>
To: <martin@netbsd.org>
Cc: gnats-bugs@NetBSD.org
Subject: Re: PR/49033 CVS commit: src/tests/fs/vfs
Date: Fri, 25 Jul 2014 21:37:31 +0300

 >  PR kern/49033: try to provide better diagnostics if the udf_lookup_complex
 >  test randomly fails.

 Thanks :)

 Output from http://www.gson.org/netbsd/bugs/build/i386/2014/2014.07.25.12.18.13/test.html#fs_vfs_t_vnops_udf_lookup_complex :

 what   sb1		sb2
 st_dev 50432		50432
 st_mode			16877	16877
 st_ino			6	6
 st_nlink		2	2
 st_uid			0	0
 st_gid			0	0
 st_rdev			-1	-1
 st_atim			1406308232.27096000	1406308232.47096000
 st_mtim			1406308232.27096000	1406308232.27096000
 st_ctim			1406308232.27096000	1406308232.27096000
 st_birthtim		1406308232.27096000	1406308232.27096000
 st_size			40			40
 st_blocks		0			0
 st_flags		0			0
 st_gen			0			0

 Looks like the mismatch is in st_atim.  Does this mean UDF updates the
 st_atime on inode accesses and not just on data accesses, and if so,
 should it be considered a bug or is it a deliberate departure from
 standard POSIX semantics due to some filesystem limitation?
 -- 
 Andreas Gustafsson, gson@gson.org

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: PR/49033 CVS commit: src/tests/fs/vfs
Date: Sat, 26 Jul 2014 23:05:38 +0000

 On Fri, Jul 25, 2014 at 06:40:00PM +0000, Andreas Gustafsson wrote:
  >  Looks like the mismatch is in st_atim.  Does this mean UDF updates the
  >  st_atime on inode accesses and not just on data accesses, and if so,
  >  should it be considered a bug or is it a deliberate departure from
  >  standard POSIX semantics due to some filesystem limitation?

 I would guess it's just a bug.

 -- 
 David A. Holland
 dholland@netbsd.org

From: "Reinoud Zandijk" <reinoud@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/49033 CVS commit: src/sys/fs/udf
Date: Tue, 29 Jul 2014 11:10:12 +0000

 Module Name:	src
 Committed By:	reinoud
 Date:		Tue Jul 29 11:10:12 UTC 2014

 Modified Files:
 	src/sys/fs/udf: udf_vnops.c

 Log Message:
 Access time should only be set on accessing the data or messing with the
 contents, not on inode access rights and ownership changes. Should address
 PR kern/49033 for UDF.

 Test results now come clean for bugs related to this issue in the ATF.


 To generate a diff of this commit:
 cvs rdiff -u -r1.92 -r1.93 src/sys/fs/udf/udf_vnops.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@NetBSD.org
Cc: "Reinoud Zandijk" <reinoud@netbsd.org>
Subject: Re: kern/49033: fs/vfs/t_vnops/udf_lookup_complex test case randomly fails
Date: Wed, 30 Jul 2014 15:19:01 +0300

 There has been another test failure since the commit of udf_vnops.c 1.94:

   http://releng.netbsd.org/b5reports/i386/build/2014.07.29.23.35.00/test.html#fs_vfs_t_vnops_udf_lookup_complex

 -- 
 Andreas Gustafsson, gson@gson.org

From: Reinoud Zandijk <reinoud@NetBSD.org>
To: Andreas Gustafsson <gson@gson.org>
Cc: gnats-bugs@NetBSD.org, Reinoud Zandijk <reinoud@netbsd.org>
Subject: Re: kern/49033: fs/vfs/t_vnops/udf_lookup_complex test case randomly
 fails
Date: Wed, 30 Jul 2014 14:39:47 +0200

 Hi Andreas,

 On Wed, Jul 30, 2014 at 03:19:01PM +0300, Andreas Gustafsson wrote:
 > There has been another test failure since the commit of udf_vnops.c 1.94:
 > 
 >   http://releng.netbsd.org/b5reports/i386/build/2014.07.29.23.35.00/test.html#fs_vfs_t_vnops_udf_lookup_complex

 I think i can explain this one, though i have not seen WHAT lookup fails; it
 would be handy if the test would give hints, if only a line number. Looking at
 the changes i made, i think its a symbolic link lookup that needs to *read*
 i.e. access the data of the file to do the translation. The lookup is also
 needed when the symbolic link is only stat'ed.

 Is this acceptable?

 Reinoud

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/49033: fs/vfs/t_vnops/udf_lookup_complex test case randomly fails
Date: Wed, 30 Jul 2014 14:55:07 +0200

 On Wed, Jul 30, 2014 at 12:45:01PM +0000, Reinoud Zandijk wrote:
 >  I think i can explain this one, though i have not seen WHAT lookup fails; it
 >  would be handy if the test would give hints, if only a line number.

 Well, just look at the source of the test case:

         sprintf(pb, "%s/dir", mountpath);
         if (rump_sys_mkdir(pb, 0777) == -1)
                 atf_tc_fail_errno("mkdir");
         if (rump_sys_stat(pb, &sb1) == -1)  
                 atf_tc_fail_errno("stat 1");

         sprintf(pb, "%s/./dir/../././dir/.", mountpath);
         if (rump_sys_stat(pb, &sb2) == -1)  
                 atf_tc_fail_errno("stat 2");

         if (memcmp(&sb1, &sb2, sizeof(sb1)) != 0) {
                 printf("what\tsb1\t\tsb2\n");
 [..]
 ... atime differs

 Martin

From: Andreas Gustafsson <gson@gson.org>
To: Reinoud Zandijk <reinoud@NetBSD.org>
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/49033: fs/vfs/t_vnops/udf_lookup_complex test case randomly fails
Date: Wed, 30 Jul 2014 15:53:56 +0300

 Reinoud,

 You wrote:
 > >   http://releng.netbsd.org/b5reports/i386/build/2014.07.29.23.35.00/test.html#fs_vfs_t_vnops_udf_lookup_complex
 > 
 > I think i can explain this one, though i have not seen WHAT lookup fails; it
 > would be handy if the test would give hints, if only a line number.

 I'm not sure I understand your request.  The reported error is that
 the stat calls on t_vnops.c lines 93 and 97 returned different
 results; are there some other line numbers you need?

 > Looking at the changes i made, i think its a symbolic link lookup
 > that needs to *read* i.e. access the data of the file to do the
 > translation. The lookup is also needed when the symbolic link is
 > only stat'ed.

 As far as I can see, there are no symlinks involved in this test.
 -- 
 Andreas Gustafsson, gson@gson.org

From: "Andreas Gustafsson" <gson@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/49033 CVS commit: src/tests/fs/vfs
Date: Tue, 12 Aug 2014 12:13:09 +0000

 Module Name:	src
 Committed By:	gson
 Date:		Tue Aug 12 12:13:09 UTC 2014

 Modified Files:
 	src/tests/fs/vfs: t_vnops.c

 Log Message:
 Mark udf_lookup_complex as an expected failure, referencing PR kern/49033.
 Since it does not fail reliably, force it to fail to avoid spurious failure
 reports due to unexpected success.


 To generate a diff of this commit:
 cvs rdiff -u -r1.40 -r1.41 src/tests/fs/vfs/t_vnops.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/49033 CVS commit: [netbsd-7] src/tests/fs/vfs
Date: Sat, 17 Jan 2015 11:51:33 +0000

 Module Name:	src
 Committed By:	martin
 Date:		Sat Jan 17 11:51:33 UTC 2015

 Modified Files:
 	src/tests/fs/vfs [netbsd-7]: t_vnops.c

 Log Message:
 Pull up following revision(s) (requested by gson in ticket #424):
 	tests/fs/vfs/t_vnops.c: revision 1.41
 Mark udf_lookup_complex as an expected failure, referencing PR kern/49033.
 Since it does not fail reliably, force it to fail to avoid spurious failure
 reports due to unexpected success.


 To generate a diff of this commit:
 cvs rdiff -u -r1.40 -r1.40.2.1 src/tests/fs/vfs/t_vnops.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/49033: fs/vfs/t_vnops/udf_lookup_complex test case randomly fails
Date: Sat, 20 Aug 2016 22:20:47 +0700

 I believe the test is bogus.

 In this test case ...

          if (rump_sys_mkdir(pb, 0777) == -1)
                  atf_tc_fail_errno("mkdir");
          if (rump_sys_stat(pb, &sb1) == -1)  
                  atf_tc_fail_errno("stat 1");

          sprintf(pb, "%s/./dir/../././dir/.", mountpath);
          if (rump_sys_stat(pb, &sb2) == -1)  
                  atf_tc_fail_errno("stat 2");

          if (memcmp(&sb1, &sb2, sizeof(sb1)) != 0) {
                  printf("what\tsb1\t\tsb2\n");

 The second stat() (sb2) has to read "dir" (the ultimate subject of the
 stat) in order to resolve ".." and "." within it.

 Implememnations are permitted to update the access time of the directory
 due to that read of it.

 What posix actually says (section 4.8 "File Times Update") is ...

 [ 
 http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html#tag_04_08 
 ]

 As best I can tell, TC2 (due to be published sometime soon) has not
 altered any of this.

 The relevant paragraph is:

     Each function or utility in POSIX.1-2008 that reads or writes data
     (even if the data does not change) or performs an operation to change
     file status (even if the file status does not change) indicates which
     of the appropriate timestamps shall be marked for update.

 That part is boring (for our purpose here), and just says that sometimes
 one (or more) of the 3 required file times must be updated, and the std
 specifies which when that is required.

     If an implementation of such a function or utility marks for update
     one of these timestamps in a place or time not specified by POSIX.1-2008,
     this shall be documented,

 That says that the implementation can update the times when not actually
 required to do so, provided it documents that it does, but ...

     except that any changes caused by pathname resolution need not be
     documented.

 that is, it is explicitly allowed that pathname resolution itself might cause
 time updates - for that, the only rational time that would be updated is the
 access time, as nothing is ever being changed during pathname resolution - but
 both directories and symlinks do (when appropriate) get read as part of that
 process.   Nothing requires that the access times be updated, but nothing
 prohibits it either,

 That is, a test that assumes that the access time will not be altered is
 simply broken.

 That is what this particular test is doing.   It ought not be just marked
 as an expected failure, but either deleted, or at least fixed to ignore the
 access time (or if it is to test it at all, just verify it has not moved
 backwards in time.)

 I expect that the intermittent failures simply depend upon whether the
 mkdir() (which sets the access time originally - the first stat() does not
 need to read the directory, and so probably does not alter it - though
 technically it is permitted to) and the second stat() both happen to occur
 withing the same qemu clock tick - sometimes they obviously do, other times
 they clearly don't - inserting a sleep(1) would (I think) make the test
 fail reliably without that explicit "failed to fail" hack...

 If there is no objection, I will make appropriate changes to the test
 (in a few days.)

 kre

From: "Robert Elz" <kre@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/49033 CVS commit: src/tests/fs/vfs
Date: Mon, 29 Aug 2016 02:31:46 +0000

 Module Name:	src
 Committed By:	kre
 Date:		Mon Aug 29 02:31:46 UTC 2016

 Modified Files:
 	src/tests/fs/vfs: t_vnops.c

 Log Message:
 PR kern/49033

 POSIX allows for the atime (or technically, any of the times) to be
 updated as a side effect of searching a directory (allows, not requires).
 The NetBSD UDF implementation apparently works that way, treating a
 directory search as a read of the directory, and hence updating the
 access time.   Compensate for that in the test (rather than just
 expecting failure) by verifying that the atime after the directory
 search is within a small margin of the atime before the search
 (currently, "small" is 1 second).   We could fetch the time before
 the mkdir and both stat() calls, do all of that, fetch the time after,
 subtract, and require the after stat() atime to be bounded by the atime
 set by the original mkdir and returned in the first stat() and that time
 + the difference in elapsed time - that would be more accurate, but is
 a lot more work for little real benefit.

 Should anyone be interested in doing that extra work, remember to use
 monotonic time (clock_gettime(CLOCK_MOMNOTONIC, ...)) not the time of day
 clock for measuring the elapsed time.

 Along with this, remove the "if (udf) failure expected" and the
 if (udf && we haven't failed yet) fail("random failure failed to happen")
 stuff...  (the "random" would have been that sometimes the mkdir and
 two lookups (stat() calls) would all occur within the same clock tick,
 meaning that the atimes would all be the same.  Other times the clock
 would tick somewhere between the mkdir() and the 2nd stat().)


 To generate a diff of this commit:
 cvs rdiff -u -r1.57 -r1.58 src/tests/fs/vfs/t_vnops.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->feedback
State-Changed-By: kre@NetBSD.org
State-Changed-When: Mon, 29 Aug 2016 04:12:01 +0000
State-Changed-Why:
Is there anyone related to this PR who is not happy to have it
closed after the change made to src/tests/fs/vfs/t_vnops.c in rev 1.58
of that file ?

Without objection, I (or someone else) will close this PR soon.


From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org,
    gnats-admin@netbsd.org,
    kre@NetBSD.org,
    gson@gson.org (Andreas Gustafsson)
Subject: Re: kern/49033 (fs/vfs/t_vnops/udf_lookup_complex test case randomly fails)
Date: Mon, 29 Aug 2016 16:40:44 +0300

 kre@NetBSD.org wrote:
 > Is there anyone related to this PR who is not happy to have it
 > closed after the change made to src/tests/fs/vfs/t_vnops.c in rev 1.58
 > of that file ?
 > 
 > Without objection, I (or someone else) will close this PR soon.

 I will be happy to see it closed.  Thank you for the analysis and fix.
 -- 
 Andreas Gustafsson, gson@gson.org

State-Changed-From-To: feedback->closed
State-Changed-By: kre@NetBSD.org
State-Changed-When: Wed, 31 Aug 2016 03:40:40 +0000
State-Changed-Why:
Test seems stable now.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.