NetBSD Problem Report #38265

From martin@duskware.de  Thu Mar 20 09:29:44 2008
Return-Path: <martin@duskware.de>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by narn.NetBSD.org (Postfix) with ESMTP id 5A7E063B89A
	for <gnats-bugs@gnats.netbsd.org>; Thu, 20 Mar 2008 09:29:44 +0000 (UTC)
Message-Id: <20080320064058.A7F1263B89A@narn.NetBSD.org>
Date: Thu, 20 Mar 2008 06:40:58 +0000 (UTC)
From: dlagno@rambler.ru
Reply-To: dlagno@rambler.ru
To: netbsd-bugs-owner@NetBSD.org
Subject: sometimes /kern directory can not be read
X-Send-Pr-Version: www-1.0

>Number:         38265
>Category:       kern
>Synopsis:       sometimes /kern directory can not be read
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    kern-bug-people
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Mar 20 09:30:00 +0000 2008
>Closed-Date:    Thu Jan 15 23:03:14 +0000 2015
>Last-Modified:  Wed Feb 12 20:00:05 +0000 2020
>Originator:     Denis Lagno
>Release:        4.99.55
>Organization:
>Environment:
NetBSD flam.gado 4.99.55 NetBSD 4.99.55 (FLAM) #0: Wed Mar 19 19:10:28 MSK 2008  dina@flam.gado:/volatile/worksrc/netbsd-current/obj/sys/arch/i386/compile/FLAM i386
>Description:
kernfs is mounted:

$ mount | grep kern
kernfs on /kern type kernfs (read-only, local)

$ cat /etc/fstab | grep kern                                                                                  
none                                                    /kern                                                                                                   kernfs  ro                                                                                                              0       0

individual files can be read:

$ cat /kern/hz          
100

However directory /kern sometimes fails to be read:
getdents yields EBADF:

$ ktruss ls /kern
  3906      1 ktruss   emul(netbsd)
  3906      1 ktruss   fktrace                     = 0
  3906      1 ktruss   fcntl(0x4, 0x3, 0)          = 1
  3906      1 ktruss   fcntl(0x4, 0x4, 0x1)        = 0
  3906      1 ktruss   execve("/sbin/ls", 0xbfbfebe0, 0xbfbfebec) Err#2 ENOENT
  3906      1 ktruss   execve("/usr/sbin/ls", 0xbfbfebe0, 0xbfbfebec) Err#2 ENOENT
  3906      1 ls       emul(netbsd)
  3906      1 ls       execve("/bin/ls", 0xbfbfebe0, 0xbfbfebec) JUSTRETURN
  3906      1 ls       mmap(0, 0x8000, 0x3, 0x1002, 0xffffffff, 0, 0, 0) = 0xbbbea000
  3906      1 ls       open("/libexec/ld.elf_so", 0, 0) = 3
  3906      1 ls       __fstat30(0x3, 0xbfbfeadc)  = 0
  3906      1 ls       mmap(0, 0x5c, 0x1, 0x1, 0x3, 0, 0, 0) = 0xbbbe9000
  3906      1 ls       close(0x3)                  = 0
  3906      1 ls       munmap(0xbbbe9000, 0x5c)    = 0
  3906      1 ls       open("/etc/ld.so.conf", 0, 0) = 3
  3906      1 ls       __fstat30(0x3, 0xbfbfe3f0)  = 0
  3906      1 ls       mmap(0, 0x1000, 0x1, 0x1, 0x3, 0, 0, 0) = 0xbbbe9000
  3906      1 ls       munmap(0xbbbe9000, 0x1000)  = 0
  3906      1 ls       mmap(0, 0x13000, 0x5, 0x2, 0x3, 0, 0, 0) = 0xbbbd7000
  3906      1 ls       mmap(0xbbbe7000, 0x2000, 0x3, 0x12, 0x3, 0, 0x10000, 0) = 0xbbbe7000
  3906      1 ls       mmap(0xbbbe9000, 0x1000, 0x3, 0x1012, 0xffffffff, 0, 0, 0) = 0xbbbe9000
  3906      1 ls       close(0x3)                  = 0
  3906      1 ls       open("/usr/lib/libutil.so.7", 0, 0xbfbfe3f0) = 3
  3906      1 ls       __fstat30(0x3, 0xbfbfe3f0)  = 0
  3906      1 ls       mmap(0, 0x1000, 0x1, 0x1, 0x3, 0, 0, 0) = 0xbbbd6000
  3906      1 ls       munmap(0xbbbd6000, 0x1000)  = 0
  3906      1 ls       mmap(0, 0xf9000, 0x5, 0x2, 0x3, 0, 0, 0) = 0xbbade000
  3906      1 ls       mmap(0xbbbc0000, 0x7000, 0x3, 0x12, 0x3, 0, 0xe2000, 0) = 0xbbbc0000
  3906      1 ls       mmap(0xbbbc7000, 0x10000, 0x3, 0x1012, 0xffffffff, 0, 0, 0) = 0xbbbc7000
  3906      1 ls       close(0x3)                  = 0
  3906      1 ls       __sysctl(0xbfbfeb04, 0x2, 0x804db40, 0xbfbfeb0c, 0, 0) = 0
  3906      1 ls       issetugid()                 = 0
  3906      1 ls       ioctl(0x1, TIOCGETA, 0xbfbfeb00) = 0
       "\^B+\0\0\a\0\0\0\0K\0\0\M-K\^E\0 \^D\M^?\M^?\^?\^W\^U\^R\M^?\^C\^\\^Z\^Y\^Q\^S\^V\^O\^A\0\^T\M^?\0\M^V\0\0\0\M^V\0\0"
  3906      1 ls       ioctl(0x1, TIOCGWINSZ, 0xbfbfeb64) = 0
       ",\0\M^Q\0\M-{\^CM\^B"
  3906      1 ls       getuid()                    = 0
  3906      1 ls       __sysctl(0xbfbfe664, 0x2, 0xbbbcea9c, 0xbfbfe670, 0, 0) = 0
  3906      1 ls       __sysctl(0xbfbfe578, 0x2, 0xbbbd5780, 0xbfbfe580, 0, 0) = 0
  3906      1 ls       readlink("/usr/lib/libc.so.12", 0xbfbfe675, 0x400) Err#2 ENOENT
  3906      1 ls       issetugid()                 = 0
  3906      1 ls       break(0x804dd18)            = 0
  3906      1 ls       break(0x804dd18)            = 0
  3906      1 ls       break(0x8100000)            = 0
  3906      1 ls       mmap(0, 0x100000, 0x3, 0x14001002, 0xffffffff, 0, 0, 0) = 0xbb900000
  3906      1 ls       __stat30("/etc/malloc.conf", 0xbfbfea54) = 0
  3906      1 ls       open("/kern", 0, 0)         = 3
  3906      1 ls       fcntl(0x3, 0x2, 0x1)        = 0
  3906      1 ls       fchdir(0x3)                 = 0
  3906      1 ls       open(".", 0x4, 0)           = 5
  3906      1 ls       fcntl(0x5, 0x2, 0x1)        = 0
  3906      1 ls       __fstat30(0x5, 0xbfbfe140)  = 0
  3906      1 ls       fstatvfs1(0x5, 0xbfbfe1a4, 0x2) = 0
  3906      1 ls       lseek(0x5, 0, 0, 0, 0x1)    = 0
  3906      1 ls       __getdents30(0x5, 0xbb90d000, 0x1000) Err#9 EBADF
  3906      1 ls       close(0x5)                  = 0
  3906      1 ls       open("/kern", 0x4, 0xbfbfea68) = 5
  3906      1 ls       fcntl(0x5, 0x2, 0x1)        = 0
  3906      1 ls       __fstat30(0x5, 0xbfbfe130)  = 0
  3906      1 ls       fstatvfs1(0x5, 0xbfbfe194, 0x2) = 0
  3906      1 ls       __fstat30(0x5, 0xbfbfe9e8)  = 0
  3906      1 ls       fchdir(0x5)                 = 0
  3906      1 ls       lseek(0x5, 0, 0, 0, 0x1)    = 0
  3906      1 ls       __getdents30(0x5, 0xbb90d000, 0x1000) Err#9 EBADF
  3906      1 ls       close(0x5)                  = 0
  3906      1 ls       fchdir(0x3)                 = 0
  3906      1 ls       fchdir(0x3)                 = 0
  3906      1 ls       close(0x3)                  = 0
  3906      1 ls       exit(0)   

However it is not persistent error.
After some tries error hides:

$ ls /kern  
boottime  copyright hostname  hz        ipsecsa   ipsecsp   loadavg   msgbuf    pagesize  physmem   rootdev   rrootdev  time      version

>How-To-Repeat:
mount kernfs and try to read /kern directory
>Fix:
N/A

>Release-Note:

>Audit-Trail:
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org, netbsd-bugs@NetBSD.org
Subject: Re: kern/38265: sometimes /kern directory can not be read
Date: Thu, 20 Mar 2008 21:16:59 +0100

 On Thu, Mar 20, 2008 at 09:30:00AM +0000, dlagno@rambler.ru wrote:
 > >Number:         38265
 > >Category:       kern
 > >Synopsis:       sometimes /kern directory can not be read
 > >Confidential:   no
 > >Severity:       non-critical
 > >Priority:       medium
 > >Responsible:    kern-bug-people
 > >State:          open
 > >Class:          sw-bug
 > >Submitter-Id:   net
 > >Arrival-Date:   Thu Mar 20 09:30:00 +0000 2008
 > >Originator:     Denis Lagno
 > >Release:        4.99.55
 > >Organization:
 > >Environment:
 > NetBSD flam.gado 4.99.55 NetBSD 4.99.55 (FLAM) #0: Wed Mar 19 19:10:28 MSK 2008  dina@flam.gado:/volatile/worksrc/netbsd-current/obj/sys/arch/i386/compile/FLAM i386
 > >Description:
 > kernfs is mounted:
 > 
 > $ mount | grep kern
 > kernfs on /kern type kernfs (read-only, local)
 > 
 > $ cat /etc/fstab | grep kern                                                                                  
 > none                                                    /kern                                                                                                   kernfs  ro                                                                                                              0       0
 > 
 > individual files can be read:
 > 
 > $ cat /kern/hz          
 > 100
 > 
 > However directory /kern sometimes fails to be read:

 FWIW I also see this on some systems:
 rap:/usr/home/bouyer>ls /kern/
 rap:/usr/home/bouyer>ls /kern/xen
 privcmd   xenbus    xsd_port
 rap:/usr/home/bouyer>

  29742      1 ls       CALL  open(0x7f7ffd808c00,4,0)
  29742      1 ls       NAMI  "/kern"
  29742      1 ls       RET   open 4
  29742      1 ls       CALL  fcntl(4,2,1)
  29742      1 ls       RET   fcntl 0
  29742      1 ls       CALL  __fstat30(4,0x7f7fffffd030)
  29742      1 ls       RET   __fstat30 0
  29742      1 ls       CALL  __sysctl(0x7f7fffffcfc0,2,0x7f7ffdb0e080,0x7f7fffff
 cfb8,0,0)
  29742      1 ls       RET   __sysctl 0
  29742      1 ls       CALL  fstatvfs1(4,0x7f7fffffd0c0,2)
  29742      1 ls       RET   fstatvfs1 0
  29742      1 ls       CALL  lseek(4,0,0,1)
  29742      1 ls       RET   lseek 0
  29742      1 ls       CALL  __getdents30(4,0x7f7ffd81e000,0x1000)
  29742      1 ls       RET   __getdents30 -1 errno 9 Bad file descriptor
  29742      1 ls       CALL  close(4)
  29742      1 ls       RET   close 0

 -- 
 Manuel Bouyer <bouyer@antioche.eu.org>
      NetBSD: 26 ans d'experience feront toujours la difference
 --

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: makoto@ki.nu
Subject: Re: kern/38265: was: Re: kern/38778: /kern files do not show up
	anymore after quite some time]
Date: Sun, 18 Jan 2009 00:09:22 +0000

 The following didn't make it to gnats. (Things need to be sent to
 gnats-bugs as well as or instead of netbsd-bugs.)

 Also, 38778 was closed as a duplicate of 38265.

    ------

 From: Makoto Fujiwara <makoto@ki.nu>
 To: netbsd-bugs@netbsd.org
 Cc: ad@netbsd.org, kern-bug-people@netbsd.org
 Subject: Re: kern/38778: /kern files do not show up anymore after quite some
 	time
 Date: Wed, 14 Jan 2009 09:59:54 +0900

 | On a freshly booted system there is no problems to see files under /kern;
 | after quite some time/use (cvs up src + kerne build), they do not show
 | up anymore but remain available.

   This symptom may be triggered intensive file access, and in
 my case rsync -aH /m/NFS-HOST/ /local-dir/ with having
 /m/NFS-HOST at large size. It is really 180GB for here.
 It only needs two minutes to reproduce the problem.
  But just find src -newer tarball.tgz or find src tarball.tgz may
 be the key. 

  I have tracked down this problem, and it seems to me
 that the changes in the range of
   2008/01/24 17:30 UTC - 2008/01/24 18:00 UTC
 is the key to this behavior.

 Also when I rebooted at 2008-01-24 18:00 kernel, I was getting
 following problem.
    unmounting file systems...panic: unmount: dangling vnode

 In this range: following commit seems to be very suspicious:
  http://mail-index.netbsd.org/source-changes/2008/01/24/msg001255.html

 Related mail:
  http://mail-index.netbsd.org/current-users/2008/03/25/msg001514.html

 Thanks:
 ---
 Makoto Fujiwara,

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org
Subject: Re: kern/38265: sometimes /kern directory can not be read
Date: Sun, 15 Feb 2009 08:53:22 +0000

 On Thu, Mar 20, 2008 at 09:30:00AM +0000, dlagno@rambler.ru wrote:
  > individual files can be read:
  > 
  > $ cat /kern/hz          
  > 100
  > 
  > However directory /kern sometimes fails to be read:

 mrg says that the problem is that /kern/rootdev and /kern/rrootdev
 don't work unless the real root device vnodes from /dev are in the
 vnode cache, and thus ls -l in /dev will cause /kern to reappear.

 This worked just now for me on a 5.99.7 box but not on a 4.99.72 box,
 so it may not be the only problem but it's probably at least part of
 the problem.

 -- 
 David A. Holland
 dholland@netbsd.org

From: Makoto Fujiwara <makoto@ki.nu>
To: David Holland <dholland-bugs@NetBSD.org>
Cc: kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org, netbsd-bugs@NetBSD.org,
        dlagno@rambler.ru
Subject: Re: kern/38265: sometimes /kern directory can not be read
Date: Mon, 16 Feb 2009 11:58:18 +0900

 Hi, I have a machine running 4.99.72 and when I see that machine,
 it does NOT have anything in /kern.

 Then I did 
    ls -l /dev,
 evething seems to be back. Then I did rsync,

    rsync -aH /mounf/NFSserver/home/ /export/home/

 About in 7 minutes, /kern gets empty again. Then I did
    ls -l /dev/wd0a, 
 which is the root device, /kern seems to be back,
 but missing rrootdev. So I did 
    ls -l /dev/rwd0a, 
 then I get everthing now.

 ttyp2:makoto@mini 11:45:07/090216(/dev)> ls -l /kern
 total 15
 -r--r--r--  1 root  wheel        11 Feb 14 18:06 boottime
 -r--r--r--  1 root  wheel       264 Feb 16 11:55 copyright
 -rw-r--r--  1 root  wheel         5 Feb 16 11:55 hostname
 -r--r--r--  1 root  wheel         4 Feb 16 11:55 hz
 -r--r--r--  1 root  wheel        16 Feb 16 11:55 loadavg
 -r--r--r--  1 root  wheel      8176 Feb 16 11:55 msgbuf
 -r--r--r--  1 root  wheel         5 Feb 16 11:55 pagesize
 -r--r--r--  1 root  wheel         7 Feb 16 11:55 physmem
 brw-r-----  1 root  operator  10, 0 Sep 30  2006 rootdev
 crw-r-----  1 root  operator  30, 0 Sep 30  2006 rrootdev
 -r--r--r--  1 root  wheel        18 Feb 16 11:55 time
 -r--r--r--  1 root  wheel       125 Feb 16 11:55 version

 NetBSD mini 4.99.72 NetBSD 4.99.72 (GENERIC) #1: 
 Tue Nov 11 23:05:54 JST 2008  
 root@bologna:/export/20080908/src/sys/arch/macppc/compile/GENERIC macppc

 Thanks,
  Makoto Fujiwara

State-Changed-From-To: open->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Thu, 15 Jan 2015 23:03:14 +0000
State-Changed-Why:
The problem described herein (the entire listing of /kern disappearing if
the root device vnodes get cleared from the vnode cache) was fixed some
time ago.

I seem to recall a second different problem that manifested with the same
symptoms, but I can't find any references to it; so I'm going to close this
PR and if anyone remembers/finds the other problem (or sees it again), let me
know or open a new PR.

There's also a contributing issue, which is that fts(3) is not robust in the
presence of errors; this is why the kernfs listing disappears instead of
reporting the error assocated with the root device vnodes. I'm going to
open a new PR on that if there isn't one already.

(update: it is 49577)

From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/38265 CVS commit: src/sys/miscfs/kernfs
Date: Tue, 4 Feb 2020 04:19:25 +0000

 Module Name:	src
 Committed By:	riastradh
 Date:		Tue Feb  4 04:19:24 UTC 2020

 Modified Files:
 	src/sys/miscfs/kernfs: kernfs.h kernfs_vfsops.c kernfs_vnops.c

 Log Message:
 Use specfs vnops for specnodes in kernfs.

 While here, don't filter out rootdev and rrootdev merely because
 they're not cached.

 Fixes the elusive /kern/rootdev and /kern/rrootdev nodes, which only
 appeared sometimes when they felt like it, and fixes operations on
 /kern/rootdev and /kern/rrootdev always returning EOPNOTSUPP.

 We didn't seem to have a single PR for these issues but the following
 PRs are all relevant:

 PR bin/13564
 PR kern/38265
 PR kern/38778
 PR kern/45974

 XXX pullup-9, pullup-8, pullup-7, pullup-6, pullup-5, pullup-4, pullup-3, pullup-2, pullup-1.4T...


 To generate a diff of this commit:
 cvs rdiff -u -r1.42 -r1.43 src/sys/miscfs/kernfs/kernfs.h
 cvs rdiff -u -r1.97 -r1.98 src/sys/miscfs/kernfs/kernfs_vfsops.c
 cvs rdiff -u -r1.162 -r1.163 src/sys/miscfs/kernfs/kernfs_vnops.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/38265 CVS commit: [netbsd-9] src/sys/miscfs/kernfs
Date: Wed, 12 Feb 2020 19:59:23 +0000

 Module Name:	src
 Committed By:	martin
 Date:		Wed Feb 12 19:59:22 UTC 2020

 Modified Files:
 	src/sys/miscfs/kernfs [netbsd-9]: kernfs.h kernfs_vfsops.c
 	    kernfs_vnops.c

 Log Message:
 Pull up following revision(s) (requested by riastradh in ticket #702):

 	sys/miscfs/kernfs/kernfs_vfsops.c: revision 1.98
 	sys/miscfs/kernfs/kernfs_vnops.c: revision 1.163
 	sys/miscfs/kernfs/kernfs.h: revision 1.43

 Use specfs vnops for specnodes in kernfs.

 While here, don't filter out rootdev and rrootdev merely because
 they're not cached.

 Fixes the elusive /kern/rootdev and /kern/rrootdev nodes, which only
 appeared sometimes when they felt like it, and fixes operations on
 /kern/rootdev and /kern/rrootdev always returning EOPNOTSUPP.

 We didn't seem to have a single PR for these issues but the following
 PRs are all relevant:

 PR bin/13564
 PR kern/38265
 PR kern/38778
 PR kern/45974

 XXX pullup-9, pullup-8, pullup-7, pullup-6, pullup-5, pullup-4, pullup-3, p=
 ullup-2, pullup-1.4T...


 To generate a diff of this commit:
 cvs rdiff -u -r1.40 -r1.40.32.1 src/sys/miscfs/kernfs/kernfs.h
 cvs rdiff -u -r1.96 -r1.96.18.1 src/sys/miscfs/kernfs/kernfs_vfsops.c
 cvs rdiff -u -r1.160.4.1 -r1.160.4.2 src/sys/miscfs/kernfs/kernfs_vnops.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.