NetBSD Problem Report #38265

From martin@duskware.de  Thu Mar 20 09:29:44 2008
Return-Path: <martin@duskware.de>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by narn.NetBSD.org (Postfix) with ESMTP id 5A7E063B89A
	for <gnats-bugs@gnats.netbsd.org>; Thu, 20 Mar 2008 09:29:44 +0000 (UTC)
Message-Id: <20080320064058.A7F1263B89A@narn.NetBSD.org>
Date: Thu, 20 Mar 2008 06:40:58 +0000 (UTC)
From: dlagno@rambler.ru
Reply-To: dlagno@rambler.ru
To: netbsd-bugs-owner@NetBSD.org
Subject: sometimes /kern directory can not be read
X-Send-Pr-Version: www-1.0

>Number:         38265
>Category:       kern
>Synopsis:       sometimes /kern directory can not be read
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Mar 20 09:30:00 +0000 2008
>Last-Modified:  Mon Feb 16 03:10:02 +0000 2009
>Originator:     Denis Lagno
>Release:        4.99.55
>Organization:
>Environment:
NetBSD flam.gado 4.99.55 NetBSD 4.99.55 (FLAM) #0: Wed Mar 19 19:10:28 MSK 2008  dina@flam.gado:/volatile/worksrc/netbsd-current/obj/sys/arch/i386/compile/FLAM i386
>Description:
kernfs is mounted:

$ mount | grep kern
kernfs on /kern type kernfs (read-only, local)

$ cat /etc/fstab | grep kern                                                                                  
none                                                    /kern                                                                                                   kernfs  ro                                                                                                              0       0

individual files can be read:

$ cat /kern/hz          
100

However directory /kern sometimes fails to be read:
getdents yields EBADF:

$ ktruss ls /kern
  3906      1 ktruss   emul(netbsd)
  3906      1 ktruss   fktrace                     = 0
  3906      1 ktruss   fcntl(0x4, 0x3, 0)          = 1
  3906      1 ktruss   fcntl(0x4, 0x4, 0x1)        = 0
  3906      1 ktruss   execve("/sbin/ls", 0xbfbfebe0, 0xbfbfebec) Err#2 ENOENT
  3906      1 ktruss   execve("/usr/sbin/ls", 0xbfbfebe0, 0xbfbfebec) Err#2 ENOENT
  3906      1 ls       emul(netbsd)
  3906      1 ls       execve("/bin/ls", 0xbfbfebe0, 0xbfbfebec) JUSTRETURN
  3906      1 ls       mmap(0, 0x8000, 0x3, 0x1002, 0xffffffff, 0, 0, 0) = 0xbbbea000
  3906      1 ls       open("/libexec/ld.elf_so", 0, 0) = 3
  3906      1 ls       __fstat30(0x3, 0xbfbfeadc)  = 0
  3906      1 ls       mmap(0, 0x5c, 0x1, 0x1, 0x3, 0, 0, 0) = 0xbbbe9000
  3906      1 ls       close(0x3)                  = 0
  3906      1 ls       munmap(0xbbbe9000, 0x5c)    = 0
  3906      1 ls       open("/etc/ld.so.conf", 0, 0) = 3
  3906      1 ls       __fstat30(0x3, 0xbfbfe3f0)  = 0
  3906      1 ls       mmap(0, 0x1000, 0x1, 0x1, 0x3, 0, 0, 0) = 0xbbbe9000
  3906      1 ls       munmap(0xbbbe9000, 0x1000)  = 0
  3906      1 ls       mmap(0, 0x13000, 0x5, 0x2, 0x3, 0, 0, 0) = 0xbbbd7000
  3906      1 ls       mmap(0xbbbe7000, 0x2000, 0x3, 0x12, 0x3, 0, 0x10000, 0) = 0xbbbe7000
  3906      1 ls       mmap(0xbbbe9000, 0x1000, 0x3, 0x1012, 0xffffffff, 0, 0, 0) = 0xbbbe9000
  3906      1 ls       close(0x3)                  = 0
  3906      1 ls       open("/usr/lib/libutil.so.7", 0, 0xbfbfe3f0) = 3
  3906      1 ls       __fstat30(0x3, 0xbfbfe3f0)  = 0
  3906      1 ls       mmap(0, 0x1000, 0x1, 0x1, 0x3, 0, 0, 0) = 0xbbbd6000
  3906      1 ls       munmap(0xbbbd6000, 0x1000)  = 0
  3906      1 ls       mmap(0, 0xf9000, 0x5, 0x2, 0x3, 0, 0, 0) = 0xbbade000
  3906      1 ls       mmap(0xbbbc0000, 0x7000, 0x3, 0x12, 0x3, 0, 0xe2000, 0) = 0xbbbc0000
  3906      1 ls       mmap(0xbbbc7000, 0x10000, 0x3, 0x1012, 0xffffffff, 0, 0, 0) = 0xbbbc7000
  3906      1 ls       close(0x3)                  = 0
  3906      1 ls       __sysctl(0xbfbfeb04, 0x2, 0x804db40, 0xbfbfeb0c, 0, 0) = 0
  3906      1 ls       issetugid()                 = 0
  3906      1 ls       ioctl(0x1, TIOCGETA, 0xbfbfeb00) = 0
       "\^B+\0\0\a\0\0\0\0K\0\0\M-K\^E\0 \^D\M^?\M^?\^?\^W\^U\^R\M^?\^C\^\\^Z\^Y\^Q\^S\^V\^O\^A\0\^T\M^?\0\M^V\0\0\0\M^V\0\0"
  3906      1 ls       ioctl(0x1, TIOCGWINSZ, 0xbfbfeb64) = 0
       ",\0\M^Q\0\M-{\^CM\^B"
  3906      1 ls       getuid()                    = 0
  3906      1 ls       __sysctl(0xbfbfe664, 0x2, 0xbbbcea9c, 0xbfbfe670, 0, 0) = 0
  3906      1 ls       __sysctl(0xbfbfe578, 0x2, 0xbbbd5780, 0xbfbfe580, 0, 0) = 0
  3906      1 ls       readlink("/usr/lib/libc.so.12", 0xbfbfe675, 0x400) Err#2 ENOENT
  3906      1 ls       issetugid()                 = 0
  3906      1 ls       break(0x804dd18)            = 0
  3906      1 ls       break(0x804dd18)            = 0
  3906      1 ls       break(0x8100000)            = 0
  3906      1 ls       mmap(0, 0x100000, 0x3, 0x14001002, 0xffffffff, 0, 0, 0) = 0xbb900000
  3906      1 ls       __stat30("/etc/malloc.conf", 0xbfbfea54) = 0
  3906      1 ls       open("/kern", 0, 0)         = 3
  3906      1 ls       fcntl(0x3, 0x2, 0x1)        = 0
  3906      1 ls       fchdir(0x3)                 = 0
  3906      1 ls       open(".", 0x4, 0)           = 5
  3906      1 ls       fcntl(0x5, 0x2, 0x1)        = 0
  3906      1 ls       __fstat30(0x5, 0xbfbfe140)  = 0
  3906      1 ls       fstatvfs1(0x5, 0xbfbfe1a4, 0x2) = 0
  3906      1 ls       lseek(0x5, 0, 0, 0, 0x1)    = 0
  3906      1 ls       __getdents30(0x5, 0xbb90d000, 0x1000) Err#9 EBADF
  3906      1 ls       close(0x5)                  = 0
  3906      1 ls       open("/kern", 0x4, 0xbfbfea68) = 5
  3906      1 ls       fcntl(0x5, 0x2, 0x1)        = 0
  3906      1 ls       __fstat30(0x5, 0xbfbfe130)  = 0
  3906      1 ls       fstatvfs1(0x5, 0xbfbfe194, 0x2) = 0
  3906      1 ls       __fstat30(0x5, 0xbfbfe9e8)  = 0
  3906      1 ls       fchdir(0x5)                 = 0
  3906      1 ls       lseek(0x5, 0, 0, 0, 0x1)    = 0
  3906      1 ls       __getdents30(0x5, 0xbb90d000, 0x1000) Err#9 EBADF
  3906      1 ls       close(0x5)                  = 0
  3906      1 ls       fchdir(0x3)                 = 0
  3906      1 ls       fchdir(0x3)                 = 0
  3906      1 ls       close(0x3)                  = 0
  3906      1 ls       exit(0)   

However it is not persistent error.
After some tries error hides:

$ ls /kern  
boottime  copyright hostname  hz        ipsecsa   ipsecsp   loadavg   msgbuf    pagesize  physmem   rootdev   rrootdev  time      version

>How-To-Repeat:
mount kernfs and try to read /kern directory
>Fix:
N/A

>Audit-Trail:
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org, netbsd-bugs@NetBSD.org
Subject: Re: kern/38265: sometimes /kern directory can not be read
Date: Thu, 20 Mar 2008 21:16:59 +0100

 On Thu, Mar 20, 2008 at 09:30:00AM +0000, dlagno@rambler.ru wrote:
 > >Number:         38265
 > >Category:       kern
 > >Synopsis:       sometimes /kern directory can not be read
 > >Confidential:   no
 > >Severity:       non-critical
 > >Priority:       medium
 > >Responsible:    kern-bug-people
 > >State:          open
 > >Class:          sw-bug
 > >Submitter-Id:   net
 > >Arrival-Date:   Thu Mar 20 09:30:00 +0000 2008
 > >Originator:     Denis Lagno
 > >Release:        4.99.55
 > >Organization:
 > >Environment:
 > NetBSD flam.gado 4.99.55 NetBSD 4.99.55 (FLAM) #0: Wed Mar 19 19:10:28 MSK 2008  dina@flam.gado:/volatile/worksrc/netbsd-current/obj/sys/arch/i386/compile/FLAM i386
 > >Description:
 > kernfs is mounted:
 > 
 > $ mount | grep kern
 > kernfs on /kern type kernfs (read-only, local)
 > 
 > $ cat /etc/fstab | grep kern                                                                                  
 > none                                                    /kern                                                                                                   kernfs  ro                                                                                                              0       0
 > 
 > individual files can be read:
 > 
 > $ cat /kern/hz          
 > 100
 > 
 > However directory /kern sometimes fails to be read:

 FWIW I also see this on some systems:
 rap:/usr/home/bouyer>ls /kern/
 rap:/usr/home/bouyer>ls /kern/xen
 privcmd   xenbus    xsd_port
 rap:/usr/home/bouyer>

  29742      1 ls       CALL  open(0x7f7ffd808c00,4,0)
  29742      1 ls       NAMI  "/kern"
  29742      1 ls       RET   open 4
  29742      1 ls       CALL  fcntl(4,2,1)
  29742      1 ls       RET   fcntl 0
  29742      1 ls       CALL  __fstat30(4,0x7f7fffffd030)
  29742      1 ls       RET   __fstat30 0
  29742      1 ls       CALL  __sysctl(0x7f7fffffcfc0,2,0x7f7ffdb0e080,0x7f7fffff
 cfb8,0,0)
  29742      1 ls       RET   __sysctl 0
  29742      1 ls       CALL  fstatvfs1(4,0x7f7fffffd0c0,2)
  29742      1 ls       RET   fstatvfs1 0
  29742      1 ls       CALL  lseek(4,0,0,1)
  29742      1 ls       RET   lseek 0
  29742      1 ls       CALL  __getdents30(4,0x7f7ffd81e000,0x1000)
  29742      1 ls       RET   __getdents30 -1 errno 9 Bad file descriptor
  29742      1 ls       CALL  close(4)
  29742      1 ls       RET   close 0

 -- 
 Manuel Bouyer <bouyer@antioche.eu.org>
      NetBSD: 26 ans d'experience feront toujours la difference
 --

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: makoto@ki.nu
Subject: Re: kern/38265: was: Re: kern/38778: /kern files do not show up
	anymore after quite some time]
Date: Sun, 18 Jan 2009 00:09:22 +0000

 The following didn't make it to gnats. (Things need to be sent to
 gnats-bugs as well as or instead of netbsd-bugs.)

 Also, 38778 was closed as a duplicate of 38265.

    ------

 From: Makoto Fujiwara <makoto@ki.nu>
 To: netbsd-bugs@netbsd.org
 Cc: ad@netbsd.org, kern-bug-people@netbsd.org
 Subject: Re: kern/38778: /kern files do not show up anymore after quite some
 	time
 Date: Wed, 14 Jan 2009 09:59:54 +0900

 | On a freshly booted system there is no problems to see files under /kern;
 | after quite some time/use (cvs up src + kerne build), they do not show
 | up anymore but remain available.

   This symptom may be triggered intensive file access, and in
 my case rsync -aH /m/NFS-HOST/ /local-dir/ with having
 /m/NFS-HOST at large size. It is really 180GB for here.
 It only needs two minutes to reproduce the problem.
  But just find src -newer tarball.tgz or find src tarball.tgz may
 be the key. 

  I have tracked down this problem, and it seems to me
 that the changes in the range of
   2008/01/24 17:30 UTC - 2008/01/24 18:00 UTC
 is the key to this behavior.

 Also when I rebooted at 2008-01-24 18:00 kernel, I was getting
 following problem.
    unmounting file systems...panic: unmount: dangling vnode

 In this range: following commit seems to be very suspicious:
  http://mail-index.netbsd.org/source-changes/2008/01/24/msg001255.html

 Related mail:
  http://mail-index.netbsd.org/current-users/2008/03/25/msg001514.html

 Thanks:
 ---
 Makoto Fujiwara,

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org
Subject: Re: kern/38265: sometimes /kern directory can not be read
Date: Sun, 15 Feb 2009 08:53:22 +0000

 On Thu, Mar 20, 2008 at 09:30:00AM +0000, dlagno@rambler.ru wrote:
  > individual files can be read:
  > 
  > $ cat /kern/hz          
  > 100
  > 
  > However directory /kern sometimes fails to be read:

 mrg says that the problem is that /kern/rootdev and /kern/rrootdev
 don't work unless the real root device vnodes from /dev are in the
 vnode cache, and thus ls -l in /dev will cause /kern to reappear.

 This worked just now for me on a 5.99.7 box but not on a 4.99.72 box,
 so it may not be the only problem but it's probably at least part of
 the problem.

 -- 
 David A. Holland
 dholland@netbsd.org

From: Makoto Fujiwara <makoto@ki.nu>
To: David Holland <dholland-bugs@NetBSD.org>
Cc: kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org, netbsd-bugs@NetBSD.org,
        dlagno@rambler.ru
Subject: Re: kern/38265: sometimes /kern directory can not be read
Date: Mon, 16 Feb 2009 11:58:18 +0900

 Hi, I have a machine running 4.99.72 and when I see that machine,
 it does NOT have anything in /kern.

 Then I did 
    ls -l /dev,
 evething seems to be back. Then I did rsync,

    rsync -aH /mounf/NFSserver/home/ /export/home/

 About in 7 minutes, /kern gets empty again. Then I did
    ls -l /dev/wd0a, 
 which is the root device, /kern seems to be back,
 but missing rrootdev. So I did 
    ls -l /dev/rwd0a, 
 then I get everthing now.

 ttyp2:makoto@mini 11:45:07/090216(/dev)> ls -l /kern
 total 15
 -r--r--r--  1 root  wheel        11 Feb 14 18:06 boottime
 -r--r--r--  1 root  wheel       264 Feb 16 11:55 copyright
 -rw-r--r--  1 root  wheel         5 Feb 16 11:55 hostname
 -r--r--r--  1 root  wheel         4 Feb 16 11:55 hz
 -r--r--r--  1 root  wheel        16 Feb 16 11:55 loadavg
 -r--r--r--  1 root  wheel      8176 Feb 16 11:55 msgbuf
 -r--r--r--  1 root  wheel         5 Feb 16 11:55 pagesize
 -r--r--r--  1 root  wheel         7 Feb 16 11:55 physmem
 brw-r-----  1 root  operator  10, 0 Sep 30  2006 rootdev
 crw-r-----  1 root  operator  30, 0 Sep 30  2006 rrootdev
 -r--r--r--  1 root  wheel        18 Feb 16 11:55 time
 -r--r--r--  1 root  wheel       125 Feb 16 11:55 version

 NetBSD mini 4.99.72 NetBSD 4.99.72 (GENERIC) #1: 
 Tue Nov 11 23:05:54 JST 2008  
 root@bologna:/export/20080908/src/sys/arch/macppc/compile/GENERIC macppc

 Thanks,
  Makoto Fujiwara

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.