NetBSD Problem Report #45305

From www@NetBSD.org  Mon Aug 29 08:25:34 2011
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by www.NetBSD.org (Postfix) with ESMTP id 2644C63C789
	for <gnats-bugs@gnats.NetBSD.org>; Mon, 29 Aug 2011 08:25:34 +0000 (UTC)
Message-Id: <20110829082533.384B763C0E2@www.NetBSD.org>
Date: Mon, 29 Aug 2011 08:25:33 +0000 (UTC)
From: kretschm@cs.uni-bonn.de
Reply-To: kretschm@cs.uni-bonn.de
To: gnats-bugs@NetBSD.org
Subject: umount says device busy without any process having current directory in the mount or file open
X-Send-Pr-Version: www-1.0

>Number:         45305
>Category:       kern
>Synopsis:       umount says device busy without any process having current directory in the mount or file open
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Aug 29 08:30:00 +0000 2011
>Last-Modified:  Tue Oct 04 08:45:01 +0000 2011
>Originator:     Matthias Kretschmer
>Release:        NetBSD 5.1_STABLE (July, 27 2011)
>Organization:
>Environment:
NetBSD blavet.barock.local 5.1_STABLE NetBSD 5.1_STABLE (BLAVET) #0: Sun Aug 28 09:06:28 CEST 2011 root@telemann.barock.local:/usr/src/sys/arch/macppc/compile/obj/BLAVET macppc
>Description:
I'm having trouble unmounting file-systems.  So far occurred with null and nfs mounts.  To ensure that I do not have a file open on the mount or some process having such a mount as current working directory, I killed every process except init and login, performed a clean login using root and then tried to unmount it without success ("Device busy").  Sending SIGHUP to init didn't help either.  So far I was only able to unmount the file-system by performing a reboot.

I did not found out so far why this happens.  It seems only to happen, after I have a lot of file access, creation, or modification on the mounts.  It happened to the mounts of a file-system I chroot to for building packages.  I use three mounts, an NFS /netboot from a NetBSD 5.1_STABLE machine, /netboot/tmp which is a null-mount from /tmp and /netboot/wrkobjdir which is a null-mount from local ffs.  So far I had the problem with the NFS-mount and /netboot/wrkobjdir after compiling multiple packages.  Small packages don't seem to create such a problem.  As mentioned above I do a chroot /netboot and then start compiling packages from pkgsrc.  /netboot/wrkobjdir is written to and read from.  /netboot is primarily read from as WRKOBJDIR is /netboot/wrkobjdir.  As it already happened to both mount points I do not see that heavy write access is required for this to happen.

The kernel "BLAVET" I use on this box, is just a GENERIC kernel that includes FFS_EI option and is stripped by all devices not available on the iBook it runs on.
>How-To-Repeat:
A lot of file system access seems to cause this problem.  It seems, that heavy-writing is not required, as it already happened at a point where I only wrote a new pkgchk.conf on the not-unmountable file-system and did not need to require any tarball.
>Fix:

>Audit-Trail:
From: matthew green <mrg@eterna.com.au>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
    netbsd-bugs@netbsd.org
Subject: re: kern/45305: umount says device busy without any process having current directory in the mount or file open
Date: Mon, 29 Aug 2011 20:22:20 +1000

 what does "fstat" say in this case?  it should report any thing that
 is showing an open filesystem.  also, for the killed everything case,
 what processes are running?

 also, what about "mount -a"?  fstat doesn't report nullfs mounts that
 keep a filesystem busy.

 this should help track down what is holding the fs open.


 .mrg.

From: Matthias Kretschmer <kretschm@cs.uni-bonn.de>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org
Subject: Re: kern/45305: umount says device busy without any process having current directory in the mount or file open
Date: Mon, 29 Aug 2011 13:35:46 +0200

 Hello,


 On Mon, Aug 29, 2011 at 10:25:02AM +0000, matthew green wrote:
 >  what does "fstat" say in this case?  it should report any thing that
 >  is showing an open filesystem.  also, for the killed everything case,
 >  what processes are running?

 fstat: the only listed stuff is related to / and no other mount point

 Running processes: init, one login, one /bin/sh (fresh root login).
 If I only have those processes running, I would assume that I'm able
 to unmount everything except /?


 >  also, what about "mount -a"?  fstat doesn't report nullfs mounts that
 >  keep a filesystem busy.

 As I required to use the corresponding computer, I need to make it
 behave like this again.  I'll try mount -a and then report, but don't
 understand currently why that should give some insight.  The problem
 mounts are listed as noauto mounts in fstab and thus I assume mount -a
 should do anything about them.

 The last time, the nfs-mount got stuck.  The two null-mounts were
 unmounted successfully.


 --
 Matthias Kretschmer

From: Matthias Kretschmer <kretschm@cs.uni-bonn.de>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org
Subject: Re: kern/45305: umount says device busy without any process having current directory in the mount or file open
Date: Tue, 6 Sep 2011 07:21:46 +0200

 Hello,

 it got it again ...

 On Mon, Aug 29, 2011 at 10:25:02AM +0000, matthew green wrote:
 >  what does "fstat" say in this case?  it should report any thing that
 >  is showing an open filesystem.  also, for the killed everything case,
 >  what processes are running?

 # fstat
 USER     CMD          PID   FD MOUNT       INUM MODE         SZ|DV R/W
 root     fstat      18085   wd /          31885 drwxr-xr-x     512 r 
 root     fstat      18085    0 /          53128 crw-------   ttyE0 rw
 root     fstat      18085    1 /          31999 -rw-r--r--       0 w 
 root     fstat      18085    2 /          53128 crw-------   ttyE0 rw
 root     fstat      18085    3 /          53828 crw-r-----     mem r 
 root     fstat      18085    4 /          53827 crw-r-----    kmem r 
 root     fstat      18085    5 /          53826 crw-r-----    drum r 
 root     fstat      18085    6 /            377 -rw-------   40960 r 
 root     sh          4953   wd /          31885 drwxr-xr-x     512 r 
 root     sh          4953    0 /          53128 crw-------   ttyE0 rw
 root     sh          4953    1 /          53128 crw-------   ttyE0 rw
 root     sh          4953    2 /          53128 crw-------   ttyE0 rw
 root     sh          4953  127 /          53833 crw-rw-rw-     tty rw
 root     login      23308   wd /          31885 drwxr-xr-x     512 r 
 root     login      23308    0 /          53128 crw-------   ttyE0 rw
 root     login      23308    1 /          53128 crw-------   ttyE0 rw
 root     login      23308    2 /          53128 crw-------   ttyE0 rw
 root     init           1   wd /              2 drwxr-xr-x     512 r 
 root     system         0   wd /              2 drwxr-xr-x     512 r 

 # ps ax
   PID TTY   STAT     TIME COMMAND
     0 ?     DKl  10:46.21 [system]
     1 ?     Is    0:00.71 init 
  4953 ttyE0 S     0:02.82 -sh 
 23308 ttyE0 Is    0:00.54 login 
 25680 ttyE0 O+    0:00.01 ps -ax 

 # umount /netboot
 umount: /netboot: Device busy


 So as you can see, there is no process running which has any access
 to /netboot, even though it cannot be unmounted.
 The corresponding /netboot entry from fstab:
 <host>:/home/macppc/root  /netboot  nfs  rw,noauto


 --
 Matthias Kretschmer

From: christos@zoulas.com (Christos Zoulas)
To: Matthias Kretschmer <kretschm@cs.uni-bonn.de>, gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, 
	netbsd-bugs@netbsd.org
Subject: Re: kern/45305: umount says device busy without any process having current directory in the mount or file open
Date: Tue, 6 Sep 2011 07:01:56 -0400

 On Sep 6,  7:21am, kretschm@cs.uni-bonn.de (Matthias Kretschmer) wrote:
 -- Subject: Re: kern/45305: umount says device busy without any process havin

 | Hello,
 | 
 | it got it again ...

 1. fstat will not show unix sockets open in the filesystem. I have updated
 fstat on head to mention this, as well as to print the pathnames for open
 sockets when no arguments are given.
 2. Can you compile and run lsof from pkgsrc? what does that say?

 christos

From: Matthias Kretschmer <kretschm@cs.uni-bonn.de>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org
Subject: Re: kern/45305: umount says device busy without any process having current directory in the mount or file open
Date: Wed, 7 Sep 2011 08:43:03 +0200

 On Tue, Sep 06, 2011 at 11:05:06AM +0000, Christos Zoulas wrote:
 >  1. fstat will not show unix sockets open in the filesystem. I have updated
 >  fstat on head to mention this, as well as to print the pathnames for open
 >  sockets when no arguments are given.

 Is the information of the more modern fstat important?  Then I could try
 to update, but due to space limitations on my iBook where I have the
 problem I would prefer to avoid that.

 >  2. Can you compile and run lsof from pkgsrc? what does that say?

 See the output below.


 I'm quite sure I can reproduce this now for the NFS mount point.  The
 steps are as follows:
 - mount /netboot (nfs), /netboot/tmp (nullfs from tmpfs),
   /netboot/wrkobjdir (nullfs from ffs)
 - chroot to /netboot (/netboot contains a complete macppc tree which is
   used to netboot from)
 - perform in the chroot
     # pkg_delete -r xulrunner
     # packageall (some script which ensures that only the binary
                   packages of installed packages are available)
     # pkg_chk -a -s
       (building devel/xulrunner & www/firefox from HEAD)
     # packageall (again ...)
 - outside the chroot I run two scripts pkg_sync & pkg_markauto which
   ensure that all binary packages from the chroot and only those are
   installed in / and copy the automatic installed information from the
   chroot.

 So the interaction with the /netboot-NFS-mount-point is basically
 removing xulrunner and firefox packages, installing them again, and
 creating a few (5) intermediate files with lists of installed packages
 and listings of tar-balls available at /usr/pkgsrc/packages/All.


 I had the same problem with /netboot/wrkobjdir but happened only after
 compiling 100+ packages and I prefer not to do that at the moment.


 The requested outputs of lsof and mount/umount (I do not repost fstat as
 there is no change as is the case for ps ax):

 # lsof | grep netboot
 # lsof > lsof.txt
 # cat lsof.txt
 COMMAND   PID USER   FD   TYPE DEVICE SIZE/OFF   NODE NAME
 system      0 root  cwd   VDIR   10,0      512      2 /
 init        1 root  cwd   VDIR   10,0      512      2 /
 init        1 root  txt   VREG   10,0    28026  31919 /sbin/init
 init        1 root  txt   VREG   10,0  1305621  21252 /lib/libc.so.12.164
 init        1 root  txt   VREG   10,0    29811  21255 /lib/libcrypt.so.0.2
 init        1 root  txt   VREG   10,0    93732  21288 /lib/libutil.so.7.15
 init        1 root  txt   VREG   10,0    79436  31882 /libexec/ld.elf_so
 login   11125 root  cwd   VDIR   10,0      512  31885 /root
 login   11125 root  txt   VREG   10,4    22540 569731 /usr/bin/login
 login   11125 root  txt   VREG   10,4    11032 114285 /usr/lib/security/pam_lastlog.so.1
 login   11125 root  txt   VREG   10,4    10212 114286 /usr/lib/security/pam_login_access.so.1
 login   11125 root  txt   VREG   10,4     7504 114292 /usr/lib/security/pam_securetty.so.1
 login   11125 root  txt   VREG   10,4    67132 114165 /usr/lib/librpcsvc.so.0.0
 login   11125 root  txt   VREG   10,4    15548 114296 /usr/lib/security/pam_unix.so.1
 login   11125 root  txt   VREG   10,4    23653 114090 /usr/lib/libkafs.so.9.0
 login   11125 root  txt   VREG   10,4     7742 114275 /usr/lib/security/pam_afslog.so.1
 login   11125 root  txt   VREG   10,4   295015 114063 /usr/lib/libhx509.so.2.0
 login   11125 root  txt   VREG   10,0  1982775  21258 /lib/libcrypto.so.4.2
 login   11125 root  txt   VREG   10,4     8405 114018 /usr/lib/libcom_err.so.5.0
 login   11125 root  txt   VREG   10,4    75627 114162 /usr/lib/libroken.so.13.0
 login   11125 root  txt   VREG   10,4   550706 113998 /usr/lib/libasn1.so.7.0
 login   11125 root  txt   VREG   10,4   479989 114093 /usr/lib/libkrb5.so.22.0
 login   11125 root  txt   VREG   10,4    22511 114283 /usr/lib/security/pam_krb5.so.1
 login   11125 root  txt   VREG   10,4    30129 114237 /usr/lib/libskey.so.1.0
 login   11125 root  txt   VREG   10,4     6941 114294 /usr/lib/security/pam_skey.so.1
 login   11125 root  txt   VREG   10,4     7716 114287 /usr/lib/security/pam_nologin.so.1
 login   11125 root  txt   VREG   10,4     6134 114293 /usr/lib/security/pam_self.so.1
 login   11125 root  txt   VREG   10,0  1305621  21252 /lib/libc.so.12.164
 login   11125 root  txt   VREG   10,4    37846 114129 /usr/lib/libpam.so.1.0
 login   11125 root  txt   VREG   10,0    29811  21255 /lib/libcrypt.so.0.2
 login   11125 root  txt   VREG   10,0    93732  21288 /lib/libutil.so.7.15
 login   11125 root  txt   VREG   10,0    79436  31882 /libexec/ld.elf_so
 login   11125 root    0u  VCHR   35,0  0t17908  53128 /dev/ttyE0
 login   11125 root    1u  VCHR   35,0  0t17908  53128 /dev/ttyE0
 login   11125 root    2u  VCHR   35,0  0t17908  53128 /dev/ttyE0
 lsof    18029 root  cwd   VDIR   10,0      512  32000 /root/pr-infos
 lsof    18029 root  txt   VREG   10,4   129888 231925 /usr/pkg/sbin/lsof
 lsof    18029 root  txt   VREG   10,0  1305621  21252 /lib/libc.so.12.164
 lsof    18029 root  txt   VREG   10,0    34607  21270 /lib/libkvm.so.5.3
 lsof    18029 root  txt   VREG   10,0    79436  31882 /libexec/ld.elf_so
 lsof    18029 root    0u  VCHR   35,0  0t17908  53128 /dev/ttyE0
 lsof    18029 root    1w  VREG   10,0        0  32002 / (/dev/wd0a)
 lsof    18029 root    2u  VCHR   35,0  0t17908  53128 /dev/ttyE0
 lsof    18029 root    3r  VCHR    2,0      0t0  53828 /dev/mem
 lsof    18029 root    4r  VCHR    2,1      0t0  53827 /dev/kmem
 lsof    18029 root    5r  VCHR    6,0      0t0  53826 /dev/drum
 lsof    18029 root    6r  VCHR   74,0      0t0  53832 /dev/ksyms
 sh      18323 root  cwd   VDIR   10,0      512  32000 /root/pr-infos
 sh      18323 root  txt   VREG   10,0   168953  42526 /bin/sh
 sh      18323 root  txt   VREG   10,0  1305621  21252 /lib/libc.so.12.164
 sh      18323 root  txt   VREG   10,0    15179  21282 /lib/libtermcap.so.0.6
 sh      18323 root  txt   VREG   10,0   147621  21261 /lib/libedit.so.2.11
 sh      18323 root  txt   VREG   10,0    79436  31882 /libexec/ld.elf_so
 sh      18323 root    0u  VCHR   35,0  0t17908  53128 /dev/ttyE0
 sh      18323 root    1u  VCHR   35,0  0t17908  53128 /dev/ttyE0
 sh      18323 root    2u  VCHR   35,0  0t17908  53128 /dev/ttyE0
 sh      18323 root  127u  VCHR    1,0      0t0  53833 /dev/tty

 # cat /proc/mounts
 /dev/wd0a / ffs rw 0 0
 /dev/wd0g /var ffs rw 0 0
 /dev/wd0e /usr ffs rw 0 0
 /dev/cgd0c /home ffs rw 0 0
 tmpfs /tmp tmpfs rw 0 0
 kernfs /kern kernfs rw 0 0
 ptyfs /dev/pts ptyfs rw 0 0
 procfs /proc proc rw 0 0
 <host>:/home/macppc/root /netboot nfs rw 0 0

 # umount -F /netboot
 /home/macppc/root: unmount from /netboot
 umount: /netboot: Device busy.
 # umount -f /netboot
 #


 --
 Matthias Kretschmer

From: christos@zoulas.com (Christos Zoulas)
To: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org, 
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, kretschm@cs.uni-bonn.de
Cc: 
Subject: Re: kern/45305: umount says device busy without any process having current directory in the mount or file open
Date: Wed, 7 Sep 2011 10:23:58 -0400

 On Sep 7,  6:45am, kretschm@cs.uni-bonn.de (Matthias Kretschmer) wrote:
 -- Subject: Re: kern/45305: umount says device busy without any process havin

 | The following reply was made to PR kern/45305; it has been noted by GNATS.
 | 
 | From: Matthias Kretschmer <kretschm@cs.uni-bonn.de>
 | To: gnats-bugs@NetBSD.org
 | Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
 | 	netbsd-bugs@netbsd.org
 | Subject: Re: kern/45305: umount says device busy without any process having current directory in the mount or file open
 | Date: Wed, 7 Sep 2011 08:43:03 +0200
 | 
 |  On Tue, Sep 06, 2011 at 11:05:06AM +0000, Christos Zoulas wrote:
 |  >  1. fstat will not show unix sockets open in the filesystem. I have updated
 |  >  fstat on head to mention this, as well as to print the pathnames for open
 |  >  sockets when no arguments are given.
 |  
 |  Is the information of the more modern fstat important?  Then I could try
 |  to update, but due to space limitations on my iBook where I have the
 |  problem I would prefer to avoid that.

 No, it is not important. The output of lsof should be including all fd's and
 I don't see anything. I will try to replicate the problem.

 christos

From: Matthias Kretschmer <kretschm@cs.uni-bonn.de>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org
Subject: Re: kern/45305: umount says device busy without any process having current directory in the mount or file open
Date: Tue, 27 Sep 2011 07:35:42 +0200

 Hi,

 I just want to add, that now I have data to the problem when even
 the nullfs is not unmountable (or even both nullfs and nfs) ...

 # lsof | grep netboot
 # lsof /netboot/wrkobjdir
 # umount /netboot/wrkobjdir
 umount: /netboot/wrkobjdir: Device busy
 # mount
 /dev/wd0a on / type ffs (local)
 /dev/wd0g on /var type ffs (local)
 /dev/wd0e on /usr type ffs (local)
 /dev/cgd0c on /home type ffs (local)
 tmpfs on /tmp type tmpfs (local)
 kernfs on /kern type kernfs (local)
 ptyfs on /dev/pts type ptyfs (local)
 procfs on /proc type procfs (local)
 pid231@blavet:/net on /net type nfs (hidden)
 pid231@blavet:/media on /media type nfs (hidden)
 nfshost:/home/macppc/root on /netboot type nfs
 /home/wrkobjdir on /netboot/wrkobjdir type null (local)
 # umount -f /netboot/wrkobjdir
 # umount /netboot
 umount: /netboot: Device busy
 # umount -f /netboot
 # 

 --
 Matthias Kretschmer

From: christos@zoulas.com (Christos Zoulas)
To: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org, 
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, kretschm@cs.uni-bonn.de
Cc: 
Subject: Re: kern/45305: umount says device busy without any process having current directory in the mount or file open
Date: Tue, 27 Sep 2011 09:47:15 -0400

 On Sep 27,  5:40am, kretschm@cs.uni-bonn.de (Matthias Kretschmer) wrote:
 -- Subject: Re: kern/45305: umount says device busy without any process havin

 | The following reply was made to PR kern/45305; it has been noted by GNATS.
 | 
 | From: Matthias Kretschmer <kretschm@cs.uni-bonn.de>
 | To: gnats-bugs@NetBSD.org
 | Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
 | 	netbsd-bugs@netbsd.org
 | Subject: Re: kern/45305: umount says device busy without any process having current directory in the mount or file open
 | Date: Tue, 27 Sep 2011 07:35:42 +0200
 | 
 |  Hi,
 |  
 |  I just want to add, that now I have data to the problem when even
 |  the nullfs is not unmountable (or even both nullfs and nfs) ...
 |  
 |  # lsof | grep netboot
 |  # lsof /netboot/wrkobjdir
 |  # umount /netboot/wrkobjdir
 |  umount: /netboot/wrkobjdir: Device busy
 |  # mount
 |  /dev/wd0a on / type ffs (local)
 |  /dev/wd0g on /var type ffs (local)
 |  /dev/wd0e on /usr type ffs (local)
 |  /dev/cgd0c on /home type ffs (local)
 |  tmpfs on /tmp type tmpfs (local)
 |  kernfs on /kern type kernfs (local)
 |  ptyfs on /dev/pts type ptyfs (local)
 |  procfs on /proc type procfs (local)
 |  pid231@blavet:/net on /net type nfs (hidden)
 |  pid231@blavet:/media on /media type nfs (hidden)
 |  nfshost:/home/macppc/root on /netboot type nfs
 |  /home/wrkobjdir on /netboot/wrkobjdir type null (local)
 |  # umount -f /netboot/wrkobjdir
 |  # umount /netboot
 |  umount: /netboot: Device busy
 |  # umount -f /netboot
 |  # 

 Do you have a recipe for recreating the problem?

 christos

From: Matthias Kretschmer <kretschm@cs.uni-bonn.de>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org
Subject: Re: kern/45305: umount says device busy without any process having current directory in the mount or file open
Date: Tue, 27 Sep 2011 16:07:38 +0200

 Hi

 On Tue, Sep 27, 2011 at 01:50:04PM +0000, Christos Zoulas wrote:
 >  Do you have a recipe for recreating the problem?

 no recipe which is quite worth anything.  I'll stumble upon the problem,
 when I update large packages.  I perform compiling of packages inside a
 chroot to not have to touch the current installation.  As this iBook
 with G3 600MHz (where I have this problem) is quite slow and has not
 much disk-space, I perform compiling of packages via a NetBSD/macppc
 installation (same userland) hosted on another box which is mounted to
 /netboot (basically it is the same installation which is netboot-able).
 To speed up compilations I mount_null a /netboot/tmp and
 /netboot/wrkobjdir and set PKGSRC_COMPILER to ccache distcc gcc, where
 the only DISTCC_HOST is the nfs-host.  So only preprocessing is done
 locally.

 The problem seems to always occur when I compile www/firefox (including
 devel/xulrunner) from pkgsrc HEAD (so last time version 6.0.2).  The
 last time, when even the nullfs didn't want to unmount without -f, I was
 additionally building chat/ircII.  At the moment I do not have a clue if
 creating, opening, writing, reading, or chdir'ing causes this problem
 (or a probably parallel combination of all of them).  Even though
 writing a test program wouldn't be much work, I have not found the time
 to write it...

 Are there any counters of open/etc. files for a mount-point?  Am I able
 to output these somehow?  As there is no record of anything accessing
 the corresponding mount-points I'm wondering why the systems thinks it
 is busy.

 --
 Matthias Kretschmer

From: christos@zoulas.com (Christos Zoulas)
To: Matthias Kretschmer <kretschm@cs.uni-bonn.de>, gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, 
	netbsd-bugs@netbsd.org
Subject: Re: kern/45305: umount says device busy without any process having current directory in the mount or file open
Date: Tue, 27 Sep 2011 10:15:51 -0400

 On Sep 27,  4:07pm, kretschm@cs.uni-bonn.de (Matthias Kretschmer) wrote:
 -- Subject: Re: kern/45305: umount says device busy without any process havin

 | Hi
 | 
 | On Tue, Sep 27, 2011 at 01:50:04PM +0000, Christos Zoulas wrote:
 | >  Do you have a recipe for recreating the problem?
 | 
 | no recipe which is quite worth anything.  I'll stumble upon the problem,
 | when I update large packages.  I perform compiling of packages inside a
 | chroot to not have to touch the current installation.  As this iBook
 | with G3 600MHz (where I have this problem) is quite slow and has not
 | much disk-space, I perform compiling of packages via a NetBSD/macppc
 | installation (same userland) hosted on another box which is mounted to
 | /netboot (basically it is the same installation which is netboot-able).
 | To speed up compilations I mount_null a /netboot/tmp and
 | /netboot/wrkobjdir and set PKGSRC_COMPILER to ccache distcc gcc, where
 | the only DISTCC_HOST is the nfs-host.  So only preprocessing is done
 | locally.
 | 
 | The problem seems to always occur when I compile www/firefox (including
 | devel/xulrunner) from pkgsrc HEAD (so last time version 6.0.2).  The
 | last time, when even the nullfs didn't want to unmount without -f, I was
 | additionally building chat/ircII.  At the moment I do not have a clue if
 | creating, opening, writing, reading, or chdir'ing causes this problem
 | (or a probably parallel combination of all of them).  Even though
 | writing a test program wouldn't be much work, I have not found the time
 | to write it...
 | 
 | Are there any counters of open/etc. files for a mount-point?  Am I able
 | to output these somehow?  As there is no record of anything accessing
 | the corresponding mount-points I'm wondering why the systems thinks it
 | is busy.

 I would start adding printfs in the unmount path in vfs_mount.c following
 them down to the one that returns EBUSY...

 christos

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/45305: umount says device busy without any process having
 current directory in the mount or file open
Date: Thu, 29 Sep 2011 00:29:43 +0000

 On Tue, Sep 27, 2011 at 02:10:06PM +0000, Matthias Kretschmer wrote:
  >  Are there any counters of open/etc. files for a mount-point?  Am I able
  >  to output these somehow?  As there is no record of anything accessing
  >  the corresponding mount-points I'm wondering why the systems thinks it
  >  is busy.

 The usual cause of this problem is refcount leaks on vnodes, which are
 a royal pain to track down. Figuring out a sequence of actions that
 causes it makes it much easier, as it allows restricting the search to
 a particular code path.

 I've been thinking about writing vnode refdebug code (similar in
 concept to LOCKDEBUG) but it's not entirely trivial...

 -- 
 David A. Holland
 dholland@netbsd.org

From: Bernd Ernesti <pr200904@veego.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/45305: umount says device busy without any process having current directory in the mount or file open
Date: Tue, 4 Oct 2011 10:43:32 +0200

 On Tue, Sep 27, 2011 at 04:07:38PM +0200, Matthias Kretschmer wrote:
 [..]

 > The problem seems to always occur when I compile www/firefox (including
 > devel/xulrunner) from pkgsrc HEAD (so last time version 6.0.2).  The
 > last time, when even the nullfs didn't want to unmount without -f, I was
 > additionally building chat/ircII.  At the moment I do not have a clue if
 > creating, opening, writing, reading, or chdir'ing causes this problem
 > (or a probably parallel combination of all of them).  Even though
 > writing a test program wouldn't be much work, I have not found the time
 > to write it...

 I think this could be the same issue as I have with pr kern/40472.

 Bernd

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.