NetBSD Problem Report #44722

From www@NetBSD.org  Mon Mar 14 15:05:24 2011
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by www.NetBSD.org (Postfix) with ESMTP id 2335C63B92A
	for <gnats-bugs@gnats.NetBSD.org>; Mon, 14 Mar 2011 15:05:24 +0000 (UTC)
Message-Id: <20110314150523.498BF63B874@www.NetBSD.org>
Date: Mon, 14 Mar 2011 15:05:23 +0000 (UTC)
From: pooka@iki.fi
Reply-To: pooka@iki.fi
To: gnats-bugs@NetBSD.org
Subject: ls(1) behaves incorrectly with a low descriptor limit
X-Send-Pr-Version: www-1.0

>Number:         44722
>Category:       bin
>Synopsis:       ls(1) behaves incorrectly with a low descriptor limit
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    bin-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Mar 14 15:10:00 +0000 2011
>Last-Modified:  Sun Sep 18 11:00:04 +0000 2011
>Originator:     Antti Kantee
>Release:        NetBSD 5.0
>Organization:
>Environment:
>Description:
With a low file descriptor limit ls lists the contents of subdirectories
instead of the current directory.

(i marked the problem as bin, although it might be in fts.  feel free
to reclassify)
>How-To-Repeat:
limit descriptors 5
ls
>Fix:

>Audit-Trail:
From: Abhinav Upadhyay <er.abhinav.upadhyay@gmail.com>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: bin/44722
Date: Sat, 17 Sep 2011 23:47:26 +0530

 Not only ls(1) but probably a number of other programs are not behaving
 correclty under low file descriptor limit.
 For example:

 $ man ls
 .: 4: Invalid argument

 The error messages suggest that probably something is going wrong with
 the shell. Probably the shell tries to execute the shell built-in '.'
 (as observed by the above error message).

 It was even more weird to see that ls(1) and man(1) exited with proper
 error messages if I tried to do something like this:

 $ . ls #supply name of any executable file
 .: Cannot execute ELF binary /bin/ls

 $ ls
 ls: .: Too many open files

 $ man sh
 .: Can't open /etc/shrc

 So the bug probably is in the shell ?  The above worked with sh(1) , I
 also tried the same with bash(1) but it didn't work on bash.

 --
 Abhinav

From: Alan Barrett <apb@cequrux.com>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: bin/44722
Date: Sun, 18 Sep 2011 11:14:48 +0200

 On Sat, 17 Sep 2011, Abhinav Upadhyay wrote:
 > Not only ls(1) but probably a number of other programs are not behaving
 > correclty under low file descriptor limit.
 > For example:
 >
 > $ man ls
 > .: 4: Invalid argument

 I can't replicate that.  I get the following results, with everything
 either working or giving a reasonable error:

 for ENV in '' /etc/shrc ; do
      for PAGER in '' cat more ; do
 	for n in 5 4 3 2 1 ; do
 	    cmd="env - PATH=\\\"\\\$PATH\\\""
 	    cmd="${cmd}${ENV:+ ENV=\"\$ENV\"}"
 	    cmd="${cmd}${PAGER:+ PAGER=\"\$PAGER\"}"
 	    cmd="${cmd} sh -c \\'ulimit -n ${n}\; man ls\\'"
 	    [ -z "$PAGER" ] && cmd="${cmd} \| cat"
 	    eval echo "\$ ${cmd}"
 	    eval eval "${cmd}"
 	done
 	echo
      done
 done

 $ env - PATH="$PATH" sh -c 'ulimit -n 5; man ls' | cat
 [works]
 $ env - PATH="$PATH" sh -c 'ulimit -n 4; man ls' | cat
 [works]
 $ env - PATH="$PATH" sh -c 'ulimit -n 3; man ls' | cat
 Shared object "libutil.so.7" not found
 $ env - PATH="$PATH" sh -c 'ulimit -n 2; man ls' | cat
 Shared object "libutil.so.7" not found
 $ env - PATH="$PATH" sh -c 'ulimit -n 1; man ls' | cat
 Shared object "libutil.so.7" not found

 $ env - PATH="$PATH" PAGER=cat sh -c 'ulimit -n 5; man ls'
 [works]
 $ env - PATH="$PATH" PAGER=cat sh -c 'ulimit -n 4; man ls'
 [works]
 $ env - PATH="$PATH" PAGER=cat sh -c 'ulimit -n 3; man ls'
 Shared object "libutil.so.7" not found
 $ env - PATH="$PATH" PAGER=cat sh -c 'ulimit -n 2; man ls'
 Shared object "libutil.so.7" not found
 $ env - PATH="$PATH" PAGER=cat sh -c 'ulimit -n 1; man ls'
 Shared object "libutil.so.7" not found

 $ env - PATH="$PATH" PAGER=more sh -c 'ulimit -n 5; man ls'
 [works]
 $ env - PATH="$PATH" PAGER=more sh -c 'ulimit -n 4; man ls'
 /usr/share/man/cat1/ls.0: Too many open files
 [error message is from more(1)]
 $ env - PATH="$PATH" PAGER=more sh -c 'ulimit -n 3; man ls'
 Shared object "libutil.so.7" not found
 $ env - PATH="$PATH" PAGER=more sh -c 'ulimit -n 2; man ls'
 Shared object "libutil.so.7" not found
 $ env - PATH="$PATH" PAGER=more sh -c 'ulimit -n 1; man ls'
 Shared object "libutil.so.7" not found

 $ env - PATH="$PATH" ENV=/etc/shrc sh -c 'ulimit -n 5; man ls' | cat
 [works]
 $ env - PATH="$PATH" ENV=/etc/shrc sh -c 'ulimit -n 4; man ls' | cat
 [works]
 $ env - PATH="$PATH" ENV=/etc/shrc sh -c 'ulimit -n 3; man ls' | cat
 Shared object "libutil.so.7" not found
 $ env - PATH="$PATH" ENV=/etc/shrc sh -c 'ulimit -n 2; man ls' | cat
 Shared object "libutil.so.7" not found
 $ env - PATH="$PATH" ENV=/etc/shrc sh -c 'ulimit -n 1; man ls' | cat
 Shared object "libutil.so.7" not found

 $ env - PATH="$PATH" ENV=/etc/shrc PAGER=cat sh -c 'ulimit -n 5; man ls'
 [works]
 $ env - PATH="$PATH" ENV=/etc/shrc PAGER=cat sh -c 'ulimit -n 4; man ls'
 [works]
 $ env - PATH="$PATH" ENV=/etc/shrc PAGER=cat sh -c 'ulimit -n 3; man ls'
 Shared object "libutil.so.7" not found
 $ env - PATH="$PATH" ENV=/etc/shrc PAGER=cat sh -c 'ulimit -n 2; man ls'
 Shared object "libutil.so.7" not found
 $ env - PATH="$PATH" ENV=/etc/shrc PAGER=cat sh -c 'ulimit -n 1; man ls'
 Shared object "libutil.so.7" not found

 $ env - PATH="$PATH" ENV=/etc/shrc PAGER=more sh -c 'ulimit -n 5; man ls'
 [works]
 $ env - PATH="$PATH" ENV=/etc/shrc PAGER=more sh -c 'ulimit -n 4; man ls'
 /usr/share/man/cat1/ls.0: Too many open files
 [error message is from more(1)]
 $ env - PATH="$PATH" ENV=/etc/shrc PAGER=more sh -c 'ulimit -n 3; man ls'
 Shared object "libutil.so.7" not found
 $ env - PATH="$PATH" ENV=/etc/shrc PAGER=more sh -c 'ulimit -n 2; man ls'
 Shared object "libutil.so.7" not found
 $ env - PATH="$PATH" ENV=/etc/shrc PAGER=more sh -c 'ulimit -n 1; man ls'
 Shared object "libutil.so.7" not found

 > The error messages suggest that probably something is going wrong with
 > the shell. Probably the shell tries to execute the shell built-in '.'
 > (as observed by the above error message).

 I can get error messages similar to that from the shell, but not using
 the "man ls" command.  For example:

 $ env - PATH="$PATH" sh -c 'ulimit -n 4 ; . /etc/shrc'
 .: 3: Invalid argument

 The error message could be better, but otherwise the behaviour
 seems reasonable.

 > It was even more weird to see that ls(1) and man(1) exited with proper
 > error messages if I tried to do something like this:
 >
 > $ . ls #supply name of any executable file
 > .: Cannot execute ELF binary /bin/ls
 >
 > $ ls
 > ls: .: Too many open files
 >
 > $ man sh
 > .: Can't open /etc/shrc

 Those error messages seem reasonable to me.

 --apb (Alan Barrett)

From: Alan Barrett <apb@cequrux.com>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: bin/44722
Date: Sun, 18 Sep 2011 11:55:31 +0200

 > With a low file descriptor limit ls lists the contents of
 > subdirectories instead of the current directory.

 I can replicate this.  It's difficult to debug, because if you make
 the file descriptor limit low enough to exhibit the problem, then gdb
 doesn't work properly.

 You can simulate the prioblem by making the __opendir2 call in line
 654 of ftc.s fail on the first call and succeed on the second call, as
 follows:

 cd src/bin/ls ;
 cp ../../lib/libc/gen/fts.c ./fts.c ;
 patch ./fts.c <<'ENDPATCH'
 --- ../../lib/libc/gen/fts.c
 +++ ./fts.c
 @@ -42,7 +42,6 @@
   #endif
   #endif /* LIBC_SCCS and not lint */

 -#include "namespace.h"
   #include <sys/param.h>
   #include <sys/stat.h>

 @@ -651,7 +650,17 @@
   #else
   #define	__opendir2(path, flag) opendir(path)
   #endif
 -	if ((dirp = __opendir2(cur->fts_accpath, oflag)) == NULL) {
 +	{
 +		static int ncalls = 0;
 +
 +		if (++ncalls == 1) {
 +			errno = EMFILE;
 +			dirp = NULL;
 +		} else {
 +			dirp = __opendir2(cur->fts_accpath, oflag);
 +		}
 +	}
 +	if (dirp == NULL) {
   		if (type == BREAD) {
   			cur->fts_info = FTS_DNR;
   			cur->fts_errno = errno;
 ENDPATCH
 patch ./Makefile <<'ENDPATCH'
 --- Makefile
 +++ Makefile
 @@ -2,7 +2,9 @@
   #      @(#)Makefile    8.1 (Berkeley) 6/2/93

   PROG=  ls
 -SRCS=  cmp.c ls.c main.c print.c util.c
 +SRCS=  cmp.c ls.c main.c print.c util.c fts.c
 +
 +COPTS= -O0 -g

   LDADD+=        -lutil
   DPADD+=        ${LIBUTIL}
 ENDPATCH

 Now you can build a modified version of ls, and use gdb to debug it.

 The call to fts_children() in line 417 of the traverse() function in
 ls.c returns NULL.

 The comment in lines 496 to 502 of ls.c says "We ignore the error case
 since it will be replicated on the next call to fts_read()", but in
 fact the error case is not replicated later.

 The test in line 503 of the display() function in ls.c just returns.

 The loop beginning lin line 429 of the traverse() function in ls.c
 ends up printing the wrong information.

 It's certainly wrong to assume that the same error will occur twice,
 but it's not yet clear to me whether the bug is in fts_read() or in
 traverse() or in display().

 Perhaps fts_read() should have noticed the "cur->fts_info = FTS_DNR"
 status that was stored by the earlier fts_build() (line 665 of the
 patched fts.c).

 Perhaps traverse() should have noticed the error return from
 fts_children() in line 417 of ls.c, and should have reported the error
 without performing the fts_read() loop in line 429 of ls.c.

 Perhaps display() should have reported the error before returning in
 line 504 of ls.c.

 --apb (Alan Barrett)

From: Abhinav Upadhyay <er.abhinav.upadhyay@gmail.com>
To: gnats-bugs@netbsd.org
Cc: gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, pooka@iki.fi
Subject: Re: bin/44722
Date: Sun, 18 Sep 2011 16:24:59 +0530

 On Sun, Sep 18, 2011 at 2:45 PM, Alan Barrett <apb@cequrux.com> wrote:

 > =A0On Sat, 17 Sep 2011, Abhinav Upadhyay wrote:
 > =A0> Not only ls(1) but probably a number of other programs are not behav=
 ing
 > =A0> correclty under low file descriptor limit.
 > =A0> For example:
 > =A0>
 > =A0> $ man ls
 > =A0> .: 4: Invalid argument
 >
 > =A0I can't replicate that. =A0I get the following results, with everythin=
 g
 > =A0either working or giving a reasonable error:
 I also see man(1) working properly if I use the above script. But I
 did something like this:

 $ ulimit -n 5
 $ man ls
 .: 4: Invalid argument
 Fair enough. I didn't perform a thorough enough check like you did,
 most probably my environment had to do something with the weird
 results I got.

 > =A0> It was even more weird to see that ls(1) and man(1) exited with prop=
 er
 > =A0> error messages if I tried to do something like this:
 > =A0>
 > =A0> $ . ls #supply name of any executable file
 > =A0> .: Cannot execute ELF binary /bin/ls
 > =A0>
 > =A0> $ ls
 > =A0> ls: .: Too many open files
 > =A0>
 > =A0Those error messages seem reasonable to me.

 My point with the above example was that, if you have a low file
 descriptor limit, then ls(1) lists all the sub-directories, but if you
 execute the above set of commands sequentially, ls(1) does give a sane
 message and exits. Probably man(1) is irrelevant here. I mixed it up.

 --
 Abhinav

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.