NetBSD Problem Report #44722
From www@NetBSD.org Mon Mar 14 15:05:24 2011
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
by www.NetBSD.org (Postfix) with ESMTP id 2335C63B92A
for <gnats-bugs@gnats.NetBSD.org>; Mon, 14 Mar 2011 15:05:24 +0000 (UTC)
Message-Id: <20110314150523.498BF63B874@www.NetBSD.org>
Date: Mon, 14 Mar 2011 15:05:23 +0000 (UTC)
From: pooka@iki.fi
Reply-To: pooka@iki.fi
To: gnats-bugs@NetBSD.org
Subject: ls(1) behaves incorrectly with a low descriptor limit
X-Send-Pr-Version: www-1.0
>Number: 44722
>Category: bin
>Synopsis: ls(1) behaves incorrectly with a low descriptor limit
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: bin-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Mon Mar 14 15:10:00 +0000 2011
>Last-Modified: Sun Sep 18 11:00:04 +0000 2011
>Originator: Antti Kantee
>Release: NetBSD 5.0
>Organization:
>Environment:
>Description:
With a low file descriptor limit ls lists the contents of subdirectories
instead of the current directory.
(i marked the problem as bin, although it might be in fts. feel free
to reclassify)
>How-To-Repeat:
limit descriptors 5
ls
>Fix:
>Audit-Trail:
From: Abhinav Upadhyay <er.abhinav.upadhyay@gmail.com>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: bin/44722
Date: Sat, 17 Sep 2011 23:47:26 +0530
Not only ls(1) but probably a number of other programs are not behaving
correclty under low file descriptor limit.
For example:
$ man ls
.: 4: Invalid argument
The error messages suggest that probably something is going wrong with
the shell. Probably the shell tries to execute the shell built-in '.'
(as observed by the above error message).
It was even more weird to see that ls(1) and man(1) exited with proper
error messages if I tried to do something like this:
$ . ls #supply name of any executable file
.: Cannot execute ELF binary /bin/ls
$ ls
ls: .: Too many open files
$ man sh
.: Can't open /etc/shrc
So the bug probably is in the shell ? The above worked with sh(1) , I
also tried the same with bash(1) but it didn't work on bash.
--
Abhinav
From: Alan Barrett <apb@cequrux.com>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: bin/44722
Date: Sun, 18 Sep 2011 11:14:48 +0200
On Sat, 17 Sep 2011, Abhinav Upadhyay wrote:
> Not only ls(1) but probably a number of other programs are not behaving
> correclty under low file descriptor limit.
> For example:
>
> $ man ls
> .: 4: Invalid argument
I can't replicate that. I get the following results, with everything
either working or giving a reasonable error:
for ENV in '' /etc/shrc ; do
for PAGER in '' cat more ; do
for n in 5 4 3 2 1 ; do
cmd="env - PATH=\\\"\\\$PATH\\\""
cmd="${cmd}${ENV:+ ENV=\"\$ENV\"}"
cmd="${cmd}${PAGER:+ PAGER=\"\$PAGER\"}"
cmd="${cmd} sh -c \\'ulimit -n ${n}\; man ls\\'"
[ -z "$PAGER" ] && cmd="${cmd} \| cat"
eval echo "\$ ${cmd}"
eval eval "${cmd}"
done
echo
done
done
$ env - PATH="$PATH" sh -c 'ulimit -n 5; man ls' | cat
[works]
$ env - PATH="$PATH" sh -c 'ulimit -n 4; man ls' | cat
[works]
$ env - PATH="$PATH" sh -c 'ulimit -n 3; man ls' | cat
Shared object "libutil.so.7" not found
$ env - PATH="$PATH" sh -c 'ulimit -n 2; man ls' | cat
Shared object "libutil.so.7" not found
$ env - PATH="$PATH" sh -c 'ulimit -n 1; man ls' | cat
Shared object "libutil.so.7" not found
$ env - PATH="$PATH" PAGER=cat sh -c 'ulimit -n 5; man ls'
[works]
$ env - PATH="$PATH" PAGER=cat sh -c 'ulimit -n 4; man ls'
[works]
$ env - PATH="$PATH" PAGER=cat sh -c 'ulimit -n 3; man ls'
Shared object "libutil.so.7" not found
$ env - PATH="$PATH" PAGER=cat sh -c 'ulimit -n 2; man ls'
Shared object "libutil.so.7" not found
$ env - PATH="$PATH" PAGER=cat sh -c 'ulimit -n 1; man ls'
Shared object "libutil.so.7" not found
$ env - PATH="$PATH" PAGER=more sh -c 'ulimit -n 5; man ls'
[works]
$ env - PATH="$PATH" PAGER=more sh -c 'ulimit -n 4; man ls'
/usr/share/man/cat1/ls.0: Too many open files
[error message is from more(1)]
$ env - PATH="$PATH" PAGER=more sh -c 'ulimit -n 3; man ls'
Shared object "libutil.so.7" not found
$ env - PATH="$PATH" PAGER=more sh -c 'ulimit -n 2; man ls'
Shared object "libutil.so.7" not found
$ env - PATH="$PATH" PAGER=more sh -c 'ulimit -n 1; man ls'
Shared object "libutil.so.7" not found
$ env - PATH="$PATH" ENV=/etc/shrc sh -c 'ulimit -n 5; man ls' | cat
[works]
$ env - PATH="$PATH" ENV=/etc/shrc sh -c 'ulimit -n 4; man ls' | cat
[works]
$ env - PATH="$PATH" ENV=/etc/shrc sh -c 'ulimit -n 3; man ls' | cat
Shared object "libutil.so.7" not found
$ env - PATH="$PATH" ENV=/etc/shrc sh -c 'ulimit -n 2; man ls' | cat
Shared object "libutil.so.7" not found
$ env - PATH="$PATH" ENV=/etc/shrc sh -c 'ulimit -n 1; man ls' | cat
Shared object "libutil.so.7" not found
$ env - PATH="$PATH" ENV=/etc/shrc PAGER=cat sh -c 'ulimit -n 5; man ls'
[works]
$ env - PATH="$PATH" ENV=/etc/shrc PAGER=cat sh -c 'ulimit -n 4; man ls'
[works]
$ env - PATH="$PATH" ENV=/etc/shrc PAGER=cat sh -c 'ulimit -n 3; man ls'
Shared object "libutil.so.7" not found
$ env - PATH="$PATH" ENV=/etc/shrc PAGER=cat sh -c 'ulimit -n 2; man ls'
Shared object "libutil.so.7" not found
$ env - PATH="$PATH" ENV=/etc/shrc PAGER=cat sh -c 'ulimit -n 1; man ls'
Shared object "libutil.so.7" not found
$ env - PATH="$PATH" ENV=/etc/shrc PAGER=more sh -c 'ulimit -n 5; man ls'
[works]
$ env - PATH="$PATH" ENV=/etc/shrc PAGER=more sh -c 'ulimit -n 4; man ls'
/usr/share/man/cat1/ls.0: Too many open files
[error message is from more(1)]
$ env - PATH="$PATH" ENV=/etc/shrc PAGER=more sh -c 'ulimit -n 3; man ls'
Shared object "libutil.so.7" not found
$ env - PATH="$PATH" ENV=/etc/shrc PAGER=more sh -c 'ulimit -n 2; man ls'
Shared object "libutil.so.7" not found
$ env - PATH="$PATH" ENV=/etc/shrc PAGER=more sh -c 'ulimit -n 1; man ls'
Shared object "libutil.so.7" not found
> The error messages suggest that probably something is going wrong with
> the shell. Probably the shell tries to execute the shell built-in '.'
> (as observed by the above error message).
I can get error messages similar to that from the shell, but not using
the "man ls" command. For example:
$ env - PATH="$PATH" sh -c 'ulimit -n 4 ; . /etc/shrc'
.: 3: Invalid argument
The error message could be better, but otherwise the behaviour
seems reasonable.
> It was even more weird to see that ls(1) and man(1) exited with proper
> error messages if I tried to do something like this:
>
> $ . ls #supply name of any executable file
> .: Cannot execute ELF binary /bin/ls
>
> $ ls
> ls: .: Too many open files
>
> $ man sh
> .: Can't open /etc/shrc
Those error messages seem reasonable to me.
--apb (Alan Barrett)
From: Alan Barrett <apb@cequrux.com>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: bin/44722
Date: Sun, 18 Sep 2011 11:55:31 +0200
> With a low file descriptor limit ls lists the contents of
> subdirectories instead of the current directory.
I can replicate this. It's difficult to debug, because if you make
the file descriptor limit low enough to exhibit the problem, then gdb
doesn't work properly.
You can simulate the prioblem by making the __opendir2 call in line
654 of ftc.s fail on the first call and succeed on the second call, as
follows:
cd src/bin/ls ;
cp ../../lib/libc/gen/fts.c ./fts.c ;
patch ./fts.c <<'ENDPATCH'
--- ../../lib/libc/gen/fts.c
+++ ./fts.c
@@ -42,7 +42,6 @@
#endif
#endif /* LIBC_SCCS and not lint */
-#include "namespace.h"
#include <sys/param.h>
#include <sys/stat.h>
@@ -651,7 +650,17 @@
#else
#define __opendir2(path, flag) opendir(path)
#endif
- if ((dirp = __opendir2(cur->fts_accpath, oflag)) == NULL) {
+ {
+ static int ncalls = 0;
+
+ if (++ncalls == 1) {
+ errno = EMFILE;
+ dirp = NULL;
+ } else {
+ dirp = __opendir2(cur->fts_accpath, oflag);
+ }
+ }
+ if (dirp == NULL) {
if (type == BREAD) {
cur->fts_info = FTS_DNR;
cur->fts_errno = errno;
ENDPATCH
patch ./Makefile <<'ENDPATCH'
--- Makefile
+++ Makefile
@@ -2,7 +2,9 @@
# @(#)Makefile 8.1 (Berkeley) 6/2/93
PROG= ls
-SRCS= cmp.c ls.c main.c print.c util.c
+SRCS= cmp.c ls.c main.c print.c util.c fts.c
+
+COPTS= -O0 -g
LDADD+= -lutil
DPADD+= ${LIBUTIL}
ENDPATCH
Now you can build a modified version of ls, and use gdb to debug it.
The call to fts_children() in line 417 of the traverse() function in
ls.c returns NULL.
The comment in lines 496 to 502 of ls.c says "We ignore the error case
since it will be replicated on the next call to fts_read()", but in
fact the error case is not replicated later.
The test in line 503 of the display() function in ls.c just returns.
The loop beginning lin line 429 of the traverse() function in ls.c
ends up printing the wrong information.
It's certainly wrong to assume that the same error will occur twice,
but it's not yet clear to me whether the bug is in fts_read() or in
traverse() or in display().
Perhaps fts_read() should have noticed the "cur->fts_info = FTS_DNR"
status that was stored by the earlier fts_build() (line 665 of the
patched fts.c).
Perhaps traverse() should have noticed the error return from
fts_children() in line 417 of ls.c, and should have reported the error
without performing the fts_read() loop in line 429 of ls.c.
Perhaps display() should have reported the error before returning in
line 504 of ls.c.
--apb (Alan Barrett)
From: Abhinav Upadhyay <er.abhinav.upadhyay@gmail.com>
To: gnats-bugs@netbsd.org
Cc: gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, pooka@iki.fi
Subject: Re: bin/44722
Date: Sun, 18 Sep 2011 16:24:59 +0530
On Sun, Sep 18, 2011 at 2:45 PM, Alan Barrett <apb@cequrux.com> wrote:
> =A0On Sat, 17 Sep 2011, Abhinav Upadhyay wrote:
> =A0> Not only ls(1) but probably a number of other programs are not behav=
ing
> =A0> correclty under low file descriptor limit.
> =A0> For example:
> =A0>
> =A0> $ man ls
> =A0> .: 4: Invalid argument
>
> =A0I can't replicate that. =A0I get the following results, with everythin=
g
> =A0either working or giving a reasonable error:
I also see man(1) working properly if I use the above script. But I
did something like this:
$ ulimit -n 5
$ man ls
.: 4: Invalid argument
Fair enough. I didn't perform a thorough enough check like you did,
most probably my environment had to do something with the weird
results I got.
> =A0> It was even more weird to see that ls(1) and man(1) exited with prop=
er
> =A0> error messages if I tried to do something like this:
> =A0>
> =A0> $ . ls #supply name of any executable file
> =A0> .: Cannot execute ELF binary /bin/ls
> =A0>
> =A0> $ ls
> =A0> ls: .: Too many open files
> =A0>
> =A0Those error messages seem reasonable to me.
My point with the above example was that, if you have a low file
descriptor limit, then ls(1) lists all the sub-directories, but if you
execute the above set of commands sequentially, ls(1) does give a sane
message and exits. Probably man(1) is irrelevant here. I mixed it up.
--
Abhinav
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.