NetBSD Problem Report #50627
From paul@pokey.whooppee.com Wed Jan 6 01:40:13 2016
Return-Path: <paul@pokey.whooppee.com>
Received: from mail.netbsd.org (mail.NetBSD.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
by mollari.NetBSD.org (Postfix) with ESMTPS id 6BC947ACBE
for <gnats-bugs@gnats.NetBSD.org>; Wed, 6 Jan 2016 01:40:13 +0000 (UTC)
Message-Id: <20160106013940.503F516E92@pokey.whooppee.com>
Date: Wed, 6 Jan 2016 09:39:40 +0800 (PHT)
From: paul@whooppee.com
Reply-To: paul@whooppee.com
To: gnats-bugs@NetBSD.org
Subject: filemon can hang the process
X-Send-Pr-Version: 3.95
>Number: 50627
>Category: kern
>Synopsis: filemon can hang a process
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: pgoyette
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Jan 06 01:45:00 +0000 2016
>Closed-Date: Mon Jan 11 02:08:28 +0000 2016
>Last-Modified: Mon Jan 11 02:08:28 +0000 2016
>Originator: Paul Goyette
>Release: NetBSD 7.99.25
>Organization:
+------------------+--------------------------+------------------------+
| Paul Goyette | PGP Key fingerprint: | E-mail addresses: |
| (Retired) | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+------------------+--------------------------+------------------------+
>Environment:
System: NetBSD pokey.whooppee.com 7.99.25 NetBSD 7.99.25 (POKEY 2015-12-23 05:05:48) #9: Wed Dec 23 15:10:44 PHT 2015 paul@pokey.whooppee.com:/build/netbsd-local/obj/amd64/sys/arch/amd64/compile/POKEY amd64
Architecture: x86_64
Machine: amd64
>Description:
If the file descriptor on which a process opens /dev/filemon is numerically
larger than the descriptor to which the activity entries are being logged
(as set with ioctl(filemon_fd, FILEMON_SET_FD) call), the monitoring process
will hang when it tries to exit. Process exit code attempts to close all
open file descriptors, in sequence, but closing the fd to which the activity
entries are being written hangs waiting for its reference count to reduce to
zero. An example sequence of events is:
1. Process opens /dev/filemon and gets fd #3
2. Process tells filemon to log activity to fd #1 (stdout)
3. Process calls sys_exit(), which starts process cleanup
4. Clean-up code tries to fd_close all open descriptors, in
order, so handles fd #0 and then fd #1
5. fd #1 has another reference, so we wait on the condvar,
which never gets broadcast since there's no other thread
to run. We hang here forever.
When the activity log file's descriptor is numerically greater than the
fd on which /dev/filemon is open, the filemon descriptor gets closed
first, which removes the additional reference to the log file:
1. Process opens /dev/filemon and gets fd #3
2. Process opens up a temp file (or simply calls dup(stdout))
and gets fd #4; the process tells filemon to log activity
to fd #4
3. Process calls sys_exit(), which starts process cleanup
4. Clean-up code tries to fd_close all open descriptors, in
order, so handles fd #0 and then fd #1
5. In this scenario, fd#1 has no extra references, so it can
close normally.
6. Cleanup proceeds with fd #2, and then gets to fd#3, where
/dev/filemon is open
7. We call filemon_close() which calls fd_putfile() on fd #4.
This removes the additional reference on fd #4
8. Cleanup moves on to fd #4 which now has only a single
reference, so it, too, can be successfully closed!
This bug will be added to the filemon(4) man page.
>How-To-Repeat:
See above.
>Fix:
Fix not yet known. However, you can work-around the problem by using an
atexit() handler to close the filemon device's fd first (or set the log's
fd to an invalid value, which disassociates from the log without setting
up a new one).
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: kern-bug-people->pgoyette
Responsible-Changed-By: pgoyette@NetBSD.org
Responsible-Changed-When: Wed, 06 Jan 2016 01:58:16 +0000
Responsible-Changed-Why:
I'm working on it.
From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/50627: filemon can hang the process
Date: Wed, 06 Jan 2016 13:35:40 +0700
Date: Wed, 6 Jan 2016 01:45:00 +0000 (UTC)
From: paul@whooppee.com
Message-ID: <20160106014500.65AA47A21B@mollari.NetBSD.org>
| However, you can work-around the problem by using an
| atexit() handler to close the filemon device's fd first
In-kernel atexit() could solve this problem, and could also potentially
have uses for other problems, but when things get complex (I don't know
if it is possible, but consider a process using filemon twice, where one
of them is logging changes to a file monitored by the other) the atexit()
order processing is just going to be the same problem all over again -
something will need to make sure those get done in the correct order,
which is not necessarily either establishment order, or the reverse of that.
Since what you have here is a "do it in this order" problem, it seems as
if one of the classic solutions to that problem might be more in order.
One would be to use explicit dependencies - that is what I thought of
first (but...) In your case this would work like
1. Process opens /dev/filemon and gets fd #3
2. Process tells filemon to log activity to fd #1 (stdout)
2a. fd #1 now depends upon fd#3
3. Process calls sys_exit(), which starts process cleanup
4. Clean-up code tries to fd_close all open descriptors, in
order, so handles fd #0 and then fd #1
4a. fd#1 depends upon fd#3, so close fd#3 first, which releases
the ref on fd#1, then return to:
5. fd #1 has another reference, so we wait on the condvar,
which never gets broadcast since there's no other thread
to run. We hang here forever.
which is no longer true, as that other reference has gone away, and the
close of fd#1 succeeds normally. After going past #2 we reach #3
but now that is just an (old) closed descriptor for which there is
nothing to do (just like 4 5 ... presumably)
The problem with this is that it (probably) needs a double linked list
between fds to work properly (whether those are pointers, or just fd
numbers doesn't matter) and then the locking to maintain it all properly
(at close, setup is easy) would get difficult at best I think.
An alternative is just to (separately) maintain a correctly ordered
close list, and always close in that order. For most (non filemon
using, nor any other similar requirement) processes, this list would be
null, and the closes would just happen in numeric order like now.
But when filemon (or anything similar) is in use, the process would be ...
1. Process opens /dev/filemon and gets fd #3
2. Process tells filemon to log activity to fd #1 (stdout)
2a. processes notes that this is its first activity which requires
ordered closes, and so allocates a close order list, and links that to
the process. (This would be an array of size max_fd, of element type
some int big enough to hold a fd - char, short, whatever it takes).
This is init'd with values in ascending integer order (so of itself,
would change nothing).
2b. logging to fd1 makes fd1 depend upon fd3, so fd3 must be closed
before 1, adjust the order of the list so 3 is inserted ahead of 1
(this is a slice and reassemble operation on the array.)
Alt: move 1 after 3.
Which to do depends on whether anything previously depended upon 1 or 3
or whether either of those already depends upon anything else - setting
a flag (on both fds) to indicate an ordering dependency exists, would work,
then error out of this (ie: filemon ioctl fails) if both relevant fds
already have the flag set, otherwise if one does, move the position of
the other (if neither do, toss a coin .. or follow a rule in the code
more likely) - set the flag on both fds when done.
The file flag bits force the (relative) order of those fds to remain
unchanged, so once a dependency is set, it cannot be altered later
(unless a fd is closed of course, that would reset the flag for a later open.)
3. Process calls sys_exit(), which starts process cleanup
3a. process notices that it has a close order list, so it calls
fd_close() on all open descriptors in the order that they appear
on that list, which in this case would be 0 3 1 2 ...
(or perhaps 0 2 3 1 ... it should make no difference as long as
3 gets closed before 1).
3b. skip to step 6 (4 is done already, and 5 never happens)
4. Clean-up code tries to fd_close all open descriptors, in
order, so handles fd #0 and then fd #1
5. fd #1 has another reference, so we wait on the condvar,
which never gets broadcast since there's no other thread
to run. We hang here forever.
There are no locking issues of note with this one (locks are needed, but
they're nothing unusual, and probably already covered by existing locking)
step 2a could be optimised (for space) slightly by making the array only
as big as the number of current open fds (then it would need to be able to
be extended should that be needed later - unlikely but possible) in which
case after 3a we would just continue to do step 4 - the initial group
of fds (those open at the time of 2a) would already be closed, and so just
skipped at step 4, which would then close any fds opened later.
There are most likely other solutions worth investigating before even
contemplating the mechanism of adding some kind of atexit() (for processes,
rather than the kernel itself, of course) to the kernel.
kre
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/50627: filemon can hang the process
Date: Wed, 6 Jan 2016 11:26:38 +0100
On Wed, Jan 06, 2016 at 06:40:01AM +0000, Robert Elz wrote:
> In-kernel atexit() could solve this problem, and could also potentially
> have uses for other problems, but when things get complex (I don't know
> if it is possible, but consider a process using filemon twice, where one
> of them is logging changes to a file monitored by the other) the atexit()
> order processing is just going to be the same problem all over again -
> something will need to make sure those get done in the correct order,
> which is not necessarily either establishment order, or the reverse of that.
I haven't followed the details, so this may be a stupid suggestion: isn't
it enough to do the closing loop several times, using cv_timedwait(.., 1)
and ignoring the fds where it returns EWOULDBLOCK, untill we did not close
any fds in one pass and then switch to cv_wait() in the next pass?
Martin
From: Paul Goyette <paul@whooppee.com>
To: gnats-bugs@NetBSD.org
Cc: gnats-admin@netbsd.org
Subject: Re: kern/50627: filemon can hang the process
Date: Wed, 6 Jan 2016 19:12:12 +0800 (PHT)
This message is in MIME format. The first part should be readable text,
while the remaining parts are likely unreadable without MIME-aware tools.
--0-476036076-1452078732=:19497
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
On Wed, 6 Jan 2016, Martin Husemann wrote:
> The following reply was made to PR kern/50627; it has been noted by GNATS.
>
> From: Martin Husemann <martin@duskware.de>
> To: gnats-bugs@NetBSD.org
> Cc:
> Subject: Re: kern/50627: filemon can hang the process
> Date: Wed, 6 Jan 2016 11:26:38 +0100
>
> On Wed, Jan 06, 2016 at 06:40:01AM +0000, Robert Elz wrote:
> > In-kernel atexit() could solve this problem, and could also potentially
> > have uses for other problems, but when things get complex (I don't know
> > if it is possible, but consider a process using filemon twice, where one
> > of them is logging changes to a file monitored by the other) the atexit()
> > order processing is just going to be the same problem all over again -
> > something will need to make sure those get done in the correct order,
> > which is not necessarily either establishment order, or the reverse of that.
>
> I haven't followed the details, so this may be a stupid suggestion: isn't
> it enough to do the closing loop several times, using cv_timedwait(.., 1)
> and ignoring the fds where it returns EWOULDBLOCK, untill we did not close
> any fds in one pass and then switch to cv_wait() in the next pass?
You might be able to get that to work, but not easily. One of the first
thing that happens in fd_close() is to remove the entry from the array
so that no new references can be made. If you use cv_timedwait() and
retry on EWOULDBLOCK, you'd have to put the entry back in the table (so
it would be seen on the next pass through the loop), and that would
allow other code to establish new references.
It could get pretty messy.
filemon(4) really has a fairly limited use-case, and it already abuses
system interfaces by hijacking/wrapping system calls (which already
introduces some ordering dependencies if there were ever any other
hijacker). I'm not sure we want to invent complicated mechanisms to
handle this situation.
I'm much more comfortable with making the attached patch (already posted
to tech-kern, I think!) to ensure that the fd numbers are in the proper
sequence. The caller can easily use dup(2) or fcntl(2) to get a more
acceptable fd number for the log file.
>
> Martin
>
>
> !DSPAM:568cecaf221711927294789!
>
>
+------------------+--------------------------+------------------------+
| Paul Goyette | PGP Key fingerprint: | E-mail addresses: |
| (Retired) | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+------------------+--------------------------+------------------------+
--0-476036076-1452078732=:19497
Content-Type: TEXT/PLAIN; charset=US-ASCII; name=filemon.c.diff
Content-Transfer-Encoding: BASE64
Content-ID: <Pine.NEB.4.64.1601061912120.19497@pokey.whooppee.com>
Content-Description:
Content-Disposition: attachment; filename=filemon.c.diff
SW5kZXg6IGZpbGVtb24uYw0KPT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQ0KUkNT
IGZpbGU6IC9jdnNyb290L3NyYy9zeXMvZGV2L2ZpbGVtb24vZmlsZW1vbi5j
LHYNCnJldHJpZXZpbmcgcmV2aXNpb24gMS4yNA0KZGlmZiAtdSAtcCAtcjEu
MjQgZmlsZW1vbi5jDQotLS0gZmlsZW1vbi5jCTUgSmFuIDIwMTYgMjI6MDg6
NTQgLTAwMDAJMS4yNA0KKysrIGZpbGVtb24uYwk2IEphbiAyMDE2IDA1OjA1
OjA0IC0wMDAwDQpAQCAtMjc5LDggKzI3OSwxMiBAQCBzdGF0aWMgaW50DQog
ZmlsZW1vbl9pb2N0bChzdHJ1Y3QgZmlsZSAqIGZwLCB1X2xvbmcgY21kLCB2
b2lkICpkYXRhKQ0KIHsNCiAJaW50IGVycm9yID0gMDsNCisJaW50IGksIG5m
LCBmZDsNCiAJc3RydWN0IGZpbGVtb24gKmZpbGVtb247DQogCXN0cnVjdCBw
cm9jICp0cDsNCisJZmRmaWxlX3QgKmZmOw0KKwlmaWxlZGVzY190ICpmZHA7
DQorCWZkdGFiX3QgKmR0Ow0KIA0KICNpZmRlZiBERUJVRw0KIAlsb2cobG9n
TGV2ZWwsICJmaWxlbW9uX2lvY3RsKCVsdSkiLCBjbWQpOzsNCkBAIC0zMDMs
OCArMzA3LDQxIEBAIGZpbGVtb25faW9jdGwoc3RydWN0IGZpbGUgKiBmcCwg
dV9sb25nIGMNCiAJCWlmIChmaWxlbW9uLT5mbV9mcCkNCiAJCQlmZF9wdXRm
aWxlKGZpbGVtb24tPmZtX2ZkKTsNCiANCisJCS8qDQorCQkgKiBYWFggSEFD
SyBYWFgNCisJCSAqDQorCQkgKiBEdWUgdG8gb3VyIHRha2luZyBhbiBleHRy
YSByZWZlcmVuY2Ugb24gdGhlDQorCQkgKiBkZXNjcmlwdG9yJ3Mgc3RydWN0
IGZpbGUsIHdlIG5lZWQgdG8gZW5zdXJlIHRoYXQNCisJCSAqIHRoZSBkZXNj
cmlwdG9yIG51bWJlciBpcyBhdCBsZWFzdCBhcyBsYXJnZSBhcw0KKwkJICog
dGhlIG9uZSB1c2VkIHRvIGFjY2VzcyAvZGV2L2ZpbGVtb24uICBPdGhlcndp
c2UsDQorCQkgKiB3ZSBnZXQgYSBkZWFkbG9jayBkdXJpbmcgcHJvY2VzcyBl
eGl0LCB3YWl0aW5nDQorCQkgKiBmb3IgdGhlIG91dHB1dCBmaWxlJ3MgcmVm
ZXJlbmNlIGNvdW50Lg0KKwkJICovDQorCQlmZCA9ICooKGludCAqKSBkYXRh
KTsNCisJCWZkcCA9IGN1cnByb2MtPnBfZmQ7DQorCQlkdCA9IGN1cmx3cC0+
bF9mZC0+ZmRfZHQ7DQorCQluZiA9IGR0LT5kdF9uZmlsZXM7DQorCQllcnJv
ciA9IEVJTlZBTDsNCisJCWZvciAoaSA9IDA7IGkgPCBuZjsgaSsrKSB7DQor
CQkJaWYgKGkgPj0gZmQpDQorCQkJCWJyZWFrOw0KKwkJCWZmID0gZHQtPmR0
X2ZmW2ldOw0KKwkJCUtBU1NFUlQoZmQgPj0gTkRGREZJTEUgfHwNCisJCQkJ
ZmYgPT0gKGZkZmlsZV90ICopZmRwLT5mZF9kZmRmaWxlW2ldKTsNCisJCQlp
ZiAoZmYgPT0gTlVMTCkNCisJCQkJY29udGludWU7DQorDQorCQkJaWYgKGZm
LT5mZl9maWxlLT5mX3R5cGUgPT0gRFRZUEVfTUlTQyAmJg0KKwkJCSAgICBm
Zi0+ZmZfZmlsZS0+Zl9vcHMgPT0gJmZpbGVtb25fZmlsZW9wcykgew0KKwkJ
CQllcnJvciA9IDA7DQorCQkJCWJyZWFrOw0KKwkJCX0NCisJCX0NCisJCWlm
IChlcnJvcikNCisJCQlicmVhazsNCisNCiAJCS8qIE5vdyBzZXQgdXAgdGhl
IG5ldyBvbmUgKi8NCi0JCWZpbGVtb24tPmZtX2ZkID0gKigoaW50ICopIGRh
dGEpOw0KKwkJZmlsZW1vbi0+Zm1fZmQgPSBmZDsNCiAJCWlmICgoZmlsZW1v
bi0+Zm1fZnAgPSBmZF9nZXRmaWxlKGZpbGVtb24tPmZtX2ZkKSkgPT0gTlVM
TCkgew0KIAkJCWVycm9yID0gRUJBREY7DQogCQkJYnJlYWs7DQo=
--0-476036076-1452078732=:19497--
From: "Paul Goyette" <pgoyette@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/50627 CVS commit: src/sys/dev/filemon
Date: Mon, 11 Jan 2016 01:37:36 +0000
Module Name: src
Committed By: pgoyette
Date: Mon Jan 11 01:37:36 UTC 2016
Modified Files:
src/sys/dev/filemon: filemon.c filemon.h
Log Message:
Take a reference on the (activity log) file itself, and not on the
descriptor.
Should fix PR kern/50627
To generate a diff of this commit:
cvs rdiff -u -r1.27 -r1.28 src/sys/dev/filemon/filemon.c
cvs rdiff -u -r1.8 -r1.9 src/sys/dev/filemon/filemon.h
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Paul Goyette" <pgoyette@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/50627 CVS commit: src/share/man/man4
Date: Mon, 11 Jan 2016 01:45:27 +0000
Module Name: src
Committed By: pgoyette
Date: Mon Jan 11 01:45:27 UTC 2016
Modified Files:
src/share/man/man4: filemon.4
Log Message:
Remove current bug, now that PR kern/50627 is fixed. Add another security
consideration concerning re-used of the target/monitored process's pid.
To generate a diff of this commit:
cvs rdiff -u -r1.17 -r1.18 src/share/man/man4/filemon.4
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->closed
State-Changed-By: pgoyette@NetBSD.org
State-Changed-When: Mon, 11 Jan 2016 02:08:28 +0000
State-Changed-Why:
Fixed.
XXX Note that there are still several serious issues with filemon(4)
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.