NetBSD Problem Report #50627

From paul@pokey.whooppee.com  Wed Jan  6 01:40:13 2016
Return-Path: <paul@pokey.whooppee.com>
Received: from mail.netbsd.org (mail.NetBSD.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 6BC947ACBE
	for <gnats-bugs@gnats.NetBSD.org>; Wed,  6 Jan 2016 01:40:13 +0000 (UTC)
Message-Id: <20160106013940.503F516E92@pokey.whooppee.com>
Date: Wed,  6 Jan 2016 09:39:40 +0800 (PHT)
From: paul@whooppee.com
Reply-To: paul@whooppee.com
To: gnats-bugs@NetBSD.org
Subject: filemon can hang the process
X-Send-Pr-Version: 3.95

>Number:         50627
>Category:       kern
>Synopsis:       filemon can hang a process
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    pgoyette
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Jan 06 01:45:00 +0000 2016
>Closed-Date:    Mon Jan 11 02:08:28 +0000 2016
>Last-Modified:  Mon Jan 11 02:08:28 +0000 2016
>Originator:     Paul Goyette
>Release:        NetBSD 7.99.25
>Organization:
+------------------+--------------------------+------------------------+
| Paul Goyette     | PGP Key fingerprint:     | E-mail addresses:      |
| (Retired)        | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+------------------+--------------------------+------------------------+
>Environment:


System: NetBSD pokey.whooppee.com 7.99.25 NetBSD 7.99.25 (POKEY 2015-12-23 05:05:48) #9: Wed Dec 23 15:10:44 PHT 2015 paul@pokey.whooppee.com:/build/netbsd-local/obj/amd64/sys/arch/amd64/compile/POKEY amd64
Architecture: x86_64
Machine: amd64
>Description:
If the file descriptor on which a process opens /dev/filemon is numerically
larger than the descriptor to which the activity entries are being logged
(as set with ioctl(filemon_fd, FILEMON_SET_FD) call), the monitoring process
will hang when it tries to exit.  Process exit code attempts to close all
open file descriptors, in sequence, but closing the fd to which the activity
entries are being written hangs waiting for its reference count to reduce to
zero.  An example sequence of events is:

        1. Process opens /dev/filemon and gets fd #3
        2. Process tells filemon to log activity to fd #1 (stdout)
        3. Process calls sys_exit(), which starts process cleanup
        4. Clean-up code tries to fd_close all open descriptors, in
           order, so handles fd #0 and then fd #1
        5. fd #1 has another reference, so we wait on the condvar,
           which never gets broadcast since there's no other thread
           to run.  We hang here forever.

When the activity log file's descriptor is numerically greater than the
fd on which /dev/filemon is open, the filemon descriptor gets closed
first, which removes the additional reference to the log file:

        1. Process opens /dev/filemon and gets fd #3
        2. Process opens up a temp file (or simply calls dup(stdout))
           and gets fd #4;  the process tells filemon to log activity
           to fd #4
        3. Process calls sys_exit(), which starts process cleanup
        4. Clean-up code tries to fd_close all open descriptors, in
           order, so handles fd #0 and then fd #1
        5. In this scenario, fd#1 has no extra references, so it can
           close normally.
        6. Cleanup proceeds with fd #2, and then gets to fd#3, where
           /dev/filemon is open
        7. We call filemon_close() which calls fd_putfile() on fd #4.
           This removes the additional reference on fd #4
        8. Cleanup moves on to fd #4 which now has only a single
           reference, so it, too, can be successfully closed!

This bug will be added to the filemon(4) man page.

>How-To-Repeat:

See above.
>Fix:
Fix not yet known.  However, you can work-around the problem by using an
atexit() handler to close the filemon device's fd first (or set the log's
fd to an invalid value, which disassociates from the log without setting
up a new one).


>Release-Note:

>Audit-Trail:

Responsible-Changed-From-To: kern-bug-people->pgoyette
Responsible-Changed-By: pgoyette@NetBSD.org
Responsible-Changed-When: Wed, 06 Jan 2016 01:58:16 +0000
Responsible-Changed-Why:
I'm working on it.


From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/50627: filemon can hang the process
Date: Wed, 06 Jan 2016 13:35:40 +0700

     Date:        Wed,  6 Jan 2016 01:45:00 +0000 (UTC)
     From:        paul@whooppee.com
     Message-ID:  <20160106014500.65AA47A21B@mollari.NetBSD.org>

   | However, you can work-around the problem by using an
   | atexit() handler to close the filemon device's fd first

 In-kernel atexit() could solve this problem, and could also potentially
 have uses for other problems, but when things get complex (I don't know
 if it is possible, but consider a process using filemon twice, where one
 of them is logging changes to a file monitored by the other) the atexit()
 order processing is just going to be the same problem all over again -
 something will need to make sure those get done in the correct order,
 which is not necessarily either establishment order, or the reverse of that.

 Since what you have here is a "do it in this order" problem, it seems as
 if one of the classic solutions to that problem might be more in order.

 One would be to use explicit dependencies - that is what I thought of
 first (but...)   In your case this would work like

         1. Process opens /dev/filemon and gets fd #3
         2. Process tells filemon to log activity to fd #1 (stdout)
 2a. fd #1 now depends upon fd#3
         3. Process calls sys_exit(), which starts process cleanup
         4. Clean-up code tries to fd_close all open descriptors, in
            order, so handles fd #0 and then fd #1
 4a. fd#1 depends upon fd#3, so close fd#3 first, which releases
 the ref on fd#1, then return to:
         5. fd #1 has another reference, so we wait on the condvar,
            which never gets broadcast since there's no other thread
            to run.  We hang here forever.
 which is no longer true, as that other reference has gone away, and the
 close of fd#1 succeeds normally.   After going past #2 we reach #3
 but now that is just an (old) closed descriptor for which there is
 nothing to do (just like 4 5 ... presumably)

 The problem with this is that it (probably) needs a double linked list
 between fds to work properly (whether those are pointers, or just fd
 numbers doesn't matter) and then the locking to maintain it all properly
 (at close, setup is easy) would get difficult at best I think.



 An alternative is just to (separately) maintain a correctly ordered
 close list, and always close in that order.   For most (non filemon
 using, nor any other similar requirement) processes, this list would be
 null, and the closes would just happen in numeric order like now.

 But when filemon (or anything similar) is in use, the process would be ...

         1. Process opens /dev/filemon and gets fd #3
         2. Process tells filemon to log activity to fd #1 (stdout)

 2a. processes notes that this is its first activity which requires
 ordered closes, and so allocates a close order list, and links that to
 the process.  (This would be an array of size max_fd, of element type
 some int big enough to hold a fd - char, short, whatever it takes).
 This is init'd with values in ascending integer order (so of itself,
 would change nothing).

 2b. logging to fd1 makes fd1 depend upon fd3, so fd3 must be closed
 before 1, adjust the order of the list so 3 is inserted ahead of 1
 (this is a slice and reassemble operation on the array.)
 Alt: move 1 after 3.
 Which to do depends on whether anything previously depended upon 1 or 3
 or whether either of those already depends upon anything else - setting
 a flag (on both fds) to indicate an ordering dependency exists, would work,
 then error out of this (ie: filemon ioctl fails) if both relevant fds
 already have the flag set, otherwise if one does, move the position of
 the other (if neither do, toss a coin .. or follow a rule in the code
 more likely) - set the flag on both fds when done.
 The file flag bits force the (relative) order of those fds to remain
 unchanged, so once a dependency is set, it cannot be altered later
 (unless a fd is closed of course, that would reset the flag for a later open.)

         3. Process calls sys_exit(), which starts process cleanup

 3a. process notices that it has a close order list, so it calls
 fd_close() on all open descriptors in the order that they appear
 on that list, which in this case would be 0 3 1 2 ...
 (or perhaps 0 2 3 1 ...  it should make no difference as long as
 3 gets closed before 1).

 3b. skip to step 6 (4 is done already, and 5 never happens)

         4. Clean-up code tries to fd_close all open descriptors, in
            order, so handles fd #0 and then fd #1
         5. fd #1 has another reference, so we wait on the condvar,
            which never gets broadcast since there's no other thread
            to run.  We hang here forever.

 There are no locking issues of note with this one (locks are needed, but
 they're nothing unusual, and probably already covered by existing locking)

 step 2a could be optimised (for space) slightly by making the array only
 as big as the number of current open fds (then it would need to be able to
 be extended should that be needed later - unlikely but possible) in which
 case after 3a we would just continue to do step 4 - the initial group
 of fds (those open at the time of 2a) would already be closed, and so just
 skipped at step 4, which would then close any fds opened later.

 There are most likely other solutions worth investigating before even
 contemplating the mechanism of adding some kind of atexit() (for processes,
 rather than the kernel itself, of course) to the kernel.

 kre

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/50627: filemon can hang the process
Date: Wed, 6 Jan 2016 11:26:38 +0100

 On Wed, Jan 06, 2016 at 06:40:01AM +0000, Robert Elz wrote:
 >  In-kernel atexit() could solve this problem, and could also potentially
 >  have uses for other problems, but when things get complex (I don't know
 >  if it is possible, but consider a process using filemon twice, where one
 >  of them is logging changes to a file monitored by the other) the atexit()
 >  order processing is just going to be the same problem all over again -
 >  something will need to make sure those get done in the correct order,
 >  which is not necessarily either establishment order, or the reverse of that.

 I haven't followed the details, so this may be a stupid suggestion: isn't
 it enough to do the closing loop several times, using cv_timedwait(.., 1)
 and ignoring the fds where it returns EWOULDBLOCK, untill we did not close
 any fds in one pass and then switch to cv_wait() in the next pass?

 Martin

From: Paul Goyette <paul@whooppee.com>
To: gnats-bugs@NetBSD.org
Cc: gnats-admin@netbsd.org
Subject: Re: kern/50627: filemon can hang the process
Date: Wed, 6 Jan 2016 19:12:12 +0800 (PHT)

   This message is in MIME format.  The first part should be readable text,
   while the remaining parts are likely unreadable without MIME-aware tools.

 --0-476036076-1452078732=:19497
 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed

 On Wed, 6 Jan 2016, Martin Husemann wrote:

 > The following reply was made to PR kern/50627; it has been noted by GNATS.
 >
 > From: Martin Husemann <martin@duskware.de>
 > To: gnats-bugs@NetBSD.org
 > Cc:
 > Subject: Re: kern/50627: filemon can hang the process
 > Date: Wed, 6 Jan 2016 11:26:38 +0100
 >
 > On Wed, Jan 06, 2016 at 06:40:01AM +0000, Robert Elz wrote:
 > >  In-kernel atexit() could solve this problem, and could also potentially
 > >  have uses for other problems, but when things get complex (I don't know
 > >  if it is possible, but consider a process using filemon twice, where one
 > >  of them is logging changes to a file monitored by the other) the atexit()
 > >  order processing is just going to be the same problem all over again -
 > >  something will need to make sure those get done in the correct order,
 > >  which is not necessarily either establishment order, or the reverse of that.
 >
 > I haven't followed the details, so this may be a stupid suggestion: isn't
 > it enough to do the closing loop several times, using cv_timedwait(.., 1)
 > and ignoring the fds where it returns EWOULDBLOCK, untill we did not close
 > any fds in one pass and then switch to cv_wait() in the next pass?

 You might be able to get that to work, but not easily.  One of the first 
 thing that happens in fd_close() is to remove the entry from the array 
 so that no new references can be made.  If you use cv_timedwait() and 
 retry on EWOULDBLOCK, you'd have to put the entry back in the table (so 
 it would be seen on the next pass through the loop), and that would 
 allow other code to establish new references.

 It could get pretty messy.


 filemon(4) really has a fairly limited use-case, and it already abuses 
 system interfaces by hijacking/wrapping system calls (which already 
 introduces some ordering dependencies if there were ever any other 
 hijacker).  I'm not sure we want to invent complicated mechanisms to 
 handle this situation.

 I'm much more comfortable with making the attached patch (already posted 
 to tech-kern, I think!) to ensure that the fd numbers are in the proper 
 sequence.  The caller can easily use dup(2) or fcntl(2) to get a more 
 acceptable fd number for the log file.



 >
 > Martin
 >
 >
 > !DSPAM:568cecaf221711927294789!
 >
 >

 +------------------+--------------------------+------------------------+
 | Paul Goyette     | PGP Key fingerprint:     | E-mail addresses:      |
 | (Retired)        | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
 | Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
 +------------------+--------------------------+------------------------+
 --0-476036076-1452078732=:19497
 Content-Type: TEXT/PLAIN; charset=US-ASCII; name=filemon.c.diff
 Content-Transfer-Encoding: BASE64
 Content-ID: <Pine.NEB.4.64.1601061912120.19497@pokey.whooppee.com>
 Content-Description: 
 Content-Disposition: attachment; filename=filemon.c.diff

 SW5kZXg6IGZpbGVtb24uYw0KPT09PT09PT09PT09PT09PT09PT09PT09PT09
 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQ0KUkNT
 IGZpbGU6IC9jdnNyb290L3NyYy9zeXMvZGV2L2ZpbGVtb24vZmlsZW1vbi5j
 LHYNCnJldHJpZXZpbmcgcmV2aXNpb24gMS4yNA0KZGlmZiAtdSAtcCAtcjEu
 MjQgZmlsZW1vbi5jDQotLS0gZmlsZW1vbi5jCTUgSmFuIDIwMTYgMjI6MDg6
 NTQgLTAwMDAJMS4yNA0KKysrIGZpbGVtb24uYwk2IEphbiAyMDE2IDA1OjA1
 OjA0IC0wMDAwDQpAQCAtMjc5LDggKzI3OSwxMiBAQCBzdGF0aWMgaW50DQog
 ZmlsZW1vbl9pb2N0bChzdHJ1Y3QgZmlsZSAqIGZwLCB1X2xvbmcgY21kLCB2
 b2lkICpkYXRhKQ0KIHsNCiAJaW50IGVycm9yID0gMDsNCisJaW50IGksIG5m
 LCBmZDsNCiAJc3RydWN0IGZpbGVtb24gKmZpbGVtb247DQogCXN0cnVjdCBw
 cm9jICp0cDsNCisJZmRmaWxlX3QgKmZmOw0KKwlmaWxlZGVzY190ICpmZHA7
 DQorCWZkdGFiX3QgKmR0Ow0KIA0KICNpZmRlZiBERUJVRw0KIAlsb2cobG9n
 TGV2ZWwsICJmaWxlbW9uX2lvY3RsKCVsdSkiLCBjbWQpOzsNCkBAIC0zMDMs
 OCArMzA3LDQxIEBAIGZpbGVtb25faW9jdGwoc3RydWN0IGZpbGUgKiBmcCwg
 dV9sb25nIGMNCiAJCWlmIChmaWxlbW9uLT5mbV9mcCkNCiAJCQlmZF9wdXRm
 aWxlKGZpbGVtb24tPmZtX2ZkKTsNCiANCisJCS8qDQorCQkgKiBYWFggSEFD
 SyBYWFgNCisJCSAqDQorCQkgKiBEdWUgdG8gb3VyIHRha2luZyBhbiBleHRy
 YSByZWZlcmVuY2Ugb24gdGhlDQorCQkgKiBkZXNjcmlwdG9yJ3Mgc3RydWN0
 IGZpbGUsIHdlIG5lZWQgdG8gZW5zdXJlIHRoYXQNCisJCSAqIHRoZSBkZXNj
 cmlwdG9yIG51bWJlciBpcyBhdCBsZWFzdCBhcyBsYXJnZSBhcw0KKwkJICog
 dGhlIG9uZSB1c2VkIHRvIGFjY2VzcyAvZGV2L2ZpbGVtb24uICBPdGhlcndp
 c2UsDQorCQkgKiB3ZSBnZXQgYSBkZWFkbG9jayBkdXJpbmcgcHJvY2VzcyBl
 eGl0LCB3YWl0aW5nDQorCQkgKiBmb3IgdGhlIG91dHB1dCBmaWxlJ3MgcmVm
 ZXJlbmNlIGNvdW50Lg0KKwkJICovDQorCQlmZCA9ICooKGludCAqKSBkYXRh
 KTsNCisJCWZkcCA9IGN1cnByb2MtPnBfZmQ7DQorCQlkdCA9IGN1cmx3cC0+
 bF9mZC0+ZmRfZHQ7DQorCQluZiA9IGR0LT5kdF9uZmlsZXM7DQorCQllcnJv
 ciA9IEVJTlZBTDsNCisJCWZvciAoaSA9IDA7IGkgPCBuZjsgaSsrKSB7DQor
 CQkJaWYgKGkgPj0gZmQpDQorCQkJCWJyZWFrOw0KKwkJCWZmID0gZHQtPmR0
 X2ZmW2ldOw0KKwkJCUtBU1NFUlQoZmQgPj0gTkRGREZJTEUgfHwNCisJCQkJ
 ZmYgPT0gKGZkZmlsZV90ICopZmRwLT5mZF9kZmRmaWxlW2ldKTsNCisJCQlp
 ZiAoZmYgPT0gTlVMTCkNCisJCQkJY29udGludWU7DQorDQorCQkJaWYgKGZm
 LT5mZl9maWxlLT5mX3R5cGUgPT0gRFRZUEVfTUlTQyAmJg0KKwkJCSAgICBm
 Zi0+ZmZfZmlsZS0+Zl9vcHMgPT0gJmZpbGVtb25fZmlsZW9wcykgew0KKwkJ
 CQllcnJvciA9IDA7DQorCQkJCWJyZWFrOw0KKwkJCX0NCisJCX0NCisJCWlm
 IChlcnJvcikNCisJCQlicmVhazsNCisNCiAJCS8qIE5vdyBzZXQgdXAgdGhl
 IG5ldyBvbmUgKi8NCi0JCWZpbGVtb24tPmZtX2ZkID0gKigoaW50ICopIGRh
 dGEpOw0KKwkJZmlsZW1vbi0+Zm1fZmQgPSBmZDsNCiAJCWlmICgoZmlsZW1v
 bi0+Zm1fZnAgPSBmZF9nZXRmaWxlKGZpbGVtb24tPmZtX2ZkKSkgPT0gTlVM
 TCkgew0KIAkJCWVycm9yID0gRUJBREY7DQogCQkJYnJlYWs7DQo=

 --0-476036076-1452078732=:19497--

From: "Paul Goyette" <pgoyette@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/50627 CVS commit: src/sys/dev/filemon
Date: Mon, 11 Jan 2016 01:37:36 +0000

 Module Name:	src
 Committed By:	pgoyette
 Date:		Mon Jan 11 01:37:36 UTC 2016

 Modified Files:
 	src/sys/dev/filemon: filemon.c filemon.h

 Log Message:
 Take a reference on the (activity log) file itself, and not on the
 descriptor.

 Should fix PR kern/50627


 To generate a diff of this commit:
 cvs rdiff -u -r1.27 -r1.28 src/sys/dev/filemon/filemon.c
 cvs rdiff -u -r1.8 -r1.9 src/sys/dev/filemon/filemon.h

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Paul Goyette" <pgoyette@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/50627 CVS commit: src/share/man/man4
Date: Mon, 11 Jan 2016 01:45:27 +0000

 Module Name:	src
 Committed By:	pgoyette
 Date:		Mon Jan 11 01:45:27 UTC 2016

 Modified Files:
 	src/share/man/man4: filemon.4

 Log Message:
 Remove current bug, now that PR kern/50627 is fixed.  Add another security
 consideration concerning re-used of the target/monitored process's pid.


 To generate a diff of this commit:
 cvs rdiff -u -r1.17 -r1.18 src/share/man/man4/filemon.4

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->closed
State-Changed-By: pgoyette@NetBSD.org
State-Changed-When: Mon, 11 Jan 2016 02:08:28 +0000
State-Changed-Why:
Fixed.

XXX Note that there are still several serious issues with filemon(4)


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.