NetBSD Problem Report #41225
From www@NetBSD.org Wed Apr 15 22:06:05 2009
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
by www.NetBSD.org (Postfix) with ESMTP id C0A0863C1F9
for <gnats-bugs@gnats.netbsd.org>; Wed, 15 Apr 2009 22:06:04 +0000 (UTC)
Message-Id: <20090415220600.6C64863B8A5@www.NetBSD.org>
Date: Wed, 15 Apr 2009 22:06:00 +0000 (UTC)
From: ad@netbsd.org
Reply-To: ad@netbsd.org
To: gnats-bugs@NetBSD.org
Subject: sys_mqueue.c mq->mq_notify_proc can disappear
X-Send-Pr-Version: www-1.0
>Number: 41225
>Category: kern
>Synopsis: sys_mqueue.c mq->mq_notify_proc can disappear
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Apr 15 22:10:01 +0000 2009
>Last-Modified: Sun Oct 06 16:11:06 +0000 2019
>Originator: Andrew Doran
>Release: 5.0
>Organization:
The NetBSD Project
>Environment:
n/a
>Description:
Message queues are tied to file descriptors and so can live across
fork().
mq_notify() does: mq->mq_notify_proc = l->l_proc;
It looks like this process can exit while the mqueue still persists.
We can later try to send a signal to this process.
>How-To-Repeat:
Code inspection.
>Fix:
This problem occurs quite a few times in the kernel, esp. in device
drivers doing async I/O. :-(
Quick and dirty fix:
Change mq_notify() to store a pid_t and use p_find() when sending
signals. mqueue can send signals to unrelated processes after
the proc exits.
More reliable but ugly and slow:
Register an exithook for every mqueue that has mq_notify_proc set.
This is mostly ugly because at the moment it requires taking exec_lock
which is pretty heavyweight.
Maybe a better solution:
In addition to a pid, store a generation number per struct proc.
Every time a pid is allocated, increment the generation number.
Add a p_find_gen() or similar that looks for a pid and also
compares the generation numbers.
This number could be a 64-bit system global (under proc_lock) or
we could have one for each pid table slot. I don't personally see
a big disadvantage to having a global. ??
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: kern-bug-people->rmind
Responsible-Changed-By: rmind@NetBSD.org
Responsible-Changed-When: Wed, 15 Apr 2009 23:22:05 +0000
Responsible-Changed-Why:
Mine.
From: David Laight <david@l8s.co.uk>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/41225: sys_mqueue.c mq->mq_notify_proc can disappear
Date: Thu, 16 Apr 2009 07:38:23 +0100
On Wed, Apr 15, 2009 at 10:10:01PM +0000, ad@netbsd.org wrote:
> >Synopsis: sys_mqueue.c mq->mq_notify_proc can disappear
>
> Quick and dirty fix:
>
> Change mq_notify() to store a pid_t and use p_find() when sending
> signals. mqueue can send signals to unrelated processes after
> the proc exits.
Can the message queue hold a reference count againt the proc table slot ?
This would stop the slot being reused.
I guess it should also be possibly to determine that the process had died
when it tries to send a new signal - removing the need for the exithook.
Just don't do the fubar recently added to linux where kernel code
can no longer send a signal to a pid!
David
--
David Laight: david@l8s.co.uk
From: Mindaugas Rasiukevicius <rmind@netbsd.org>
To: ad@netbsd.org
Cc: gnats-bugs@NetBSD.org, netbsd-bugs@netbsd.org
Subject: Re: kern/41225: sys_mqueue.c mq->mq_notify_proc can disappear
Date: Sat, 25 Apr 2009 17:59:20 +0100
> >Number: 41225
> >Category: kern
> >Synopsis: sys_mqueue.c mq->mq_notify_proc can disappear
>
> ...
>
> >Description:
> Message queues are tied to file descriptors and so can live across
> fork().
>
> mq_notify() does: mq->mq_notify_proc = l->l_proc;
>
> It looks like this process can exit while the mqueue still persists.
When process is closing the descriptor (by exit or directly), the
mq_notify_proc is checked by mq_close_fop():
http://nxr.netbsd.org/source/xref/sys/kern/sys_mqueue.c#304
Therefore it seems to be a small race condition, with window here:
http://nxr.netbsd.org/source/xref/sys/kern/sys_mqueue.c#739
Holding proc_lock before release of mq->mq_mtx should fix the problem:
http://www.netbsd.org/~rmind/mq_kpsignal.diff
Looks good?
> Maybe a better solution:
>
> In addition to a pid, store a generation number per struct proc.
> Every time a pid is allocated, increment the generation number.
> Add a p_find_gen() or similar that looks for a pid and also
> compares the generation numbers.
> This number could be a 64-bit system global (under proc_lock) or
> we could have one for each pid table slot. I don't personally see
> a big disadvantage to having a global. ??
That would be a general way to ensure uniqueness. Are there any other
users (or other usefulness) of such interface?
--
Best regards,
Mindaugas
Responsible-Changed-From-To: rmind->kern-bug-people
Responsible-Changed-By: rmind@NetBSD.org
Responsible-Changed-When: Sun, 06 Oct 2019 16:11:06 +0000
Responsible-Changed-Why:
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.