NetBSD Problem Report #37915
From dholland@eecs.harvard.edu Wed Jan 30 01:27:30 2008
Return-Path: <dholland@eecs.harvard.edu>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
by narn.NetBSD.org (Postfix) with ESMTP id 3A7C363B101
for <gnats-bugs@gnats.NetBSD.org>; Wed, 30 Jan 2008 01:27:30 +0000 (UTC)
Message-Id: <20080130012855.BAC44F905@tanaqui.eecs.harvard.edu>
Date: Tue, 29 Jan 2008 20:28:55 -0500 (EST)
From: dholland@eecs.harvard.edu
Reply-To: dholland@eecs.harvard.edu
To: gnats-bugs@gnats.NetBSD.org
Subject: vt100 + wscons + tty vmlocking changes = panic
X-Send-Pr-Version: 3.95
>Number: 37915
>Category: kern
>Synopsis: vt100 + wscons + tty vmlocking changes = panic
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Jan 30 01:30:00 +0000 2008
>Closed-Date:
>Last-Modified: Mon Apr 09 05:58:46 +0000 2012
>Originator: David A. Holland <dholland@eecs.harvard.edu>
>Release: NetBSD 4.99.50 (20080127)
>Organization:
>Environment:
System: NetBSD valkyrie 4.99.50 NetBSD 4.99.50 (VALKYRIE) #3: Sun Jan 27 14:54:46 EST 2008 root@valkyrie:/usr/src/sys/arch/amd64/compile/VALKYRIE amd64
Architecture: x86_64
Machine: amd64
>Description:
Sending some VT100 control sequences to wscons will panic the kernel.
The problem is that ttwrite() holds tty_lock while calling ttstart();
this makes its way down to wsemul_vt100_handle_csi(), which for
several sequences that report stuff back calls wsdisplay_emulinput(),
which calls ttyinput(), which tries to get tty_lock again.
>How-To-Repeat:
Run some program that sends such a sequence (e.g. tset) or echo one
manually, while logged in on the console.
>Fix:
One could make the lock recursive, but that's gross. Probably the best
bet would be to make another entry in linesw (l_rintlocked?) for when
one already holds the tty lock, and use that in wsdisplay_emulinput().
Or just ("just") hunt down all (other) callers of l_rint and make them
get the lock first. The comment above tty_input suggests this is the
eventual plan anyway...
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: kern-bug-people->ad
Responsible-Changed-By: ad@narn.netbsd.org
Responsible-Changed-When: Wed, 30 Jan 2008 08:40:32 +0000
Responsible-Changed-Why:
take
State-Changed-From-To: open->analyzed
State-Changed-By: ad@NetBSD.org
State-Changed-When: Fri, 30 May 2008 16:28:28 +0000
State-Changed-Why:
'resize' is a good test case.
I'm fixing all the tty locking so this is no longer a problem. All the linesw
methods will need to be called with tty_lock held.
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Subject: Re: kern/37915: vt100 + wscons + tty vmlocking changes = panic
Date: Mon, 8 Dec 2008 17:36:47 +0000
On Wed, Jan 30, 2008 at 01:30:00AM +0000, dholland@eecs.harvard.edu wrote:
> >Synopsis: vt100 + wscons + tty vmlocking changes = panic
>
> Sending some VT100 control sequences to wscons will panic the kernel.
>
> The problem is that ttwrite() holds tty_lock while calling ttstart();
> this makes its way down to wsemul_vt100_handle_csi(), which for
> several sequences that report stuff back calls wsdisplay_emulinput(),
> which calls ttyinput(), which tries to get tty_lock again.
This still happens with up-to-the-minute current.
--
David A. Holland
dholland@netbsd.org
From: David Holland <dholland-bugs@netbsd.org>
To: dholland@eecs.harvard.edu
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org, gnats-bugs@netbsd.org
Subject: Re: kern/37915: vt100 + wscons + tty vmlocking changes = panic
Date: Fri, 2 Jan 2009 09:07:32 +0000
On Wed, Jan 30, 2008 at 01:30:00AM +0000, dholland@eecs.harvard.edu wrote:
> The problem is that ttwrite() holds tty_lock while calling ttstart();
> this makes its way down to wsemul_vt100_handle_csi(), which for
> several sequences that report stuff back calls wsdisplay_emulinput(),
> which calls ttyinput(), which tries to get tty_lock again.
Making a small and noninvasive patch for this (as ad@ requested) turns
out to be not so trivial, since it turns out that spin mutexes don't
track who's holding them. (And for a while I thought tty_lock was
being released and reacquired somewhere inside ttinput, which would
have made it just about impossible, but that turns out to have been a
red herring.)
Thus, while the following patch could be considered a workaround, I
can't with a straight face call it a fix.
Index: tty.c
===================================================================
RCS file: /cvsroot/src/sys/kern/tty.c,v
retrieving revision 1.228
diff -u -p -r1.228 tty.c
--- tty.c 19 Nov 2008 18:36:07 -0000 1.228
+++ tty.c 2 Jan 2009 09:01:35 -0000
@@ -202,6 +202,10 @@ kmutex_t tty_lock;
krwlock_t ttcompat_lock;
int (*ttcompatvec)(struct tty *, u_long, void *, int, struct lwp *);
+/* XXX: names the cpu that's currently doing ttstart(). See PR 37915. */
+static struct cpu_info *in_ttstart_hack_cpu;
+static unsigned in_ttstart_hack_depth;
+
uint64_t tk_cancc;
uint64_t tk_nin;
uint64_t tk_nout;
@@ -702,11 +706,20 @@ ttyinput_wlock(int c, struct tty *tp)
*
* XXX - this is a hack, all drivers must changed to acquire the
* lock before calling linesw->l_rint()
+ *
+ * XXX - it is a bigger hack; wscons sometimes already has the lock.
+ * We record which cpu is in ttstart (and how many times) to
+ * avoid re-entering the lock when we come back here via wscons.
+ * This is a hack for a specific known case, not a general fix.
+ * Note that we're also misusing mutex_owned, but the specific
+ * usage ought to work and a real fix is planned for later.
+ * PR 37915.
*/
int
ttyinput(int c, struct tty *tp)
{
int error;
+ int dolock = 1;
/*
* Unless the receiver is enabled, drop incoming data.
@@ -714,9 +727,16 @@ ttyinput(int c, struct tty *tp)
if (!ISSET(tp->t_cflag, CREAD))
return (0);
- mutex_spin_enter(&tty_lock);
+ /* The bigger hack. */
+ if (mutex_owned(&tty_lock) && in_ttstart_hack_cpu == curcpu()) {
+ dolock = 0;
+ }
+
+ if (dolock)
+ mutex_spin_enter(&tty_lock);
error = ttyinput_wlock(c, tp);
- mutex_spin_exit(&tty_lock);
+ if (dolock)
+ mutex_spin_exit(&tty_lock);
return (error);
}
@@ -1551,9 +1571,39 @@ ttrstrt(void *tp_arg)
int
ttstart(struct tty *tp)
{
+ struct cpu_info *mycpu;
+
+ mycpu = curcpu();
+ KASSERT(mutex_owned(&tty_lock));
+
+ /*
+ * Track who's in ttstart. See comment above ttyinput(), and
+ * PR 37915. If someone else was still in ttstart, overwrite
+ * them. This appears to be necessary because something
+ * releases and reacquires tty_lock inside ->t_oproc; I'm
+ * hoping it doesn't call ttyinput() again after that.
+ */
+ if (in_ttstart_hack_cpu == mycpu) {
+ in_ttstart_hack_depth++;
+ } else if (in_ttstart_hack_cpu == NULL) {
+ in_ttstart_hack_cpu = mycpu;
+ in_ttstart_hack_depth = 1;
+ } else {
+ panic("ttstart: unexpected reentrant call "
+ "(tty_lock released inside t_oproc?)\n");
+ }
if (tp->t_oproc != NULL) /* XXX: Kludge for pty. */
(*tp->t_oproc)(tp);
+
+ KASSERT(mycpu == curcpu());
+ KASSERT(in_ttstart_hack_cpu == mycpu);
+ KASSERT(in_ttstart_hack_depth > 0);
+ in_ttstart_hack_depth--;
+ if (in_ttstart_hack_depth == 0) {
+ in_ttstart_hack_cpu = NULL;
+ }
+
return (0);
}
--
David A. Holland
dholland@netbsd.org
State-Changed-From-To: analyzed->feedback
State-Changed-By: drochner@NetBSD.org
State-Changed-When: Thu, 22 Jan 2009 20:47:56 +0000
State-Changed-Why:
applied a workaround which is as small and noninvasive I can think of,
surely it needs a fix in the tty framework, but the assumptions
this is based on are not unrealistic, so this might get a pullup candidate
State-Changed-From-To: feedback->open
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Sat, 07 Feb 2009 07:09:55 +0000
State-Changed-Why:
drochner@'s workaround is committed and pulled up to netbsd-5; however, we
should leave this open until we get some real fixes done. I'll adjust
severity.
Responsible-Changed-From-To: ad->kern-bug-people
Responsible-Changed-By: dholland@NetBSD.org
Responsible-Changed-When: Mon, 09 Apr 2012 05:58:46 +0000
Responsible-Changed-Why:
ad resigned, should not own PRs any more
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.