NetBSD Problem Report #12534
Received: (qmail 14609 invoked from network); 3 Apr 2001 08:26:53 -0000
Message-Id: <200104030826.f338Qlm10165@zhora.cs.hut.fi>
Date: Tue, 3 Apr 2001 11:26:47 +0300 (EEST)
From: Antti Kantee <pooka@iki.fi>
Reply-To: pooka@iki.fi
To: gnats-bugs@gnats.netbsd.org
Subject: processes can hang at ttyout
X-Send-Pr-Version: 3.95
>Number: 12534
>Notify-List: gson@gson.org
>Category: kern
>Synopsis: Processes can hang at exit-time on ttyout
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: kern-bug-people
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Tue Apr 03 08:27:00 +0000 2001
>Closed-Date: Sat May 22 00:32:40 +0000 2021
>Last-Modified: Sat May 22 00:32:40 +0000 2021
>Originator: Antti Kantee
>Release: 1.5.1_ALPHA
>Organization:
>Environment:
System: NetBSD 1.5.1_ALPHA i386
>Description:
Under some conditions an exiting process can hang in the kernel:
zhora:1:~> ps axl
UID PID PPID CPU PRI NI VSZ RSS WCHAN STAT TT TIME COMMAND
26645 9966 1 0 4 0 200 532 ttyout IE p0- 0:00.01 (uisp)
I have figured out nothing that helps short of a reboot.
Inside the kernel the process traceback looks like this:
#0 mi_switch (p=0xf6740010) at ../../../../kern/kern_synch.c:834
#1 0xc0134cfe in ltsleep (ident=0xf637e164, priority=282,
wmesg=0xc028f263 "ttyout", timo=0, interlock=0x0)
at ../../../../kern/kern_synch.c:482
#2 0xc0144c66 in ttysleep (tp=0xf637e11c, chan=0xf637e164, pri=282,
wmesg=0xc028f263 "ttyout", timo=0) at ../../../../kern/tty.c:2121
#3 0xc0143633 in ttywait (tp=0xf637e11c) at ../../../../kern/tty.c:1085
#4 0xc014367a in ttywflush (tp=0xf637e11c) at ../../../../kern/tty.c:1102
#5 0xc01438d2 in ttylclose (tp=0xf637e11c, flag=3)
at ../../../../kern/tty.c:1223
#6 0xc0105ace in comclose (dev=2048, flag=3, mode=8192, p=0xf6740010)
at ../../../../dev/ic/com.c:905
#7 0xc01651a6 in spec_close (v=0xf71a4e4c)
at ../../../../miscfs/specfs/spec_vnops.c:653
#8 0xc0221ac1 in ufsspec_close (v=0xf71a4e4c)
at ../../../../ufs/ufs/ufs_vnops.c:1758
#9 0xc015c557 in vn_close (vp=0xf69fe948, flags=3, cred=0xc0dd7580,
p=0xf6740010) at ../../../../sys/vnode_if.h:171
#10 0xc015ce07 in vn_closefile (fp=0xf671fe74, p=0xf6740010)
at ../../../../kern/vfs_vnops.c:605
#11 0xc012b73a in closef (fp=0xf671fe74, p=0xf6740010)
at ../../../../kern/kern_descrip.c:1090
#12 0xc012b56d in fdfree (p=0xf6740010) at ../../../../kern/kern_descrip.c:965
---Type <return> to continue, or q <return> to quit---
#13 0xc012d1a2 in exit1 (p=0xf6740010, rv=2)
at ../../../../kern/kern_exit.c:185
#14 0xc01333bb in sigexit (p=0xf6740010, signum=2)
at ../../../../kern/kern_sig.c:1281
#15 0xc01331ba in postsig (signum=2) at ../../../../kern/kern_sig.c:1178
#16 0xc023a0f2 in syscall (frame={tf_es = 31, tf_ds = 31,
tf_edi = -1077946988, tf_esi = 1, tf_ebp = -1077957156,
tf_ebx = -1077957188, tf_edx = 3404, tf_ecx = 4, tf_eax = 4,
tf_trapno = 3, tf_err = 2, tf_eip = 1209026483, tf_cs = 23,
tf_eflags = 519, tf_esp = -1077957320, tf_ss = 31, tf_vm86_es = 0,
tf_vm86_ds = 0, tf_vm86_fs = 0, tf_vm86_gs = 0})
at ../../../../arch/i386/i386/trap.c:187
#17 0xc0100cab in syscall1 ()
Here's one from another process which seems to be suffering from the
same problem:
#0 mi_switch (p=0xf66a8980) at ../../../../kern/kern_synch.c:834
#1 0xc0134cfe in ltsleep (ident=0xf637e5d4, priority=282,
wmesg=0xc028f263 "ttyout", timo=0, interlock=0x0)
at ../../../../kern/kern_synch.c:482
#2 0xc0144c66 in ttysleep (tp=0xf637e58c, chan=0xf637e5d4, pri=282,
wmesg=0xc028f263 "ttyout", timo=0) at ../../../../kern/tty.c:2121
#3 0xc0143633 in ttywait (tp=0xf637e58c) at ../../../../kern/tty.c:1085
#4 0xc014367a in ttywflush (tp=0xf637e58c) at ../../../../kern/tty.c:1102
#5 0xc01438d2 in ttylclose (tp=0xf637e58c, flag=3)
at ../../../../kern/tty.c:1223
#6 0xc0144fa5 in ptsclose (dev=1282, flag=3, mode=8192, p=0xf66a8980)
at ../../../../kern/tty_pty.c:173
#7 0xc01651a6 in spec_close (v=0xf6dd7e4c)
at ../../../../miscfs/specfs/spec_vnops.c:653
#8 0xc0221ac1 in ufsspec_close (v=0xf6dd7e4c)
at ../../../../ufs/ufs/ufs_vnops.c:1758
#9 0xc015c557 in vn_close (vp=0xf6a52d0c, flags=3, cred=0xc0bb6f80,
p=0xf66a8980) at ../../../../sys/vnode_if.h:171
#10 0xc015ce07 in vn_closefile (fp=0xf671ff34, p=0xf66a8980)
at ../../../../kern/vfs_vnops.c:605
#11 0xc012b73a in closef (fp=0xf671ff34, p=0xf66a8980)
at ../../../../kern/kern_descrip.c:1090
#12 0xc012b56d in fdfree (p=0xf66a8980) at ../../../../kern/kern_descrip.c:965
---Type <return> to continue, or q <return> to quit---
#13 0xc012d1a2 in exit1 (p=0xf66a8980, rv=9)
at ../../../../kern/kern_exit.c:185
#14 0xc01333bb in sigexit (p=0xf66a8980, signum=9)
at ../../../../kern/kern_sig.c:1281
#15 0xc01331ba in postsig (signum=9) at ../../../../kern/kern_sig.c:1178
#16 0xc023a0f2 in syscall (frame={tf_es = 43, tf_ds = 43, tf_edi = 134736144,
tf_esi = 10, tf_ebp = -1077963604, tf_ebx = 10, tf_edx = 17,
tf_ecx = 135339784, tf_eax = 4, tf_trapno = 3, tf_err = 2,
tf_eip = 1209121465, tf_cs = 35, tf_eflags = 514, tf_esp = -1077963620,
tf_ss = 43, tf_vm86_es = 0, tf_vm86_ds = 0, tf_vm86_fs = 0,
tf_vm86_gs = 0}) at ../../../../arch/i386/i386/trap.c:187
#17 0xc0100cab in syscall1 ()
Especially notice frame #6!
>How-To-Repeat:
Not sure. Run eg. com port using processes until the problem shows up.
>Fix:
don't know
>Release-Note:
>Audit-Trail:
From: Antti Kantee <pooka@iki.fi>
To: gnats-bugs@gnats.netbsd.org
Cc:
Subject: Re: kern/12534: processes can hang at ttyout
Date: Tue, 3 Apr 2001 14:10:00 +0300
Okay, some more information as a result of me trying to play around with
this:
(gdb) frame 3
#3 0xc0143633 in ttywait (tp=0xf637e11c) at ../../../../kern/tty.c:1085
1085 error = ttysleep(tp, &tp->t_outq, TTOPRI | PCATCH, ttyout, 0);
(gdb) l
1080 s = spltty();
1081 while ((tp->t_outq.c_cc || ISSET(tp->t_state, TS_BUSY)) &&
1082 CONNECTED(tp) && tp->t_oproc) {
1083 (*tp->t_oproc)(tp);
1084 SET(tp->t_state, TS_ASLEEP);
1085 error = ttysleep(tp, &tp->t_outq, TTOPRI | PCATCH, ttyout, 0);
1086 if (error)
1087 break;
1088 }
1089 splx(s);
(gdb) print *tp->t_oproc
$7 = {void ()} 0xc01067c8 <comstart>
dev/ic/com.c:comstart() is the only place where wakeup is called with
&t_outq. I'm still not quite sure what's going on here, but something is
obviously going wrong.
And yes, the problem is with a serial device.
--
Antti Kantee <pooka@iki.fi> v Of course he runs NetBSD
http://www.iki.fi/pooka/ i http://www.NetBSD.org/
From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/12534 (Processes can hang at exit-time on ttyout)
Date: Wed, 20 Aug 2014 11:18:29 +0300
I'm suffering from what looks like kern/12534 on NetBSD 6.1.3/amd64.
When I connect to the serial console of another system using tip(1)
over a com(4) port, sometimes the tip process hangs indefinitely while
exiting:
UID PID PPID CPU PRI NI VSZ RSS WCHAN STAT TTY TIME COMMAND
0 12964 1 0 0 0 0 0 - DE pts/4- 0:00.00 (tip)
Kernel stack trace of the tip process from ddb:
trace: pid 12964 lid 1 at 0xfffffe8000078850
sleepq_block() at netbsd:sleepq_block+0xad
cv_timedwait_sig() at netbsd:cv_timedwait_sig+0xaa
ttysleep() at netbsd:ttysleep+0x55
ttywait() at netbsd:ttywait+0x5c
ttywflush() at netbsd:ttywflush+0x11
ttylclose() at netbsd:ttylclose+0x17
comclose() at netbsd:comclose+0x62
cdev_close() at netbsd:cdev_close+0x6a
spec_close() at netbsd:spec_close+0x106
VOP_CLOSE() at netbsd:VOP_CLOSE+0x33
vn_close() at netbsd:vn_close+0x4e
closef() at netbsd:closef+0x4a
fd_free() at netbsd:fd_free+0xba
exit1() at netbsd:exit1+0xf9
sys_exit() at netbsd:sys_exit+0x3e
syscall() at netbsd:syscall+0xc4
Unplugging and replugging the serial port resolves the situation (I'm
mentioning this because it may offer some clue as to the cause of the
bug, not to suggest that it is an acceptable work-around - for an
unattended system, it clearly isn't).
--
Andreas Gustafsson, gson@gson.org
From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@NetBSD.org, Antti Kantee <pooka@iki.fi>
Cc:
Subject: Re: kern/12534 (Processes can hang at exit-time on ttyout)
Date: Fri, 12 Jun 2015 15:49:54 +0300
Hi,
My system that was suffering from symptoms similar to kern/12534
has now been running with the following patch for more than nine
months, and it seems to have fixed the problem and not caused any new
ones. OK to commit?
Index: src/sys/kern/tty.c
===================================================================
RCS file: /bracket/repo/src/sys/kern/tty.c,v
retrieving revision 1.249.8.2
diff -u -r1.249.8.2 tty.c
--- src/sys/kern/tty.c 20 Aug 2012 19:15:36 -0000 1.249.8.2
+++ src/sys/kern/tty.c 30 Aug 2014 13:51:17 -0000
@@ -1525,10 +1525,10 @@
}
/*
- * Wait for output to drain.
+ * Wait for output to drain, or if this times out, flush it.
*/
int
-ttywait(struct tty *tp)
+ttywait_timo(struct tty *tp, int timo)
{
int error;
@@ -1538,9 +1538,11 @@
while ((tp->t_outq.c_cc || ISSET(tp->t_state, TS_BUSY)) &&
CONNECTED(tp) && tp->t_oproc) {
(*tp->t_oproc)(tp);
- error = ttysleep(tp, &tp->t_outcv, true, 0);
- if (error)
+ error = ttysleep(tp, &tp->t_outcv, true, timo);
+ if (error) {
+ ttyflush(tp, FWRITE);
break;
+ }
}
mutex_spin_exit(&tty_lock);
@@ -1548,6 +1550,15 @@
}
/*
+ * Wait for output to drain.
+ */
+int
+ttywait(struct tty *tp)
+{
+ return ttywait_timo(tp, 0);
+}
+
+/*
* Flush if successfully wait.
*/
int
@@ -1555,7 +1566,8 @@
{
int error;
- if ((error = ttywait(tp)) == 0) {
+ error = ttywait_timo(tp, 5 * hz);
+ if (error == 0 || error == EWOULDBLOCK) {
mutex_spin_enter(&tty_lock);
ttyflush(tp, FREAD);
mutex_spin_exit(&tty_lock);
Index: src/sys/sys/tty.h
===================================================================
RCS file: /bracket/repo/src/sys/sys/tty.h,v
retrieving revision 1.90
diff -u -r1.90 tty.h
--- src/sys/sys/tty.h 24 Sep 2011 00:05:38 -0000 1.90
+++ src/sys/sys/tty.h 30 Aug 2014 13:51:17 -0000
@@ -286,6 +286,7 @@
int ttysleep(struct tty *, kcondvar_t *, bool, int);
int ttypause(struct tty *, int);
int ttywait(struct tty *);
+int ttywait_timo(struct tty *, int timo);
int ttywflush(struct tty *);
void ttysig(struct tty *, enum ttysigtype, int);
void tty_attach(struct tty *);
--
Andreas Gustafsson, gson@gson.org
From: christos@zoulas.com (Christos Zoulas)
To: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org,
gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, pooka@iki.fi
Cc: gson@gson.org
Subject: Re: kern/12534 (Processes can hang at exit-time on ttyout)
Date: Fri, 12 Jun 2015 10:35:19 -0400
On Jun 12, 12:55pm, gson@gson.org (Andreas Gustafsson) wrote:
-- Subject: Re: kern/12534 (Processes can hang at exit-time on ttyout)
| Hi,
|
| My system that was suffering from symptoms similar to kern/12534
| has now been running with the following patch for more than nine
| months, and it seems to have fixed the problem and not caused any new
| ones. OK to commit?
Well, no need to expose ttywait_timeo() since it is only used internally
(it should be static). Your fix changes the behavior of ttywflush() which
looks fine for now (since only if_sl.c and if_strip.c use it directly and
in that case it does not matter since they call it on close. I would add
a comment though to to ttywflush since it does not really wait anymore forever
as it used to do, and if one wants to really wait for it to complete, they
need to check for EWOULDBLOCK and put it in a loop. Nobody even checks
for the return for ttywflush() now so perhaps even adding the comment is
excessive :-)
I think you should commit it (with the static)
christos
|
| Index: src/sys/kern/tty.c
| ===================================================================
| RCS file: /bracket/repo/src/sys/kern/tty.c,v
| retrieving revision 1.249.8.2
| diff -u -r1.249.8.2 tty.c
| --- src/sys/kern/tty.c 20 Aug 2012 19:15:36 -0000 1.249.8.2
| +++ src/sys/kern/tty.c 30 Aug 2014 13:51:17 -0000
| @@ -1525,10 +1525,10 @@
| }
|
| /*
| - * Wait for output to drain.
| + * Wait for output to drain, or if this times out, flush it.
| */
| int
| -ttywait(struct tty *tp)
| +ttywait_timo(struct tty *tp, int timo)
| {
| int error;
|
| @@ -1538,9 +1538,11 @@
| while ((tp->t_outq.c_cc || ISSET(tp->t_state, TS_BUSY)) &&
| CONNECTED(tp) && tp->t_oproc) {
| (*tp->t_oproc)(tp);
| - error = ttysleep(tp, &tp->t_outcv, true, 0);
| - if (error)
| + error = ttysleep(tp, &tp->t_outcv, true, timo);
| + if (error) {
| + ttyflush(tp, FWRITE);
| break;
| + }
| }
| mutex_spin_exit(&tty_lock);
|
| @@ -1548,6 +1550,15 @@
| }
|
| /*
| + * Wait for output to drain.
| + */
| +int
| +ttywait(struct tty *tp)
| +{
| + return ttywait_timo(tp, 0);
| +}
| +
| +/*
| * Flush if successfully wait.
| */
| int
| @@ -1555,7 +1566,8 @@
| {
| int error;
|
| - if ((error = ttywait(tp)) == 0) {
| + error = ttywait_timo(tp, 5 * hz);
| + if (error == 0 || error == EWOULDBLOCK) {
| mutex_spin_enter(&tty_lock);
| ttyflush(tp, FREAD);
| mutex_spin_exit(&tty_lock);
| Index: src/sys/sys/tty.h
| ===================================================================
| RCS file: /bracket/repo/src/sys/sys/tty.h,v
| retrieving revision 1.90
| diff -u -r1.90 tty.h
| --- src/sys/sys/tty.h 24 Sep 2011 00:05:38 -0000 1.90
| +++ src/sys/sys/tty.h 30 Aug 2014 13:51:17 -0000
| @@ -286,6 +286,7 @@
| int ttysleep(struct tty *, kcondvar_t *, bool, int);
| int ttypause(struct tty *, int);
| int ttywait(struct tty *);
| +int ttywait_timo(struct tty *, int timo);
| int ttywflush(struct tty *);
| void ttysig(struct tty *, enum ttysigtype, int);
| void tty_attach(struct tty *);
|
| --
| Andreas Gustafsson, gson@gson.org
|
-- End of excerpt from Andreas Gustafsson
From: "Andreas Gustafsson" <gson@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/12534 CVS commit: src/sys/kern
Date: Fri, 12 Jun 2015 17:28:53 +0000
Module Name: src
Committed By: gson
Date: Fri Jun 12 17:28:53 UTC 2015
Modified Files:
src/sys/kern: tty.c
Log Message:
When closing a tty, limit the amount of time spent waiting for the
output to drain to five seconds so that exiting processes with
buffered output for a serial port blocked by flow control do not
hang indefinitely. Should fix PR kern/12534. OK christos.
To generate a diff of this commit:
cvs rdiff -u -r1.262 -r1.263 src/sys/kern/tty.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->closed
State-Changed-By: gson@NetBSD.org
State-Changed-When: Fri, 12 Jun 2015 17:48:00 +0000
State-Changed-Why:
Should be fixed now.
State-Changed-From-To: closed->open
State-Changed-By: gson@NetBSD.org
State-Changed-When: Sat, 13 Jun 2015 07:46:47 +0000
State-Changed-Why:
The change of tty.c 1.263 broke the lib/libc/ttyio/t_ttyio/ioctl
test case; reverting it until I have figured out what the problem is.
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: gson@gson.org
Subject: Re: kern/12534 (Processes can hang at exit-time on ttyout)
Date: Tue, 16 Jun 2015 06:32:46 +0000
On Sat, Jun 13, 2015 at 07:46:48AM +0000, gson@NetBSD.org wrote:
> Synopsis: Processes can hang at exit-time on ttyout
>
> State-Changed-From-To: closed->open
> State-Changed-By: gson@NetBSD.org
> State-Changed-When: Sat, 13 Jun 2015 07:46:47 +0000
> State-Changed-Why:
> The change of tty.c 1.262 broke the lib/libc/ttyio/t_ttyio/ioctl
> test case; reverting it until I have figured out what the problem is.
Also, while you're flogging this, have a look at 17171, which is very
closely related.
--
David A. Holland
dholland@netbsd.org
State-Changed-From-To: open->pending-pullups
State-Changed-By: gson@NetBSD.org
State-Changed-When: Wed, 14 Oct 2015 19:40:40 +0000
State-Changed-Why:
The process will exit after a five-second timeout as of tty.c 1.267.
From: coypu@sdf.org
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/12534: Processes can hang at exit-time on ttyout
Date: Tue, 6 Jun 2017 03:56:43 +0000
was a pullup filed?
From: Andreas Gustafsson <gson@gson.org>
To: copyu@sdf.org
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/12534: Processes can hang at exit-time on ttyout
Date: Tue, 6 Jun 2017 10:57:13 +0300
coypu@sdf.org wrote:
> was a pullup filed?
I'm afraid I still haven't gotten around to it. If you would
like to file it, feel free to do so.
--
Andreas Gustafsson, gson@gson.org
State-Changed-From-To: pending-pullups->feedback
State-Changed-By: jdolecek@NetBSD.org
State-Changed-When: Thu, 17 Aug 2017 19:04:00 +0000
State-Changed-Why:
This issue may be fixed in rev 1.263 of tty.c, which is only on netbsd-8.
Do you think this still needs to be pulled up to older releases?
State-Changed-From-To: feedback->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Tue, 10 Apr 2018 09:30:45 +0000
State-Changed-Why:
Tehre is no need to pull this up
From: matthew green <mrg@eterna.com.au>
To: gnats-bugs@NetBSD.org
Cc: gson@gson.org, kern-bug-people@netbsd.org, netbsd-bugs@netbsd.org,
gnats-admin@netbsd.org, dholland@NetBSD.org, pooka@iki.fi
Subject: re: kern/12534 (Processes can hang at exit-time on ttyout)
Date: Wed, 11 Apr 2018 04:07:19 +1000
dholland@NetBSD.org writes:
> Synopsis: Processes can hang at exit-time on ttyout
>
> State-Changed-From-To: feedback->closed
> State-Changed-By: dholland@NetBSD.org
> State-Changed-When: Tue, 10 Apr 2018 09:30:45 +0000
> State-Changed-Why:
> Tehre is no need to pull this up
why not? it still affects released systems. i used to see this
with netbsd-7.
.mrg.
State-Changed-From-To: closed->open
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Wed, 11 Apr 2018 03:23:12 +0000
State-Changed-Why:
Because I mixed it up with the pty version, which only manifests when
the pty master is broken.
Oops :-/
State-Changed-From-To: open->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Sat, 22 May 2021 00:32:40 +0000
State-Changed-Why:
There is now no need to pull it up because -7 is EOL. :-|
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.