NetBSD Problem Report #46522

From www@NetBSD.org  Sat Jun  2 15:46:02 2012
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
	by www.NetBSD.org (Postfix) with ESMTP id 069C163C785
	for <gnats-bugs@gnats.NetBSD.org>; Sat,  2 Jun 2012 15:46:02 +0000 (UTC)
Message-Id: <20120602154601.3A95B63BA27@www.NetBSD.org>
Date: Sat,  2 Jun 2012 15:46:01 +0000 (UTC)
From: nathanialsloss@yahoo.com.au
Reply-To: nathanialsloss@yahoo.com.au
To: gnats-bugs@NetBSD.org
Subject: wscons: deleting a screen causes kernel crash
X-Send-Pr-Version: www-1.0

>Number:         46522
>Category:       kern
>Synopsis:       wscons: deleting a screen causes kernel crash
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    nat
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Jun 02 15:50:00 +0000 2012
>Closed-Date:    Mon Nov 13 13:22:02 +0000 2023
>Last-Modified:  Mon Nov 13 13:22:02 +0000 2023
>Originator:     Nat Sloss
>Release:        NetBSD Current 6.99.6
>Organization:
>Environment:
NetBSD beast 6.99.6 NetBSD 6.99.6 (LOCKDEBUG) #54: Sat Jun  2 17:32:26 EST 2012  build@beast:/usr/src/sys/arch/i386/compile/obj/LOCKDEBUG i386

>Description:
In my configuration 4 virtual screens are created at startup as specified in wscons.conf.

I thought I could change the resolution of one of the sreens (this is not the case as I have since discovered) and whilst attempting to delete one of the four screens the kernel crashed.  This happens whether a getty is running on that screen or not.

The following debug output was obtained:

test# wsconscfg -dF 3
Condition variable error: lockdebug_free: is locked or in use

lock address : 0x00000000c1349828 type     :               spin
initialized  : 0x00000000c07af190 interlock: 0x00000000c0c5a7d0

panic: LOCKDEBUG
fatal breakpoint trap in supervisor mode
trap type 1 code 0 eip c025c1a4 cs 8 eflags 282 cr2 bbbd30d4 ilevel 0
Stopped in pid 531.1 (wsconscfg) at     netbsd:breakpoint+0x4:  popl    %ebp
db{0}> bt
breakpoint(c0b68789,c0c1a940,c0b59f5d,c82bb838,0,c144fc80,c14a1800,c174374c,0,c0
ac323a) at netbsd:breakpoint+0x4
vpanic(c0b59f5d,c82bb838,c82bb86c,c1743740,0,c0ac323a,c82bb86c,c0759df4,c0b59f5d
,c0b27db6) at netbsd:vpanic+0x1e2
printf_nolog(c0b59f5d,c0b27db6,c0ac323a,c0b59f67,c1422d40,c0b89814,1ac3148,c0b59
f67,c117d4cc,c0c5a7d0) at netbsd:printf_nolog
lockdebug_more(c0b59f67,1,ffffffff,c0c5a7d0,c1349878,c1349834,c0b89814,c07af482,
c14a1800,400) at netbsd:lockdebug_more
lockdebug_free(c1349828,0,6,c117d4cc,3,c121bcac,c0c513c4,c13498b0,c1349860,c1349
8a4) at netbsd:lockdebug_free+0xc9
tty_free(c1349800,c0c19ba0,c074eb25,2f,c117d700,c074cf53,6,c121bcac,3,c121bcac) a
t netbsd:tty_free+0x197
wsscreen_detach(c121bcac,8008574f,c0c19ba0,c0517ce4,c0c19ba2,c074cf53,0,c124be40
,0,c1248200) at netbsd:wsscreen_detach+0x24
wsdisplay_delscreen(c124be40,3,1,c074eb5b,c0c19ba0,0,2fff,ff,c1248200,8008574f) a
t netbsd:wsdisplay_delscreen+0xa6
wsdisplay_cfg_ioctl(c124be40,8008574f,c82bbc24,3,c1422d40,c1422d40,c14af4c0,c14a
f4c0,c0ad01a0,c1422d40) at netbsd:wsdisplay_cfg_ioctl+0x7d
cdev_ioctl(2fff,0,8008574f,c82bbc24,3,c1422d40,3,2000,0,c0854f44) at netbsd:cdev
_ioctl+0x9a
spec_ioctl(c82bbabc,0,c0854f44,c14af4c0,c1422d40,0,c82bbaec,c08c485c,0,c18dca00)
 at netbsd:spec_ioctl+0xdd
VOP_IOCTL(c14b8790,8008574f,c82bbc24,3,c1199540,c140c59c,c82bbb4c,c075a4bb,c18dc
a0c,c140c59c) at netbsd:VOP_IOCTL+0x3e
vn_ioctl(c140c580,8008574f,c82bbc24,c0759b52,0,c19041c0,c0acf9d8,c075f570,6,10) a
t netbsd:vn_ioctl+0x68
sys_ioctl(c1422d40,c82bbcf4,c82bbd1c,c12f785c,0,c12fcc6c,bbbd30d4,36,c12f785c,2)
 at netbsd:sys_ioctl+0x1b2
syscall(c82bbd48,bbbc00b3,ab,bfbf001f,bbbc001f,0,0,bfbfedb8,0,bfbfed94) at netbs
d:syscall+0x95


Note: the addresses of the tty functions will be different as my kern/tty.c was modified in an attempt to fix the problem with no effect.
>How-To-Repeat:
wsconscfg -dF 3

Attempt to delete screen 3.
>Fix:

>Release-Note:

>Audit-Trail:
From: Nat Sloss <nathanialsloss@yahoo.com.au>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/46522
Date: Mon, 4 Jun 2012 01:18:13 +1000

 Hi.

 I've found that it only crashes when deleting a screen that has a getty 
 running on it.  Deleting a screen that has no running programmes works.

 What is causing the problem is cv_destroy(&tp->t_rawcv) I believe that  getty 
 is waiting on that conditional variable as the crash says that the cv is in 
 use.

 Destroying the other cvs' as found in tty_free works.

 I would like to kill the getty before destroying the cvs' but I have no idea 
 how I've tried sending signals with no success, and I have no idea as to how 
 to make it wait until the process is killed, but I think this is the source 
 of the problem.

 Any ideas?

 Regards,

 Nat.

From: Nat Sloss <nathanialsloss@yahoo.com.au>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/46522
Date: Mon, 4 Jun 2012 04:07:56 +1000

 Hi I think I've fixed it (although I don't think my patch is very good).

 The trick was to add a delay (I didn't use delay :) ).

 The getty was closing but before that happens the cv was being destroyed.

 So what was needed was a wait (the only thing I thought of) so that the getty 
 or other programme would finish with the screen and then the cvs' could be 
 destroyed without crashing.

 So here is my patch:

 ===================================================================
 RCS file: /cvsroot/src/sys/kern/tty.c,v
 retrieving revision 1.250
 diff -u -r1.250 tty.c
 --- sys/kern/tty.c      12 Mar 2012 18:27:08 -0000      1.250
 +++ sys/kern/tty.c      3 Jun 2012 17:58:19 -0000
 @@ -2770,6 +2770,9 @@
                 sigemptyset(&tp->t_sigs[i]);
         if (tp->t_sigcount != 0)
                 TAILQ_REMOVE(&tty_sigqueue, tp, t_sigqueue);
 +
 +       ttysleep(tp, &tp->t_rawcv, true, mstohz(200));
 +
         mutex_exit(&tty_lock);
         mutex_exit(proc_lock);



 In my opinion I don't think this is the best way because what if a longer 
 sleep is needed?

 ...Anyway it stops the crashes for me.

 Note: This patch is my own work which I submit under the NetBSD license.

 Regards,

 Nat.

 PS: Would you agree that delay and DELAY should be re implemented as high 
 resolution sleeps?  I think that delay has no place in a 
 multitasking/multiprocess OS such as NetBSD.

From: Nat Sloss <nathanialsloss@yahoo.com.au>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/46522
Date: Mon, 4 Jun 2012 04:12:20 +1000

 Hi,

 I forgot to ask that when a solution is found could NetBSD 6 be pulled up?

 Regards,

 Nat.

From: jnemeth@victoria.tc.ca (John Nemeth)
To: gnats-bugs@NetBSD.org, kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org,
        netbsd-bugs@NetBSD.org, nathanialsloss@yahoo.com.au
Cc: 
Subject: Re: kern/46522
Date: Sun, 3 Jun 2012 11:33:58 -0700

 On Sep 19,  3:32am, Nat Sloss wrote:
 } 
 } From: Nat Sloss <nathanialsloss@yahoo.com.au>
 } To: gnats-bugs@netbsd.org
 } Subject: Re: kern/46522
 } Date: Mon, 4 Jun 2012 01:18:13 +1000
 } 
 }  Hi.
 }  
 }  I've found that it only crashes when deleting a screen that has a getty 
 }  running on it.  Deleting a screen that has no running programmes works.
 }  
 }  What is causing the problem is cv_destroy(&tp->t_rawcv) I believe that  getty 
 }  is waiting on that conditional variable as the crash says that the cv is in 
 }  use.
 }  
 }  Destroying the other cvs' as found in tty_free works.
 }  
 }  I would like to kill the getty before destroying the cvs' but I have no idea 
 }  how I've tried sending signals with no success, and I have no idea as to how 
 }  to make it wait until the process is killed, but I think this is the source 
 }  of the problem.

      You need to edit /etc/ttys and change status to "off" for the
 appropriate tty, then "kill -1 1" to tell init to reread /etc/ttys.  If
 you just kill the getty init will restart it.

      However, there is still a bug.  The system shouldn't crash.  It
 should either error out when the tty is in use, or forcibly detach all
 processes from the tty.

 }-- End of excerpt from Nat Sloss

From: Nat Sloss <nathanialsloss@yahoo.com.au>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/46522
Date: Mon, 4 Jun 2012 12:36:09 +1000

 Hi,

 I have a marginally better patch as it was a bad idea to sleep whilst holding 
 proc_lock it would cause it to hang and be unresponsive.

 So this one is better:

 Index: sys/kern/tty.c
 ===================================================================
 RCS file: /cvsroot/src/sys/kern/tty.c,v
 retrieving revision 1.250
 diff -u -r1.250 tty.c
 --- sys/kern/tty.c      12 Mar 2012 18:27:08 -0000      1.250
 +++ sys/kern/tty.c      4 Jun 2012 02:36:04 -0000
 @@ -2762,7 +2762,11 @@
  void
  tty_free(struct tty *tp)
  {
 -       int i;
 +       int i, timeout;
 +
 +       timeout = mstohz(200);
 +       if (timeout == 0)
 +               timeout = 1;

         mutex_enter(proc_lock);
         mutex_enter(&tty_lock);
 @@ -2770,9 +2774,11 @@
                 sigemptyset(&tp->t_sigs[i]);
         if (tp->t_sigcount != 0)
                 TAILQ_REMOVE(&tty_sigqueue, tp, t_sigqueue);
 -       mutex_exit(&tty_lock);
         mutex_exit(proc_lock);

 +       ttysleep(tp, &tp->t_rawcv, true, mstohz(200));
 +       mutex_exit(&tty_lock);
 +
         callout_halt(&tp->t_rstrt_ch, NULL);
         callout_destroy(&tp->t_rstrt_ch);
         ttyldisc_release(tp->t_linesw);


 Note: This patch is my own work which I submit under the NetBSD license.

 As an after thought it probably should panic if it has any waiters on t_rawcv.
 Is a panic better than a crash? :)

 Regards,

 Nat.

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/46522
Date: Mon, 4 Jun 2012 06:27:31 +0000

 On Mon, Jun 04, 2012 at 02:40:05AM +0000, Nat Sloss wrote:
  >  Is a panic better than a crash? :)

 A panic *is* a crash.

 -- 
 David A. Holland
 dholland@netbsd.org

From: David Laight <david@l8s.co.uk>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org, nathanialsloss@yahoo.com.au
Subject: Re: kern/46522
Date: Mon, 4 Jun 2012 10:06:18 +0100

 On Mon, Jun 04, 2012 at 02:40:05AM +0000, Nat Sloss wrote:
 >  
 >  I have a marginally better patch as it was a bad idea to sleep whilst holding 
 >  proc_lock it would cause it to hang and be unresponsive.
 ...
 >  @@ -2770,9 +2774,11 @@
 >                  sigemptyset(&tp->t_sigs[i]);
 >          if (tp->t_sigcount != 0)
 >                  TAILQ_REMOVE(&tty_sigqueue, tp, t_sigqueue);
 >  -       mutex_exit(&tty_lock);
 >          mutex_exit(proc_lock);
 >  
 >  +       ttysleep(tp, &tp->t_rawcv, true, mstohz(200));
 >  +       mutex_exit(&tty_lock);
 >  +

 While adding a fixed sleep is enough to show where the problem
 lies, it isn't an appropriate solution to the problem.

 You need to properly wait for the close to release a reference
 count on the resource, and then free the relevant data areas.

 The scheduler can always decide to not run the process you
 are waiting for - so the sleep has to be indefinitely long.

 I;m not saying a real fix is easy! the tty subsystem is full
 of places where it isn't remotely MP-safe.

 	David

 -- 
 David Laight: david@l8s.co.uk

From: Martin Husemann <martin@duskware.de>
To: Christos Zoulas <christos@zoulas.com>
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/46522
Date: Mon, 4 Jun 2012 15:18:04 +0200

 On Mon, Jun 04, 2012 at 09:14:56AM -0400, Christos Zoulas wrote:
 > What's the plan here? Is the getty process supposed to be killed, and if
 > so by whom? I would prefer if instead it got an EOF from the tty and
 > then it was waited until it exited before the screen was destroyed.

 Or: just fail deletion of the screen while the tty is open.

 Martin

From: christos@zoulas.com (Christos Zoulas)
To: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org, 
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, 
	nathanialsloss@yahoo.com.au
Cc: 
Subject: Re: kern/46522
Date: Mon, 4 Jun 2012 09:14:56 -0400

 On Jun 4,  9:15am, david@l8s.co.uk (David Laight) wrote:
 -- Subject: Re: kern/46522

 |  While adding a fixed sleep is enough to show where the problem
 |  lies, it isn't an appropriate solution to the problem.
 |  
 |  You need to properly wait for the close to release a reference
 |  count on the resource, and then free the relevant data areas.
 |  
 |  The scheduler can always decide to not run the process you
 |  are waiting for - so the sleep has to be indefinitely long.
 |  
 |  I;m not saying a real fix is easy! the tty subsystem is full
 |  of places where it isn't remotely MP-safe.

 What's the plan here? Is the getty process supposed to be killed, and if
 so by whom? I would prefer if instead it got an EOF from the tty and
 then it was waited until it exited before the screen was destroyed.

 christos

From: christos@zoulas.com (Christos Zoulas)
To: Martin Husemann <martin@duskware.de>
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/46522
Date: Mon, 4 Jun 2012 09:24:55 -0400

 On Jun 4,  3:18pm, martin@duskware.de (Martin Husemann) wrote:
 -- Subject: Re: kern/46522

 | On Mon, Jun 04, 2012 at 09:14:56AM -0400, Christos Zoulas wrote:
 | > What's the plan here? Is the getty process supposed to be killed, and if
 | > so by whom? I would prefer if instead it got an EOF from the tty and
 | > then it was waited until it exited before the screen was destroyed.
 | 
 | Or: just fail deletion of the screen while the tty is open.

 Sure!

 christos

From: Nathanial Sloss <nathanialsloss@yahoo.com.au>
To: "gnats-bugs@netbsd.org" <gnats-bugs@netbsd.org>
Cc: 
Subject: Re: kern/46522
Date: Mon, 4 Jun 2012 08:58:49 -0700 (PDT)

 Hi,=0A=0AThe getty doesn't have to be killed it's dying it's just that befo=
 re it is finished the cvs' are being destroyed.=0A=0ASo I came up with anot=
 her patch,=A0 If t_pgrp isn't nullified in ttyclose it would be possible to=
  wait on processes in the list until they die and then safely free the scre=
 en.=A0 So what I did was this:=0A=0AIndex: sys/kern/tty.c=0A=3D=3D=3D=3D=3D=
 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0ARCS file: /cvsroot/src/sys/kern/tty.=
 c,v=0Aretrieving revision 1.250=0Adiff -u -r1.250 tty.c=0A--- sys/kern/tty.=
 c=A0=A0=A0=A0=A0 12 Mar 2012 18:27:08 -0000=A0=A0=A0=A0=A0 1.250=0A+++ sys/=
 kern/tty.c=A0=A0=A0=A0=A0 4 Jun 2012 15:46:59 -0000=0A@@ -422,7 +422,6 @@=
 =0A=A0=A0=A0=A0=A0=A0=A0 ttyflush(tp, FREAD | FWRITE);=0A=0A=A0=A0=A0=A0=A0=
 =A0=A0 tp->t_gen++;=0A-=A0=A0=A0=A0=A0=A0 tp->t_pgrp =3D NULL;=0A=A0=A0=A0=
 =A0=A0=A0=A0 tp->t_state =3D 0;=0A=A0=A0=A0=A0=A0=A0=A0 sess =3D tp->t_sess=
 ion;=0A=A0=A0=A0=A0=A0=A0=A0 tp->t_session =3D NULL;=0A@@ -2762,7 +2761,12 =
 @@=0A=A0void=0A=A0tty_free(struct tty *tp)=0A=A0{=0A-=A0=A0=A0=A0=A0=A0 int=
  i;=0A+=A0=A0=A0=A0=A0=A0 int i, timeout;=0A+=A0=A0=A0=A0=A0=A0 struct proc=
  *p;=0A+=0A+=A0=A0=A0=A0=A0=A0 timeout =3D mstohz(200);=0A+=A0=A0=A0=A0=A0=
 =A0 if (timeout =3D=3D 0)=0A+=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 tim=
 eout =3D 1;=0A=0A=A0=A0=A0=A0=A0=A0=A0 mutex_enter(proc_lock);=0A=A0=A0=A0=
 =A0=A0=A0=A0 mutex_enter(&tty_lock);=0A@@ -2770,9 +2774,17 @@=0A=A0=A0=A0=
 =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 sigemptyset(&tp->t_sigs[i]);=0A=A0=A0=
 =A0=A0=A0=A0=A0 if (tp->t_sigcount !=3D 0)=0A=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
 =A0=A0=A0=A0=A0 TAILQ_REMOVE(&tty_sigqueue, tp, t_sigqueue);=0A-=A0=A0=A0=
 =A0=A0=A0 mutex_exit(&tty_lock);=0A=A0=A0=A0=A0=A0=A0=A0 mutex_exit(proc_lo=
 ck);=0A=0A+=A0=A0=A0=A0=A0=A0 while (tp->t_pgrp !=3D NULL) {=0A+=A0=A0=A0=
 =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 if ((p =3D LIST_FIRST(&tp->t_pgrp->pg_mem=
 bers)) !=3D NULL)=0A+=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
 =A0=A0=A0=A0 ttysleep(tp, &tp->t_rawcv, true, timeout);=0A+=A0=A0=A0=A0=A0=
 =A0=A0=A0=A0=A0=A0=A0=A0=A0 else if (LIST_EMPTY(&tp->t_pgrp->pg_members))=
 =0A+=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 tp->=
 t_pgrp =3D NULL;=0A+=A0=A0=A0=A0=A0=A0 }=0A+=A0=A0=A0=A0=A0=A0 tp->t_pgrp =
 =3D NULL;=0A+=A0=A0=A0=A0=A0=A0 mutex_exit(&tty_lock);=0A+=0A=A0=A0=A0=A0=
 =A0=A0=A0 callout_halt(&tp->t_rstrt_ch, NULL);=0A=A0=A0=A0=A0=A0=A0=A0 call=
 out_destroy(&tp->t_rstrt_ch);=0A=A0=A0=A0=A0=A0=A0=A0 ttyldisc_release(tp->=
 t_linesw);=0A=0A=0ANote: This patch is my own work which I submit under the=
  NetBSD license.=0A=0A=0AI hope this is a better patch.=0A=0ARegards,=0A=0A=
 Nat.

From: Nat Sloss <nathanialsloss@yahoo.com.au>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/46522
Date: Tue, 5 Jun 2012 02:37:59 +1000

 Hi Sorry for the previous posting.  I was using yahoo web mail and the resu=
 lts=20
 were unexpected.  All is well again my email is working as it should.

 Regards,

 Nat.
 =2D---------  Forwarded Message  ----------

 Subject: Re: kern/46522
 Date: Tue, 5 Jun 2012
 =46rom: Nathanial Sloss <nathanialsloss@yahoo.com.au>
 To: "gnats-bugs@netbsd.org" <gnats-bugs@netbsd.org>

 Hi,

 The getty doesn't have to be killed it's dying it's just that before it is=
 =20
 finished the cvs' are being destroyed.

 So I came up with another patch,=A0 If t_pgrp isn't nullified in ttyclose i=
 t=20
 would be possible to wait on processes in the list until they die and then=
 =20
 safely free the screen.=A0 So what I did was this:

 Index: sys/kern/tty.c
 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
 RCS file: /cvsroot/src/sys/kern/tty.c,v
 retrieving revision 1.250
 diff -u -r1.250 tty.c
 =2D-- sys/kern/tty.c=A0=A0=A0=A0=A0 12 Mar 2012 18:27:08 -0000=A0=A0=A0=A0=
 =A0 1.250
 +++ sys/kern/tty.c=A0=A0=A0=A0=A0 4 Jun 2012 15:46:59 -0000
 @@ -422,7 +422,6 @@
 =A0=A0=A0=A0=A0=A0=A0 ttyflush(tp, FREAD | FWRITE);

 =A0=A0=A0=A0=A0=A0=A0 tp->t_gen++;
 =2D=A0=A0=A0=A0=A0=A0 tp->t_pgrp =3D NULL;
 =A0=A0=A0=A0=A0=A0=A0 tp->t_state =3D 0;
 =A0=A0=A0=A0=A0=A0=A0 sess =3D tp->t_session;
 =A0=A0=A0=A0=A0=A0=A0 tp->t_session =3D NULL;
 @@ -2762,7 +2761,12 @@
 =A0void
 =A0tty_free(struct tty *tp)
 =A0{
 =2D=A0=A0=A0=A0=A0=A0 int i;
 +=A0=A0=A0=A0=A0=A0 int i, timeout;
 +=A0=A0=A0=A0=A0=A0 struct proc *p;
 +
 +=A0=A0=A0=A0=A0=A0 timeout =3D mstohz(200);
 +=A0=A0=A0=A0=A0=A0 if (timeout =3D=3D 0)
 +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 timeout =3D 1;

 =A0=A0=A0=A0=A0=A0=A0 mutex_enter(proc_lock);
 =A0=A0=A0=A0=A0=A0=A0 mutex_enter(&tty_lock);
 @@ -2770,9 +2774,17 @@
 =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 sigemptyset(&tp->t_sigs[i]);
 =A0=A0=A0=A0=A0=A0=A0 if (tp->t_sigcount !=3D 0)
 =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 TAILQ_REMOVE(&tty_sigqueue, t=
 p, t_sigqueue);
 =2D=A0=A0=A0=A0=A0=A0 mutex_exit(&tty_lock);
 =A0=A0=A0=A0=A0=A0=A0 mutex_exit(proc_lock);

 +=A0=A0=A0=A0=A0=A0 while (tp->t_pgrp !=3D NULL) {
 +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 if ((p =3D LIST_FIRST(&tp->t_pg=
 rp->pg_members)) !=3D NULL)
 +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 ttyslee=
 p(tp, &tp->t_rawcv, true, timeout);
 +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 else if (LIST_EMPTY(&tp->t_pgrp=
 =2D>pg_members))
 +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 tp->t_p=
 grp =3D NULL;
 +=A0=A0=A0=A0=A0=A0 }
 +=A0=A0=A0=A0=A0=A0 tp->t_pgrp =3D NULL;
 +=A0=A0=A0=A0=A0=A0 mutex_exit(&tty_lock);
 +
 =A0=A0=A0=A0=A0=A0=A0 callout_halt(&tp->t_rstrt_ch, NULL);
 =A0=A0=A0=A0=A0=A0=A0 callout_destroy(&tp->t_rstrt_ch);
 =A0=A0=A0=A0=A0=A0=A0 ttyldisc_release(tp->t_linesw);


 Note: This patch is my own work which I submit under the NetBSD license.


 I hope this is a better patch.

 Regards,

 Nat.

 =2D------------------------------------------------------

From: Nat Sloss <nathanialsloss@yahoo.com.au>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/46522
Date: Tue, 5 Jun 2012 03:07:06 +1000

 Hi.

 Sorry for the trouble again.

 I'm retyping my message so hopefully it'll get through un-garbled. 

 So here's my new patch:

 Index: sys/kern/tty.c
 ===================================================================
 RCS file: /cvsroot/src/sys/kern/tty.c,v
 retrieving revision 1.250
 diff -u -r1.250 tty.c
 --- sys/kern/tty.c      12 Mar 2012 18:27:08 -0000      1.250
 +++ sys/kern/tty.c      4 Jun 2012 17:06:48 -0000
 @@ -422,7 +422,6 @@
         ttyflush(tp, FREAD | FWRITE);

         tp->t_gen++;
 -       tp->t_pgrp = NULL;
         tp->t_state = 0;
         sess = tp->t_session;
         tp->t_session = NULL;
 @@ -2762,7 +2761,12 @@
  void
  tty_free(struct tty *tp)
  {
 -       int i;
 +       int i, timeout;
 +       struct proc *p;
 +
 +       timeout = mstohz(200);
 +       if (timeout == 0)
 +               timeout = 1;

         mutex_enter(proc_lock);
         mutex_enter(&tty_lock);
 @@ -2770,9 +2774,17 @@
                 sigemptyset(&tp->t_sigs[i]);
         if (tp->t_sigcount != 0)
                 TAILQ_REMOVE(&tty_sigqueue, tp, t_sigqueue);
 -       mutex_exit(&tty_lock);
         mutex_exit(proc_lock);

 +       while (tp->t_pgrp != NULL) {
 +               if ((p = LIST_FIRST(&tp->t_pgrp->pg_members)) != NULL)
 +                       ttysleep(tp, &tp->t_rawcv, true, timeout);
 +               else if (LIST_EMPTY(&tp->t_pgrp->pg_members))
 +                       tp->t_pgrp = NULL;
 +       }
 +       tp->t_pgrp = NULL;
 +       mutex_exit(&tty_lock);
 +
         callout_halt(&tp->t_rstrt_ch, NULL);
         callout_destroy(&tp->t_rstrt_ch);
         ttyldisc_release(tp->t_linesw);


 Note: This patch is my own work which I submit under the NetBSD license.

 Regards,

 Nat.

Responsible-Changed-From-To: kern-bug-people->nat
Responsible-Changed-By: nat@NetBSD.org
Responsible-Changed-When: Mon, 13 Nov 2023 13:22:02 +0000
Responsible-Changed-Why:
Take. It was opened by me some time ago.


State-Changed-From-To: open->closed
State-Changed-By: nat@NetBSD.org
State-Changed-When: Mon, 13 Nov 2023 13:22:02 +0000
State-Changed-Why:
I was unable to reproduce the crash with newer kernels.
Closed at submitter's request.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: gnats-precook-prs,v 1.4 2018/12/21 14:20:20 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.