NetBSD Problem Report #51556

From jschauma@netmeister.org  Thu Oct 13 02:05:01 2016
Return-Path: <jschauma@netmeister.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 567477A215
	for <gnats-bugs@gnats.NetBSD.org>; Thu, 13 Oct 2016 02:05:01 +0000 (UTC)
Message-Id: <20161013020459.4DC006513C@panix.netmeister.org>
Date: Wed, 12 Oct 2016 22:04:59 -0400 (EDT)
From: jschauma@netmeister.org
Reply-To: jschauma@netmeister.org
To: gnats-bugs@NetBSD.org
Subject: less(1) generates SIGTTOU if lacking a controlling terminal
X-Send-Pr-Version: 3.95

>Number:         51556
>Category:       bin
>Synopsis:       less(1) generates SIGTTOU if lacking a controlling terminal
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    bin-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Oct 13 02:10:00 +0000 2016
>Last-Modified:  Mon Oct 31 05:13:50 +0000 2016
>Originator:     Jan Schaumann
>Release:        NetBSD 7.0.1
>Organization:

>Environment:
Architecture: x86_64
Machine: amd64
>Description:

I just chased down a bug in Debian's more(1) where more(1) called
tcsetattr(3) despire not having a controlling terminal:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=840586

The same issue appears to exist in our less(1), but I haven't had the
time to pinpoint it in the code (which is notably more complex than
more(1)'s).

To summarize:

$ /bin/sh -c "timeout 30 /bin/sh -c \"ls | more\""

will hang, since the second sh(1) is suspended after receiving SIGTTOU
generated by more(1) (which, in our case, is less(1)).

This is due to timeout(1) creating a new process group, leading to the
processes it invokes not having a controlling terminal.

$ ps x -o pid,ppid,pgid,tpgid,stat,comm | egrep "(/bin/sh|timeout|more)"
 4345 25691  4345 25691 S    timeout
 6811  4345  4345 25691 T    /bin/sh
13202  6811  4345 25691 T    more
25691 14973 25691 25691 S+   /bin/sh

Note that PIDs 4345, 6811, and 13202 are not in the process group 25691,
which has the controlling terminal.

If somebody more familiar with the less(1) codebase and terminal IO
could take a look where we might be issuing writes to the controlling
terminal (via ioctl(2), I suppose), that'd be great.


>How-To-Repeat:

$ /bin/sh -c "timeout 30 /bin/sh -c \"ls | more\""

>Fix:

Only perform write actions on stdout/stderr if

tcgetpgrp(fileno(stderr)) == getpgrp()

I suspect this check needs to be added in a few places in extern/bsd/less/dist/screen.c ?

>Release-Note:

>Audit-Trail:
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: bin/51556: less(1) generates SIGTTOU if lacking a controlling
 terminal
Date: Thu, 13 Oct 2016 02:38:58 +0000

 On Thu, Oct 13, 2016 at 02:10:00AM +0000, jschauma@netmeister.org wrote:
  > This is due to timeout(1) creating a new process group, leading to the
  > processes it invokes not having a controlling terminal.

 Creating a new process group does not result in the process it invokes
 not having a controlling terminal. Unless you mean it creates a new
 session; but it doesn't:

     % tty
     /dev/pts/1
     % timeout 30 tty
     /dev/pts/1

 So I'm not sure how to interpret the rest of this report.

 -- 
 David A. Holland
 dholland@netbsd.org

From: Jan Schaumann <jschauma@netmeister.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: bin/51556: less(1) generates SIGTTOU if lacking a controlling
 terminal
Date: Wed, 12 Oct 2016 22:52:51 -0400

 David Holland <dholland-bugs@netbsd.org> wrote:

 >  Creating a new process group does not result in the process it invokes
 >  not having a controlling terminal. Unless you mean it creates a new
 >  session; but it doesn't:

 What I mean is that the command invoked by timeout(1) will end up in a
 process group that is different from the process group with the
 controlling terminal.

 For the purposes of this bug, how we end up in this situation is
 probably irrelevant, and the timeout(1) invocation is just one example
 that happened to be how I came across this issue.  You can recreate it
 without invoking timeout(1) by running this:

 --
 #include <errno.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <termios.h>
 #include <unistd.h>

 int
 main(int argc, char ** argv) {
 	struct termios otty;
 	int no_tty;

 	if (setpgid(0, 0) == -1) {
 		fprintf(stderr, "Unable to setpgid: %s\n", strerror(errno));
 		exit(1);
 	}

 	no_tty = tcgetattr(fileno(stdout), &otty);

 	if (!no_tty) {
 		if (tcsetattr(fileno(stderr), TCSANOW, &otty) == -1) {
 			fprintf(stderr, "Unable to tcsetattr: %s\n", strerror(errno));
 			exit(1);
 		}
 	}
 	return 0;
 }
 --

 You can trigger the issue by invoking it as:

 $ /bin/sh -c ./a.out

 The process details might end up like this:

 $ ps x -o pid,ppid,pgid,tpgid,stat,comm | egrep "(/bin/sh|a.out)"
 15885 18364 15885 18364 T    ./a.out 
 18364  7159 18364 18364 I+   /bin/sh 

 /bin/sh has the controlling terminal, but a.out does not.  tcsetattr(3)
 here triggers SIGTTOU and suspends the process.  This is the behaviour I
 observed in Debian's more(1), and I see similar behaviour in our
 less(1), but I haven't been able to pinpoint the code triggering it.

 -Jan

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: bin/51556: less(1) generates SIGTTOU if lacking a controlling
 terminal
Date: Thu, 13 Oct 2016 03:12:58 +0000

 On Thu, Oct 13, 2016 at 02:55:01AM +0000, Jan Schaumann wrote:
  >  >  Creating a new process group does not result in the process it invokes
  >  >  not having a controlling terminal. Unless you mean it creates a new
  >  >  session; but it doesn't:
  >  
  >  What I mean is that the command invoked by timeout(1) will end up in a
  >  process group that is different from the process group with the
  >  controlling terminal.

 Ok, but that's something entirely different :-)

 Anyway, less is not wrong. The problem is that timeout(1) is
 manipulating job control incorrectly.

 -- 
 David A. Holland
 dholland@netbsd.org

From: matthew green <mrg@eterna.com.au>
To: gnats-bugs@NetBSD.org
Cc: gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,
    jschauma@netmeister.org
Subject: re: bin/51556: less(1) generates SIGTTOU if lacking a controlling terminal
Date: Thu, 13 Oct 2016 15:18:25 +1100

 >  Anyway, less is not wrong. The problem is that timeout(1) is
 >  manipulating job control incorrectly.

 i'm convinced there are two bugs here.  the timeout one, and
 that less tries to do terminal ops that trigger signals when
 not properly able.

 the original report fails if you use 'cat' instead of 'less'.

 i'd not be so worried about the less bug, but for me i can't
 ^C out of it.  ^Z does work, but having a shell i can't sent
 an interrupt to is pretty serious situation i'd like to avoid
 having happen if the tools can.


 .mrg.

From: Thomas Klausner <wiz@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: bin/51556: less(1) generates SIGTTOU if lacking a controlling
 terminal
Date: Thu, 13 Oct 2016 06:47:59 +0200

 On Thu, Oct 13, 2016 at 04:20:00AM +0000, Matthew Green wrote:
 >  i'm convinced there are two bugs here.  the timeout one, and
 >  that less tries to do terminal ops that trigger signals when
 >  not properly able.
 >  
 >  the original report fails if you use 'cat' instead of 'less'.
 >  
 >  i'd not be so worried about the less bug, but for me i can't
 >  ^C out of it.  ^Z does work, but having a shell i can't sent
 >  an interrupt to is pretty serious situation i'd like to avoid
 >  having happen if the tools can.

 I'm not sure this is related, but a bug I've been seeing for a couple
 of releases now reminds me of that.

 start mutt (mail/mutt or mail/neomutt)
 type '|less' to pipe a mail through less
 interrupt with CTRL-Z
 continue with fg

 at this point, the screen is not updated, neither by less nor mutt and it seems neither is in control of the terminal

 interrupt again with CTRL-Z
 continue again with fg

 at this point we're back in less and everything works as if nothing
 had happened.
  Thomas

From: mlelstv@serpens.de (Michael van Elst)
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: bin/51556: less(1) generates SIGTTOU if lacking a controlling terminal
Date: Thu, 13 Oct 2016 05:10:38 +0000 (UTC)

 jschauma@netmeister.org (Jan Schaumann) writes:

 > 	if (setpgid(0, 0) == -1) {
 > 		fprintf(stderr, "Unable to setpgid: %s\n", strerror(errno));
 > 		exit(1);
 > 	}
 > 	no_tty = tcgetattr(fileno(stdout), &otty);
 > 	if (!no_tty) {
 > 		if (tcsetattr(fileno(stderr), TCSANOW, &otty) == -1) {
 > 			fprintf(stderr, "Unable to tcsetattr: %s\n", strerror(errno));
 > 			exit(1);
 > 		}
 > 	}

 > /bin/sh has the controlling terminal, but a.out does not.  tcsetattr(3)
 > here triggers SIGTTOU and suspends the process.  This is the behaviour I
 > observed in Debian's more(1), and I see similar behaviour in our
 > less(1), but I haven't been able to pinpoint the code triggering it.


 That's how UNIX does job control, it has little to do with more or less.

 The feature exists to prevent a _background process_ from interfering with
 the terminal. While output from a background process itself may or may not pass,
 changing the terminal settings is always stopped.

 Try to run less in the background to get the same behaviour.

 The only issue here is that UNIX job control is pretty limited and that timeout
 abuses the job control.

 -- 
 -- 
                                 Michael van Elst
 Internet: mlelstv@serpens.de
                                 "A potential Snark may lurk in every tree."

From: mlelstv@serpens.de (Michael van Elst)
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: bin/51556: less(1) generates SIGTTOU if lacking a controlling terminal
Date: Thu, 13 Oct 2016 05:12:54 +0000 (UTC)

 dholland-bugs@netbsd.org (David Holland) writes:

 > Anyway, less is not wrong. The problem is that timeout(1) is
 > manipulating job control incorrectly.

 I fear there is no way to manipulate it 'correctly' for a tool like timeout(1).

 -- 
 -- 
                                 Michael van Elst
 Internet: mlelstv@serpens.de
                                 "A potential Snark may lurk in every tree."

From: mlelstv@serpens.de (Michael van Elst)
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: bin/51556: less(1) generates SIGTTOU if lacking a controlling terminal
Date: Thu, 13 Oct 2016 05:23:09 +0000 (UTC)

 mrg@eterna.com.au (matthew green) writes:

 >i'd not be so worried about the less bug, but for me i can't
 >^C out of it.  ^Z does work, but having a shell i can't sent
 >an interrupt to is pretty serious situation i'd like to avoid
 >having happen if the tools can.

 As the background process is already stopped by SIGTTOU, a SIGTERM won't
 do much until the process gets a SIGCONT. A shell with job control support
 does this, but timeout does not.

 E.g. csh:

             if (kill(-pp->p_jobid, signum) < 0) {
                 (void)fprintf(csherr, "%s: %s\n", vis_str(cp),
                                strerror(errno));
                 err1++;
             }
             if (signum == SIGTERM || signum == SIGHUP)
                 (void)kill(-pp->p_jobid, SIGCONT);

 but timeout:
                         if (!foreground)
                                 killpg(pgid, killsig);  
                         else
                                 kill(pid, (int)sig_term);

                         if (do_second_kill) {
                                 set_interval(second_kill);
                                 second_kill = 0;
                                 sig_ign = killsig;
                                 killsig = SIGKILL;
                         } else  
                                 break;

 -- 
 -- 
                                 Michael van Elst
 Internet: mlelstv@serpens.de
                                 "A potential Snark may lurk in every tree."

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: bin/51556: less(1) generates SIGTTOU if lacking a controlling
 terminal
Date: Thu, 13 Oct 2016 06:10:01 +0000

 On Thu, Oct 13, 2016 at 05:15:01AM +0000, Michael van Elst wrote:
  >> Anyway, less is not wrong. The problem is that timeout(1) is
  >> manipulating job control incorrectly.
  >  
  > I fear there is no way to manipulate it 'correctly' for a tool like
  > timeout(1).

 Why not? I'm not convinced. The problem is that timeout is effectively
 backgrounding itself and leaving nothing in the foreground. If it
 wants to make its own process group, it has to take charge of the tty
 to match. There's no reason it can't do that. If you stick another
 shell in there instead of timeout it all works fine.

 -- 
 David A. Holland
 dholland@netbsd.org

From: Jan Schaumann <jschauma@netmeister.org>
To: gnats-bugs@NetBSD.org
Cc: gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: bin/51556: less(1) generates SIGTTOU if lacking a controlling
 terminal
Date: Thu, 13 Oct 2016 09:06:11 -0400

 Michael van Elst <mlelstv@serpens.de> wrote:

 >  The feature exists to prevent a _background process_ from interfering with
 >  the terminal. While output from a background process itself may or may not pass,
 >  changing the terminal settings is always stopped.

 Yes, I understand that.  My concern is that less(1) should not attempt
 to manipulate the terminal if it doesn't have a controlling terminal.

 I agree that timeout(1) should be better about setting things up, but
 that's a separate issue.  As Matthew noted, there appear to be two
 distinct bugs here.  Should we split this?

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: bin/51556: less(1) generates SIGTTOU if lacking a controlling
 terminal
Date: Fri, 14 Oct 2016 00:41:12 +0000

 On Thu, Oct 13, 2016 at 01:10:01PM +0000, Jan Schaumann wrote:
  >  Yes, I understand that.  My concern is that less(1) should not attempt
  >  to manipulate the terminal if it doesn't have a controlling terminal.

 I think you mean "if it isn't in the foreground", but it doesn't make
 sense to run less other than in the foreground. Certainly you wouldn't
 be able to provide command input to it.

  >  I agree that timeout(1) should be better about setting things up, but
  >  that's a separate issue.  As Matthew noted, there appear to be two
  >  distinct bugs here.  Should we split this?

 It's not. What less is doing is fine.

 -- 
 David A. Holland
 dholland@netbsd.org

From: mlelstv@serpens.de (Michael van Elst)
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: bin/51556: less(1) generates SIGTTOU if lacking a controlling terminal
Date: Fri, 14 Oct 2016 06:17:47 +0000 (UTC)

 jschauma@netmeister.org (Jan Schaumann) writes:

 >Yes, I understand that.  My concern is that less(1) should not attempt
 >to manipulate the terminal if it doesn't have a controlling terminal.

 What do you expect from 'less foo.txt &' ? Making it stop is actually a
 useful feature. The same is true for 'cat foo.txt &' _if_ you decide to
 enable the tostop feature for the terminal.

 In any case, the distinction is not between having or not having a
 controlling terminal but between running in a foreground or background
 process group and timeout can be improved to handle this situation
 much better.


 -- 
 -- 
                                 Michael van Elst
 Internet: mlelstv@serpens.de
                                 "A potential Snark may lurk in every tree."

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.