NetBSD Problem Report #51711

From kamada@nanohz.org  Mon Dec 12 14:38:23 2016
Return-Path: <kamada@nanohz.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 4F8FF7A2CF
	for <gnats-bugs@gnats.NetBSD.org>; Mon, 12 Dec 2016 14:38:23 +0000 (UTC)
Message-Id: <20161212233819LI%kamada@nanohz.org>
Date: Mon, 12 Dec 2016 23:38:19 +0900
From: KAMADA Ken'ichi <kamada@nanohz.org>
To: gnats-bugs@NetBSD.org
Cc: kamada@nanohz.org
Subject: ftp(1) SIGSEGV by a race

>Number:         51711
>Category:       bin
>Synopsis:       ftp(1) SIGSEGV by a race
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    lukem
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Dec 12 14:40:00 +0000 2016
>Closed-Date:    Tue Jan 31 23:37:30 +0000 2023
>Last-Modified:  Tue Jan 31 23:37:30 +0000 2023
>Originator:     KAMADA Ken'ichi
>Release:        NetBSD 7.0.1_PATCH
>Organization:
>Environment:
System: NetBSD hisa.nanohz.org 7.0.1_PATCH NetBSD 7.0.1_PATCH (HISA) #27: Sun May 29 19:05:29 JST 2016 ken@hisa.nanohz.org:/usr/src/sys/arch/amd64/compile/HISA amd64
Architecture: x86_64
Machine: amd64
>Description:
I found ftp.core in /usr/pkgsrc/distfiles.
In the backtrace, address 0 was called from address 0x40e5f4 in
getreply().  There is a call to (*oldsigalrm) at that address, and
it seems that oldsigalrm was SIG_DFL.

(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x000000000040e5f4 in getreply ()
#2  0x00000000004120d7 in hookup ()
#3  0x0000000000416c37 in setpeer ()
#4  0x000000000040a293 in fetch_ftp ()
#5  0x000000000040d678 in auto_fetch ()
#6  0x0000000000418af2 in main ()

>How-To-Repeat:
The timing is very severe, so it is not easy to reproduce.
If you were extremely (un)lucky, the following commands would trigger
this crash.

% ruby -e 'require "socket"; sv=TCPServer.open("localhost", 9999); s=sv.accept; s.write("500"); sleep(59.9); s.write("\n"); sleep(5)' &
% ftp ftp://localhost:9999/test.txt

By looking at the code in ftp.c, this will occur in the following scenario.
- the timer is set at [*1],
- the peer is slow and returns '\n' at the last microsecond
  before the timer expires,
- the timer fires between [*2] and [*3] and timeoutflag is incremented
  in the signal handler, and
- (*oldsigalrm) is called at [*4] even if it is SIG_DFL.

int
getreply(int expecteof)
{
...
	for (lineno = 0 ;; lineno++) {
		dig = n = code = 0;
		cp = current_line;
		while (alarmtimer(quit_time ? quit_time : 60),   // [*1]
		       ((c = getc(cin)) != '\n')) {
...
		}
// [*2]
		if (verbose > 0 || ((verbose > -1 && n == '5') &&
		    (n < '5' || !retry_connect))) {
			(void)putc(c, ttyout);
			(void)fflush(ttyout);
		}
		if (cp[-1] == '\r')
			cp[-1] = '\0';
		*cp = '\0';
		if (lineno == 0)
			(void)strlcpy(reply_string, current_line,
			    sizeof(reply_string));
		if (lineno > 0 && code == 0 && reply_callback != NULL)
			(*reply_callback)(current_line);
		if (continuation && code != originalcode) {
			if (originalcode == 0)
				originalcode = code;
			continue;
		}
		if (n != '1')
			cpend = 0;
// [*3]
		alarmtimer(0);
		(void)xsignal(SIGINT, oldsigint);
		(void)xsignal(SIGALRM, oldsigalrm);
		if (code == 421 || originalcode == 421)
			lostpeer(0);
		if (abrtflag && oldsigint != cmdabort && oldsigint != SIG_IGN)
			(*oldsigint)(SIGINT);
		if (timeoutflag && oldsigalrm != cmdtimeout &&
		    oldsigalrm != SIG_IGN)
			(*oldsigalrm)(SIGALRM);   // [*4]
		return (n - '0');
	}
}


>Fix:

Index: ftp.c
===================================================================
RCS file: /cvsroot/src/usr.bin/ftp/ftp.c,v
retrieving revision 1.164
diff -u -r1.164 ftp.c
--- ftp.c	4 Jul 2012 06:09:37 -0000	1.164
+++ ftp.c	12 Dec 2016 14:32:03 -0000
@@ -114,6 +114,7 @@
 #include <errno.h>
 #include <fcntl.h>
 #include <netdb.h>
+#include <signal.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
@@ -500,11 +501,23 @@
 		(void)xsignal(SIGALRM, oldsigalrm);
 		if (code == 421 || originalcode == 421)
 			lostpeer(0);
-		if (abrtflag && oldsigint != cmdabort && oldsigint != SIG_IGN)
-			(*oldsigint)(SIGINT);
-		if (timeoutflag && oldsigalrm != cmdtimeout &&
-		    oldsigalrm != SIG_IGN)
-			(*oldsigalrm)(SIGINT);
+		/*
+		 * We may receive SIGINT in the very short time range after
+		 * the getc()-loop ends and before oldsigint is restored.
+		 * In that case, re-raise SIGINT because it is sent from an
+		 * external entity (e.g., a user) and the action is expected.
+		 */
+		if (abrtflag && oldsigint != cmdabort && oldsigint != SIG_IGN) {
+			if (oldsigint == SIG_DFL)
+				raise(SIGINT);
+			else
+				(*oldsigint)(SIGINT);
+		}
+		/*
+		 * Do not re-raise SIGALRM even if timeoutflag is set.
+		 * It was from the timer to wait a '\n' from the peer,
+		 * and we won the race.
+		 */
 		return (n - '0');
 	}
 }

>Release-Note:

>Audit-Trail:

Responsible-Changed-From-To: bin-bug-people->lukem
Responsible-Changed-By: lukem@NetBSD.org
Responsible-Changed-When: Sun, 31 Jan 2021 08:29:57 +0000
Responsible-Changed-Why:


State-Changed-From-To: open->feedback
State-Changed-By: lukem@NetBSD.org
State-Changed-When: Sun, 31 Jan 2021 08:29:57 +0000
State-Changed-Why:
This should have been fixed by ftp.c rev 1.170 on 2020-07-11,
and part of ftp version 20200711.


State-Changed-From-To: feedback->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Tue, 31 Jan 2023 23:37:30 +0000
State-Changed-Why:
one-year feedback timeout, and "should have been fixed"


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.