NetBSD Problem Report #50989

From ryo_on@yk.rim.or.jp  Mon Mar 21 07:53:37 2016
Return-Path: <ryo_on@yk.rim.or.jp>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 385E97A0EB
	for <gnats-bugs@gnats.NetBSD.org>; Mon, 21 Mar 2016 07:53:37 +0000 (UTC)
Message-Id: <20160321075332.5FC70200016F9@mail.SiriusCloud.jp>
Date: Mon, 21 Mar 2016 16:53:59 +0900
From: ryoon@NetBSD.org
Reply-To: ryoon@NetBSD.org
To: gnats-bugs@gnats.NetBSD.org
Subject: Some programs in base dump dores with SIGILL
X-Send-Pr-Version: 3.95

>Number:         50989
>Category:       port-amd64
>Synopsis:       Some programs in base dump dores with SIGILL
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    port-amd64-maintainer
>State:          feedback
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Mar 21 07:55:00 +0000 2016
>Closed-Date:    
>Last-Modified:  Sat Aug 27 05:01:37 +0000 2016
>Originator:     Ryo ONODERA
>Release:        NetBSD 7.99.26
>Organization:

>Environment:


System: NetBSD brownie 7.99.26 NetBSD 7.99.26 (DTRACE7) #0: Mon Mar 21 15:14:54 JST 2016 ryo_on@brownie:/usr/world/7.99/amd64/obj/sys/arch/amd64/compile/DTRACE7 amd64
Architecture: x86_64
Machine: amd64
>Description:
https://mail-index.netbsd.org/source-changes/2016/03/20/msg073540.html
https://mail-index.netbsd.org/source-changes/2016/03/20/msg073539.html
https://mail-index.netbsd.org/source-changes/2016/03/20/msg073538.html
https://mail-index.netbsd.org/source-changes/2016/03/20/msg073537.html
https://mail-index.netbsd.org/source-changes/2016/03/20/msg073536.html

After these commits, ssh, sshd, syslogd, ssh-agent, and some other programs
dump core with SIGILL illegal instruction.

My GCCl is 4.8.5.

My CPU of laptop is
Intel(R) Core(TM) i7-5500U CPU @ 2.40GHz, id 0x306d4

And I have no settings in my /etc/mk.conf for NetBSD base.

>How-To-Repeat:

Try latest NetBSD/amd64 current
>Fix:

I have no idea.

>Release-Note:

>Audit-Trail:
From: christos@zoulas.com (Christos Zoulas)
To: gnats-bugs@NetBSD.org, port-amd64-maintainer@netbsd.org, 
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc: 
Subject: Re: port-amd64/50989: Some programs in base dump dores with SIGILL
Date: Mon, 21 Mar 2016 08:53:05 -0400

 On Mar 21,  7:55am, ryoon@NetBSD.org (ryoon@NetBSD.org) wrote:
 -- Subject: port-amd64/50989: Some programs in base dump dores with SIGILL

 Something is incorrect with CPU detection now. I am planning to upgrade
 openssl to 1.1.0-pre4 so all the assembly code is consistent.

 christos

From: matthew green <mrg@eterna.com.au>
To: gnats-bugs@NetBSD.org, christos@netbsd.org
Cc: port-amd64-maintainer@netbsd.org, gnats-admin@netbsd.org,
    netbsd-bugs@netbsd.org
Subject: re: port-amd64/50989: Some programs in base dump dores with SIGILL
Date: Tue, 22 Mar 2016 04:19:57 +1100

 Christos, why do you think that your change to openssl isn't the real
 problem here?  now GCC 4.8 is getting its stack unaligned.

 please revert your commits as listed in this PR:

 > https://mail-index.netbsd.org/source-changes/2016/03/20/msg073540.html
 > https://mail-index.netbsd.org/source-changes/2016/03/20/msg073539.html
 > https://mail-index.netbsd.org/source-changes/2016/03/20/msg073538.html
 > https://mail-index.netbsd.org/source-changes/2016/03/20/msg073537.html
 > https://mail-index.netbsd.org/source-changes/2016/03/20/msg073536.html
 > 
 > After these commits, ssh, sshd, syslogd, ssh-agent, and some other programs
 > dump core with SIGILL illegal instruction.
 > 
 > My GCCl is 4.8.5.


 i have not seen any problem with GCC 5.3.  i'm happily running X11 with
 a bunch of GL and ssh on my amd4 box with a fully GCC 5.3 compiled by
 a GCC 5.3 world (though my pkgsrc build failed at around 690 packages
 i haven't looked at why yet -- though those were a 5.3 world that was
 compiled by 4.8.)


 .mrg.

From: christos@zoulas.com (Christos Zoulas)
To: matthew green <mrg@eterna.com.au>, gnats-bugs@NetBSD.org
Cc: port-amd64-maintainer@netbsd.org, gnats-admin@netbsd.org, 
	netbsd-bugs@netbsd.org
Subject: re: port-amd64/50989: Some programs in base dump dores with SIGILL
Date: Mon, 21 Mar 2016 13:30:19 -0400

 On Mar 22,  4:19am, mrg@eterna.com.au (matthew green) wrote:
 -- Subject: re: port-amd64/50989: Some programs in base dump dores with SIGIL

 | Christos, why do you think that your change to openssl isn't the real
 | problem here?  now GCC 4.8 is getting its stack unaligned.
 | 
 | please revert your commits as listed in this PR:
 | 
 | > https://mail-index.netbsd.org/source-changes/2016/03/20/msg073540.html
 | > https://mail-index.netbsd.org/source-changes/2016/03/20/msg073539.html
 | > https://mail-index.netbsd.org/source-changes/2016/03/20/msg073538.html
 | > https://mail-index.netbsd.org/source-changes/2016/03/20/msg073537.html
 | > https://mail-index.netbsd.org/source-changes/2016/03/20/msg073536.html
 | > 
 | > After these commits, ssh, sshd, syslogd, ssh-agent, and some other programs
 | > dump core with SIGILL illegal instruction.
 | > 
 | > My GCCl is 4.8.5.
 | 
 | 
 | i have not seen any problem with GCC 5.3.  i'm happily running X11 with
 | a bunch of GL and ssh on my amd4 box with a fully GCC 5.3 compiled by
 | a GCC 5.3 world (though my pkgsrc build failed at around 690 packages
 | i haven't looked at why yet -- though those were a 5.3 world that was
 | compiled by 4.8.)
 | 

 This is SIGILL not SIGSEGV (which you get with an unaligned stack) which
 means that the processor detection code is now incompatible with the
 old assembly stubs. Which means in turn that I will either have to replace
 all of it, or import a newer openssl. I'd rather do the latter one since
 it is a waste of time to fix bugs already found and fixed.

 christos

From: christos@zoulas.com (Christos Zoulas)
To: gnats-bugs@NetBSD.org, port-amd64-maintainer@netbsd.org, 
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, ryoon@NetBSD.org
Cc: 
Subject: re: port-amd64/50989: Some programs in base dump dores with SIGILL
Date: Mon, 21 Mar 2016 13:32:01 -0400

 On Mar 21,  5:25pm, mrg@eterna.com.au (matthew green) wrote:
 -- Subject: re: port-amd64/50989: Some programs in base dump dores with SIGIL

 |  i have not seen any problem with GCC 5.3.  i'm happily running X11 with
 |  a bunch of GL and ssh on my amd4 box with a fully GCC 5.3 compiled by
 |  a GCC 5.3 world (though my pkgsrc build failed at around 690 packages
 |  i haven't looked at why yet -- though those were a 5.3 world that was
 |  compiled by 4.8.)

 There is something wrong with the old gcc and the processor detection
 code. It probably runs through the unoptimized mmx/avr code or it gets
 lucky with stack alignment. In my 5.3 tests sshd worked and ssh didn't.
 When I changed the stack offset by adding +8, ssh worked and sshd broke.
 With the new assembly code, both work (sha) but appears that the other
 assembly stubs are broken.

 christos

From: matthew green <mrg@eterna.com.au>
To: christos@zoulas.com (Christos Zoulas)
Cc: gnats-bugs@NetBSD.org, port-amd64-maintainer@netbsd.org,
    gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, ryoon@NetBSD.org
Subject: re: port-amd64/50989: Some programs in base dump dores with SIGILL
Date: Tue, 22 Mar 2016 05:09:36 +1100

 Christos Zoulas writes:
 > On Mar 21,  5:25pm, mrg@eterna.com.au (matthew green) wrote:
 > -- Subject: re: port-amd64/50989: Some programs in base dump dores with =
 SIGIL
 > =

 > |  i have not seen any problem with GCC 5.3.  i'm happily running X11 wi=
 th
 > |  a bunch of GL and ssh on my amd4 box with a fully GCC 5.3 compiled by
 > |  a GCC 5.3 world (though my pkgsrc build failed at around 690 packages
 > |  i haven't looked at why yet -- though those were a 5.3 world that was
 > |  compiled by 4.8.)
 > =

 > There is something wrong with the old gcc and the processor detection
 > code. It probably runs through the unoptimized mmx/avr code or it gets
 > lucky with stack alignment. In my 5.3 tests sshd worked and ssh didn't.
 > When I changed the stack offset by adding +8, ssh worked and sshd broke.
 > With the new assembly code, both work (sha) but appears that the other
 > assembly stubs are broken.

 when people running 4.8 update to your new libcrypto all their apps
 die in libcrypto with an unaligned stack.

 when you changed the stack offset (you actually subtracted 40) you
 only fixed the problem for the broken cases, but the working cases
 you broken those.  sshd and ssh have some difference in their
 setup or environment some how, and one of them has the stack
 misaligned.  i don't see how aslr stack would do it cuz as far as
 i can tell it leave the bottom 12 bits of the stack alone (ie, it
 only moves the page number.)  i don't know what is wrong, but i'm
 not seeing the problems you are.


 .mrg.

State-Changed-From-To: open->feedback
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Sat, 27 Aug 2016 05:01:37 +0000
State-Changed-Why:
Did this get sorted out?


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.