NetBSD Problem Report #54052

From www@NetBSD.org  Mon Mar 11 05:49:47 2019
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id EF4877A1F5
	for <gnats-bugs@gnats.NetBSD.org>; Mon, 11 Mar 2019 05:49:46 +0000 (UTC)
Message-Id: <20190311054946.0C99B7A214@mollari.NetBSD.org>
Date: Mon, 11 Mar 2019 05:49:46 +0000 (UTC)
From: rokuyama.rk@gmail.com
Reply-To: rokuyama.rk@gmail.com
To: gnats-bugs@NetBSD.org
Subject: bump STACK_ALIGNBYTES for COMPAT_LINUX
X-Send-Pr-Version: www-1.0

>Number:         54052
>Category:       port-amd64
>Synopsis:       bump STACK_ALIGNBYTES for COMPAT_LINUX
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    port-amd64-maintainer
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Mar 11 05:50:00 +0000 2019
>Closed-Date:    Sun Mar 31 21:58:36 +0000 2019
>Last-Modified:  Sun Mar 31 21:58:36 +0000 2019
>Originator:     Rin Okuyama
>Release:        HEAD
>Organization:
Department of Physics, Meiji University
>Environment:
NetBSD kobrpd02 8.99.35 NetBSD 8.99.35 (AMD64) #0: Sun Mar 10 22:31:53 JST 2019  rin@latipes:/build/src/sys/arch/amd64/compile/AMD64 amd64
>Description:
Linux binaries with glibc >= 2.23 randomly crashes in dynamic linker.
By bisectioning, the cause turns out to be this commit

https://github.molgen.mpg.de/git-mirror/glibc/commit/38d22f9f48a84b441c5777aff103f5b980243b5f

where -mno-sse flag is removed for ld.so. Due to SSE instructions,
Linux binaries now requires stack is aligned in 16-byte boundary.

Bump STACK_ALIGNBYTES for amd64 to (16 - 1) fixes problems.

Note that there's no similar problem for i386.
>How-To-Repeat:
Mount Linux userland with glibc >= 2.23 into /emul/linux.
(e.g., Fedora 24 and 29 uses glibc 2.23 and 2.28, respectively.)

Then, dynamic-link binaries receives SIGSEGV randomly in
/emul/linux/lib64/ld-linux-x86-64.so.2.

The following are objdump near address which causes SIGSEGV:

    c224:       66 0f ef c0             pxor   %xmm0,%xmm0
--> c228:       0f 29 45 90             movaps %xmm0,-0x70(%rbp)
    c22c:       0f 29 45 a0             movaps %xmm0,-0x60(%rbp)
    c230:       0f 29 45 b0             movaps %xmm0,-0x50(%rbp)
    c234:       0f 29 45 c0             movaps %xmm0,-0x40(%rbp)

movaps is a SSE instruction, for which destination address should be
aligned to 16-byte boundary. This code is generated from code segment
something like:

---
	struct {
		long l[2];
		int i[2];
	} s[2] = {{0, 0, 0, 0}, {0, 0, 0, 0}};
	...
---

Linux binaries now requires stack is aligned to 16-byte boundary.
Bumping STACK_ALIGN from __ALIGNBYTES = (8 - 1) to (16 - 1) fixes the
problem.

For i386, GCC does not generate SSE instructions for that code
even if -msse is specified.
>Fix:
Index: sys/arch/amd64/include/param.h
===================================================================
RCS file: /home/netbsd/src/sys/arch/amd64/include/param.h,v
retrieving revision 1.29
diff -p -u -r1.29 param.h
--- sys/arch/amd64/include/param.h	11 Feb 2019 14:59:32 -0000	1.29
+++ sys/arch/amd64/include/param.h	10 Mar 2019 13:12:11 -0000
@@ -23,6 +23,8 @@

 #define ALIGNED_POINTER(p,t)	1

+#define STACK_ALIGNBYTES	(16 - 1)	/* COMPAT_LINUX */
+
 #define ALIGNBYTES32		(sizeof(int) - 1)
 #define ALIGN32(p)		(((u_long)(p) + ALIGNBYTES32) &~ALIGNBYTES32)


>Release-Note:

>Audit-Trail:
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-amd64/54052: bump STACK_ALIGNBYTES for COMPAT_LINUX
Date: Mon, 11 Mar 2019 07:24:41 +0100

 On Mon, Mar 11, 2019 at 05:50:00AM +0000, rokuyama.rk@gmail.com wrote:
 >  
 > +#define STACK_ALIGNBYTES	(16 - 1)	/* COMPAT_LINUX */

 Can the exec emulation deal with it instead?

 Martin

From: Rin Okuyama <rokuyama.rk@gmail.com>
To: gnats-bugs@NetBSD.org, port-amd64-maintainer@netbsd.org,
 gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc: 
Subject: Re: port-amd64/54052: bump STACK_ALIGNBYTES for COMPAT_LINUX
Date: Mon, 11 Mar 2019 16:05:34 +0900

 >   On Mon, Mar 11, 2019 at 05:50:00AM +0000, rokuyama.rk@gmail.com wrote:
 >   >
 >   > +#define STACK_ALIGNBYTES	(16 - 1)	/* COMPAT_LINUX */
 >   
 >   Can the exec emulation deal with it instead?

 Yes, if we add, e.g., .e_stack_alignbytes into struct emul. However,
 I'm not sure on its necessity.

 Bumping STACK_ALIGNBYTES does not break anything. It merely adds
 extra memory at most 8 bytes per process.

 Thanks,
 rin

From: Rin Okuyama <rokuyama.rk@gmail.com>
To: gnats-bugs@NetBSD.org, port-amd64-maintainer@netbsd.org,
 gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc: 
Subject: Re: port-amd64/54052: bump STACK_ALIGNBYTES for COMPAT_LINUX
Date: Mon, 11 Mar 2019 16:14:27 +0900

 Also, we cannot pullup the fix if we modify struct emul.

 rin

From: Joerg Sonnenberger <joerg@bec.de>
To: gnats-bugs@NetBSD.org
Cc: port-amd64-maintainer@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org
Subject: Re: port-amd64/54052: bump STACK_ALIGNBYTES for COMPAT_LINUX
Date: Mon, 11 Mar 2019 21:15:07 +0100

 On Mon, Mar 11, 2019 at 05:50:00AM +0000, rokuyama.rk@gmail.com wrote:
 > >Fix:
 > Index: sys/arch/amd64/include/param.h
 > ===================================================================
 > RCS file: /home/netbsd/src/sys/arch/amd64/include/param.h,v
 > retrieving revision 1.29
 > diff -p -u -r1.29 param.h
 > --- sys/arch/amd64/include/param.h	11 Feb 2019 14:59:32 -0000	1.29
 > +++ sys/arch/amd64/include/param.h	10 Mar 2019 13:12:11 -0000
 > @@ -23,6 +23,8 @@
 >  
 >  #define ALIGNED_POINTER(p,t)	1
 >  
 > +#define STACK_ALIGNBYTES	(16 - 1)	/* COMPAT_LINUX */
 > +
 >  #define ALIGNBYTES32		(sizeof(int) - 1)
 >  #define ALIGN32(p)		(((u_long)(p) + ALIGNBYTES32) &~ALIGNBYTES32)
 >  
 > 

 I'm puzzled by this patch. Stack alignment should already be 16 Bytes on
 AMD64.

 Joerg

From: Rin Okuyama <rokuyama.rk@gmail.com>
To: Joerg Sonnenberger <joerg@bec.de>, gnats-bugs@NetBSD.org
Cc: port-amd64-maintainer@netbsd.org, gnats-admin@netbsd.org,
 netbsd-bugs@netbsd.org
Subject: Re: port-amd64/54052: bump STACK_ALIGNBYTES for COMPAT_LINUX
Date: Tue, 12 Mar 2019 07:20:28 +0900

 On 2019/03/12 5:15, Joerg Sonnenberger wrote:
 > On Mon, Mar 11, 2019 at 05:50:00AM +0000, rokuyama.rk@gmail.com wrote:
 >>> Fix:
 >> Index: sys/arch/amd64/include/param.h
 >> ===================================================================
 >> RCS file: /home/netbsd/src/sys/arch/amd64/include/param.h,v
 >> retrieving revision 1.29
 >> diff -p -u -r1.29 param.h
 >> --- sys/arch/amd64/include/param.h	11 Feb 2019 14:59:32 -0000	1.29
 >> +++ sys/arch/amd64/include/param.h	10 Mar 2019 13:12:11 -0000
 >> @@ -23,6 +23,8 @@
 >>   
 >>   #define ALIGNED_POINTER(p,t)	1
 >>   
 >> +#define STACK_ALIGNBYTES	(16 - 1)	/* COMPAT_LINUX */
 >> +
 >>   #define ALIGNBYTES32		(sizeof(int) - 1)
 >>   #define ALIGN32(p)		(((u_long)(p) + ALIGNBYTES32) &~ALIGNBYTES32)
 >>   
 >>
 > 
 > I'm puzzled by this patch. Stack alignment should already be 16 Bytes on
 > AMD64.

 Since we don't currently define STACK_ALIGNBYTES for amd64,
 it falls back to __ALIGNBYTES = (8 - 1):

 src/sys/sys/param.h
 https://nxr.netbsd.org/xref/src/sys/sys/param.h#227
 ...
     227  #ifndef STACK_ALIGNBYTES
     228  #define STACK_ALIGNBYTES        __ALIGNBYTES
     229  #endif
 ...

 src/sys/arch/amd64/include/cdefs.h
 https://nxr.netbsd.org/xref/src/sys/arch/amd64/include/cdefs.h#6
 ...
       6  #define __ALIGNBYTES            (sizeof(long) - 1)
 ...

 Do you mean this violates x86_64 System V ABI?

 Thanks,
 rin

From: Joerg Sonnenberger <joerg@bec.de>
To: gnats-bugs@NetBSD.org
Cc: port-amd64-maintainer@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org
Subject: Re: port-amd64/54052: bump STACK_ALIGNBYTES for COMPAT_LINUX
Date: Mon, 11 Mar 2019 23:54:35 +0100

 On Mon, Mar 11, 2019 at 05:50:00AM +0000, rokuyama.rk@gmail.com wrote:
 > >Description:
 > Linux binaries with glibc >= 2.23 randomly crashes in dynamic linker.
 > By bisectioning, the cause turns out to be this commit
 > 
 > https://github.molgen.mpg.de/git-mirror/glibc/commit/38d22f9f48a84b441c5777aff103f5b980243b5f

 So the real problem is that ld.so doesn't do what any normal startup
 code does by aligning the stack explicitly. *sigh*

 Joerg

From: Rin Okuyama <rokuyama.rk@gmail.com>
To: Joerg Sonnenberger <joerg@bec.de>, gnats-bugs@NetBSD.org
Cc: port-amd64-maintainer@netbsd.org, gnats-admin@netbsd.org,
 netbsd-bugs@netbsd.org
Subject: Re: port-amd64/54052: bump STACK_ALIGNBYTES for COMPAT_LINUX
Date: Tue, 12 Mar 2019 17:55:47 +0900

 On 2019/03/12 7:54, Joerg Sonnenberger wrote:
 > On Mon, Mar 11, 2019 at 05:50:00AM +0000, rokuyama.rk@gmail.com wrote:
 >>> Description:
 >> Linux binaries with glibc >= 2.23 randomly crashes in dynamic linker.
 >> By bisectioning, the cause turns out to be this commit
 >>
 >> https://github.molgen.mpg.de/git-mirror/glibc/commit/38d22f9f48a84b441c5777aff103f5b980243b5f
 > 
 > So the real problem is that ld.so doesn't do what any normal startup
 > code does by aligning the stack explicitly. *sigh*

 I don't get what you means...

 (1) The bottom of stack (i.e., %rsp = &argc) is required to be aligned
 to 16-byte boundary by "System V ABI - AMD64 Architecture Processor
 Supplement"

 https://www.uclibc.org/docs/psABI-x86_64.pdf

 (see pp. 29-30).

 (2) However, we align it to only 8-byte boundary; we don't define
 STACK_ALIGNBYTES for amd64, and __ALIGNBYTES = (8 - 1) is used instead:

 src/sys/kern/kern_exec.c
 https://nxr.netbsd.org/xref/src/sys/kern/kern_exec.c#1394

    1394  static size_t
    1395  calcstack(struct execve_data * restrict data, const size_t gaplen)
    1396  {
    ....
    1415          /* make the stack "safely" aligned */
    1416          return STACK_LEN_ALIGN(stacklen, STACK_ALIGNBYTES);
    1417  }

 (3) If the bottom of stack is aligned to 16-byte boundary, ld.so for
 Linux works fine.

 Therefore, I think that ld.so is legal within System V ABI. Isn't it?

 Thanks,
 rin

From: Kamil Rytarowski <n54@gmx.com>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-amd64/54052: bump STACK_ALIGNBYTES for COMPAT_LINUX
Date: Tue, 12 Mar 2019 10:53:30 +0100

 On 12.03.2019 10:00, Rin Okuyama wrote:
 > The following reply was made to PR port-amd64/54052; it has been noted by GNATS.
 > 
 > From: Rin Okuyama <rokuyama.rk@gmail.com>
 > To: Joerg Sonnenberger <joerg@bec.de>, gnats-bugs@NetBSD.org
 > Cc: port-amd64-maintainer@netbsd.org, gnats-admin@netbsd.org,
 >  netbsd-bugs@netbsd.org
 > Subject: Re: port-amd64/54052: bump STACK_ALIGNBYTES for COMPAT_LINUX
 > Date: Tue, 12 Mar 2019 17:55:47 +0900
 > 
 >  On 2019/03/12 7:54, Joerg Sonnenberger wrote:
 >  > On Mon, Mar 11, 2019 at 05:50:00AM +0000, rokuyama.rk@gmail.com wrote:
 >  >>> Description:
 >  >> Linux binaries with glibc >= 2.23 randomly crashes in dynamic linker.
 >  >> By bisectioning, the cause turns out to be this commit
 >  >>
 >  >> https://github.molgen.mpg.de/git-mirror/glibc/commit/38d22f9f48a84b441c5777aff103f5b980243b5f
 >  > 
 >  > So the real problem is that ld.so doesn't do what any normal startup
 >  > code does by aligning the stack explicitly. *sigh*
 >  
 >  I don't get what you means...
 >  
 >  (1) The bottom of stack (i.e., %rsp = &argc) is required to be aligned
 >  to 16-byte boundary by "System V ABI - AMD64 Architecture Processor
 >  Supplement"
 >  
 >  https://www.uclibc.org/docs/psABI-x86_64.pdf
 >  
 >  (see pp. 29-30).
 >  
 >  (2) However, we align it to only 8-byte boundary; we don't define
 >  STACK_ALIGNBYTES for amd64, and __ALIGNBYTES = (8 - 1) is used instead:
 >  
 >  src/sys/kern/kern_exec.c
 >  https://nxr.netbsd.org/xref/src/sys/kern/kern_exec.c#1394
 >  
 >     1394  static size_t
 >     1395  calcstack(struct execve_data * restrict data, const size_t gaplen)
 >     1396  {
 >     ....
 >     1415          /* make the stack "safely" aligned */
 >     1416          return STACK_LEN_ALIGN(stacklen, STACK_ALIGNBYTES);
 >     1417  }
 >  
 >  (3) If the bottom of stack is aligned to 16-byte boundary, ld.so for
 >  Linux works fine.
 >  
 >  Therefore, I think that ld.so is legal within System V ABI. Isn't it?
 >  
 >  Thanks,
 >  rin
 >  
 > 

 We have got a real-world use-case where we want to bypass libc/csu in
 LLDB and not rely on _start in assembly, because it does not scale to
 more than 1 CPU. We are generating core(5) files in LLDB test-suite and
 we could just write _start in C like in Linux:

 https://github.com/llvm-mirror/lldb/blob/master/packages/Python/lldbsuite/test/functionalities/postmortem/elf-core/main.c

 Stack manual alignment is the only reason for us to write _start in .S
 rather than C code. We have decided to pick libc in our core(5) files in
 LLDB's test-suite but it has side effects (most importantly 20x larger
 core dumps)

 I'm for this change to get the stack alignment right by default.

 Linux emulation is now the 2nd use-case.

From: Rin Okuyama <rokuyama.rk@gmail.com>
To: gnats-bugs@NetBSD.org, port-amd64-maintainer@netbsd.org,
 gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc: 
Subject: Re: port-amd64/54052: bump STACK_ALIGNBYTES for COMPAT_LINUX
Date: Thu, 14 Mar 2019 09:51:52 +0900

 Thank you Kamil for your comment.

 I will commit the patch this weekend, if there's no objection.

 rin

From: "Rin Okuyama" <rin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/54052 CVS commit: src/sys/arch/amd64/include
Date: Sat, 16 Mar 2019 11:50:49 +0000

 Module Name:	src
 Committed By:	rin
 Date:		Sat Mar 16 11:50:48 UTC 2019

 Modified Files:
 	src/sys/arch/amd64/include: param.h

 Log Message:
 Bump STACK_ALIGNBYTES to (16 - 1) to satisfy requirement by AMD64
 System V ABI in kernel level. This is because

 (1) for LLDB, we want to bypass libc/csu (and therefore manual stack
     alignment in _start), and

 (2) rtld in glibc >= 2.23 for Linux/x86_64 requires it.

 Fix SEGV for Linux/x86_64 binaries with glibc >= 2.23, reported as
 PR port-amd64/54052.


 To generate a diff of this commit:
 cvs rdiff -u -r1.29 -r1.30 src/sys/arch/amd64/include/param.h

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->feedback
State-Changed-By: maxv@NetBSD.org
State-Changed-When: Sun, 24 Mar 2019 11:47:43 +0000
State-Changed-Why:
Can this PR be closed?


State-Changed-From-To: feedback->pending-pullups
State-Changed-By: rin@NetBSD.org
State-Changed-When: Mon, 25 Mar 2019 04:18:14 +0000
State-Changed-Why:
Hi Max, I sent [pullup-8 #1220] and [pullup-7 #1687].


From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/54052 CVS commit: [netbsd-8] src/sys/arch/amd64/include
Date: Fri, 29 Mar 2019 19:39:06 +0000

 Module Name:	src
 Committed By:	martin
 Date:		Fri Mar 29 19:39:06 UTC 2019

 Modified Files:
 	src/sys/arch/amd64/include [netbsd-8]: param.h

 Log Message:
 Pull up following revision(s) (requested by rin in ticket #1220):

 	sys/arch/amd64/include/param.h: revision 1.30

 Bump STACK_ALIGNBYTES to (16 - 1) to satisfy requirement by AMD64
 System V ABI in kernel level. This is because

 (1) for LLDB, we want to bypass libc/csu (and therefore manual stack
      alignment in _start), and
 (2) rtld in glibc >= 2.23 for Linux/x86_64 requires it.

 Fix SEGV for Linux/x86_64 binaries with glibc >= 2.23, reported as

 PR port-amd64/54052.


 To generate a diff of this commit:
 cvs rdiff -u -r1.21.6.3 -r1.21.6.4 src/sys/arch/amd64/include/param.h

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Manuel Bouyer" <bouyer@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/54052 CVS commit: [netbsd-7] src/sys/arch/amd64/include
Date: Sat, 30 Mar 2019 18:47:16 +0000

 Module Name:	src
 Committed By:	bouyer
 Date:		Sat Mar 30 18:47:15 UTC 2019

 Modified Files:
 	src/sys/arch/amd64/include [netbsd-7]: param.h

 Log Message:
 Pull up following revision(s) (requested by rin in ticket #1687):
 	sys/arch/amd64/include/param.h: revision 1.30
 Bump STACK_ALIGNBYTES to (16 - 1) to satisfy requirement by AMD64
 System V ABI in kernel level. This is because
 (1) for LLDB, we want to bypass libc/csu (and therefore manual stack
      alignment in _start), and
 (2) rtld in glibc >= 2.23 for Linux/x86_64 requires it.
 Fix SEGV for Linux/x86_64 binaries with glibc >= 2.23, reported as
 PR port-amd64/54052.


 To generate a diff of this commit:
 cvs rdiff -u -r1.18.14.1 -r1.18.14.2 src/sys/arch/amd64/include/param.h

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: pending-pullups->closed
State-Changed-By: rin@NetBSD.org
State-Changed-When: Sun, 31 Mar 2019 21:58:36 +0000
State-Changed-Why:
Pullup done.

Linux binaries with newer glibc work fine both on netbsd-8 and -7.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.43 2018/01/16 07:36:43 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.