NetBSD Problem Report #58678

From martin@duskware.de  Thu Sep 19 10:51:49 2024
Return-Path: <martin@duskware.de>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256
	 client-signature RSA-PSS (2048 bits) client-digest SHA256)
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id C5CF51A923D
	for <gnats-bugs@gnats.NetBSD.org>; Thu, 19 Sep 2024 10:51:49 +0000 (UTC)
From: martin@NetBSD.org
Reply-To: martin@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: ntpd crashes on sparc64
X-Send-Pr-Version: 3.95

>Number:         58678
>Category:       bin
>Synopsis:       ntpd crashes on sparc64
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    bin-bug-people
>State:          feedback
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Sep 19 10:55:00 +0000 2024
>Closed-Date:    
>Last-Modified:  Tue Oct 08 22:10:02 +0000 2024
>Originator:     Martin Husemann
>Release:        NetBSD 10.99.12
>Organization:
The NetBSD Foundation, Inc.
>Environment:
System: NetBSD thirdstage.duskware.de 10.99.12 NetBSD 10.99.12 (MODULAR) #747: Thu Sep 19 10:16:21 CEST 2024 martin@thirdstage.duskware.de:/home/martin/current/src/sys/arch/sparc64/compile/MODULAR sparc64
Architecture: sparc64
Machine: sparc64
>Description:

After updating to -current as of a few hours ago, ntpd(8) won't startup
on sparc64

Thread 2 "" received signal SIGBUS, Bus error.
alloc_res4 ()
    at /home/martin/current/src/external/bsd/ntp/dist/ntpd/ntp_restrict.c:243
243                     LINK_SLIST(resfree4, res, link);
(gdb) bt
#0  alloc_res4 ()
    at /home/martin/current/src/external/bsd/ntp/dist/ntpd/ntp_restrict.c:243
#1  hack_restrict (op=<optimized out>, resaddr=0x405ca824, 
    resmask=0xffffffffffffc918, ippeerlimit=<optimized out>, 
    mflags=<optimized out>, rflags=<optimized out>, expire=0)
    at /home/martin/current/src/external/bsd/ntp/dist/ntpd/ntp_restrict.c:705
#2  0x000000000014e2ac in create_interface (port=123, protot=0x405ca200)
    at /home/martin/current/src/external/bsd/ntp/dist/ntpd/ntp_io.c:2101
#3  update_interfaces (receiver=0x0, data=<optimized out>, port=123)
    at /home/martin/current/src/external/bsd/ntp/dist/ntpd/ntp_io.c:1912
#4  0x000000000014f140 in create_sockets (port=123)
    at /home/martin/current/src/external/bsd/ntp/dist/ntpd/ntp_io.c:2039
#5  io_open_sockets ()
    at /home/martin/current/src/external/bsd/ntp/dist/ntpd/ntp_io.c:514
#6  0x000000000013725c in config_ntpd (input_from_files=<optimized out>, 
    ptree=0x40592000)
(gdb) list    
238             }
239             rl = eallocarray(count, cb);
240             /* link all but the first onto free list */
241             res = (void *)((char *)rl + (count - 1) * cb);
242             for (i = count - 1; i > 0; i--) {
243                     LINK_SLIST(resfree4, res, link);
244                     res = (void *)((char *)res - cb);
245             }
246             DEBUG_INSIST(rl == res);
247             /* allocate the first */
(gdb) p resfree4
$1 = (restrict_u *) 0x0
(gdb) p res
$2 = (restrict_u *) 0x405e2384
(gdb) p link
$3 = {<text variable, no debug info>} 0x41467140 <link>

I guess "restrict_u" needs more than 4 byte alignment, the compiler uses
stx to store 64byte there:

(gdb) x/16i $pc
=> 0x16da50 <hack_restrict+1680>:       stx  %g2, [ %g1 ]
(gdb) p/x $g1
$6 = 0x405e2384


>How-To-Repeat:
s/a

>Fix:
n/a

>Release-Note:

>Audit-Trail:
From: Christos Zoulas <christos@zoulas.com>
To: gnats-bugs@netbsd.org
Cc: gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: bin/58678: ntpd crashes on sparc64
Date: Thu, 19 Sep 2024 07:25:26 -0400

 what is cb?

 christos

 > On Sep 19, 2024, at 6:55=E2=80=AFAM, martin@netbsd.org wrote:
 >=20
 > =EF=BB=BF
 >>=20
 >> Number:         58678
 >> Category:       bin
 >> Synopsis:       ntpd crashes on sparc64
 >> Confidential:   no
 >> Severity:       critical
 >> Priority:       high
 >> Responsible:    bin-bug-people
 >> State:          open
 >> Class:          sw-bug
 >> Submitter-Id:   net
 >> Arrival-Date:   Thu Sep 19 10:55:00 +0000 2024
 >> Originator:     Martin Husemann
 >> Release:        NetBSD 10.99.12
 >> Organization:
 > The NetBSD Foundation, Inc.
 >> Environment:
 > System: NetBSD thirdstage.duskware.de 10.99.12 NetBSD 10.99.12 (MODULAR) #=
 747: Thu Sep 19 10:16:21 CEST 2024 martin@thirdstage.duskware.de:/home/marti=
 n/current/src/sys/arch/sparc64/compile/MODULAR sparc64
 > Architecture: sparc64
 > Machine: sparc64
 >> Description:
 >=20
 > After updating to -current as of a few hours ago, ntpd(8) won't startup
 > on sparc64
 >=20
 > Thread 2 "" received signal SIGBUS, Bus error.
 > alloc_res4 ()
 >    at /home/martin/current/src/external/bsd/ntp/dist/ntpd/ntp_restrict.c:2=
 43
 > 243                     LINK_SLIST(resfree4, res, link);
 > (gdb) bt
 > #0  alloc_res4 ()
 >    at /home/martin/current/src/external/bsd/ntp/dist/ntpd/ntp_restrict.c:2=
 43
 > #1  hack_restrict (op=3D<optimized out>, resaddr=3D0x405ca824,
 >    resmask=3D0xffffffffffffc918, ippeerlimit=3D<optimized out>,
 >    mflags=3D<optimized out>, rflags=3D<optimized out>, expire=3D0)
 >    at /home/martin/current/src/external/bsd/ntp/dist/ntpd/ntp_restrict.c:7=
 05
 > #2  0x000000000014e2ac in create_interface (port=3D123, protot=3D0x405ca20=
 0)
 >    at /home/martin/current/src/external/bsd/ntp/dist/ntpd/ntp_io.c:2101
 > #3  update_interfaces (receiver=3D0x0, data=3D<optimized out>, port=3D123)=

 >    at /home/martin/current/src/external/bsd/ntp/dist/ntpd/ntp_io.c:1912
 > #4  0x000000000014f140 in create_sockets (port=3D123)
 >    at /home/martin/current/src/external/bsd/ntp/dist/ntpd/ntp_io.c:2039
 > #5  io_open_sockets ()
 >    at /home/martin/current/src/external/bsd/ntp/dist/ntpd/ntp_io.c:514
 > #6  0x000000000013725c in config_ntpd (input_from_files=3D<optimized out>,=

 >    ptree=3D0x40592000)
 > (gdb) list   =20
 > 238             }
 > 239             rl =3D eallocarray(count, cb);
 > 240             /* link all but the first onto free list */
 > 241             res =3D (void *)((char *)rl + (count - 1) * cb);
 > 242             for (i =3D count - 1; i > 0; i--) {
 > 243                     LINK_SLIST(resfree4, res, link);
 > 244                     res =3D (void *)((char *)res - cb);
 > 245             }
 > 246             DEBUG_INSIST(rl =3D=3D res);
 > 247             /* allocate the first */
 > (gdb) p resfree4
 > $1 =3D (restrict_u *) 0x0
 > (gdb) p res
 > $2 =3D (restrict_u *) 0x405e2384
 > (gdb) p link
 > $3 =3D {<text variable, no debug info>} 0x41467140 <link>
 >=20
 > I guess "restrict_u" needs more than 4 byte alignment, the compiler uses
 > stx to store 64byte there:
 >=20
 > (gdb) x/16i $pc
 > =3D> 0x16da50 <hack_restrict+1680>:       stx  %g2, [ %g1 ]
 > (gdb) p/x $g1
 > $6 =3D 0x405e2384
 >=20
 >=20
 >> How-To-Repeat:
 > s/a
 >=20
 >> Fix:
 > n/a

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: bin/58678: ntpd crashes on sparc64
Date: Thu, 19 Sep 2024 13:30:49 +0200

 I guess this should fix it:

 Index: ntp.h
 ===================================================================
 RCS file: /cvsroot/src/external/bsd/ntp/dist/include/ntp.h,v
 retrieving revision 1.13
 diff -u -r1.13 ntp.h
 --- ntp.h       18 Aug 2024 20:46:50 -0000      1.13
 +++ ntp.h       19 Sep 2024 11:28:27 -0000
 @@ -859,8 +859,8 @@
         restrict_u *    link;           /* link to next entry */
         u_int32         count;          /* number of packets matched */
         u_int32         expire;         /* valid until current_time */
 -       u_short         rflags;         /* restrict (accesslist) flags */
         u_int32         mflags;         /* match flags */
 +       u_short         rflags;         /* restrict (accesslist) flags */
         short           ippeerlimit;    /* limit of associations matching */
         union {                         /* variant starting here */
                 res_addr4 v4;


 (but have not yet tested it).

 They re-arranded this strucrecently and now union {} u is no longer properly
 aligned.

 Martin

From: Martin Husemann <martin@duskware.de>
To: Christos Zoulas <christos@zoulas.com>
Cc: gnats-bugs@netbsd.org
Subject: Re: bin/58678: ntpd crashes on sparc64
Date: Thu, 19 Sep 2024 13:36:17 +0200

 On Thu, Sep 19, 2024 at 07:25:26AM -0400, Christos Zoulas wrote:
 > what is cb?

 (from a new run)

 (gdb) p cb
 $1 = 36

 Martin

From: Martin Husemann <martin@duskware.de>
To: Christos Zoulas <christos@zoulas.com>
Cc: gnats-bugs@netbsd.org
Subject: Re: bin/58678: ntpd crashes on sparc64
Date: Thu, 19 Sep 2024 13:48:37 +0200

 The problem is (manual) over-optimization:

 (gdb) p sizeof(res_addr6)
 $1 = 32
 (gdb) p sizeof(res_addr4)
 $2 = 8
 (gdb) p sizeof(restrict_u)
 $3 = 64

 So shaving 24 bytes off for each IPv4-only use of restrict_u makes the
 following entries mis-aligned.

 An easy fix (suggested by mlelstv) is:

 #define V6_SIZEOF_RESTRICT_U sizeof(stuct restrict_u_tag)
 #define V4_SIZEOF_RESTRICT_U V6_SIZEOF_RESTRICT_U

 that is: always use the full allocation. This could be restricted to
 alignment-critical architectures. Plus the rearanging to avoid internal
 padding as I suggested before to save 4 bytes per struct on those
 architectures.

 Martin

From: Martin Husemann <martin@duskware.de>
To: Christos Zoulas <christos@zoulas.com>
Cc: gnats-bugs@netbsd.org
Subject: Re: bin/58678: ntpd crashes on sparc64
Date: Thu, 19 Sep 2024 22:21:28 +0200

 --sm4nu43k4a2Rpi4c
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline

 This patch seems to work for me.

 Martin

 --sm4nu43k4a2Rpi4c
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: attachment; filename="ntpd.patch"

 Index: external/bsd/ntp/dist/include/ntp.h
 ===================================================================
 RCS file: /cvsroot/src/external/bsd/ntp/dist/include/ntp.h,v
 retrieving revision 1.13
 diff -u -r1.13 ntp.h
 --- external/bsd/ntp/dist/include/ntp.h	18 Aug 2024 20:46:50 -0000	1.13
 +++ external/bsd/ntp/dist/include/ntp.h	19 Sep 2024 20:03:53 -0000
 @@ -859,18 +859,25 @@
  	restrict_u *	link;		/* link to next entry */
  	u_int32		count;		/* number of packets matched */
  	u_int32		expire;		/* valid until current_time */
 -	u_short		rflags;		/* restrict (accesslist) flags */
  	u_int32		mflags;		/* match flags */
 +	u_short		rflags;		/* restrict (accesslist) flags */
  	short		ippeerlimit;	/* limit of associations matching */
  	union {				/* variant starting here */
  		res_addr4 v4;
  		res_addr6 v6;
  	} u;
  };
 +
 +#if defined(_LP64_) && (defined(__sparc__) || defined(__mips__) \
 +	|| defined(__powerpc__))
 +#define	V4_SIZEOF_RESTRICT_U	sizeof(restrict_u)
 +#define	V6_SIZEOF_RESTRICT_U	sizeof(restrict_u)
 +#else
  #define	V4_SIZEOF_RESTRICT_U	(offsetof(restrict_u, u)	\
  				 + sizeof(res_addr4))
  #define	V6_SIZEOF_RESTRICT_U	(offsetof(restrict_u, u)	\
  				 + sizeof(res_addr6))
 +#endif

  /* restrictions for (4) a given address */
  typedef struct r4addr_tag	r4addr;

 --sm4nu43k4a2Rpi4c--

From: Jason Thorpe <thorpej@me.com>
To: gnats-bugs@netbsd.org
Cc: gnats-admin@netbsd.org,
 netbsd-bugs@netbsd.org,
 "martin@netbsd.org" <martin@NetBSD.org>
Subject: Re: bin/58678: ntpd crashes on sparc64
Date: Thu, 19 Sep 2024 13:40:37 -0700

 > On Sep 19, 2024, at 1:25=E2=80=AFPM, Martin Husemann via gnats =
 <gnats-admin@NetBSD.org> wrote:
 >=20
 > +#if defined(_LP64_) && (defined(__sparc__) || defined(__mips__) \
 > + || defined(__powerpc__))
 > +#define V4_SIZEOF_RESTRICT_U sizeof(restrict_u)
 > +#define V6_SIZEOF_RESTRICT_U sizeof(restrict_u)
 > +#else
 >  #define V4_SIZEOF_RESTRICT_U (offsetof(restrict_u, u) \
 >    + sizeof(res_addr4))
 >  #define V6_SIZEOF_RESTRICT_U (offsetof(restrict_u, u) \
 >    + sizeof(res_addr6))
 > +#endif

 Rather than listing architectures, what about keying off of =
 __NO_STRICT_ALIGNMENT?

 -- thorpej

From: Martin Husemann <martin@duskware.de>
To: Jason Thorpe <thorpej@me.com>
Cc: gnats-bugs@netbsd.org
Subject: Re: bin/58678: ntpd crashes on sparc64
Date: Fri, 20 Sep 2024 08:06:29 +0200

 On Thu, Sep 19, 2024 at 01:40:37PM -0700, Jason Thorpe wrote:
 > Rather than listing architectures, what about keying off of __NO_STRICT_ALIGNMENT?

 That is NetBSD specific, isn't it?
 I had hoped there is a chance to upstream this change (the bug is not NetBSD
 specific).

 Martin

From: Christos Zoulas <christos@zoulas.com>
To: gnats-bugs@netbsd.org
Cc: gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, martin@netbsd.org
Subject: Re: bin/58678: ntpd crashes on sparc64
Date: Fri, 20 Sep 2024 07:37:19 -0400

 > On Sep 20, 2024, at 2:10=E2=80=AFAM, Martin Husemann via gnats <gnats-admi=
 n@netbsd.org> wrote:
 >=20
 > =EF=BB=BFThe following reply was made to PR bin/58678; it has been noted b=
 y GNATS.
 >=20
 > From: Martin Husemann <martin@duskware.de>
 > To: Jason Thorpe <thorpej@me.com>
 > Cc: gnats-bugs@netbsd.org
 > Subject: Re: bin/58678: ntpd crashes on sparc64
 > Date: Fri, 20 Sep 2024 08:06:29 +0200
 >=20
 >> On Thu, Sep 19, 2024 at 01:40:37PM -0700, Jason Thorpe wrote:
 >> Rather than listing architectures, what about keying off of __NO_STRICT_A=
 LIGNMENT?
 >=20
 > That is NetBSD specific, isn't it?
 > I had hoped there is a chance to upstream this change (the bug is not NetB=
 SD
 > specific).
 >=20
 Yes I would send it to harlan, but I would prefer that it was done more port=
 ably, i.e. arrange for the struct to be always aligned properly so that ther=
 e are no machine specific ifdefs.

 christos
 > Martin
 >=20

From: Martin Husemann <martin@duskware.de>
To: Christos Zoulas <christos@zoulas.com>
Cc: gnats-bugs@netbsd.org
Subject: Re: bin/58678: ntpd crashes on sparc64
Date: Fri, 20 Sep 2024 13:50:36 +0200

 On Fri, Sep 20, 2024 at 07:37:19AM -0400, Christos Zoulas wrote:
 > Yes I would send it to harlan, but I would prefer that it was done more portably, i.e. arrange for the struct to be always aligned properly so that there are no machine specific ifdefs.

 The strust is properly non-padded and aligned with the patch I
 suggested, but the tricks they play to shave off a few bytes if storing
 only an IPv4 address makes things complicated.

 The easiest way is to avoid the tricks at all (basically what the patch
 does now for the alignment critical machines, but the code could be simplified
 if that would be done for all).

 However, on tiny ram non-alignement-critical machines the tricks save a bit of
 memory, so I left that in place and used the ifdefs.

 I would leave that at upstream's choice - this is the minimal intrusive fix.

 Martin

From: Martin Husemann <martin@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: bin/58678: ntpd crashes on alignment critical architectures
Date: Wed, 2 Oct 2024 05:02:07 +0000

 ----- Forwarded message from Christos Zoulas <christos@netbsd.org> -----

 Date: Tue, 1 Oct 2024 16:59:51 -0400
 From: Christos Zoulas <christos@netbsd.org>
 To: source-changes@NetBSD.org
 Subject: CVS commit: src/external/bsd/ntp/dist
 X-Mailer: log_accum

 Module Name:	src
 Committed By:	christos
 Date:		Tue Oct  1 20:59:51 UTC 2024

 Modified Files:
 	src/external/bsd/ntp/dist/include: ntp.h ntp_lists.h ntpd.h
 	src/external/bsd/ntp/dist/ntpd: ntp_control.c ntp_request.c
 	    ntp_restrict.c

 Log Message:
 Don't play pointer tricks to save memory, just declare a struct for v4 and
 one for v6... Fixes alignment issues on machines that have strict alignment
 requirements (eg. sparc64)


 To generate a diff of this commit:
 cvs rdiff -u -r1.13 -r1.14 src/external/bsd/ntp/dist/include/ntp.h \
     src/external/bsd/ntp/dist/include/ntpd.h
 cvs rdiff -u -r1.7 -r1.8 src/external/bsd/ntp/dist/include/ntp_lists.h
 cvs rdiff -u -r1.24 -r1.25 src/external/bsd/ntp/dist/ntpd/ntp_control.c
 cvs rdiff -u -r1.19 -r1.20 src/external/bsd/ntp/dist/ntpd/ntp_request.c
 cvs rdiff -u -r1.12 -r1.13 src/external/bsd/ntp/dist/ntpd/ntp_restrict.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.


 ----- End forwarded message -----

State-Changed-From-To: open->feedback
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Tue, 08 Oct 2024 22:05:41 +0000
State-Changed-Why:
Fixed in HEAD?  Does this need pullups?


From: Christos Zoulas <christos@zoulas.com>
To: gnats-bugs@netbsd.org
Cc: gnats-admin@netbsd.org,
 netbsd-bugs@netbsd.org,
 "riastradh@netbsd.org" <riastradh@NetBSD.org>,
 "martin@netbsd.org" <martin@NetBSD.org>
Subject: Re: bin/58678 (ntpd crashes on sparc64)
Date: Tue, 8 Oct 2024 18:06:56 -0400

 --Apple-Mail=_75BD2B3D-F1ED-4901-8697-18AB6A620B51
 Content-Transfer-Encoding: quoted-printable
 Content-Type: text/plain;
 	charset=utf-8

 This broke with the latest version of ntp, so my guess is no.

 christos

 > On Oct 8, 2024, at 6:05=E2=80=AFPM, riastradh@netbsd.org =
 <riastradh@NetBSD.org> wrote:
 >=20
 > Synopsis: ntpd crashes on sparc64
 >=20
 > State-Changed-From-To: open->feedback
 > State-Changed-By: riastradh@NetBSD.org
 > State-Changed-When: Tue, 08 Oct 2024 22:05:41 +0000
 > State-Changed-Why:
 > Fixed in HEAD?  Does this need pullups?
 >=20
 >=20


 --Apple-Mail=_75BD2B3D-F1ED-4901-8697-18AB6A620B51
 Content-Transfer-Encoding: 7bit
 Content-Disposition: attachment;
 	filename=signature.asc
 Content-Type: application/pgp-signature;
 	name=signature.asc
 Content-Description: Message signed with OpenPGP

 -----BEGIN PGP SIGNATURE-----
 Comment: GPGTools - http://gpgtools.org

 iF0EARECAB0WIQS+BJlbqPkO0MDBdsRxESqxbLM7OgUCZwWtAAAKCRBxESqxbLM7
 Ogs8AJ0QDHFZEIz92ejIuygDY556Di7XKwCeLOHIpJWzIMvqA9w4eG77IlfVXcM=
 =7uRB
 -----END PGP SIGNATURE-----

 --Apple-Mail=_75BD2B3D-F1ED-4901-8697-18AB6A620B51--

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2024 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.