NetBSD Problem Report #56533

From gson@gson.org  Fri Dec  3 09:58:12 2021
Return-Path: <gson@gson.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 432B11A9239
	for <gnats-bugs@gnats.NetBSD.org>; Fri,  3 Dec 2021 09:58:12 +0000 (UTC)
Message-Id: <20211203095801.3D13B254286@guava.gson.org>
Date: Fri,  3 Dec 2021 11:58:01 +0200 (EET)
From: gson@gson.org (Andreas Gustafsson)
Reply-To: gson@gson.org (Andreas Gustafsson)
To: gnats-bugs@NetBSD.org
Subject: fs/ffs/t_miscquota:default_deny_user_big fails randomly
X-Send-Pr-Version: 3.95

>Number:         56533
>Category:       kern
>Synopsis:       fs/ffs/t_miscquota:default_deny_user_big fails randomly
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Dec 03 10:00:01 +0000 2021
>Closed-Date:    Wed Dec 08 10:18:43 +0000 2021
>Last-Modified:  Wed Dec 08 10:18:43 +0000 2021
>Originator:     Andreas Gustafsson
>Release:        NetBSD-current, source date >= 2021.09.30.03.23.48
>Organization:

>Environment:
System: NetBSD
Architecture: i386
Machine: i386
>Description:

On multiple i386 testbeds, the fs/ffs/t_miscquota:default_deny_user_big
test case fails in about half the runs.  For example, in this recent
run on the TNF testbed, it was the sole unexpected failure:

  http://releng.netbsd.org/b5reports/i386/2021/2021.12.03.05.28.32/test.html#failed-tcs-summary

It's failing not only under qemu, but also on real i386 hardware:

  https://www.gson.org/netbsd/bugs/build/i386-laptop/2021/2021.11.28.11.49.10/test.html#fs_ffs_t_miscquota_default_deny_user_big

Bisection indicates that the problem started with this commit (which
is strange, as there is no obvious connection between the change and
the failing test case):

  2021.09.30.03.23.48 yamaguchi src/sys/net/if.c 1.492
  2021.09.30.03.23.48 yamaguchi src/sys/net/if.h 1.294

>How-To-Repeat:

Install emulators/qemu and misc/py-anita from pkgsrc and run:

$ anita --memory-size 128M interact http://nycdn.netbsd.org/pub/NetBSD-daily/HEAD/latest/i386/
login: root
# cd /usr/tests/fs/ffs
# while atf-run t_miscquota:default_deny_user_big; do true; done

>Fix:

>Release-Note:

>Audit-Trail:
From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/56533: fs/ffs/t_miscquota:default_deny_user_big fails randomly
Date: Sat, 4 Dec 2021 23:34:12 +0200

 I tracked this down to a bug in librumpuser.

 When rumpuser_sp_init() sets up a unix domain socket for the rump
 service, it calls bind() using a socket address that is malloced in
 unix_parse().  The size of the allocation is sufficient for the actual
 pathname in the sun_path field and its terminating NUL, but does not
 include the unused remainder of sun_path.  Nonetheless, bind() is
 called with a namelen argument of sizeof(struct sockaddr_un), which
 does include the unused part of sun_path.  In rare circumstances, the
 allocation may happen to be immediately followed by an unmapped page,
 and then bind() will return EFAULT causing the test to fail.

 The commit of src/sys/net/if.c 1.492 probably just changed the pattern
 of allocations such that those rare circumstances happen in this
 particular test case on this particular port.

 I intend to fix this by making unix_parse() always allocate a
 full-sized struct sockaddr_un on all host systems rather than just
 some of them, which not only fixes the bug but also reduces the amount
 of system dependent code:

 Index: sp_common.c
 ===================================================================
 RCS file: /cvsroot/src/lib/librumpuser/sp_common.c,v
 retrieving revision 1.42
 diff -u -r1.42 sp_common.c
 --- sp_common.c	13 Jun 2020 16:51:59 -0000	1.42
 +++ sp_common.c	4 Dec 2021 21:20:16 -0000
 @@ -670,12 +670,10 @@
  		}
  	}
  	strcat(s_un.sun_path, addr);
 -#if defined(__linux__) || defined(__sun__) || defined(__CYGWIN__)
 -	slen = sizeof(s_un);
 -#else
 +#if !(defined(__linux__) || defined(__sun__) || defined(__CYGWIN__))
  	s_un.sun_len = SUN_LEN(&s_un);
 -	slen = s_un.sun_len+1; /* get the 0 too */
  #endif
 +	slen = sizeof(s_un);

  	if (savepath && *parsedurl == '\0') {
  		snprintf(parsedurl, sizeof(parsedurl),
 -- 
 Andreas Gustafsson, gson@gson.org

From: "Andreas Gustafsson" <gson@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/56533 CVS commit: src/lib/librumpuser
Date: Tue, 7 Dec 2021 10:39:33 +0000

 Module Name:	src
 Committed By:	gson
 Date:		Tue Dec  7 10:39:33 UTC 2021

 Modified Files:
 	src/lib/librumpuser: sp_common.c

 Log Message:
 In unix_parse(), always allocate memory for the entire struct sockaddr_un
 and not just the part used by the present pathname, because the entire
 struct will be passed to bind() and an EFAULT can result if not all of
 it is a valid allocation.  Fixes PR kern/56533.


 To generate a diff of this commit:
 cvs rdiff -u -r1.42 -r1.43 src/lib/librumpuser/sp_common.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->closed
State-Changed-By: gson@NetBSD.org
State-Changed-When: Wed, 08 Dec 2021 10:18:43 +0000
State-Changed-Why:
Fixed.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.