NetBSD Problem Report #55504

From gson@gson.org  Mon Jul 20 18:33:45 2020
Return-Path: <gson@gson.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 300081A9213
	for <gnats-bugs@gnats.NetBSD.org>; Mon, 20 Jul 2020 18:33:45 +0000 (UTC)
Message-Id: <20200720183339.69868253EDD@guava.gson.org>
Date: Mon, 20 Jul 2020 21:33:39 +0300 (EEST)
From: gson@gson.org (Andreas Gustafsson)
Reply-To: gson@gson.org (Andreas Gustafsson)
To: gnats-bugs@NetBSD.org
Subject: evbarm-earmv7hf testbed hangs during sbin/ifconfig/t_repeated_updown test
X-Send-Pr-Version: 3.95

>Number:         55504
>Category:       port-evbarm
>Synopsis:       evbarm-earmv7hf testbed hangs during sbin/ifconfig/t_repeated_updown test
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    port-evbarm-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Jul 20 18:35:00 +0000 2020
>Last-Modified:  Sat Jul 25 20:20:01 +0000 2020
>Originator:     Andreas Gustafsson
>Release:        NetBSD-current, source date >= c. 2020.06.26.10.14.32
>Organization:

>Environment:
System: NetBSD
Architecture: earmv7hf
Machine: evbarm
>Description:

The TNF evbarm-earmv7hf testbed is failing to complete the ATF tests
because the system under test hangs at the repeated_updown test case
of the sbin/ifconfig/t_repeated_updown test.  This has presumably been
the case since the test was added on source date 2020.06.25.15.41.40,
buth because the test was added while the build was broken, the first
test run demonstrating the problem was with sources from 2020.06.26.10.14.32:

  http://releng.netbsd.org/b5reports/evbarm-earmv7hf/commits-2020.06.html#2020.06.26.10.14.32

>How-To-Repeat:

>Fix:

>Audit-Trail:
From: Jukka Ruohonen <jruohonen@iki.fi>
To: gnats-bugs@netbsd.org
Cc: port-evbarm-maintainer@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org
Subject: Re: port-evbarm/55504: evbarm-earmv7hf testbed hangs during
 sbin/ifconfig/t_repeated_updown test
Date: Tue, 21 Jul 2020 08:41:26 +0300

 On Mon, Jul 20, 2020 at 06:35:01PM +0000, Andreas Gustafsson wrote:
 > The TNF evbarm-earmv7hf testbed is failing to complete the ATF tests
 > because the system under test hangs at the repeated_updown test case
 > of the sbin/ifconfig/t_repeated_updown test.  This has presumably been
 > the case since the test was added on source date 2020.06.25.15.41.40,
 > buth because the test was added while the build was broken, the first
 > test run demonstrating the problem was with sources from 2020.06.26.10.14.32:

 The reason is likely the same as in PR kern/55466, i.e. not related to 
 this particular, simple test per se.

 - Jukka

From: Rin Okuyama <rokuyama.rk@gmail.com>
To: gnats-bugs@netbsd.org, port-evbarm-maintainer@netbsd.org,
 gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, jruohonen@iki.fi,
 Andreas Gustafsson <gson@gson.org>
Cc: 
Subject: Re: port-evbarm/55504: evbarm-earmv7hf testbed hangs during
 sbin/ifconfig/t_repeated_updown test
Date: Tue, 21 Jul 2020 15:17:55 +0900

 This problem is discussed on tech-net:

 http://mail-index.netbsd.org/tech-net/2020/07/18/msg007797.html
 ...
 http://mail-index.netbsd.org/tech-net/2020/07/19/msg007810.html

From: Andreas Gustafsson <gson@gson.org>
To: Jukka Ruohonen <jruohonen@iki.fi>, gnats-bugs@netbsd.org
Cc: martin@NetBSD.org
Subject: Re: port-evbarm/55504: evbarm-earmv7hf testbed hangs during
 sbin/ifconfig/t_repeated_updown test
Date: Fri, 24 Jul 2020 12:17:47 +0300

 Jukka Ruohonen wrote:
 >  The reason is likely the same as in PR kern/55466, i.e. not related to 
 >  this particular, simple test per se.

 I don't think the present bug is the same as PR kern/55466, as this
 one can in fact be triggered by running "this particular simple, test"
 only; no hung rump_server processes are involved or required.

 Using a system built from 2020.07.22.01.24.40 sources (which has
 t_repeated_updown 1.3) under qemu 5.0.0, running

   sysctl -w hw.cnmagic=+
   cd /usr/tests/sbin/ifconfig
   atf-run t_repeated_updown

 hangs after printing "smsh0 up":

   tc-start: 1595581029.664487, repeated_updown
   tc-so:Test 35: smsh0 down
   tc-so:Test 35: smsh0 up

 The system hangs hard such that typing the cnmagic character on the
 emulated serial console has no effect.

 Note that t_repeated_updown.sh 1.4 will no longer trigger the hang.
 Also, if the system is started with "anita --machine virt", it will
 have no shmh interface and therefore will not hang.
 -- 
 Andreas Gustafsson, gson@gson.org

From: Martin Husemann <martin@duskware.de>
To: Andreas Gustafsson <gson@gson.org>
Cc: Jukka Ruohonen <jruohonen@iki.fi>, gnats-bugs@netbsd.org,
	martin@NetBSD.org
Subject: Re: port-evbarm/55504: evbarm-earmv7hf testbed hangs during
 sbin/ifconfig/t_repeated_updown test
Date: Fri, 24 Jul 2020 11:33:55 +0200

 On Fri, Jul 24, 2020 at 12:17:47PM +0300, Andreas Gustafsson wrote:
 > hangs after printing "smsh0 up":
 > 
 >   tc-start: 1595581029.664487, repeated_updown
 >   tc-so:Test 35: smsh0 down
 >   tc-so:Test 35: smsh0 up
 > 
 > The system hangs hard such that typing the cnmagic character on the
 > emulated serial console has no effect.

 This sound like a generic driver bug (maybe since the media locking
 changes or per cpu interface stats) - any chance you could manually
 try this with a DIAGNOSTIC/DEBUG/LOCKDEBUG kernel? ... or a qemu bug.
 Anyone with a real gumstix board?

 > Note that t_repeated_updown.sh 1.4 will no longer trigger the hang.
 > Also, if the system is started with "anita --machine virt", it will
 > have no shmh interface and therefore will not hang.

 Is that because the interface is configured UP in your test setup or
 does the test skip it eroneously?

 Martin

From: Andreas Gustafsson <gson@gson.org>
To: Martin Husemann <martin@duskware.de>
Cc: gnats-bugs@netbsd.org,
    Jukka Ruohonen <jruohonen@iki.fi>
Subject: Re: port-evbarm/55504: evbarm-earmv7hf testbed hangs during sbin/ifconfig/t_repeated_updown test
Date: Fri, 24 Jul 2020 12:54:31 +0300

 Martin Husemann wrote:
 > This sound like a generic driver bug (maybe since the media locking
 > changes or per cpu interface stats) - any chance you could manually
 > try this with a DIAGNOSTIC/DEBUG/LOCKDEBUG kernel?

 DIAGNOSTIC should be enabled already as it is enabled by default in
 -current (but I can't check using config -x because it appears to be
 broken).  I'll see what I can do about the others.

 > > Note that t_repeated_updown.sh 1.4 will no longer trigger the hang.
 > > Also, if the system is started with "anita --machine virt", it will
 > > have no shmh interface and therefore will not hang.
 > 
 > Is that because the interface is configured UP in your test setup or
 > does the test skip it eroneously?

 I believe it is configured UP due to dhcpcd being enabled by default
 on the evbarm-earmv7hf images.
 -- 
 Andreas Gustafsson, gson@gson.org

From: Andreas Gustafsson <gson@gson.org>
To: Martin Husemann <martin@duskware.de>
Cc: Jukka Ruohonen <jruohonen@iki.fi>,
    gnats-bugs@netbsd.org
Subject: Re: port-evbarm/55504: evbarm-earmv7hf testbed hangs during
 sbin/ifconfig/t_repeated_updown test
Date: Sat, 25 Jul 2020 20:48:52 +0300

 Martin Husemann wrote:
 > any chance you could manually try this with a
 > DIAGNOSTIC/DEBUG/LOCKDEBUG kernel?

 Done.  It made no difference; the system locked up as before
 without providing any additional diagnostics.
 -- 
 Andreas Gustafsson, gson@gson.org

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.