NetBSD Problem Report #58073
From www@netbsd.org Sun Mar 24 11:50:51 2024
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 3F0401A9239
for <gnats-bugs@gnats.NetBSD.org>; Sun, 24 Mar 2024 11:50:51 +0000 (UTC)
Message-Id: <20240324115049.1C22F1A923A@mollari.NetBSD.org>
Date: Sun, 24 Mar 2024 11:50:49 +0000 (UTC)
From: jspath55@gmail.com
Reply-To: jspath55@gmail.com
To: gnats-bugs@NetBSD.org
Subject: panic: Trap: Data Abort (EL1): Translation Fault L0 on Pi3 during automated tests
X-Send-Pr-Version: www-1.0
>Number: 58073
>Category: port-evbarm
>Synopsis: panic: Trap: Data Abort (EL1): Translation Fault L0 on Pi3 during automated tests
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: port-evbarm-maintainer
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Mar 24 11:55:00 +0000 2024
>Last-Modified: Fri Mar 29 14:20:02 +0000 2024
>Originator: Jim Spath
>Release: 10.0 RC6
>Organization:
>Environment:
NetBSD nb3b.home 10.0_RC6 NetBSD 10.0_RC6 (GENERIC64) #0: Tue Mar 12 10:19:02 UTC 2024 mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/evbarm/compile/GENERIC64 evbarm
>Description:
After several uneventuful usr-test runs on a Pi3B, the system panicked and rebooted.
The usr test suite was nearly complete; log portions follow below.
It looks like the last commands run were in the fs/ffs/t_snapshot case snapshotstress.
Automated Test Framework run command:
/usr/bin/atf-run | /usr/bin/tee /log/tests-${nm}-${dt}.log | /usr/bin/atf-report >/log/tests-${nm}-${dt}.txt 2>/log/tests-${nm}-${dt}.err
Log files:
-rw-r--r-- 1 root wheel 413629 Mar 22 08:03 tests-nb3b.home-202403220442.txt
-rw-r--r-- 1 root wheel 10327858 Mar 22 08:03 tests-nb3b.home-202403220442.log
$ tail /log/tests-nb3b.home-202403220442.txt
extattr_simple: [0.297141s] Passed.
[0.600340s]
fs/ffs/t_fifos (756/935): 1 test cases
fifos: [0.281438s] Passed.
[0.292858s]
fs/ffs/t_snapshot (757/935): 2 test cases
snapshot: [1.241567s] Passed.
snapshotstress:
$ tail /log/tests-nb3b.home-202403220442.log
tc-so:super-block backups (for fsck_ffs -b #) at:
tc-so:32, 2536, 5040, 7544,
tc-so:[ 1.0000000] entropy: ready
tc-so:[ 2.0100050] /dev/fss0: file system not clean (fs_clean=0); please fsck(8)
tc-end: 1711094632.792737, snapshot, passed
tc-start: 1711094632.804160, snapshotstress
tc-so:ffs.img: 4.9MB (10000 sectors) block size 4096, fragment size 512
tc-so: using 4 cylinder groups of 1.22MB, 313 blks, 608 inodes.
tc-so:super-block backups (for fsck_ffs -b #) at:
tc-so:32, 2536, 5040, 7544,
The /var/log/message report end:
Mar 22 07:40:29 nb3b ntpd[1739]: error resolving pool 1.netbsd.pool.ntp.org: Temporary failure in name resolution (2)
Mar 22 07:40:55 nb3b root: /usr/tests/sys/rc/h_simple: ERROR: the restart command does not take any parameters
Mar 22 07:40:55 nb3b root: /usr/tests/sys/rc/h_simple: ERROR: the start command does not take any parameters
Mar 22 07:40:56 nb3b root: /usr/tests/sys/rc/h_simple: ERROR: the stop command does not take any parameters
Mar 22 07:40:57 nb3b dhcpcd[731]: ps_root_recvmsg: Host is down
Mar 22 07:41:34 nb3b ntpd[1739]: error resolving pool 1.netbsd.pool.ntp.org: Temporary failure in name resolution (2)
Mar 22 07:42:02 nb3b dhcpcd[731]: ps_root_recvmsg: Host is down
Mar 22 08:03:31 nb3b inetd[10767]: 5439/tcp: max spawn rate (0 in 60 seconds) already met; closing for 600 seconds
Mar 22 08:05:04 syslogd[715]: restart
Mar 22 08:05:04 /netbsd: [ 465954.2272048] panic: Trap: Data Abort (EL1): Translation Fault L0 with read access for 000000000000005a: pc ffffc000003d0e34: ldrb w3, [x22,#90]
Mar 22 08:05:04 /netbsd:
Mar 22 08:05:04 /netbsd: [ 465954.2272048] cpu0: Begin traceback...
Mar 22 08:05:04 /netbsd: [ 465954.2272048] trace fp ffffc0009a497550
Mar 22 08:05:04 /netbsd: [ 465954.2272048] fp ffffc0009a497580 vpanic() at ffffc000004ef218 netbsd:vpanic+0x178
Mar 22 08:05:04 /netbsd: [ 465954.2272048] fp ffffc0009a4975e0 panic() at ffffc000004ef324 netbsd:panic+0x44
Mar 22 08:05:04 /netbsd: [ 465954.2272048] fp ffffc0009a497670 data_abort_handler() at ffffc000000a962c netbsd:data_abort_handler+0x1ec
Mar 22 08:05:04 /netbsd: [ 465954.2272048] tf ffffc0009a4976e0 el1_trap() at ffffc000000aaf84 netbsd:el1_vectors+0x784
Mar 22 08:05:04 /netbsd: [ 465954.2272048] ---- Data Abort (EL1): trapframe 0xffffc0009a4976e0 (304 bytes) ----
Mar 22 08:05:04 /netbsd: [ 465954.2272048] pc=ffffc000003d0e34, spsr=0000000020000005
Mar 22 08:05:04 /netbsd: [ 465954.2272048] esr=0000000096000004, far=000000000000005a
Mar 22 08:05:04 /netbsd: [ 465954.2272048] x0=ffff00000b1ca9f8, x1=ffff00003a9cc0d0
Mar 22 08:05:04 /netbsd: [ 465954.2272048] x2=0000000000000000, x3=0000000000000001
Mar 22 08:05:04 /netbsd: [ 465954.2272048] x4=ffff00003a9fe3e8, x5=ffff00003abf7000
Mar 22 08:05:04 /netbsd: [ 465954.2272048] x6=ffff00002858c4d0, x7=ffff00003a9c9b80
Mar 22 08:05:04 /netbsd: [ 465954.2272048] x8=0000000000000418, x9=ffff00003b0d9bc0
Mar 22 08:05:04 /netbsd: [ 465954.2272048] x10=ffffc000000a07d4, x11=000000000000003f
Mar 22 08:05:04 /netbsd: [ 465954.2272048] x12=000003fffffff738, x13=000003fffffff746
Mar 22 08:05:04 /netbsd: [ 465954.2272048] x14=0000000000000005, x15=0000fffffffdd1b0
Mar 22 08:05:04 /netbsd: [ 465954.2272048] x16=ffffc0000009e384, x17=0000f5864516c4f4
Mar 22 08:05:04 /netbsd: [ 465954.2272048] x18=00000000ffffffff, x19=ffff00003a9cc200
Mar 22 08:05:04 /netbsd: [ 465954.2272048] x20=ffff00000b1ca9a0, x21=ffff00003a9cc250
Mar 22 08:05:04 /netbsd: [ 465954.2272048] x22=0000000000000000, x23=ffff00003a9fe000
Mar 22 08:05:04 /netbsd: [ 465954.2272048] x24=ffff00003a8a2640, x25=ffff00003a9fe400
Mar 22 08:05:04 /netbsd: [ 465954.2272048] x26=ffff00003a9fe3e8, x27=000000000000005a
Mar 22 08:05:04 /netbsd: [ 465954.2272048] x28=ffff00003a8a2670, fp=x29=ffffc0009a497a10
Mar 22 08:05:04 /netbsd: [ 465954.2272048] lr=x30=ffffc000003d1e00, sp=ffffc0009a497a10
Mar 22 08:05:04 /netbsd: [ 465954.2272048] ------------------------------------------------
Mar 22 08:05:04 /netbsd: [ 465954.2272048] fp ffffc0009a497a10 dwc2_assign_and_init_hc() at ffffc000003d0e34 netbsd:dwc2_assign_and_init_hc+0x84
Mar 22 08:05:04 /netbsd: [ 465954.2272048] fp ffffc0009a497a90 dwc2_hcd_select_transactions() at ffffc000003d1dfc netbsd:dwc2_hcd_select_transactions+0x15c
Mar 22 08:05:04 /netbsd: [ 465954.2272048] fp ffffc0009a497b00 dwc2_release_channel() at ffffc000003d48e0 netbsd:dwc2_release_channel+0xe0
Mar 22 08:05:04 /netbsd: [ 465954.2272048] fp ffffc0009a497b30 dwc2_hc_xfercomp_intr() at ffffc000003d58ec netbsd:dwc2_hc_xfercomp_intr+0x32c
Mar 22 08:05:04 /netbsd: [ 465954.2272048] fp ffffc0009a497b80 dwc2_handle_hcd_intr() at ffffc000003d68ac netbsd:dwc2_handle_hcd_intr+0x568
Mar 22 08:05:04 /netbsd: [ 465954.2272048] fp ffffc0009a497c00 dwc2_intr() at ffffc000003cc4b8 netbsd:dwc2_intr+0xe4
Mar 22 08:05:04 /netbsd: [ 465954.2272048] fp ffffc0009a497c30 bcm2835_icu_intr() at ffffc0000001d77c netbsd:bcm2835_icu_intr+0x1c
Mar 22 08:05:04 /netbsd: [ 465954.2272048] fp ffffc0009a497c50 pic_dispatch() at ffffc000000023a8 netbsd:pic_dispatch+0x44
Mar 22 08:05:04 /netbsd: [ 465954.2272048] fp ffffc0009a497c90 pic_do_pending_ints() at ffffc00000002858 netbsd:pic_do_pending_ints+0x358
Mar 22 08:05:04 /netbsd: [ 465954.2272048] fp ffffc0009a497e28 cpu_idle() at ffffc000000a56f0 netbsd:cpu_idle+0x4c
Mar 22 08:05:04 /netbsd: [ 465954.2272048] fp ffffc0009a497e70 idle_loop() at ffffc000004a2844 netbsd:idle_loop+0xb4
Mar 22 08:05:04 /netbsd: [ 465954.2272048] tf ffffc0009a497ed0 el0_trap() at ffffc000000aaff0 netbsd:el1_trap_exit+0x68
Mar 22 08:05:04 /netbsd: [ 465954.2272048] cpu0: End traceback...
Mar 22 08:05:04 /netbsd: [ 465954.2272048] rebooting...
Mar 22 08:05:04 /netbsd: [ 1.0000000] Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003,
[...]
Mar 22 08:05:04 /netbsd: [ 1.4858384] sdmmc0: direct I/O error 5, r=6 p=0xffffc000aa9c6e4c write
Mar 22 08:05:04 /netbsd: [ 1.5358413] sdmmc0: SD card status: 4-bit, C10, U3, V30, A2
[...]
>How-To-Repeat:
Unknown.
I will isolate this case and run it more frequently to see if/when the fault repeats.
At least 10 automated test runs finished without this problem occurring.
>Fix:
Unknown.
>Audit-Trail:
From: Jim Spath <jspath55@gmail.com>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: port-evbarm/58073: panic: Trap: Data Abort (EL1): Translation
Fault L0 on Pi3 during automated tests
Date: Fri, 29 Mar 2024 10:19:01 -0400
On Sun, Mar 24, 2024 at 7:55=E2=80=AFAM <gnats-admin@netbsd.org> wrote:
> >Synopsis: panic: Trap: Data Abort (EL1): Translation Fault L0 on P=
i3 during automated tests
> >Arrival-Date: Sun Mar 24 11:55:00 +0000 2024
I have run the last test case logged before this panic by itself
(>2400 runs) once a minute with no ill effect.
I do not fully understand the stack trace details, and think some
other cause than the fs/ffs/t_snapshot test triggered the panic.
I will let this run a bit longer then go back to full automated test
framework runs.
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2024
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.