NetBSD Problem Report #45601

From martin@duskware.de  Fri Nov 11 15:36:03 2011
Return-Path: <martin@duskware.de>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by www.NetBSD.org (Postfix) with ESMTP id E90F963BEB2
	for <gnats-bugs@gnats.NetBSD.org>; Fri, 11 Nov 2011 15:36:02 +0000 (UTC)
Message-Id: <20111111153602.E90F963BEB2@www.NetBSD.org>
Date: Fri, 11 Nov 2011 15:36:02 +0000 (UTC)
From: martin@NetBSD.org
Reply-To: martin@NetBSD.org
To: gnats-bugs@gnats.NetBSD.org
Subject: shark panics during atf run
X-Send-Pr-Version: 3.95

>Number:         45601
>Category:       port-arm
>Synopsis:       shark panics during atf run
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    port-arm-maintainer
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Nov 11 15:40:00 +0000 2011
>Closed-Date:    Fri Nov 18 19:11:10 +0000 2011
>Last-Modified:  Fri Nov 18 19:11:10 +0000 2011
>Originator:     Martin Husemann
>Release:        NetBSD 5.99.56
>Organization:
The NetBSD Foundation, Inc.
>Environment:
System: NetBSD night-rest.duskware.de 5.99.56 NetBSD 5.99.56 (GENERIC) #2: Thu Nov 10 22:13:00 CET 2011 martin@night-porter.duskware.de:/usr/src/sys/arch/shark/compile/GENERIC shark
Architecture: arm
Machine: shark
>Description:

I see a reproducable kernel panic during atf runs on shark:

fs/psshfs/t_psshfs (94/482): 3 test cases
    inode_nos: Failed: Test case timed out after 300 seconds
    ls: panic: kernel diagnostic assertion "curcpu()->ci_biglock_count == 0" failed: file "../../../../sys/userret.h", line 86 
Mcpu0: Begin traceback...
Mnetbsd:vpanic+0xc
      scp=0xf0191260 rlv=0xf026f5d8 (netbsd:kern_assert+0x40)
      rsp=0xf2c7ce84 rfp=0xf2c7ce98
      r7=0x00000016
netbsd:kern_assert+0x10
      scp=0xf026f5a8 rlv=0xf0001658 (netbsd:userret+0xbc)
      rsp=0xf2c7ceac rfp=0xf2c7cec4
netbsd:userret+0xc
      scp=0xf00015a8 rlv=0xf01a63a8 (netbsd:syscall+0xe4)
      rsp=0xf2c7cec8 rfp=0xf2c7cf54
      r5=0xf2c1b540 r4=0xf2c7cfb4
netbsd:syscall+0xc
      scp=0xf01a62d0 rlv=0xf01a6544 (netbsd:swi_handler+0xb4)
      rsp=0xf2c7cf58 rfp=0xf2c7cfb0
      r10=0x00000000 r9=0x00000000
      r8=0x00000000 r7=0x00000001 r6=0xf02bd8d4 r5=0xf2c7cfb4
      r4=0xf2c1b540
netbsd:swi_handler+0xc
      scp=0xf01a649c rlv=0xf003cf10 (netbsd:swi_entry+0x5c)
      rsp=0xf2c7cfb4 rfp=0xefffece0
      r8=0xefffe808 r7=0x00011c10
      r6=0x00000000 r5=0xefffe808 r4=0xf02bd8d4
cpu0: End traceback...


>How-To-Repeat:

cd /usr/tests
atf-run | atf-report

>Fix:
n/a

>Release-Note:

>Audit-Trail:
From: Martin Husemann <martin@duskware.de>
To: netbsd-bugs@NetBSD.org
Cc: 
Subject: Re: port-arm/45601: shark panics during atf run
Date: Sat, 12 Nov 2011 22:57:03 +0100

 This is strictly reproducable for me, a quick way is:

  cd /usr/tests/fs/psshfs
  atf-run | atf-report

 But: a simple "atf-run" in that directory will not trigger tha panic,
 neither will the above run under ktruss -i.

 Martin

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-arm/45601: shark panics during atf run
Date: Sun, 13 Nov 2011 14:21:33 +0100

 I could reproduce it from a ssh session running the ktruss -i, but I don't know
 how much (if any) is missing at the end of the log.

 Martin

 --8<--
 Full log is pretty big, can be provided if needed. Last part is:

   8123      1 sh       open("/usr/tests/fs/psshfs/t_psshfs", 0, 0x30d1c) = 3
   8123      1 sh       lseek(0x3, 0, 0, 0, 0)      = 0
   8123      1 sh       read(0x3, 0xefffe79c, 0x4)  = 4
        "#! /"
   8123      1 sh       lseek(0x3, 0, 0, 0, 0)      = 0
   8123      1 sh       fcntl(0x3, 0, 0xa)          = 10
   8123      1 sh       close(0x3)                  = 0
   8123      1 sh       fcntl(0xa, 0x2, 0x1)        = 0
   8123      1 sh       read(0xa, 0x2020f400, 0x3f8) = 1016
        "#! /usr/bin/atf-sh\n# $NetBSD: t_psshfs.sh,v 1.6 2011/08/12 04:14:00 "
   8123      1 sh       read(0xa, 0x2020f400, 0x3f8) = 1016
        "NCIDENTAL, SPECIAL, EXEMPLARY, OR\n# CONSEQUENTIAL DAMAGES (INCLUDING"
   8123      1 sh       read(0xa, 0x2020f400, 0x3f8) = 1016
        "error trying to access /dev/puffs"\n\t\t\t;;\n\t\tyes)\n\t\t\t;;\n\t"
   8123      1 sh       read(0xa, 0x2020f400, 0x3f8) = 1016
        "sh_host_key .\n\tatf_check -s eq:0 -o empty -e empty \\\n\t    cp $(a"
   8123      1 sh       read(0xa, 0x2020f400, 0x3f8) = 1016
        "rver spawned by start_ssh and prints diagnosis data.\n#\nstop_ssh() {"
   8123      1 sh       read(0xa, 0x2020f400, 0x3f8) = 1016
        "s.sh <<EOF\n#! $(atf-config -t atf_shell)\n#\n# Compares the inodes o"
   8123      1 sh       read(0xa, 0x2020f400, 0x3f8) = 1016
        "-e empty \\\n\t    -x 'cd mnt/dir && ls .. >/dev/null && /bin/pwd'\n}"
   8123      1 sh       read(0xa, 0x2020f400, 0x3f8) = 639
        "nt/loser\n\t# Oops -- this doesn't work.  We need to stop the child o"
   8123      1 sh       read(0xa, 0x2020f400, 0x3f8) = 0
        ""
   8123      1 sh       close(0xa)                  = 0
   8123      1 sh       pipe()                      = 3, 5
   8123      1 sh       fork()                      = 7560
   8123      1 sh       close(0x5)                  = 0
   7560      1 sh       fork                        = 0
   7560      1 sh       emul(netbsd)
   7560      1 sh       close(0x3)                  = 0
   7560      1 sh       close(0x1)                  = 0
   7560      1 sh       dup2(0x5, 0x1)              = 1
   7560      1 sh       close(0x5)                  = 0
   7560      1 sh       __stat50("/sbin/expr", 0xefffe160) Err#2 ENOENT
   7560      1 sh       __stat50("/usr/sbin/expr", 0xefffe160) Err#2 ENOENT
   7560      1 sh       __stat50("/bin/expr", 0xefffe160) = 0
   7560      1 expr     execve("/bin/expr", 0x20206394, 0x20206404) JUSTRETURN
   7560      1 expr     emul(netbsd)
   7560      1 expr     mmap(0, 0x8000, 0x3, 0x1002, 0xffffffff, 0, 0, 0) = 0x20032000
   7560      1 expr     open("/etc/ld.so.conf", 0, 0x200315e0) Err#2 ENOENT
   7560      1 expr     open("/lib/libc.so.12", 0, 0) = 3
   7560      1 expr     __fstat50(0x3, 0xefffe458)  = 0
   7560      1 expr     mmap(0, 0x1000, 0x1, 0x1, 0x3, 0, 0, 0) = 0x20014000
   7560      1 expr     munmap(0x20014000, 0x1000)  = 0
   7560      1 expr     mmap(0, 0x11c000, 0x5, 0xf000002, 0x3, 0, 0, 0) = 0x20040000
   7560      1 expr     mmap(0x20145000, 0x8000, 0x3, 0x12, 0x3, 0, 0xfd000, 0) = 0x20145000
   7560      1 expr     mmap(0x2014d000, 0xf000, 0x3, 0x1012, 0xffffffff, 0, 0, 0) = 0x2014d000
   7560      1 expr     mprotect(0x2013e000, 0x7000, 0) = 0
   7560      1 expr     close(0x3)                  = 0
   7560      1 expr     _lwp_setprivate(0x20036000) Err#78 ENOSYS
   7560      1 expr     _lwp_self()                 = 1
   7560      1 expr     __sigprocmask14(0x1, 0xefffebbc, 0xefffec44) = 0
   7560      1 expr     __sigprocmask14(0x3, 0xefffec44, 0) = 0
   7560      1 expr     __sysctl(0xefffeb64, 0x2, 0x2015a1e8, 0xefffeb6c, 0, 0) = 0
   7560      1 expr     __sysctl(0xefffeb5c, 0x2, 0xefffeb68, 0xefffeb64, 0, 0) = 0
   7560      1 expr     rasctl(0x200cb9f0, 0xc, 0)  = 0
   7560      1 expr     _lwp_self()                 = 1
   7560      1 expr     __sigprocmask14(0x1, 0xefffeb80, 0xefffec44) = 0
   7560      1 expr     __sigprocmask14(0x3, 0xefffec44, 0) = 0
   7560      1 expr     issetugid()                 = 0
   7560      1 expr     __sysctl(0xefffe5fc, 0x2, 0x20154870, 0xefffe604, 0, 0) = 0
   7560      1 expr     __sysctl(0xefffe524, 0x2, 0x2015b0b0, 0xefffe52c, 0, 0) = 0
   7560      1 expr     readlink("/etc/malloc.conf", 0xefffe608, 0x400) Err#2 ENOENT
   7560      1 expr     break(0x1359c)              = 0
   7560      1 expr     break(0x1359c)              = 0
   7560      1 expr     break(0x100000)             = 0
   7560      1 expr     mmap(0, 0x100000, 0x3, 0x14001002, 0xffffffff, 0, 0, 0) = 0x20200000
   7560      1 expr     __fstat50(0x1, 0xefffe97c)  = 0
   7560      1 expr     _lwp_self()                 = 1
   7560      1 expr     __sigprocmask14(0x1, 0xefffeb98, 0xefffebd4) = 0
   7560      1 expr     __sigprocmask14(0x3, 0xefffebd4, 0) = 0
   7560      1 expr     _lwp_self()                 = 1
   7560      1 expr     __sigprocmask14(0x1, 0xefffeb50, 0xefffebd4) = 0
   7560      1 expr     __sigprocmask14(0x3, 0xefffebd4, 0) = 0
   7560      1 expr     write(0x1, 0x20214000, 0x2) = 2
        "2\n"
   8123      1 sh       read                        = 2
        "2\n"
   7560      1 expr     exit(0)
   8123      1 sh       read(0x3, 0xefffe448, 0x80) = 0
        ""
   8123      1 sh       close(0x3)                  = 0
   8123      1 sh       getpgrp()                   = 8123
   8123      1 sh       __wait450(0xffffffff, 0xefffe3d8, 0, 0) = 7560
   8123      1 sh       __stat50("/usr/tests/fs/psshfs/t_psshfs", 0xefffe4c4) = 0
   8123      1 sh       open(".", 0x400004, 0x30644) = 3
   8123      1 sh       __fstat50(0x3, 0xefffd984)  = 0
   8123      1 sh       fstatvfs1(0x3, 0xefffda00, 0x2) = 0
   8123      1 sh       lseek(0x3, 0, 0, 0, 0x1)    = 0
   8123      1 sh       __getdents30(0x3, 0x20247000, 0x1000) = 408
        "\^Cs\M-"\0\0\0\0\0\^P\0\^A\0\^D.\0\0\0s\M-"\0\0\0\0\0\^P\0\^B\0\^D.."
   8123      1 sh       lseek(0x3, 0, 0, 0, 0x1)    = 512
   8123      1 sh       __getdents30(0x3, 0x20247000, 0x1000) = 0
        ""
   8123      1 sh       close(0x3)                  = 0
   8123      1 sh       pipe()                      = 3, 5
   8123      1 sh       fork()                      = 2119
   8123      1 sh       close(0x5)                  = 0
   2119      1 sh       fork                        = 0
   2119      1 sh       emul(netbsd)
   2119      1 sh       close(0x3)                  = 0
   7667      1 atf-run  poll                        = 0
   2119      1 sh       close(0x1)                  = 0
   2119      1 sh       dup2(0x5, 0x1)              = 1
   2119      1 sh       close(0x5)                  = 0
   2119      1 sh       exit(0x1)
   8123      1 sh       read                        = 0
        ""
   8123      1 sh       close(0x3)                  = 0
   8123      1 sh       getpgrp()                   = 8123
   8123      1 sh       __wait450(0xffffffff, 0xefffe458, 0, 0) = 2119
   8123      1 sh       pipe()                      = 3, 5
   8123      1 sh       fork()                      = 8062
   8123      1 sh       close(0x5)                  = 0
   8062      1 sh       fork                        = 0
   8062      1 sh       emul(netbsd)
   8062      1 sh       close(0x3)                  = 0
   8062      1 sh       close(0x1)                  = 0


From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-arm/45601: shark panics during atf run
Date: Fri, 18 Nov 2011 20:07:00 +0100

 Ok, I can't seem to be able to reproduce it anymore.

 For the record, in case it is a race and we are just lucky now (and it would
 come back sometime):

  - the syscall code looked completely garbage (os_mask != SWI_OS_NETBSD, pretty
    random high code > NSYSENT)

 so it was probably just a broken object file or ld.elf_so breakage.

 I'll close this PR for now, let's hope it does not come back.

 Martin

State-Changed-From-To: open->closed
State-Changed-By: martin@NetBSD.org
State-Changed-When: Fri, 18 Nov 2011 19:11:10 +0000
State-Changed-Why:
can't reproduce it any more


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.