NetBSD Problem Report #45708

From www@NetBSD.org  Tue Dec 13 09:07:12 2011
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by www.NetBSD.org (Postfix) with ESMTP id 39EDB63DB5E
	for <gnats-bugs@gnats.NetBSD.org>; Tue, 13 Dec 2011 09:07:12 +0000 (UTC)
Message-Id: <20111213090710.7A26663D993@www.NetBSD.org>
Date: Tue, 13 Dec 2011 09:07:10 +0000 (UTC)
From: bartosz.kuzma@gmail.com
Reply-To: bartosz.kuzma@gmail.com
To: gnats-bugs@NetBSD.org
Subject: Unable to read big files from large FFSv2 (12TB), ls out of swap
X-Send-Pr-Version: www-1.0

>Number:         45708
>Category:       kern
>Synopsis:       Unable to read big files from large FFSv2 (12TB), ls out of swap
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Dec 13 09:10:00 +0000 2011
>Last-Modified:  Sun Apr 22 18:20:02 +0000 2012
>Originator:     Bartosz Kuzma
>Release:        NetBSD 5.1_STABLE
>Organization:
>Environment:
NetBSD backup-host.marol.com.pl 5.1_STABLE NetBSD 5.1_STABLE (GENERIC) #0: Tue Dec  6 04:51:51 UTC 2011  builds@b6.netbsd.org:/home/builds/ab/netbsd-5/amd64/201112060320Z-obj/home/builds/ab/netbsd-5/src/sys/arch/amd64/compile/GENERIC amd64
>Description:
On large filesystem (12TB) when I try to create big files I'm unable to ls directory.

When I try to do:

# ls -1 /mnt

Kernel panic with the following message:

UVM: pid 977 (ls), uid 0 killed: out of swap
ubc_uiomove: error=12
dev = 0xa800, block = 1305922608, fs = /mnt
panic: blkfree: freeing free block
fatal breakpoint trap in supervisor mode
trap type 1 code 0 rip ffffffff8052ace5 cs 8 rflags 246 cr2  7f7f9f14b000 cpl 0 rsp ffff80005263b500


After reboot, and mount dk0 as read only:

backup-host# mount -r /dev/dk0 /mnt
backup-host# dd if=/mnt/zero-1tb.dat of=/dev/null

/mnt: bad dir ino 2 at offset 0: mangled entry
panic: bad dir
fatal breakpoint trap in supervisor mode
trap type 1 code 0 rip ffffffff8052ace5 cs 8 rflags 246 cr2  7f7ffdb16c60 cpl 0 rsp ffff8000520d5670
Stopped in pid 376.1 (dd) at    netbsd:breakpoint+0x5:  leave

db{3}> trace/t 0x178
trace: pid 376 lid 1 at 0xffff8000520d5670
breakpoint() at netbsd:breakpoint+0x5
panic() at netbsd:panic+0x24d
ufs_dirbad() at netbsd:ufs_dirbad+0x54
ufs_lookup() at netbsd:ufs_lookup+0x3ce
VOP_LOOKUP() at netbsd:VOP_LOOKUP+0x63
lookup() at netbsd:lookup+0x33a
namei() at netbsd:namei+0x170
vn_open() at netbsd:vn_open+0x95
sys_open() at netbsd:sys_open+0xeb
syscall() at netbsd:syscall+0xb6


backup-host# gpt show sd0
        start         size  index  contents
            0            1         PMBR
            1            1         Pri GPT header
            2           32         Pri GPT table
           34  27334279101      1  GPT part - NetBSD UFS/UFS2
  27334279135           32         Sec GPT table
  27334279167            1         Sec GPT header

Full dmesg:
Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
    2006, 2007, 2008, 2009, 2010
    The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
    The Regents of the University of California.  All rights reserved.

NetBSD 5.1_STABLE (GENERIC) #0: Tue Dec  6 04:51:51 UTC 2011
        builds@b6.netbsd.org:/home/builds/ab/netbsd-5/amd64/201112060320Z-obj/home/builds/ab/netbsd-5/src/sys/arch/amd64/compile/GENERIC
total memory = 4095 MB
avail memory = 3954 MB
timecounter: Timecounters tick every 10.000 msec
timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100
SMBIOS rev. 2.3 @ 0xf9920 (87 entries)
Dell Computer Corporation PowerEdge 2850
mainbus0 (root)
cpu0 at mainbus0 apid 0: Intel 686-class, 3192MHz, id 0xf43
cpu1 at mainbus0 apid 6: Intel 686-class, 3192MHz, id 0xf43
cpu2 at mainbus0 apid 1: Intel 686-class, 3192MHz, id 0xf43
cpu3 at mainbus0 apid 7: Intel 686-class, 3192MHz, id 0xf43
ioapic0 at mainbus0 apid 8: pa 0xfec00000, version 20, 24 pins
ioapic1 at mainbus0 apid 9: pa 0xfec80000, version 20, 24 pins
ioapic2 at mainbus0 apid 10: pa 0xfec83000, version 20, 24 pins
acpi0 at mainbus0: Intel ACPICA 20080321
acpi0: X/RSDT: OemId <DELL  ,PE BKC  ,00000001>, AslId <MSFT,0100000a>
acpi0: SCI interrupting at int 9
acpi0: fixed-feature power button present
timecounter: Timecounter "ACPI-Fast" frequency 3579545 Hz quality 1000
ACPI-Fast 24-bit timer
pcppi1 at acpi0 (SPK, PNP0800): io 0x61
midi0 at pcppi1: PC speaker (CPU-intensive output)
sysbeep0 at pcppi1
attimer1 at acpi0 (TMR, PNP0100): io 0x40-0x5f irq 0
FDC (PNP0700) at acpi0 not configured
COMA (PNP0501) at acpi0 not configured
hpet0 at acpi0 (HPET, PNP0103-0): mem 0xfed00000-0xfed003ff
timecounter: Timecounter "hpet0" frequency 14318179 Hz quality 2000
attimer1: attached to pcppi1
ipmi0 at mainbus0
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
pchb0 at pci0 dev 0 function 0
pchb0: vendor 0x8086 product 0x3590 (rev. 0x09)
ppb0 at pci0 dev 2 function 0: vendor 0x8086 product 0x3595 (rev. 0x09)
pci1 at ppb0 bus 1
pci1: i/o space, memory space enabled, rd/line, wr/inv ok
ppb1 at pci1 dev 0 function 0: vendor 0x8086 product 0x0330 (rev. 0x06)
ppb1: disabling notification events
pci2 at ppb1 bus 2
pci2: i/o space, memory space enabled, rd/line, wr/inv ok
amr0 at pci2 dev 14 function 0: AMI RAID <PERC 4e/Di>
amr0: interrupting at ioapic0 pin 18
amr0: firmware 521X, BIOS H430, 256MB RAM
ld0 at amr0 unit 0: RAID 1, optimal
ld0: 34680 MB, 8807 cyl, 128 head, 63 sec, 512 bytes/sect x 71024640 sectors
ld1 at amr0 unit 1: RAID 1, optimal
ld1: 69360 MB, 8842 cyl, 255 head, 63 sec, 512 bytes/sect x 142049280 sectors
ppb2 at pci1 dev 0 function 2: vendor 0x8086 product 0x0332 (rev. 0x06)
ppb2: disabling notification events
pci3 at ppb2 bus 3
pci3: i/o space, memory space enabled, rd/line, wr/inv ok
ppb3 at pci0 dev 4 function 0: vendor 0x8086 product 0x3597 (rev. 0x09)
pci4 at ppb3 bus 4
pci4: i/o space, memory space enabled, rd/line, wr/inv ok
ppb4 at pci0 dev 5 function 0: vendor 0x8086 product 0x3598 (rev. 0x09)
pci5 at ppb4 bus 5
pci5: i/o space, memory space enabled, rd/line, wr/inv ok
ppb5 at pci5 dev 0 function 0: vendor 0x8086 product 0x0329 (rev. 0x09)
ppb5: disabling notification events
pci6 at ppb5 bus 6
pci6: i/o space, memory space enabled, rd/line, wr/inv ok
wm0 at pci6 dev 7 function 0: Intel i82541GI 1000BASE-T Ethernet, rev. 5
wm0: interrupting at ioapic2 pin 0
wm0: 32-bit 66MHz PCI bus
wm0: 65536 word (16 address bits) SPI EEPROM
wm0: Ethernet address 00:13:72:4d:aa:00
igphy0 at wm0 phy 1: Intel IGP01E1000 Gigabit PHY, rev. 0
igphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
ppb6 at pci5 dev 0 function 2: vendor 0x8086 product 0x032a (rev. 0x09)
ppb6: disabling notification events
pci7 at ppb6 bus 7
pci7: i/o space, memory space enabled, rd/line, wr/inv ok
wm1 at pci7 dev 8 function 0: Intel i82541GI 1000BASE-T Ethernet, rev. 5
wm1: interrupting at ioapic2 pin 1
wm1: 32-bit 66MHz PCI bus
wm1: 65536 word (16 address bits) SPI EEPROM
wm1: Ethernet address 00:13:72:4d:aa:01
igphy1 at wm1 phy 1: Intel IGP01E1000 Gigabit PHY, rev. 0
igphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
ppb7 at pci0 dev 6 function 0: vendor 0x8086 product 0x3599 (rev. 0x09)
pci8 at ppb7 bus 8
pci8: i/o space, memory space enabled, rd/line, wr/inv ok
mfi0 at pci8 dev 0 function 0: Dell PERC 6/e
mfi0: interrupting at ioapic0 pin 16
mfi0: logical drives 1, version 6.3.0-0001, 512MB RAM
scsibus0 at mfi0: 64 targets, 8 luns per target
uhci0 at pci0 dev 29 function 0: vendor 0x8086 product 0x24d2 (rev. 0x02)
uhci0: interrupting at ioapic0 pin 16
usb0 at uhci0: USB revision 1.0
uhci1 at pci0 dev 29 function 1: vendor 0x8086 product 0x24d4 (rev. 0x02)
uhci1: interrupting at ioapic0 pin 19
usb1 at uhci1: USB revision 1.0
uhci2 at pci0 dev 29 function 2: vendor 0x8086 product 0x24d7 (rev. 0x02)
uhci2: interrupting at ioapic0 pin 18
usb2 at uhci2: USB revision 1.0
ehci0 at pci0 dev 29 function 7: vendor 0x8086 product 0x24dd (rev. 0x02)
ehci0: interrupting at ioapic0 pin 23
ehci0: EHCI version 1.0
ehci0: companion controllers, 2 ports each: uhci0 uhci1 uhci2
usb3 at ehci0: USB revision 2.0
ppb8 at pci0 dev 30 function 0: vendor 0x8086 product 0x244e (rev. 0xc2)
pci9 at ppb8 bus 9
pci9: i/o space, memory space enabled
vga0 at pci9 dev 13 function 0: vendor 0x1002 product 0x5159 (rev. 0x00)
wsdisplay0 at vga0 kbdmux 1
wsmux1: connecting to wsdisplay0
drm at vga0 not configured
ichlpcib0 at pci0 dev 31 function 0
ichlpcib0: vendor 0x8086 product 0x24d0 (rev. 0x02)
timecounter: Timecounter "ichlpcib0" frequency 3579545 Hz quality 1000
ichlpcib0: 24-bit timer
ichlpcib0: TCO (watchdog) timer configured.
piixide0 at pci0 dev 31 function 1
piixide0: Intel 82801EB IDE Controller (ICH5) (rev. 0x02)
piixide0: bus-master DMA support present
piixide0: primary channel configured to compatibility mode
piixide0: primary channel interrupting at ioapic0 pin 14
atabus0 at piixide0 channel 0
piixide0: secondary channel configured to compatibility mode
piixide0: secondary channel interrupting at ioapic0 pin 15
atabus1 at piixide0 channel 1
isa0 at ichlpcib0
com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
com0: console
pckbc0 at isa0 port 0x60-0x64
fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
timecounter: Timecounter "clockinterrupt" frequency 100 Hz quality 0
timecounter: Timecounter "TSC" frequency 3192302240 Hz quality 3000
scsibus0: waiting 2 seconds for devices to settle...
fd0 at fdc0 drive 0: 1.44MB, 80 cyl, 2 head, 18 sec
uhub0 at usb0: vendor 0x8086 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhub1 at usb3: vendor 0x8086 EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub1: 6 ports with 6 removable, self powered
uhub2 at usb1: vendor 0x8086 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub2: 2 ports with 2 removable, self powered
atapibus0 at atabus0: 2 targets
uhub3 at usb2: vendor 0x8086 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub3: 2 ports with 2 removable, self powered
cd0 at atapibus0 drive 0: <TEAC CD-ROM CD-224E-N, , 3.AB> cdrom removable
cd0: 32-bit data port
cd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 2 (Ultra/33)
cd0(piixide0:0:0): using PIO mode 4, Ultra-DMA mode 2 (Ultra/33) (using DMA)
uhub4 at uhub1 port 3: vendor 0x413c product 0xa001, class 9/0, rev 2.00/0.00, addr 2
uhub4: multiple transaction translators
uhub4: 2 ports with 2 removable, self powered
sd0 at scsibus0 target 0 lun 0: <DELL, PERC 6/E Adapter, 1.22> disk fixed
sd0: fabricating a geometry
sd0: 13034 GB, 13346816 cyl, 64 head, 32 sec, 512 bytes/sect x 27334279168 sectors
sd0: fabricating a geometry
sd0: mbr partition exceeds disk size
sd0: GPT GUID: 5ae30ad8-1e74-11e1-b6ea-0013724daa00
dk0 at sd0: backup
dk0: 27334279101 blocks at 34, type: ffs
uhidev0 at uhub4 port 2 configuration 1 interface 0
uhidev0: Logitech USB Keyboard, rev 1.10/64.00, addr 3, iclass 3/1
ukbd0 at uhidev0
wskbd0 at ukbd0 mux 1
wskbd0: connecting to wsdisplay0
uhidev1 at uhub4 port 2 configuration 1 interface 1
uhidev1: Logitech USB Keyboard, rev 1.10/64.00, addr 3, iclass 3/0
uhidev1: 3 report ids
uhid0 at uhidev1 reportid 1: input=1, output=0, feature=0
uhid1 at uhidev1 reportid 2: input=1, output=0, feature=0
uhid2 at uhidev1 reportid 3: input=3, output=0, feature=0
ipmi0: version 1.5 interface KCS iobase 0xca8/8 spacing 4
Kernelized RAIDframe activated
pad0: outputs: 44100Hz, 16-bit, stereo
audio0 at pad0: half duplex, playback, capture
boot device: ld0
root on ld0a dumps on ld0b
/: replaying log to memory
root file system type: ffs
/: replaying log to disk
/var: replaying log to disk
/usr: replaying log to disk
wsdisplay0: screen 1 added (80x25, vt100 emulation)
wsdisplay0: screen 2 added (80x25, vt100 emulation)
wsdisplay0: screen 3 added (80x25, vt100 emulation)
wsdisplay0: screen 4 added (80x25, vt100 emulation)
/tmp: replaying log to disk
/home: replaying log to disk
mfi0: normal state on 'mfi0:0' (online)


>How-To-Repeat:
backup-host# newfs -O 2 dk0
/dev/rdk0: 13346816.0MB (27334279100 sectors) block size 16384, fragment size 2048
        using 72170 cylinder groups of 184.94MB, 11836 blks, 22976 inodes.
super-block backups (for fsck_ffs -b #) at:
160, 378912, 757664, 1136416, 1515168, 1893920, 2272672, 2651424, 3030176,
...............................................................................
backup-host# mount -o log /dev/dk0 /mnt
backup-host# dd if=/dev/zero of=/mnt/zero-1tb.dat bs=1024m count=1024
backup-host# dd if=/dev/zero of=/mnt/zero-2tb.dat bs=1024m count=2048

Log to another terminal and try to do:

backup-host# ls -1 /mnt


>Fix:

>Audit-Trail:
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/45708: Unable to read big files from large FFSv2 (12TB), ls
 out of swap
Date: Wed, 14 Dec 2011 12:35:56 +0000

 On Tue, Dec 13, 2011 at 09:10:01AM +0000, bartosz.kuzma@gmail.com wrote:
  > On large filesystem (12TB) when I try to create big files I'm
  > unable to ls directory.
  > 
  > When I try to do:
  > 
  > # ls -1 /mnt
  > 
  > Kernel panic with the following message:
  > 
  > UVM: pid 977 (ls), uid 0 killed: out of swap
  > ubc_uiomove: error=12
  > dev = 0xa800, block = 1305922608, fs = /mnt

 That is weird...

  > panic: blkfree: freeing free block

 ...but this makes me think the real problem is that the filesystem is
 corrupted. Have you run fsck on it recently? Does this really happen
 on a freshly newfs'd volume as described?

 -- 
 David A. Holland
 dholland@netbsd.org

From: =?UTF-8?Q?Bartosz_Ku=C5=BAma?= <bartosz.kuzma@gmail.com>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/45708: Unable to read big files from large FFSv2 (12TB), ls
 out of swap
Date: Wed, 14 Dec 2011 13:44:32 +0100

 On Wed, Dec 14, 2011 at 13:40, David Holland <dholland-bugs@netbsd.org> wro=
 te:
 > The following reply was made to PR kern/45708; it has been noted by GNATS=
 .
 >
 > From: David Holland <dholland-bugs@netbsd.org>
 > To: gnats-bugs@NetBSD.org
 > Cc:
 > Subject: Re: kern/45708: Unable to read big files from large FFSv2 (12TB)=
 , ls
 > =C2=A0out of swap
 > Date: Wed, 14 Dec 2011 12:35:56 +0000
 >
 > =C2=A0On Tue, Dec 13, 2011 at 09:10:01AM +0000, bartosz.kuzma@gmail.com w=
 rote:
 > =C2=A0> On large filesystem (12TB) when I try to create big files I'm
 > =C2=A0> unable to ls directory.
 > =C2=A0>
 > =C2=A0> When I try to do:
 > =C2=A0>
 > =C2=A0> # ls -1 /mnt
 > =C2=A0>
 > =C2=A0> Kernel panic with the following message:
 > =C2=A0>
 > =C2=A0> UVM: pid 977 (ls), uid 0 killed: out of swap
 > =C2=A0> ubc_uiomove: error=3D12
 > =C2=A0> dev =3D 0xa800, block =3D 1305922608, fs =3D /mnt
 >
 > =C2=A0That is weird...
 >
 > =C2=A0> panic: blkfree: freeing free block
 >
 > =C2=A0...but this makes me think the real problem is that the filesystem =
 is
 > =C2=A0corrupted. Have you run fsck on it recently? Does this really happe=
 n
 > =C2=A0on a freshly newfs'd volume as described?
 >
 > =C2=A0--
 > =C2=A0David A. Holland
 > =C2=A0dholland@netbsd.org
 >

 Yes, it is easily reproductible on freshly newfs'd volume.

 When I did test with creating several large files (about 256GB each)
 and then call sync command and did unclean reboot (e. g. poweroff) it
 is unable to mount this fs again. It hangs on "replying log to disk".
 However it is possible to mount it in read-only mode. It simply put
 "replying log to memory" and works.

 If you need more info or even access to this machine ask me.

 --=20
 Pozdrawiam, Bartosz Ku=C5=BAma.

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/45708: Unable to read big files from large FFSv2 (12TB), ls
 out of swap
Date: Sun, 22 Apr 2012 17:52:43 +0000

 (not filed in gnats; this tends to happen if you reply to your own
 gnats mail)

    ------

 From: Bartosz Ku?ma <bartosz.kuzma@gmail.com>
 To: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,
 	bartosz.kuzma@gmail.com
 Subject: Re: kern/45708: Unable to read big files from large FFSv2 (12TB), ls
 	out of swap
 Date: Wed, 14 Dec 2011 14:35:02 +0100

 On Wed, Dec 14, 2011 at 13:45, Bartosz Ku?ma <bartosz.kuzma@gmail.com> wrote:
 > The following reply was made to PR kern/45708; it has been noted by GNATS.
 >
 > From: =?UTF-8?Q?Bartosz_Ku=C5=BAma?= <bartosz.kuzma@gmail.com>
 > To: gnats-bugs@netbsd.org
 > Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
 > Subject: Re: kern/45708: Unable to read big files from large FFSv2 (12TB), ls
 > ?out of swap
 > Date: Wed, 14 Dec 2011 13:44:32 +0100
 >
 > ?On Wed, Dec 14, 2011 at 13:40, David Holland <dholland-bugs@netbsd.org> wro=
 > ?te:
 > ?> The following reply was made to PR kern/45708; it has been noted by GNATS=
 > ?.
 > ?>
 > ?> From: David Holland <dholland-bugs@netbsd.org>
 > ?> To: gnats-bugs@NetBSD.org
 > ?> Cc:
 > ?> Subject: Re: kern/45708: Unable to read big files from large FFSv2 (12TB)=
 > ?, ls
 > ?> =C2=A0out of swap
 > ?> Date: Wed, 14 Dec 2011 12:35:56 +0000
 > ?>
 > ?> =C2=A0On Tue, Dec 13, 2011 at 09:10:01AM +0000, bartosz.kuzma@gmail.com w=
 > ?rote:
 > ?> =C2=A0> On large filesystem (12TB) when I try to create big files I'm
 > ?> =C2=A0> unable to ls directory.
 > ?> =C2=A0>
 > ?> =C2=A0> When I try to do:
 > ?> =C2=A0>
 > ?> =C2=A0> # ls -1 /mnt
 > ?> =C2=A0>
 > ?> =C2=A0> Kernel panic with the following message:
 > ?> =C2=A0>
 > ?> =C2=A0> UVM: pid 977 (ls), uid 0 killed: out of swap
 > ?> =C2=A0> ubc_uiomove: error=3D12
 > ?> =C2=A0> dev =3D 0xa800, block =3D 1305922608, fs =3D /mnt
 > ?>
 > ?> =C2=A0That is weird...
 > ?>
 > ?> =C2=A0> panic: blkfree: freeing free block
 > ?>
 > ?> =C2=A0...but this makes me think the real problem is that the filesystem =
 > ?is
 > ?> =C2=A0corrupted. Have you run fsck on it recently? Does this really happe=
 > ?n
 > ?> =C2=A0on a freshly newfs'd volume as described?
 > ?>
 > ?> =C2=A0--
 > ?> =C2=A0David A. Holland
 > ?> =C2=A0dholland@netbsd.org
 > ?>
 >
 > ?Yes, it is easily reproductible on freshly newfs'd volume.
 >
 > ?When I did test with creating several large files (about 256GB each)
 > ?and then call sync command and did unclean reboot (e. g. poweroff) it
 > ?is unable to mount this fs again. It hangs on "replying log to disk".
 > ?However it is possible to mount it in read-only mode. It simply put
 > ?"replying log to memory" and works.
 >
 > ?If you need more info or even access to this machine ask me.
 >
 > ?--=20
 > ?Pozdrawiam, Bartosz Ku=C5=BAma.
 >

 There is simpler way to reproduce error:

  # newfs -O 2 /dev/dk0
  # mount -o log /dev/dk0 /mnt

  And run the following script:

  #!/bin/sh

  for i in `jot 256 1 256`
  do
         echo mkdir /mnt/dir-${i}
         mkdir /mnt/dir-${i}

         for j in `jot 256 1 256`
         do
                 echo touch /mnt/dir-${i}/file-${j}
                 touch /mnt/dir-${i}/file-${j}
         done
  done


  And about line "touch /mnt/dir-28/file-122" kernel panics:

  dev = 0xa800, block = 625305256, fs = /mnt
  panic: blkfree: freeing free frag
  fatal breakpoint trap in supervisor mode
  trap type 1 code 0 rip ffffffff8052ace5 cs 8 rflags 246 cr2  0 cpl 0
  rsp ffff80005175f850
  Stopped in pid 0.58 (system) at netbsd:breakpoint+0x5:  leave
  db{1}> trace
  breakpoint() at netbsd:breakpoint+0x5
  panic() at netbsd:panic+0x24d
  ffs_blkfree() at netbsd:ffs_blkfree+0x6d7
  ffs_wapbl_sync_metadata() at netbsd:ffs_wapbl_sync_metadata+0x66
  wapbl_flush() at netbsd:wapbl_flush+0x7c
  ffs_sync() at netbsd:ffs_sync+0x36c
  VFS_SYNC() at netbsd:VFS_SYNC+0x33
  sync_fsync() at netbsd:sync_fsync+0x85
  VOP_FSYNC() at netbsd:VOP_FSYNC+0x71
  sched_sync() at netbsd:sched_sync+0x15d

  db{1}> ps
  PID    LID S CPU     FLAGS       STRUCT LWP *               NAME WAIT
  16157>   1 7   3         4   ffff8000520b5020              touch
  1049     1 3   3        84   ffff8000524b2000                 sh wait
  911      1 3   0        84   ffff800052682800                ksh ttyraw
  403      1 3   3        84   ffff8000524b23e0                ksh pause
  375      1 3   3        84   ffff8000524be7e0                 su wait
  300      1 3   3        84   ffff800052682be0                ksh pause
  405      1 3   0        84   ffff8000524be020               sshd select
  398      1 3   0        84   ffff80004ca9b000               sshd netio
  393      1 3   0        84   ffff80004ca9b3e0              login wait
  383      1 3   0        84   ffff8000524bebc0               cron nanoslp
  380      1 3   3        84   ffff8000524be400              inetd kqueue
  379      1 3   2        84   ffff8000524b27c0               qmgr kqueue
  388      1 3   0        84   ffff8000520e7800             pickup kqueue
  365      1 3   0        84   ffff8000520b57e0             master kqueue
  263      1 3   0        84   ffff8000520e7420               sshd select
  126      1 3   0        84   ffff8000520b5bc0            syslogd kqueue
  1        1 3   0        84   ffff80004ca8a420               init wait
  0       60 3   0       204   ffff8000520b5400            physiod physiod
               59 3   1       204   ffff80004ca9b7c0           aiodoned aiodoned
            >  58 7   1       204   ffff80004ca9bba0            ioflush
               57 3   1       204   ffff80004ca857c0           pgdaemon pgdaemon
               56 3   3       204   ffff80004ca84800          cryptoret crypto_wa

  db{1}> trace/t 0x3f1d
  trace: pid 16157 lid 1 at 0xffff8000520d2b50
  0:

 -- 
 Pozdrawiam, Bartosz Ku?ma.

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/45708: Unable to read big files from large FFSv2 (12TB), ls
 out of swap
Date: Sun, 22 Apr 2012 18:17:22 +0000

 (not filed in gnats)

    ------

 From: Bartosz Ku?ma <bartosz.kuzma@gmail.com>
 To: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,
 	bartosz.kuzma@gmail.com
 Subject: Re: kern/45708: Unable to read big files from large FFSv2 (12TB), ls
 	out of swap
 Date: Fri, 16 Dec 2011 10:27:08 +0100

 > ?dev = 0xa800, block = 625305256, fs = /mnt
 > ?panic: blkfree: freeing free frag
 > ?fatal breakpoint trap in supervisor mode
 > ?trap type 1 code 0 rip ffffffff8052ace5 cs 8 rflags 246 cr2 ?0 cpl 0
 > ?rsp ffff80005175f850
 > ?Stopped in pid 0.58 (system) at netbsd:breakpoint+0x5: ?leave
 > ?db{1}> trace
 > ?breakpoint() at netbsd:breakpoint+0x5
 > ?panic() at netbsd:panic+0x24d
 > ?ffs_blkfree() at netbsd:ffs_blkfree+0x6d7
 > ?ffs_wapbl_sync_metadata() at netbsd:ffs_wapbl_sync_metadata+0x66
 > ?wapbl_flush() at netbsd:wapbl_flush+0x7c
 > ?ffs_sync() at netbsd:ffs_sync+0x36c
 > ?VFS_SYNC() at netbsd:VFS_SYNC+0x33
 > ?sync_fsync() at netbsd:sync_fsync+0x85
 > ?VOP_FSYNC() at netbsd:VOP_FSYNC+0x71
 > ?sched_sync() at netbsd:sched_sync+0x15d
 >

 It looks like for block size bigger than 16384 this problem does not exists.

 I've tested the following options:

 backup-host# newfs -O 2 -b 32768 dk0
 /dev/rdk0: 13346816.0MB (27334279096 sectors) block size 32768,
 fragment size 4096
         using 17981 cylinder groups of 742.28MB, 23753 blks, 46848 inodes.
 super-block backups (for fsck_ffs -b #) at:
 192, 1520384, 3040576, 4560768, 6080960, 7601152, 9121344, 10641536, 12161728,
 ...............................................................................


 and:

 backup-host# newfs -O 2 -b 65536 dk0
 /dev/rdk0: 13346816.0MB (27334279088 sectors) block size 65536,
 fragment size 8192
         using 4079 cylinder groups of 3272.12MB, 52354 blks, 103936 inodes.
 super-block backups (for fsck_ffs -b #) at:
 256, 6701568, 13402880, 20104192, 26805504, 33506816, 40208128, 46909440,
 ...............................................................................


 and it works good in all problematic cases.

 -- 
 Pozdrawiam, Bartosz Ku?ma.

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.