NetBSD Problem Report #46325

From www@NetBSD.org  Thu Apr 12 09:35:50 2012
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
	by www.NetBSD.org (Postfix) with ESMTP id 05A0363BBEC
	for <gnats-bugs@gnats.NetBSD.org>; Thu, 12 Apr 2012 09:35:50 +0000 (UTC)
Message-Id: <20120412093549.386CC63B9FE@www.NetBSD.org>
Date: Thu, 12 Apr 2012 09:35:49 +0000 (UTC)
From: Christoph_Egger@gmx.de
Reply-To: Christoph_Egger@gmx.de
To: gnats-bugs@NetBSD.org
Subject: wapbl + disk io = temporary system freeze
X-Send-Pr-Version: www-1.0

>Number:         46325
>Category:       kern
>Synopsis:       wapbl + disk io = temporary system freeze
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    chs
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Apr 12 09:40:00 +0000 2012
>Closed-Date:    Mon Oct 07 06:16:27 +0000 2013
>Last-Modified:  Mon Oct 07 06:16:27 +0000 2013
>Originator:     Christoph Egger
>Release:        netbsd-6 and -current
>Organization:
>Environment:
netbsd/amd64
>Description:

Use NetBSD/amd64 as a desktop system with firefox, thunderbird, openoffice, etc.
Use wapbl.

On heavy disk io the ioflush kernel thread eats up to 60% cpu load
and is in the syncer state. During that time all interrupts are blocked:

- The mouse doesn't move and then jumps.
- Keyboard seems to not work and then suddenly all keys typed appear
  at once.

Here are two outputs of 'top -u' with threads shown during
heavy disk io:

load averages:  0.95,  0.37,  0.16;               up 1+22:59:30                   11:31:05
151 threads: 18 idle, 127 sleeping, 2 zombie, 4 on CPU
CPU0 states:  0.0% user,  0.0% nice,  0.0% system, 92.0% interrupt,  8.0% idle
CPU1 states:  0.0% user,  0.0% nice,  4.8% system,  0.0% interrupt, 95.2% idle
CPU2 states:  0.0% user,  0.0% nice,  2.0% system,  0.0% interrupt, 98.0% idle
CPU3 states:  1.2% user,  0.0% nice, 81.2% system,  0.0% interrupt, 17.6% idle
Memory: 3784M Act, 38M Inact, 20K Wired, 188M Exec, 2743M File, 3219M Free
Swap: 8171M Total, 8171M Free

  PID   LID     UID  PRI STATE      TIME   WCPU    CPU NAME      COMMAND
21107     1       0   39 CPU/3      0:54 67.56% 65.53% -         cvs
    0    59       0  124 syncer/3 158:48 48.34% 48.34% ioflush   [system]
    0     4       0  221 IDLE/0     9:03 33.35% 33.35% softbio/0 [system]
[...]


load averages:  1.06,  0.55,  0.24;               up 1+23:00:46                   11:32:21
151 threads: 17 idle, 128 sleeping, 1 zombie, 5 on CPU
CPU0 states:  0.0% user,  0.0% nice,  0.0% system,  100% interrupt,  0.0% idle
CPU1 states:  0.0% user,  0.0% nice, 92.0% system,  0.0% interrupt,  8.0% idle
CPU2 states:  0.0% user,  0.0% nice,  1.2% system,  0.0% interrupt, 98.8% idle
CPU3 states:  0.2% user,  0.0% nice,  0.6% system,  0.0% interrupt, 99.2% idle
Memory: 3819M Act, 38M Inact, 20K Wired, 188M Exec, 2770M File, 3184M Free
Swap: 8171M Total, 8171M Free

  PID   LID     UID  PRI STATE      TIME   WCPU    CPU NAME      COMMAND
    0    59       0  124 CPU/1    159:27 61.57% 61.57% ioflush   [system]
21107     1       0  112 tstile/0   1:22 28.54% 28.52% -         cvs
[...]

>How-To-Repeat:

Use wapbl on a desktop system and create heavy disk io.
I create the disk io by updating the pkgsrc tree from cvs.

>Fix:

>Release-Note:

>Audit-Trail:
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/46325: wapbl + disk io = temporary system freeze
Date: Thu, 12 Apr 2012 11:47:51 +0200

 Stupid questions, just to make sure we are not barking up the wrong tree:

  - your underlying disk io speed is OK? (dmesg excerpts or bonnie++ runs
    would give hints)
  - your time keeping works? vmstat -i shows "cpuN time" with aproximately
    HZ frequency?

 Martin

From: Christoph Egger <Christoph_Egger@gmx.de>
To: gnats-bugs@NetBSD.org
Cc: Martin Husemann <martin@duskware.de>, kern-bug-people@netbsd.org, 
 gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/46325: wapbl + disk io = temporary system freeze
Date: Thu, 12 Apr 2012 12:40:50 +0200

 On 04/12/12 11:50, Martin Husemann wrote:
 >    - your underlying disk io speed is OK? (dmesg excerpts or bonnie++ runs
 >      would give hints)

 wd0 at atabus1 drive 0
 wd0: <ST3500413AS>
 wd0: drive supports 16-sector PIO transfers, LBA48 addressing
 wd0: 465 GB, 969021 cyl, 16 head, 63 sec, 512 bytes/sect x 976773168 sectors
 wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 6 (Ultra/133)

 that's ok.

 pciide0 at pci0 dev 17 function 0: AMD Hudson SATA Controller (rev. 0x40)
 pciide0: bus-master DMA support present, but unused (no driver support)
 pciide0: primary channel configured to native-PCI mode
 pciide0: using ioapic0 pin 19 for native-PCI interrupt
 atabus0 at pciide0 channel 0
 pciide0: secondary channel configured to native-PCI mode
 atabus1 at pciide0 channel 1

 eek, DMA support is missing in pciide0.


 >    - your time keeping works? vmstat -i shows "cpuN time" with aproximately
 >      HZ frequency?

 interrupt                                     total     rate
 cpu0 timer                                 17264946       99

 Yes, that's fine.

 Christoph

From: Christoph Egger <Christoph_Egger@gmx.de>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, 
 netbsd-bugs@netbsd.org
Subject: Re: kern/46325: wapbl + disk io = temporary system freeze
Date: Thu, 12 Apr 2012 13:01:11 +0200

 Problem seems to be solved: I changed the BIOS SATA mode from IDE to AHCI.


 load averages:  0.95,  0.47,  0.19;               up 0+00:04:52 
    12:58:40
 135 threads: 18 idle, 111 sleeping, 2 zombie, 4 on CPU
 CPU0 states:  0.0% user,  0.0% nice,  9.4% system,  5.6% interrupt, 85.0% idle
 CPU1 states:  1.6% user,  0.0% nice,  4.4% system,  0.0% interrupt, 94.0% idle
 CPU2 states:  0.2% user,  0.0% nice,  0.0% system,  0.0% interrupt, 99.8% idle
 CPU3 states:  6.6% user,  0.0% nice, 68.1% system,  0.0% interrupt, 25.3% idle
 Memory: 783M Act, 16K Wired, 88M Exec, 177M File, 6852M Free
 Swap: 8171M Total, 8171M Free

    PID   LID     UID  PRI STATE      TIME   WCPU    CPU NAME      COMMAND
    483     1       0   26 CPU/3      0:36 74.88% 69.34% -         cvs
      0    63       0  124 syncer/1   0:02  6.15%  6.15% ioflush   [system]


 I am still puzzled why so much 'system' time is spent.

 Christoph

From: Christoph Egger <Christoph_Egger@gmx.de>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, 
 netbsd-bugs@netbsd.org
Subject: Re: kern/46325: wapbl + disk io = temporary system freeze
Date: Thu, 12 Apr 2012 14:10:48 +0200

 I forgot the dmesg snippets:

 ahcisata0 at pci0 dev 17 function 0: AMD Hudson AHCI SATA Controller 
 (rev. 0x40)
 ahcisata0: interrupting at ioapic0 pin 19
 ahcisata0: 64-bit DMA
 ahcisata0: AHCI revision 1.30, 6 ports, 32 slots, CAP 0xf732ff05<PSC,SSC,PMD,SPM,ISS=0x3=Gen3,SCLO,SAL,SALP,SMPS,SSNTF,SNCQ,S64A>
 atabus0 at ahcisata0 channel 0
 atabus1 at ahcisata0 channel 1
 atabus2 at ahcisata0 channel 2
 atabus3 at ahcisata0 channel 3
 atabus4 at ahcisata0 channel 4
 atabus5 at ahcisata0 channel 5

 ahcisata0 port 1: device present, speed: 6.0Gb/s

 wd0 at atabus1 drive 0
 wd0: <ST3500413AS>
 wd0: drive supports 16-sector PIO transfers, LBA48 addressing
 wd0: 465 GB, 969021 cyl, 16 head, 63 sec, 512 bytes/sect x 976773168 sectors
 wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 6 (Ultra/133)
 wd0(ahcisata0:1:0): using PIO mode 4, DMA mode 2, Ultra-DMA mode 6 (Ultra/133) (using DMA)


 Christoph

From: Christoph Egger <Christoph_Egger@gmx.de>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, 
 netbsd-bugs@netbsd.org
Subject: Re: kern/46325: wapbl + disk io = temporary system freeze
Date: Thu, 12 Apr 2012 14:28:44 +0200

 The problem still persists but is a lot harder to reproduce now.
 I can no longer reproduce with the keyboard.

 Moving the mouse during disk io the mouse pointer suddenly "hangs"
 for a very short moment. It happens when top shows
 the ioflush kernel thread causing cpu load and more than 5% in interrupts.

 I think this might be even more noticable with applications that
 rely on having a responsive system like watching a movie.

 Christoph

From: "Chuck Silvers" <chs@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/46325 CVS commit: src/sys
Date: Sun, 29 Apr 2012 22:54:01 +0000

 Module Name:	src
 Committed By:	chs
 Date:		Sun Apr 29 22:54:01 UTC 2012

 Modified Files:
 	src/sys/fs/msdosfs: msdosfs_vnops.c
 	src/sys/fs/ntfs: ntfs_vnops.c
 	src/sys/fs/sysvbfs: sysvbfs_vnops.c
 	src/sys/fs/udf: udf_subr.c udf_vnops.c
 	src/sys/fs/v7fs: v7fs_vnops.c
 	src/sys/kern: vfs_subr.c
 	src/sys/miscfs/genfs: genfs_io.c
 	src/sys/miscfs/specfs: spec_vnops.c
 	src/sys/ufs/chfs: chfs_vnops.c
 	src/sys/ufs/ext2fs: ext2fs_readwrite.c ext2fs_vnops.c
 	src/sys/ufs/ffs: ffs_vfsops.c ffs_vnops.c
 	src/sys/ufs/ufs: ufs_readwrite.c
 	src/sys/uvm: uvm_pager.h

 Log Message:
 change vflushbuf() to take the full FSYNC_* flags.
 translate FSYNC_LAZY into PGO_LAZY for VOP_PUTPAGES() so that
 genfs_do_io() can set the appropriate io priority for the I/O.
 this is the first part of addressing PR 46325.


 To generate a diff of this commit:
 cvs rdiff -u -r1.82 -r1.83 src/sys/fs/msdosfs/msdosfs_vnops.c
 cvs rdiff -u -r1.50 -r1.51 src/sys/fs/ntfs/ntfs_vnops.c
 cvs rdiff -u -r1.43 -r1.44 src/sys/fs/sysvbfs/sysvbfs_vnops.c
 cvs rdiff -u -r1.118 -r1.119 src/sys/fs/udf/udf_subr.c
 cvs rdiff -u -r1.70 -r1.71 src/sys/fs/udf/udf_vnops.c
 cvs rdiff -u -r1.10 -r1.11 src/sys/fs/v7fs/v7fs_vnops.c
 cvs rdiff -u -r1.433 -r1.434 src/sys/kern/vfs_subr.c
 cvs rdiff -u -r1.53 -r1.54 src/sys/miscfs/genfs/genfs_io.c
 cvs rdiff -u -r1.134 -r1.135 src/sys/miscfs/specfs/spec_vnops.c
 cvs rdiff -u -r1.6 -r1.7 src/sys/ufs/chfs/chfs_vnops.c
 cvs rdiff -u -r1.60 -r1.61 src/sys/ufs/ext2fs/ext2fs_readwrite.c
 cvs rdiff -u -r1.102 -r1.103 src/sys/ufs/ext2fs/ext2fs_vnops.c
 cvs rdiff -u -r1.276 -r1.277 src/sys/ufs/ffs/ffs_vfsops.c
 cvs rdiff -u -r1.120 -r1.121 src/sys/ufs/ffs/ffs_vnops.c
 cvs rdiff -u -r1.103 -r1.104 src/sys/ufs/ufs/ufs_readwrite.c
 cvs rdiff -u -r1.42 -r1.43 src/sys/uvm/uvm_pager.h

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Chuck Silvers" <chs@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/46325 CVS commit: src/sys/kern
Date: Sun, 29 Apr 2012 22:55:12 +0000

 Module Name:	src
 Committed By:	chs
 Date:		Sun Apr 29 22:55:12 UTC 2012

 Modified Files:
 	src/sys/kern: vfs_wapbl.c

 Log Message:
 mark all wapbl I/O as BPRIO_TIMECRITICAL.
 this is the second part of addressing PR 46325.


 To generate a diff of this commit:
 cvs rdiff -u -r1.51 -r1.52 src/sys/kern/vfs_wapbl.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

Responsible-Changed-From-To: kern-bug-people->chs
Responsible-Changed-By: chs@NetBSD.org
Responsible-Changed-When: Mon, 30 Apr 2012 01:09:54 +0000
Responsible-Changed-Why:
I worked on this


From: "Jeff Rizzo" <riz@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/46325 CVS commit: [netbsd-6] src/sys
Date: Mon, 7 May 2012 03:01:15 +0000

 Module Name:	src
 Committed By:	riz
 Date:		Mon May  7 03:01:14 UTC 2012

 Modified Files:
 	src/sys/fs/msdosfs [netbsd-6]: msdosfs_vnops.c
 	src/sys/fs/ntfs [netbsd-6]: ntfs_vnops.c
 	src/sys/fs/sysvbfs [netbsd-6]: sysvbfs_vnops.c
 	src/sys/fs/udf [netbsd-6]: udf_subr.c udf_vnops.c
 	src/sys/fs/v7fs [netbsd-6]: v7fs_vnops.c
 	src/sys/kern [netbsd-6]: vfs_subr.c vfs_wapbl.c
 	src/sys/miscfs/genfs [netbsd-6]: genfs_io.c
 	src/sys/miscfs/specfs [netbsd-6]: spec_vnops.c
 	src/sys/ufs/chfs [netbsd-6]: chfs_vnops.c
 	src/sys/ufs/ext2fs [netbsd-6]: ext2fs_readwrite.c ext2fs_vnops.c
 	src/sys/ufs/ffs [netbsd-6]: ffs_vfsops.c ffs_vnops.c
 	src/sys/ufs/ufs [netbsd-6]: ufs_readwrite.c
 	src/sys/uvm [netbsd-6]: uvm_pager.h

 Log Message:
 Pull up following revision(s) (requested by chs in ticket #204):
 	sys/fs/sysvbfs/sysvbfs_vnops.c: revision 1.44
 	sys/ufs/ffs/ffs_vfsops.c: revision 1.277
 	sys/fs/v7fs/v7fs_vnops.c: revision 1.11
 	sys/ufs/chfs/chfs_vnops.c: revision 1.7
 	sys/ufs/ext2fs/ext2fs_readwrite.c: revision 1.61
 	sys/miscfs/genfs/genfs_io.c: revision 1.54
 	sys/kern/vfs_wapbl.c: revision 1.52
 	sys/uvm/uvm_pager.h: revision 1.43
 	sys/ufs/ffs/ffs_vnops.c: revision 1.121
 	sys/kern/vfs_subr.c: revision 1.434
 	sys/fs/msdosfs/msdosfs_vnops.c: revision 1.83
 	sys/fs/ntfs/ntfs_vnops.c: revision 1.51
 	sys/fs/udf/udf_subr.c: revision 1.119
 	sys/miscfs/specfs/spec_vnops.c: revision 1.135
 	sys/ufs/ext2fs/ext2fs_vnops.c: revision 1.103
 	sys/fs/udf/udf_vnops.c: revision 1.71
 	sys/ufs/ufs/ufs_readwrite.c: revision 1.104
 change vflushbuf() to take the full FSYNC_* flags.
 translate FSYNC_LAZY into PGO_LAZY for VOP_PUTPAGES() so that
 genfs_do_io() can set the appropriate io priority for the I/O.
 this is the first part of addressing PR 46325.
 mark all wapbl I/O as BPRIO_TIMECRITICAL.
 this is the second part of addressing PR 46325.


 To generate a diff of this commit:
 cvs rdiff -u -r1.79.4.1 -r1.79.4.2 src/sys/fs/msdosfs/msdosfs_vnops.c
 cvs rdiff -u -r1.49 -r1.49.10.1 src/sys/fs/ntfs/ntfs_vnops.c
 cvs rdiff -u -r1.40 -r1.40.2.1 src/sys/fs/sysvbfs/sysvbfs_vnops.c
 cvs rdiff -u -r1.118 -r1.118.6.1 src/sys/fs/udf/udf_subr.c
 cvs rdiff -u -r1.69 -r1.69.6.1 src/sys/fs/udf/udf_vnops.c
 cvs rdiff -u -r1.7 -r1.7.2.1 src/sys/fs/v7fs/v7fs_vnops.c
 cvs rdiff -u -r1.432 -r1.432.2.1 src/sys/kern/vfs_subr.c
 cvs rdiff -u -r1.51 -r1.51.2.1 src/sys/kern/vfs_wapbl.c
 cvs rdiff -u -r1.53 -r1.53.8.1 src/sys/miscfs/genfs/genfs_io.c
 cvs rdiff -u -r1.134 -r1.134.8.1 src/sys/miscfs/specfs/spec_vnops.c
 cvs rdiff -u -r1.2 -r1.2.4.1 src/sys/ufs/chfs/chfs_vnops.c
 cvs rdiff -u -r1.58 -r1.58.6.1 src/sys/ufs/ext2fs/ext2fs_readwrite.c
 cvs rdiff -u -r1.101 -r1.101.6.1 src/sys/ufs/ext2fs/ext2fs_vnops.c
 cvs rdiff -u -r1.275 -r1.275.2.1 src/sys/ufs/ffs/ffs_vfsops.c
 cvs rdiff -u -r1.120 -r1.120.8.1 src/sys/ufs/ffs/ffs_vnops.c
 cvs rdiff -u -r1.101 -r1.101.2.1 src/sys/ufs/ufs/ufs_readwrite.c
 cvs rdiff -u -r1.42 -r1.42.8.1 src/sys/uvm/uvm_pager.h

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Mon, 07 Oct 2013 06:16:27 +0000
State-Changed-Why:
fixed (right?)


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.