NetBSD Problem Report #55402

From kardel@Kardel.name  Sat Jun 20 10:51:19 2020
Return-Path: <kardel@Kardel.name>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id ED3621A9217
	for <gnats-bugs@gnats.NetBSD.org>; Sat, 20 Jun 2020 10:51:18 +0000 (UTC)
Message-Id: <20200620105044.E97F444B47@Andromeda.Kardel.name>
Date: Sat, 20 Jun 2020 12:50:44 +0200 (CEST)
From: kardel@netbsd.org
Reply-To: kardel@netbsd.org
To: gnats-bugs@NetBSD.org
Subject: amd64/9.99.68/9.99.68: xen/zfs - kernel: double fault trap, code=0
X-Send-Pr-Version: 3.95

>Number:         55402
>Category:       kern
>Synopsis:       amd64/9.99.68/GENERIC: xen/zfs - kernel: double fault trap, code=0
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    jdolecek
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Jun 20 10:55:00 +0000 2020
>Closed-Date:    Mon Jun 29 10:03:53 +0000 2020
>Last-Modified:  Mon Jun 29 10:10:01 +0000 2020
>Originator:     Frank Kardel
>Release:        NetBSD 9.99.68
>Organization:

>Environment:


System: NetBSD abstest2 9.99.68 NetBSD 9.99.68 (GENERIC) #2: Sat Jun 20 06:48:01 UTC 2020 kardel@dolomiti.hw.abs.acrys.com:/src/NetBSD/cur/src/obj.amd64/sys/arch/amd64/compile/GENERIC amd64
Architecture: x86_64
Machine: amd64
>Description:
	While testing ZFS on a pvh instance of GENERIC 
	zfs scrub
	runs into a double fault with an optically long stack.
	Reboots run into a similary stack trace at file system check time.
>How-To-Repeat:
abstest2# zpool create data0 mirror xbd1 xbd2
abstest2# zpool list
NAME    SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
data0   824G    96K   824G         -     0%     0%  1.00x  ONLINE  -
abstest2# zfs list
NAME    USED  AVAIL  REFER  MOUNTPOINT
data0  72.5K   798G    23K  /data0
abstest2# zfs set compression=lz4 data0
abstest2# zfs set dedup=on data0                                                                                                                                             
abstest2# ll /fs/nvme1/data0/
total 24
drwxr-xr-x  3 root   wheel  512 Feb 22  2019 BACKUP
drwxr-xr-x  7 root   wheel  512 Sep 20  2018 CA
drwxr-xr-x  4 abs    abs    512 Jan  4 09:15 abs
drwxr-xr-x  8 root   abs    512 Sep  6  2018 poolarranger
drwxr-xr-x  8 root   abs    512 Sep 11  2018 poolarranger-test
drwxr-xr-x  3 pgsql  wheel  512 Nov 21  2019 postgres
abstest2# zfs create data0/BACKUP
abstest2# zfs set compression=off data0/BACKUP                                                                                                                               
abstest2# zfs create data0/CA
abstest2# zfs set copies=2 data0/CA                                                                                                                                          
abstest2# zfs create data0/abs
abstest2# zfs create data0/poolarranger
abstest2# zfs create data0/poolarranger-test
abstest2# zfs create data0/postgres
abstest2# zfs list
NAME                      USED  AVAIL  REFER  MOUNTPOINT
data0                     397K   798G    25K  /data0
data0/BACKUP               23K   798G    23K  /data0/BACKUP
data0/CA                   23K   798G    23K  /data0/CA
data0/abs                  23K   798G    23K  /data0/abs
data0/poolarranger         23K   798G    23K  /data0/poolarranger
data0/poolarranger-test    23K   798G    23K  /data0/poolarranger-test
data0/postgres             23K   798G    23K  /data0/postgres
abstest2# zpool scrub data0
abstest2# zpool status data0
  pool: data0
 state: ONLINE
  scan: scrub repaired 0 in 0h0m with 0 errors on Sat Jun 20 07:34:29 2020
config:

        NAME        STATE     READ WRITE CKSUM
        data0       ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            xbd1    ONLINE       0     0     0
            xbd2    ONLINE       0     0     0

errors: No known data errors
abstest2# rsync -av /fs/nvme1/data0/ /data0/
sending incremental file list
./
[...]
sent 619,861,176,009 bytes  received 274,561 bytes  79,586,756.19 bytes/sec
total size is 619,708,937,442  speedup is 1.00
abstest2# zpool list
NAME    SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
data0   824G   536G   288G         -    11%    65%  1.07x  ONLINE  -
abstest2# zfs list
NAME                      USED  AVAIL  REFER  MOUNTPOINT
data0                     574G   261G    25K  /data0
data0/BACKUP              338G   261G   338G  /data0/BACKUP
data0/CA                 26.5M   261G  26.5M  /data0/CA
data0/abs                79.0G   261G  79.0G  /data0/abs
data0/poolarranger        134G   261G   134G  /data0/poolarranger
data0/poolarranger-test  22.1G   261G  22.1G  /data0/poolarranger-test
data0/postgres           39.5K   261G  39.5K  /data0/postgres
abstest2# zpool status
  pool: data0
 state: ONLINE
  scan: scrub repaired 0 in 0h0m with 0 errors on Sat Jun 20 07:34:29 2020
config:

        NAME        STATE     READ WRITE CKSUM
        data0       ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            xbd1    ONLINE       0     0     0
            xbd2    ONLINE       0     0     0

errors: No known data errors
abstest2# zpool scrub data0
abstest2# zpool status
  pool: data0
 state: ONLINE
  scan: scrub in progress since Sat Jun 20 09:49:33 2020
        352M scanned out of 536G at 117M/s, 1h17m to go
        0 repaired, 0.06% done
config:

        NAME        STATE     READ WRITE CKSUM
        data0       ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            xbd1    ONLINE       0     0     0
            xbd2    ONLINE       0     0     0

errors: No known data errors
abstest2#
[ 9089.3930078] fatal double fault in supervisor mode
[ 9089.3930078] trap type 13 code 0 rip 0xffffffff80234157 cs 0x8 rflags 0x10096 cr2 0xffffc8116e792fd8 ilevel 0 rsp 0xffffc8116e792fe8
[ 9089.3930078] curlwp 0xfffff94d42842940 pid 0.2010 lowest kstack 0xffffc8116e7912c0
kernel: double fault trap, code=0
Stopped in pid 0.2010 (system) at       netbsd:do_hypervisor_callback+0x1c:
movq    %rax,ffffffffffffffb0(%rbp)
do_hypervisor_callback() at netbsd:do_hypervisor_callback+0x1c
Xhandle_hypervisor_callback() at netbsd:Xhandle_hypervisor_callback+0x19
--- interrupt ---
vdev_queue_offset_compare() at zfs:vdev_queue_offset_compare+0x7
vdev_queue_io_to_issue() at zfs:vdev_queue_io_to_issue+0x714
vdev_queue_io() at zfs:vdev_queue_io+0xec
zio_vdev_io_start() at zfs:zio_vdev_io_start+0x151
zio_execute() at zfs:zio_execute+0xe3
zio_nowait() at zfs:zio_nowait+0x5c
vdev_mirror_io_start() at zfs:vdev_mirror_io_start+0x32f
zio_vdev_io_start() at zfs:zio_vdev_io_start+0x192
zio_execute() at zfs:zio_execute+0xe3
zio_nowait() at zfs:zio_nowait+0x5c
vdev_mirror_io_start() at zfs:vdev_mirror_io_start+0x157
zio_vdev_io_start() at zfs:zio_vdev_io_start+0x33f
zio_execute() at zfs:zio_execute+0xe3
zio_nowait() at zfs:zio_nowait+0x5c
zio_ddt_read_start() at zfs:zio_ddt_read_start+0x1a6
zio_execute() at zfs:zio_execute+0xe3
zio_nowait() at zfs:zio_nowait+0x5c
dsl_scan_scrub_cb() at zfs:dsl_scan_scrub_cb+0x4e9
dsl_scan_visitbp() at zfs:dsl_scan_visitbp+0x2f1
dsl_scan_visitbp() at zfs:dsl_scan_visitbp+0x46c
dsl_scan_visitbp() at zfs:dsl_scan_visitbp+0x46c
dsl_scan_visitdnode() at zfs:dsl_scan_visitdnode+0x75
dsl_scan_visitbp() at zfs:dsl_scan_visitbp+0x61e
dsl_scan_visitbp() at zfs:dsl_scan_visitbp+0x46c
dsl_scan_visitbp() at zfs:dsl_scan_visitbp+0x46c
dsl_scan_visitbp() at zfs:dsl_scan_visitbp+0x46c
dsl_scan_visitbp() at zfs:dsl_scan_visitbp+0x46c
dsl_scan_visitbp() at zfs:dsl_scan_visitbp+0x46c
dsl_scan_visitdnode() at zfs:dsl_scan_visitdnode+0x75
dsl_scan_visitbp() at zfs:dsl_scan_visitbp+0x7cb
dsl_scan_visitds() at zfs:dsl_scan_visitds+0xe3
dsl_scan_visit() at zfs:dsl_scan_visit+0x1ad
dsl_scan_sync() at zfs:dsl_scan_sync+0x276
spa_sync() at zfs:spa_sync+0x41c
txg_sync_thread() at zfs:txg_sync_thread+0x2d8
ds          23
es          23
fs          0
gs          0
rdi         ffffc8116e793058
rsi         fffff9739f3a9bf8
rbp         ffffc8116e793048
rbx         fffff94d427729f0
rdx         0
rcx         fffff975917e1e78
rax         ffffffff81a1d000
r8          260
r9          20000
r10         fffff94372cdd2d8
r11         fffff95c3761e2b0
r12         fffff975917e1c18
r13         248
r14         fffff9739f3a9e40
r15         fffff9739f3a9e40
rip         ffffffff80234157    do_hypervisor_callback+0x1c
cs          8
rflags      10096
rsp         ffffc8116e792fe8
ss          10
netbsd:do_hypervisor_callback+0x1c:     movq    %rax,ffffffffffffffb0(%rbp)
db{0}>
------------- REBOOT ------------
[...]
[   1.0900620] boot device: dk0
[   1.0900620] root on dk0
[   1.1000647] root file system type: ffs
[   1.1000647] kern.module.path=/stand/amd64/9.99.68/modules
Sat Jun 20 10:09:53 UTC 2020
Starting root file system check:
/dev/rdk0: 65945 files, 1167160 used, 3913894 free (17854 frags, 487005 blocks, 0.4% fragmentation)
/dev/rdk0: MARKING FILE SYSTEM CLEAN
[  10.4500850] WARNING: ZFS on NetBSD is under development
[  10.4600697] pool redzone disabled for 'zio_buf_4096'
[  10.4600697] pool redzone disabled for 'zio_data_buf_4096'
[  10.4600697] pool redzone disabled for 'zio_buf_8192'
[  10.4600697] pool redzone disabled for 'zio_data_buf_8192'
[  10.4600697] pool redzone disabled for 'zio_buf_16384'
[  10.4600697] pool redzone disabled for 'zio_data_buf_16384'
[  10.4600697] pool redzone disabled for 'zio_buf_32768'
[  10.4600697] pool redzone disabled for 'zio_data_buf_32768'
[  10.4600697] pool redzone disabled for 'zio_buf_65536'
[  10.4600697] pool redzone disabled for 'zio_data_buf_65536'
[  10.4600697] pool redzone disabled for 'zio_buf_131072'
[  10.4600697] pool redzone disabled for 'zio_data_buf_131072'
[  10.4600697] pool redzone disabled for 'zio_buf_262144'
[  10.4600697] pool redzone disabled for 'zio_data_buf_262144'
[  10.4600697] pool redzone disabled for 'zio_buf_524288'
[  10.4600697] pool redzone disabled for 'zio_data_buf_524288'
[  10.4600697] pool redzone disabled for 'zio_buf_1048576'
[  10.4600697] pool redzone disabled for 'zio_data_buf_1048576'
[  10.4600697] pool redzone disabled for 'zio_buf_2097152'
[  10.4600697] pool redzone disabled for 'zio_data_buf_2097152'
[  10.4600697] pool redzone disabled for 'zio_buf_4194304'
[  10.4600697] pool redzone disabled for 'zio_data_buf_4194304'
[  10.4600697] pool redzone disabled for 'zio_buf_8388608'
[  10.4600697] pool redzone disabled for 'zio_data_buf_8388608'
[  10.4600697] pool redzone disabled for 'zio_buf_16777216'
[  10.4600697] pool redzone disabled for 'zio_data_buf_16777216'
[  10.7200569] ZFS filesystem version: 5
Starting file system checks:
[  15.9900617] fatal double fault in supervisor mode
[  15.9900617] trap type 13 code 0 rip 0xffffffff80c932ae cs 0x8 rflags 0x10297 cr2 0xffffa3916dd12ff8 ilevel 0 rsp 0xffffa3916dd13000
[  15.9900617] curlwp 0xffffcae5720061c0 pid 0.351 lowest kstack 0xffffa3916dd112c0
kernel: double fault trap, code=0
Stopped in pid 0.351 (system) at        netbsd:mutex_vector_enter+0x8:  pushq
%r13
mutex_vector_enter() at netbsd:mutex_vector_enter+0x8
pool_get() at netbsd:pool_get+0x69
pool_cache_get_slow() at netbsd:pool_cache_get_slow+0x12b
pool_cache_get_paddr() at netbsd:pool_cache_get_paddr+0x23a
kmem_intr_alloc() at netbsd:kmem_intr_alloc+0x5b
kmem_intr_zalloc() at netbsd:kmem_intr_zalloc+0x11
kmem_zalloc() at netbsd:kmem_zalloc+0x4a
vdev_mirror_io_start() at zfs:vdev_mirror_io_start+0x64
zio_vdev_io_start() at zfs:zio_vdev_io_start+0x192
zio_execute() at zfs:zio_execute+0xe3
zio_nowait() at zfs:zio_nowait+0x5c
vdev_mirror_io_start() at zfs:vdev_mirror_io_start+0x157
zio_vdev_io_start() at zfs:zio_vdev_io_start+0x33f
zio_execute() at zfs:zio_execute+0xe3
zio_nowait() at zfs:zio_nowait+0x5c
arc_read() at zfs:arc_read+0x4ed
dbuf_read() at zfs:dbuf_read+0x1c3
dbuf_hold_impl() at zfs:dbuf_hold_impl+0x332
dbuf_hold_impl() at zfs:dbuf_hold_impl+0x2a4
dbuf_hold() at zfs:dbuf_hold+0x22
dmu_buf_hold_noread_by_dnode() at zfs:dmu_buf_hold_noread_by_dnode+0x39
dmu_buf_hold_by_dnode() at zfs:dmu_buf_hold_by_dnode+0x2a
zap_idx_to_blk() at zfs:zap_idx_to_blk+0x97
zap_deref_leaf() at zfs:zap_deref_leaf+0x6e
fzap_length() at zfs:fzap_length+0x2a
zap_length_uint64() at zfs:zap_length_uint64+0x7b
ddt_zap_lookup() at zfs:ddt_zap_lookup+0x33
ddt_class_contains() at zfs:ddt_class_contains+0x7c
dsl_scan_visitbp() at zfs:dsl_scan_visitbp+0x2b4
dsl_scan_visitbp() at zfs:dsl_scan_visitbp+0x46c
dsl_scan_visitbp() at zfs:dsl_scan_visitbp+0x46c
dsl_scan_visitdnode() at zfs:dsl_scan_visitdnode+0x75
dsl_scan_visitbp() at zfs:dsl_scan_visitbp+0x61e
dsl_scan_visitbp() at zfs:dsl_scan_visitbp+0x46c
dsl_scan_visitbp() at zfs:dsl_scan_visitbp+0x46c
dsl_scan_visitbp() at zfs:dsl_scan_visitbp+0x46c
dsl_scan_visitbp() at zfs:dsl_scan_visitbp+0x46c
dsl_scan_visitbp() at zfs:dsl_scan_visitbp+0x46c
dsl_scan_visitdnode() at zfs:dsl_scan_visitdnode+0x75
dsl_scan_visitbp() at zfs:dsl_scan_visitbp+0x7cb
dsl_scan_visitds() at zfs:dsl_scan_visitds+0xe3
dsl_scan_visit() at zfs:dsl_scan_visit+0x1ad
dsl_scan_sync() at zfs:dsl_scan_sync+0x276
spa_sync() at zfs:spa_sync+0x41c
txg_sync_thread() at zfs:txg_sync_thread+0x2d8
ds          23
es          23
fs          0
gs          0
rdi         ffffcb10dd8a34f0
rsi         1
rbp         ffffa3916dd13010
rbx         ffffcb10dd8a3440
rdx         b
rcx         ffffcae5720061c0
rax         601
r8          ffffcb10dd8a3440
r9          1
r10         ffffcae5818bd900
r11         ffffcae5719ff078
r12         1
r13         ffffcb10dd8a34f0
r14         ffffa3916dd13118
r15         ffffcb10dd8a3e80
rip         ffffffff80c932ae    mutex_vector_enter+0x8
cs          8
rflags      10297
rsp         ffffa3916dd13000
ss          10
netbsd:mutex_vector_enter+0x8:  pushq   %r13
db{0}>

>Fix:
	?
	could it be a stack size issue? - the stack trace seems to resemble
	the pattern of tree walk.

>Release-Note:

>Audit-Trail:
From: =?UTF-8?B?SmFyb23DrXIgRG9sZcSNZWs=?= <jaromir.dolecek@gmail.com>
To: "gnats-bugs@NetBSD.org" <gnats-bugs@netbsd.org>
Cc: 
Subject: Re: kern/55402: amd64/9.99.68/9.99.68: xen/zfs - kernel: double fault
 trap, code=0
Date: Sat, 20 Jun 2020 15:08:39 +0200

 In order to verify whether it's a stack issue, can you please test
 with the following patch?

 http://www.netbsd.org/~jdolecek/zfs_reduce_stack.diff

From: Frank Kardel <kardel@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/55402: amd64/9.99.68/9.99.68: xen/zfs - kernel: double fault
 trap, code=0
Date: Sat, 20 Jun 2020 15:40:49 +0200

 Well it goes a bit further but hits the double fault

 now with 57 frames (two > 1k) at about the same accumulated stack size.

 Some space was saved by the patch, but presumably not enough.

 0: vdev_queue_io_to_issue framesize 1152
 1: vdev_queue_io_remove framesize 32
 2: vdev_queue_io_to_issue framesize 1152
 3: vdev_queue_io framesize 64
 4: vdev_queue_io_done framesize 64
 5: zio_vdev_io_start framesize 96
 6: zio_execute framesize 80
 7: zio_nowait framesize 48
 8: vdev_mirror_io_start framesize 128
 9: zio_vdev_io_start framesize 96
 10: zio_execute framesize 80
 11: zio_nowait framesize 48
 12: vdev_mirror_io_start framesize 128
 13: zio_vdev_io_start framesize 96
 14: zio_execute framesize 80
 15: zio_nowait framesize 48
 16: arc_read_done framesize 80
 17: l2arc_read_done framesize 112
 18: arc_read framesize 192
 19: dbuf_read framesize 176
 20: dbuf_read_done framesize 48
 21: dmu_buf_hold_by_dnode framesize 64
 22: zap_get_leaf_byblk framesize 112
 23: zap_deref_leaf framesize 64
 24: fzap_length framesize 112
 25: zap_length_uint64 framesize 112
 26: ddt_zap_lookup framesize 368
 27: ddt_class_contains framesize 448
 28: dsl_scan_visitbp framesize 224
 29: dsl_scan_visitbp framesize 224
 30: dsl_scan_visitbp framesize 224
 31: dsl_scan_visitdnode framesize 144
 32: dsl_scan_visitbp framesize 224
 33: dsl_scan_visitbp framesize 224
 34: dsl_scan_visitbp framesize 224
 35: dsl_scan_visitbp framesize 224
 36: dsl_scan_visitbp framesize 224
 37: dsl_scan_visitbp framesize 224
 38: dsl_scan_visitdnode framesize 144
 39: dsl_scan_visitbp framesize 224
 40: dsl_scan_visitds framesize 144
 41: dsl_scan_visitdnode framesize 144
 42: dsl_scan_visitbp framesize 224
 43: dsl_scan_visitds framesize 144
 44: dsl_scan_visit framesize 560
 45: dsl_scan_sync_state framesize 32
 46: dsl_scan_sync framesize 112
 47: spa_sync_version framesize 48
 48: spa_sync_nvlist framesize 96
 49: spa_sync_config_object framesize 432
 50: spa_sync_aux_dev.part.6 framesize 112
 51: spa_sync_props framesize 128
 52: spa_sync framesize 208
 53: spa_sync_allpools framesize 32
 54: spa_sync_pass framesize 8
 55: spa_syncing_txg framesize 8
 56: txg_sync_thread framesize 240

 frame size sum = 10480, # frames 57

 On 06/20/20 15:10, Jaromír Doleček wrote:
 > The following reply was made to PR kern/55402; it has been noted by GNATS.
 >
 > From: =?UTF-8?B?SmFyb23DrXIgRG9sZcSNZWs=?= <jaromir.dolecek@gmail.com>
 > To: "gnats-bugs@NetBSD.org" <gnats-bugs@netbsd.org>
 > Cc:
 > Subject: Re: kern/55402: amd64/9.99.68/9.99.68: xen/zfs - kernel: double fault
 >   trap, code=0
 > Date: Sat, 20 Jun 2020 15:08:39 +0200
 >
 >   In order to verify whether it's a stack issue, can you please test
 >   with the following patch?
 >   
 >   http://www.netbsd.org/~jdolecek/zfs_reduce_stack.diff
 >   

From: =?UTF-8?B?SmFyb23DrXIgRG9sZcSNZWs=?= <jaromir.dolecek@gmail.com>
To: "gnats-bugs@NetBSD.org" <gnats-bugs@netbsd.org>
Cc: 
Subject: Re: kern/55402: amd64/9.99.68/9.99.68: xen/zfs - kernel: double fault
 trap, code=0
Date: Sat, 20 Jun 2020 18:55:28 +0200

 Le sam. 20 juin 2020 =C3=A0 15:45, Frank Kardel <kardel@netbsd.org> a =C3=
 =A9crit :
 >  Well it goes a bit further but hits the double fault
 >
 >  now with 57 frames (two > 1k) at about the same accumulated stack size.
 >
 >  Some space was saved by the patch, but presumably not enough.

 OK, I've updated the patch. My previous change in
 vdev_queue_io_to_issue() did not work, gcc returns the stack on the
 end of the function, not when going out of the block.

 New version reduced vdev_queue_io_to_issue() to use only 160 bytes of
 stack instead of 1152, and dsl_scan_visitbp()+ dsl_scan_visitdnode()
 pair now takes 40 less bytes.

 Can you check if this is enough to get it through?

 http://www.netbsd.org/~jdolecek/zfs_reduce_stack.diff

 Jaromir

From: Frank Kardel <kardel@netbsd.org>
To: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org,
 gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc: 
Subject: Re: kern/55402: amd64/9.99.68/9.99.68: xen/zfs - kernel: double fault
 trap, code=0
Date: Sat, 20 Jun 2020 20:28:18 +0200

 That did it - the scrub run now completed successfully.

 Frank


 On 06/20/20 19:00, Jaromír Doleček wrote:
 > The following reply was made to PR kern/55402; it has been noted by GNATS.
 >
 > From: =?UTF-8?B?SmFyb23DrXIgRG9sZcSNZWs=?= <jaromir.dolecek@gmail.com>
 > To: "gnats-bugs@NetBSD.org" <gnats-bugs@netbsd.org>
 > Cc:
 > Subject: Re: kern/55402: amd64/9.99.68/9.99.68: xen/zfs - kernel: double fault
 >   trap, code=0
 > Date: Sat, 20 Jun 2020 18:55:28 +0200
 >
 >   Le sam. 20 juin 2020 =C3=A0 15:45, Frank Kardel <kardel@netbsd.org> a =C3=
 >   =A9crit :
 >   >  Well it goes a bit further but hits the double fault
 >   >
 >   >  now with 57 frames (two > 1k) at about the same accumulated stack size.
 >   >
 >   >  Some space was saved by the patch, but presumably not enough.
 >   
 >   OK, I've updated the patch. My previous change in
 >   vdev_queue_io_to_issue() did not work, gcc returns the stack on the
 >   end of the function, not when going out of the block.
 >   
 >   New version reduced vdev_queue_io_to_issue() to use only 160 bytes of
 >   stack instead of 1152, and dsl_scan_visitbp()+ dsl_scan_visitdnode()
 >   pair now takes 40 less bytes.
 >   
 >   Can you check if this is enough to get it through?
 >   
 >   http://www.netbsd.org/~jdolecek/zfs_reduce_stack.diff
 >   
 >   Jaromir
 >   

From: =?UTF-8?B?SmFyb23DrXIgRG9sZcSNZWs=?= <jaromir.dolecek@gmail.com>
To: "gnats-bugs@NetBSD.org" <gnats-bugs@netbsd.org>
Cc: 
Subject: Re: kern/55402: amd64/9.99.68/9.99.68: xen/zfs - kernel: double fault
 trap, code=0
Date: Sat, 20 Jun 2020 20:39:52 +0200

 Can you confirm whether it's enough to apply just the change for
 vdev_queue.c, i.e. can you try the scrum with dsl_scan.c same as in
 repository (without patch)?

 I'd prefer to keep dsl_scan.c closer to upstream unless absolutely
 necessary to change.

 Jaromir

 Le sam. 20 juin 2020 =C3=A0 20:28, Frank Kardel <kardel@netbsd.org> a =C3=
 =A9crit :
 >
 > That did it - the scrub run now completed successfully.
 >
 > Frank
 >
 >
 > On 06/20/20 19:00, Jarom=C3=ADr Dole=C4=8Dek wrote:
 > > The following reply was made to PR kern/55402; it has been noted by GNA=
 TS.
 > >
 > > From: =3D?UTF-8?B?SmFyb23DrXIgRG9sZcSNZWs=3D?=3D <jaromir.dolecek@gmail=
 .com>
 > > To: "gnats-bugs@NetBSD.org" <gnats-bugs@netbsd.org>
 > > Cc:
 > > Subject: Re: kern/55402: amd64/9.99.68/9.99.68: xen/zfs - kernel: doubl=
 e fault
 > >   trap, code=3D0
 > > Date: Sat, 20 Jun 2020 18:55:28 +0200
 > >
 > >   Le sam. 20 juin 2020 =3DC3=3DA0 15:45, Frank Kardel <kardel@netbsd.or=
 g> a =3DC3=3D
 > >   =3DA9crit :
 > >   >  Well it goes a bit further but hits the double fault
 > >   >
 > >   >  now with 57 frames (two > 1k) at about the same accumulated stack =
 size.
 > >   >
 > >   >  Some space was saved by the patch, but presumably not enough.
 > >
 > >   OK, I've updated the patch. My previous change in
 > >   vdev_queue_io_to_issue() did not work, gcc returns the stack on the
 > >   end of the function, not when going out of the block.
 > >
 > >   New version reduced vdev_queue_io_to_issue() to use only 160 bytes of
 > >   stack instead of 1152, and dsl_scan_visitbp()+ dsl_scan_visitdnode()
 > >   pair now takes 40 less bytes.
 > >
 > >   Can you check if this is enough to get it through?
 > >
 > >   http://www.netbsd.org/~jdolecek/zfs_reduce_stack.diff
 > >
 > >   Jaromir
 > >
 >

From: Frank Kardel <kardel@netbsd.org>
To: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org,
 gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc: 
Subject: Re: kern/55402: amd64/9.99.68/9.99.68: xen/zfs - kernel: double fault
 trap, code=0
Date: Sat, 20 Jun 2020 21:00:14 +0200

 I can do that.

 I do have the feeling that zfs code is very generous with stack. Also

 scan has a recursive structure where I don't know the upper bound

 of the recursion. This mini pool was just around 800GB. Every three

 recursions we eat about 1k of stack.

 We seem to be running pretty close to our kernel stack limit when

 using zfs.

 Maybe enlarging our kernel stack could also be an option.

 Other systems seem to be able to handle normal zfs operations.

 Will check for dsl_scan in original form anyway - takes some time though.

 Frank



 On 06/20/20 20:45, Jaromír Doleček wrote:
 > The following reply was made to PR kern/55402; it has been noted by GNATS.
 >
 > From: =?UTF-8?B?SmFyb23DrXIgRG9sZcSNZWs=?= <jaromir.dolecek@gmail.com>
 > To: "gnats-bugs@NetBSD.org" <gnats-bugs@netbsd.org>
 > Cc:
 > Subject: Re: kern/55402: amd64/9.99.68/9.99.68: xen/zfs - kernel: double fault
 >   trap, code=0
 > Date: Sat, 20 Jun 2020 20:39:52 +0200
 >
 >   Can you confirm whether it's enough to apply just the change for
 >   vdev_queue.c, i.e. can you try the scrum with dsl_scan.c same as in
 >   repository (without patch)?
 >   
 >   I'd prefer to keep dsl_scan.c closer to upstream unless absolutely
 >   necessary to change.
 >   
 >   Jaromir
 >   
 >   Le sam. 20 juin 2020 =C3=A0 20:28, Frank Kardel <kardel@netbsd.org> a =C3=
 >   =A9crit :
 >   >
 >   > That did it - the scrub run now completed successfully.
 >   >
 >   > Frank
 >   >
 >   >
 >   > On 06/20/20 19:00, Jarom=C3=ADr Dole=C4=8Dek wrote:
 >   > > The following reply was made to PR kern/55402; it has been noted by GNA=
 >   TS.
 >   > >
 >   > > From: =3D?UTF-8?B?SmFyb23DrXIgRG9sZcSNZWs=3D?=3D <jaromir.dolecek@gmail=
 >   .com>
 >   > > To: "gnats-bugs@NetBSD.org" <gnats-bugs@netbsd.org>
 >   > > Cc:
 >   > > Subject: Re: kern/55402: amd64/9.99.68/9.99.68: xen/zfs - kernel: doubl=
 >   e fault
 >   > >   trap, code=3D0
 >   > > Date: Sat, 20 Jun 2020 18:55:28 +0200
 >   > >
 >   > >   Le sam. 20 juin 2020 =3DC3=3DA0 15:45, Frank Kardel <kardel@netbsd.or=
 >   g> a =3DC3=3D
 >   > >   =3DA9crit :
 >   > >   >  Well it goes a bit further but hits the double fault
 >   > >   >
 >   > >   >  now with 57 frames (two > 1k) at about the same accumulated stack =
 >   size.
 >   > >   >
 >   > >   >  Some space was saved by the patch, but presumably not enough.
 >   > >
 >   > >   OK, I've updated the patch. My previous change in
 >   > >   vdev_queue_io_to_issue() did not work, gcc returns the stack on the
 >   > >   end of the function, not when going out of the block.
 >   > >
 >   > >   New version reduced vdev_queue_io_to_issue() to use only 160 bytes of
 >   > >   stack instead of 1152, and dsl_scan_visitbp()+ dsl_scan_visitdnode()
 >   > >   pair now takes 40 less bytes.
 >   > >
 >   > >   Can you check if this is enough to get it through?
 >   > >
 >   > >   http://www.netbsd.org/~jdolecek/zfs_reduce_stack.diff
 >   > >
 >   > >   Jaromir
 >   > >
 >   >
 >   

From: Frank Kardel <kardel@netbsd.org>
To: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org,
 gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc: 
Subject: Re: kern/55402: amd64/9.99.68/9.99.68: xen/zfs - kernel: double fault
 trap, code=0
Date: Sat, 20 Jun 2020 21:26:03 +0200

 It double faulted again. We are too close to the kernel stack limit. I fear

 we need to save more stack space and/or extend the kernel stack (if 
 possible).

 The issue here is the the stack usage is probably data dependent and we must

 not trip just because the pool scan needs more recursion stack space.

 Frank


 On 06/20/20 21:05, Frank Kardel wrote:
 > The following reply was made to PR kern/55402; it has been noted by GNATS.
 >
 > From: Frank Kardel <kardel@netbsd.org>
 > To: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org,
 >   gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
 > Cc:
 > Subject: Re: kern/55402: amd64/9.99.68/9.99.68: xen/zfs - kernel: double fault
 >   trap, code=0
 > Date: Sat, 20 Jun 2020 21:00:14 +0200
 >
 >   I can do that.
 >   
 >   I do have the feeling that zfs code is very generous with stack. Also
 >   
 >   scan has a recursive structure where I don't know the upper bound
 >   
 >   of the recursion. This mini pool was just around 800GB. Every three
 >   
 >   recursions we eat about 1k of stack.
 >   
 >   We seem to be running pretty close to our kernel stack limit when
 >   
 >   using zfs.
 >   
 >   Maybe enlarging our kernel stack could also be an option.
 >   
 >   Other systems seem to be able to handle normal zfs operations.
 >   
 >   Will check for dsl_scan in original form anyway - takes some time though.
 >   
 >   Frank
 >   
 >   
 >   
 >   On 06/20/20 20:45, Jaromír Doleček wrote:
 >   > The following reply was made to PR kern/55402; it has been noted by GNATS.
 >   >
 >   > From: =?UTF-8?B?SmFyb23DrXIgRG9sZcSNZWs=?= <jaromir.dolecek@gmail.com>
 >   > To: "gnats-bugs@NetBSD.org" <gnats-bugs@netbsd.org>
 >   > Cc:
 >   > Subject: Re: kern/55402: amd64/9.99.68/9.99.68: xen/zfs - kernel: double fault
 >   >   trap, code=0
 >   > Date: Sat, 20 Jun 2020 20:39:52 +0200
 >   >
 >   >   Can you confirm whether it's enough to apply just the change for
 >   >   vdev_queue.c, i.e. can you try the scrum with dsl_scan.c same as in
 >   >   repository (without patch)?
 >   >
 >   >   I'd prefer to keep dsl_scan.c closer to upstream unless absolutely
 >   >   necessary to change.
 >   >
 >   >   Jaromir
 >   >
 >   >   Le sam. 20 juin 2020 =C3=A0 20:28, Frank Kardel <kardel@netbsd.org> a =C3=
 >   >   =A9crit :
 >   >   >
 >   >   > That did it - the scrub run now completed successfully.
 >   >   >
 >   >   > Frank
 >   >   >
 >   >   >
 >   >   > On 06/20/20 19:00, Jarom=C3=ADr Dole=C4=8Dek wrote:
 >   >   > > The following reply was made to PR kern/55402; it has been noted by GNA=
 >   >   TS.
 >   >   > >
 >   >   > > From: =3D?UTF-8?B?SmFyb23DrXIgRG9sZcSNZWs=3D?=3D <jaromir.dolecek@gmail=
 >   >   .com>
 >   >   > > To: "gnats-bugs@NetBSD.org" <gnats-bugs@netbsd.org>
 >   >   > > Cc:
 >   >   > > Subject: Re: kern/55402: amd64/9.99.68/9.99.68: xen/zfs - kernel: doubl=
 >   >   e fault
 >   >   > >   trap, code=3D0
 >   >   > > Date: Sat, 20 Jun 2020 18:55:28 +0200
 >   >   > >
 >   >   > >   Le sam. 20 juin 2020 =3DC3=3DA0 15:45, Frank Kardel <kardel@netbsd.or=
 >   >   g> a =3DC3=3D
 >   >   > >   =3DA9crit :
 >   >   > >   >  Well it goes a bit further but hits the double fault
 >   >   > >   >
 >   >   > >   >  now with 57 frames (two > 1k) at about the same accumulated stack =
 >   >   size.
 >   >   > >   >
 >   >   > >   >  Some space was saved by the patch, but presumably not enough.
 >   >   > >
 >   >   > >   OK, I've updated the patch. My previous change in
 >   >   > >   vdev_queue_io_to_issue() did not work, gcc returns the stack on the
 >   >   > >   end of the function, not when going out of the block.
 >   >   > >
 >   >   > >   New version reduced vdev_queue_io_to_issue() to use only 160 bytes of
 >   >   > >   stack instead of 1152, and dsl_scan_visitbp()+ dsl_scan_visitdnode()
 >   >   > >   pair now takes 40 less bytes.
 >   >   > >
 >   >   > >   Can you check if this is enough to get it through?
 >   >   > >
 >   >   > >   http://www.netbsd.org/~jdolecek/zfs_reduce_stack.diff
 >   >   > >
 >   >   > >   Jaromir
 >   >   > >
 >   >   >
 >   >
 >   

Responsible-Changed-From-To: kern-bug-people->jdolecek
Responsible-Changed-By: jdolecek@NetBSD.org
Responsible-Changed-When: Sat, 20 Jun 2020 19:53:52 +0000
Responsible-Changed-Why:
I'm looking into this.


State-Changed-From-To: open->analyzed
State-Changed-By: jdolecek@NetBSD.org
State-Changed-When: Sat, 20 Jun 2020 19:53:52 +0000
State-Changed-Why:
The provided patch is working, I'll commit it once I re-check the changes
with upstream.
It's somewhat strange we get trap around 12KB mark, kernel stack should
be 16KB on amd64. I want to check also that before that.


From: =?UTF-8?B?SmFyb23DrXIgRG9sZcSNZWs=?= <jaromir.dolecek@gmail.com>
To: Frank Kardel <kardel@netbsd.org>
Cc: "gnats-bugs@NetBSD.org" <gnats-bugs@netbsd.org>
Subject: Re: kern/55402: amd64/9.99.68/9.99.68: xen/zfs - kernel: double fault
 trap, code=0
Date: Sat, 20 Jun 2020 21:57:44 +0200

 Le sam. 20 juin 2020 =C3=A0 21:00, Frank Kardel <kardel@netbsd.org> a =C3=
 =A9crit :
 > I do have the feeling that zfs code is very generous with stack. Also
 > scan has a recursive structure where I don't know the upper bound

 Both Linux and FreeBSD have the same kernel stack size as we do on
 amd64 - 16KB, i.e. 4 pages.

 It however seems that _something_ blows up when stack overflows the
 3rd page already, i.e. around 12288 bytes mark. I'll investigate this
 separately.

 I checked FreeBSD code, they have some changes in dsl_scan.c which as
 a side effect reduced the stack usage on the dsl_scan_*() recursion. Maybe
 that is the reason why it works there.

 Anyway, it seems both changes are necessary, so I'll eventually commit
 them. Thanks for testing.

 Jaromir

From: "Jaromir Dolecek" <jdolecek@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/55402 CVS commit: src/external/cddl/osnet/dist/uts/common/fs/zfs
Date: Wed, 24 Jun 2020 16:16:01 +0000

 Module Name:	src
 Committed By:	jdolecek
 Date:		Wed Jun 24 16:16:01 UTC 2020

 Modified Files:
 	src/external/cddl/osnet/dist/uts/common/fs/zfs: vdev_queue.c

 Log Message:
 reduce stack usage in vdev_queue_io_to_issue() - zio_t is about 1KB, and
 the function potentially recurses into itself

 part of fix for PR kern/55402 by Frank Kardel


 To generate a diff of this commit:
 cvs rdiff -u -r1.1.1.3 -r1.2 \
     src/external/cddl/osnet/dist/uts/common/fs/zfs/vdev_queue.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Jaromir Dolecek" <jdolecek@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/55402 CVS commit: src/external/cddl/osnet/dist/uts/common/fs/zfs
Date: Wed, 24 Jun 2020 16:23:16 +0000

 Module Name:	src
 Committed By:	jdolecek
 Date:		Wed Jun 24 16:23:16 UTC 2020

 Modified Files:
 	src/external/cddl/osnet/dist/uts/common/fs/zfs: dsl_scan.c

 Log Message:
 change dsl_scan_visitbp() to allocate blkptr_t dynamically rather than
 on-stack - this function is called recursively, and the 120 bytes per call
 add up; also remove unused variable

 part of fix for PR kern/55402 by Frank Kardel


 To generate a diff of this commit:
 cvs rdiff -u -r1.1.1.1 -r1.2 \
     src/external/cddl/osnet/dist/uts/common/fs/zfs/dsl_scan.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Jaromir Dolecek" <jdolecek@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/55402 CVS commit: src/external/cddl/osnet/dist/uts/common/fs/zfs
Date: Wed, 24 Jun 2020 16:29:34 +0000

 Module Name:	src
 Committed By:	jdolecek
 Date:		Wed Jun 24 16:29:34 UTC 2020

 Modified Files:
 	src/external/cddl/osnet/dist/uts/common/fs/zfs: dsl_scan.c

 Log Message:
 reduce stack usage in dsl_scan_recurse() - allocate memory for
 temporary zbookmark_phys_t using kmem_alloc() rather than stack;
 this recuses several times usually, and this saves 2x
 sizeof(zbookmark_phys_t) == 64 bytes per recursion

 part of fix for PR kern/55402 by Frank Kardel


 To generate a diff of this commit:
 cvs rdiff -u -r1.2 -r1.3 \
     src/external/cddl/osnet/dist/uts/common/fs/zfs/dsl_scan.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: analyzed->closed
State-Changed-By: jdolecek@NetBSD.org
State-Changed-When: Mon, 29 Jun 2020 10:03:53 +0000
State-Changed-Why:
Actually usable kernel stack on amd64 is indeed 12 KiB, one whole page
is reserved for pcb, and one for redpage. Eventually we might want
to bump it if needed, but with the zfs stack fixes it seems fine for now.
Thanks for report and testing.


From: =?UTF-8?B?SmFyb23DrXIgRG9sZcSNZWs=?= <jaromir.dolecek@gmail.com>
To: "gnats-bugs@NetBSD.org" <gnats-bugs@netbsd.org>
Cc: 
Subject: Re: kern/55402 (amd64/9.99.68/GENERIC: xen/zfs - kernel: double fault
 trap, code=0)
Date: Mon, 29 Jun 2020 12:07:00 +0200

 For reference there was this commit for SVS, not linked due to typo in
 PR number.

 Module Name:    src
 Committed By:   jdolecek
 Date:           Mon Jun 29 09:56:51 UTC 2020

 Modified Files:
         src/sys/arch/amd64/include: param.h

 Log Message:
 increase UPAGES (used for lwp kernel stack) for SVS so the the
 amount of actually usable kernel stack is the same for SVS and
 non-SVS kernels (currently 12 KiB)

 discussed with maxv@, part of investigation for PR kern/S55402


 To generate a diff of this commit:
 cvs rdiff -u -r1.37 -r1.38 src/sys/arch/amd64/include/param.h

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.