NetBSD Problem Report #55697

From root@legendre.systella.fr  Tue Oct  6 11:57:30 2020
Return-Path: <root@legendre.systella.fr>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id AB9F61A923A
	for <gnats-bugs@gnats.NetBSD.org>; Tue,  6 Oct 2020 11:57:30 +0000 (UTC)
Message-Id: <20201006115101.8FD242C673F@legendre.systella.fr>
Date: Tue,  6 Oct 2020 13:51:01 +0200 (CEST)
From: joel.bertrand@systella.fr
Reply-To: joel.bertrand@systella.fr
To: gnats-bugs@NetBSD.org
Subject: ccd, gpt and, I suppose, kernel panic.
X-Send-Pr-Version: 3.95

>Number:         55697
>Category:       kern
>Synopsis:       ccd, gpt and, I suppose, kernel panic.
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    mlelstv
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Oct 06 12:00:01 +0000 2020
>Last-Modified:  Tue Oct 06 13:05:01 +0000 2020
>Originator:     Charlie Root
>Release:        NetBSD 9.0_STABLE
>Organization:
>Environment:
System: NetBSD legendre.systella.fr 9.0_STABLE NetBSD 9.0_STABLE (CUSTOM) #12: Thu Oct 1 08:59:33 CEST 2020 root@legendre.systella.fr:/usr/src/netbsd-9/obj/sys/arch/amd64/compile/CUSTOM amd64
Architecture: x86_64
Machine: amd64
>Description:

	I use for a long time a NetBSD system that acts as boot (tftpd), NFS
and iSCSI server for diskless workstations. This server runs 9.0_STABLE
(built from sources tree).

	This server contains six disks and a lot of partitions
(wedges and regulars) :
legendre:[~] > df -h
Filesystem         Size       Used      Avail %Cap Mounted on
/dev/raid0a         31G       1,1G        28G   3% /
/dev/raid0e         62G        24G        35G  41% /usr
/dev/raid0f         31G        11G        19G  36% /var
/dev/raid0g        252G        33G       206G  13% /usr/src
/dev/raid0h        523G       228G       269G  45% /srv
/dev/dk0           3,6T       2,4T       1,0T  70% /home
kernfs             1,0K       1,0K         0B 100% /kern
ptyfs              1,0K       1,0K         0B 100% /dev/pts
procfs             4,0K       4,0K         0B 100% /proc
tmpfs              4,0G        20K       4,0G   0% /var/shm
/dev/dk1            11T        64M        10T   0% /opt

	iSCSI is only used for diskless workstations swap's on an old disk.
When I have installed this iSCSI server, I have created wedges on GPT
labels without difficulties (if I remember) :

#dkctl wd0 listwedges :
/dev/rwd0d: 3 wedges:
dk0: swap_hilbert, 67108864 blocks at 34, type: swap
dk1: swap_abel, 2097152 blocks at 67108898, type: swap
dk2: swap_schwarz, 33554432 blocks at 69206050, type: swap

	Now, I have replaced this old disk (SATA 3Gbps, 160 Go, 4500rpms) by
two others disks (SATA 6Gbps, 1To, 7200rpms). In a first time, I have
configured ccd :

legendre# cat /etc/ccd.conf
ccd0    32      none    /dev/wd0a /dev/wd1a
legendre# dmesg
...
[   511.661981] ccd0: Interleaving 2 components (32 block interleave)
[   511.661981] ccd0: /dev/wd0a (1953524160 blocks)
[   511.661981] ccd0: /dev/wd1a (1953524160 blocks)
[   511.661981] ccd0: total 3907048320 blocks

	I have tried to create new wedge on ccd0 with following command :
legendre# gpt add -a 4k -l swap_hilbert -s 96g -t swap ccd0

	Server immediatly reboots. I'm not sure kernel panics (but I suppose)
as there is no crashdump and kernel doesn't enter in debugger.

	After reboot :
legendre# gpt show ccd0
       start        size  index  contents
           0           1         PMBR
           1           1         Pri GPT header
           2          32         Pri GPT table
          34           6         Unused
          40   201326592      1  GPT part - NetBSD swap
   201326632  3705721655         Unused
  3907048287          32         Sec GPT table
  3907048319           1         Sec GPT header

but no wedge is created and dk2 (in my case) remains unconfigured.

	Of course, if I replace ccd0 by another device (for example a
raidframe device), system runs as expected.

>How-To-Repeat:

	Create an interleaved  ccd device.
	Create a GTP partition on this device.

>Fix:

	No idea.

>Release-Note:

>Audit-Trail:

Responsible-Changed-From-To: kern-bug-people->mlelstv
Responsible-Changed-By: mlelstv@NetBSD.org
Responsible-Changed-When: Tue, 06 Oct 2020 12:21:47 +0000
Responsible-Changed-Why:
Mine. I can re-create the problem.


From: mlelstv@serpens.de (Michael van Elst)
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/55697: ccd, gpt and, I suppose, kernel panic.
Date: Tue, 6 Oct 2020 13:01:08 -0000 (UTC)

 joel.bertrand@systella.fr writes:

 >	I have tried to create new wedge on ccd0 with following command :
 >legendre# gpt add -a 4k -l swap_hilbert -s 96g -t swap ccd0

 >	Server immediatly reboots. I'm not sure kernel panics (but I suppose)
 >as there is no crashdump and kernel doesn't enter in debugger.

 The kernel fails over a recursive lock when gpt issues the DIOCMWEDGES ioctl.

 lockdebug_abort() at lockdebug_abort+0xee
 mutex_vector_enter() at mutex_vector_enter+0x364
 ccdopen() at ccdopen+0x52
 spec_open() at spec_open+0x175
 VOP_OPEN() at VOP_OPEN+0x4c
 dkwedge_discover() at dkwedge_discover+0xb4
 disk_ioctl() at disk_ioctl+0xd3
 ccdioctl() at ccdioctl+0x1ea
 VOP_IOCTL() at VOP_IOCTL+0x54
 vn_ioctl() at vn_ioctl+0xa5
 sys_ioctl() at sys_ioctl+0x5ab
 syscall() at syscall+0x157

 The error is to call disk_ioctl() with the dvlock held.

 -- 
 -- 
                                 Michael van Elst
 Internet: mlelstv@serpens.de
                                 "A potential Snark may lurk in every tree."

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.