NetBSD Problem Report #55697
From root@legendre.systella.fr Tue Oct 6 11:57:30 2020
Return-Path: <root@legendre.systella.fr>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id AB9F61A923A
for <gnats-bugs@gnats.NetBSD.org>; Tue, 6 Oct 2020 11:57:30 +0000 (UTC)
Message-Id: <20201006115101.8FD242C673F@legendre.systella.fr>
Date: Tue, 6 Oct 2020 13:51:01 +0200 (CEST)
From: joel.bertrand@systella.fr
Reply-To: joel.bertrand@systella.fr
To: gnats-bugs@NetBSD.org
Subject: ccd, gpt and, I suppose, kernel panic.
X-Send-Pr-Version: 3.95
>Number: 55697
>Category: kern
>Synopsis: ccd, gpt and, I suppose, kernel panic.
>Confidential: no
>Severity: critical
>Priority: medium
>Responsible: mlelstv
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Tue Oct 06 12:00:01 +0000 2020
>Closed-Date: Fri Sep 24 02:56:08 +0000 2021
>Last-Modified: Fri Sep 24 06:30:01 +0000 2021
>Originator: Charlie Root
>Release: NetBSD 9.0_STABLE
>Organization:
>Environment:
System: NetBSD legendre.systella.fr 9.0_STABLE NetBSD 9.0_STABLE (CUSTOM) #12: Thu Oct 1 08:59:33 CEST 2020 root@legendre.systella.fr:/usr/src/netbsd-9/obj/sys/arch/amd64/compile/CUSTOM amd64
Architecture: x86_64
Machine: amd64
>Description:
I use for a long time a NetBSD system that acts as boot (tftpd), NFS
and iSCSI server for diskless workstations. This server runs 9.0_STABLE
(built from sources tree).
This server contains six disks and a lot of partitions
(wedges and regulars) :
legendre:[~] > df -h
Filesystem Size Used Avail %Cap Mounted on
/dev/raid0a 31G 1,1G 28G 3% /
/dev/raid0e 62G 24G 35G 41% /usr
/dev/raid0f 31G 11G 19G 36% /var
/dev/raid0g 252G 33G 206G 13% /usr/src
/dev/raid0h 523G 228G 269G 45% /srv
/dev/dk0 3,6T 2,4T 1,0T 70% /home
kernfs 1,0K 1,0K 0B 100% /kern
ptyfs 1,0K 1,0K 0B 100% /dev/pts
procfs 4,0K 4,0K 0B 100% /proc
tmpfs 4,0G 20K 4,0G 0% /var/shm
/dev/dk1 11T 64M 10T 0% /opt
iSCSI is only used for diskless workstations swap's on an old disk.
When I have installed this iSCSI server, I have created wedges on GPT
labels without difficulties (if I remember) :
#dkctl wd0 listwedges :
/dev/rwd0d: 3 wedges:
dk0: swap_hilbert, 67108864 blocks at 34, type: swap
dk1: swap_abel, 2097152 blocks at 67108898, type: swap
dk2: swap_schwarz, 33554432 blocks at 69206050, type: swap
Now, I have replaced this old disk (SATA 3Gbps, 160 Go, 4500rpms) by
two others disks (SATA 6Gbps, 1To, 7200rpms). In a first time, I have
configured ccd :
legendre# cat /etc/ccd.conf
ccd0 32 none /dev/wd0a /dev/wd1a
legendre# dmesg
...
[ 511.661981] ccd0: Interleaving 2 components (32 block interleave)
[ 511.661981] ccd0: /dev/wd0a (1953524160 blocks)
[ 511.661981] ccd0: /dev/wd1a (1953524160 blocks)
[ 511.661981] ccd0: total 3907048320 blocks
I have tried to create new wedge on ccd0 with following command :
legendre# gpt add -a 4k -l swap_hilbert -s 96g -t swap ccd0
Server immediatly reboots. I'm not sure kernel panics (but I suppose)
as there is no crashdump and kernel doesn't enter in debugger.
After reboot :
legendre# gpt show ccd0
start size index contents
0 1 PMBR
1 1 Pri GPT header
2 32 Pri GPT table
34 6 Unused
40 201326592 1 GPT part - NetBSD swap
201326632 3705721655 Unused
3907048287 32 Sec GPT table
3907048319 1 Sec GPT header
but no wedge is created and dk2 (in my case) remains unconfigured.
Of course, if I replace ccd0 by another device (for example a
raidframe device), system runs as expected.
>How-To-Repeat:
Create an interleaved ccd device.
Create a GTP partition on this device.
>Fix:
No idea.
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: kern-bug-people->mlelstv
Responsible-Changed-By: mlelstv@NetBSD.org
Responsible-Changed-When: Tue, 06 Oct 2020 12:21:47 +0000
Responsible-Changed-Why:
Mine. I can re-create the problem.
From: mlelstv@serpens.de (Michael van Elst)
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/55697: ccd, gpt and, I suppose, kernel panic.
Date: Tue, 6 Oct 2020 13:01:08 -0000 (UTC)
joel.bertrand@systella.fr writes:
> I have tried to create new wedge on ccd0 with following command :
>legendre# gpt add -a 4k -l swap_hilbert -s 96g -t swap ccd0
> Server immediatly reboots. I'm not sure kernel panics (but I suppose)
>as there is no crashdump and kernel doesn't enter in debugger.
The kernel fails over a recursive lock when gpt issues the DIOCMWEDGES ioctl.
lockdebug_abort() at lockdebug_abort+0xee
mutex_vector_enter() at mutex_vector_enter+0x364
ccdopen() at ccdopen+0x52
spec_open() at spec_open+0x175
VOP_OPEN() at VOP_OPEN+0x4c
dkwedge_discover() at dkwedge_discover+0xb4
disk_ioctl() at disk_ioctl+0xd3
ccdioctl() at ccdioctl+0x1ea
VOP_IOCTL() at VOP_IOCTL+0x54
vn_ioctl() at vn_ioctl+0xa5
sys_ioctl() at sys_ioctl+0x5ab
syscall() at syscall+0x157
The error is to call disk_ioctl() with the dvlock held.
--
--
Michael van Elst
Internet: mlelstv@serpens.de
"A potential Snark may lurk in every tree."
State-Changed-From-To: open->closed
State-Changed-By: mlelstv@NetBSD.org
State-Changed-When: Fri, 24 Sep 2021 02:56:08 +0000
State-Changed-Why:
Fixed in ccd.c 1.185 and pulled up with ticket #1110.
From: =?UTF-8?Q?BERTRAND_Jo=c3=abl?= <joel.bertrand@systella.fr>
To: gnats-bugs@netbsd.org, mlelstv@netbsd.org, netbsd-bugs@netbsd.org,
gnats-admin@netbsd.org
Cc:
Subject: Re: kern/55697 (ccd, gpt and, I suppose, kernel panic.)
Date: Fri, 24 Sep 2021 07:58:47 +0200
mlelstv@NetBSD.org a écrit :
> Synopsis: ccd, gpt and, I suppose, kernel panic.
>
> State-Changed-From-To: open->closed
> State-Changed-By: mlelstv@NetBSD.org
> State-Changed-When: Fri, 24 Sep 2021 02:56:08 +0000
> State-Changed-Why:
> Fixed in ccd.c 1.185 and pulled up with ticket #1110.
Thanks a lot. Do you know when this patch will be in netbsd-9 branch ?
Best regards,
JKB
From: mlelstv@serpens.de (Michael van Elst)
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/55697 (ccd, gpt and, I suppose, kernel panic.)
Date: Fri, 24 Sep 2021 06:04:53 -0000 (UTC)
joel.bertrand@systella.fr (=?UTF-8?Q?BERTRAND_Jo=c3=abl?=) writes:
>mlelstv@NetBSD.org a écrit :
>> Synopsis: ccd, gpt and, I suppose, kernel panic.
>>
>> State-Changed-From-To: open->closed
>> State-Changed-By: mlelstv@NetBSD.org
>> State-Changed-When: Fri, 24 Sep 2021 02:56:08 +0000
>> State-Changed-Why:
>> Fixed in ccd.c 1.185 and pulled up with ticket #1110.
> Thanks a lot. Do you know when this patch will be in netbsd-9 branch ?
It's already there since last october, or do you still see an issue?
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/55697 (ccd, gpt and, I suppose, kernel panic.)
Date: Fri, 24 Sep 2021 08:05:06 +0200
On Fri, Sep 24, 2021 at 06:00:03AM +0000, BERTRAND Joël wrote:
> Thanks a lot. Do you know when this patch will be in netbsd-9 branch ?
It is already (since october 2020).
Martin
From: =?UTF-8?Q?BERTRAND_Jo=c3=abl?= <joel.bertrand@systella.fr>
To: gnats-bugs@netbsd.org, mlelstv@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Cc:
Subject: Re: kern/55697 (ccd, gpt and, I suppose, kernel panic.)
Date: Fri, 24 Sep 2021 08:27:47 +0200
Michael van Elst a écrit :
> The following reply was made to PR kern/55697; it has been noted by GNATS.
>
> From: mlelstv@serpens.de (Michael van Elst)
> To: gnats-bugs@netbsd.org
> Cc:
> Subject: Re: kern/55697 (ccd, gpt and, I suppose, kernel panic.)
> Date: Fri, 24 Sep 2021 06:04:53 -0000 (UTC)
>
> joel.bertrand@systella.fr (=?UTF-8?Q?BERTRAND_Jo=c3=abl?=) writes:
>
> >mlelstv@NetBSD.org a écrit :
> >> Synopsis: ccd, gpt and, I suppose, kernel panic.
> >>
> >> State-Changed-From-To: open->closed
> >> State-Changed-By: mlelstv@NetBSD.org
> >> State-Changed-When: Fri, 24 Sep 2021 02:56:08 +0000
> >> State-Changed-Why:
> >> Fixed in ccd.c 1.185 and pulled up with ticket #1110.
>
> > Thanks a lot. Do you know when this patch will be in netbsd-9 branch ?
>
> It's already there since last october, or do you still see an issue?
Oops, sorry. This bug was fixed, I don't know why I've read ccb (I have
sent a mail as my server regulary panics with :
[ 308324,087343] panic: trap
[ 308324,087343] cpu1: Begin traceback...
[ 308324,087343] vpanic() at netbsd:vpanic+0x160
[ 308324,087343] snprintf() at netbsd:snprintf
[ 308324,087343] startlwp() at netbsd:startlwp
[ 308324,087343] alltraps() at netbsd:alltraps+0xbb
[ 308324,087343] ccb_timeout() at iscsi:ccb_timeout+0xf0
[ 308324,087343] iscsi_cleanup_thread() at iscsi:iscsi_cleanup_thread+0x2b6
[ 308324,087343] cpu1: End traceback...
[ 308324,087343] uvm_fault(0xffffc947e69898d0, 0x0, 2) -> e
)
Please forget my answer on this PR.
Best regards,
JKB
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.