NetBSD Problem Report #46199

From cheusov@tut.by  Thu Mar 15 11:29:55 2012
Return-Path: <cheusov@tut.by>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
	by www.NetBSD.org (Postfix) with ESMTP id C6A0263B946
	for <gnats-bugs@gnats.netbsd.org>; Thu, 15 Mar 2012 11:29:55 +0000 (UTC)
Message-Id: <s93bony6r0w.fsf@work.imb.invention.com>
Date: Thu, 15 Mar 2012 14:29:51 +0300
From: cheusov@tut.by
To: gnats-bugs@gnats.NetBSD.org
Subject: 6.0_BETA. "qemu -net tap" + bridge(4). Network problems in guests.
X-Send-Pr-Version: 3.95

>Number:         46199
>Category:       kern
>Synopsis:       6.0_BETA. "qemu -net tap" + bridge(4). Network problems in guests.
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Mar 15 11:30:00 +0000 2012
>Closed-Date:    Tue Jun 03 17:00:52 +0000 2014
>Last-Modified:  Tue Jun 03 17:00:52 +0000 2014
>Originator:     Aleksey Cheusov
>Release:        NetBSD 6.0_BETA
>Organization:
>Environment:
System: NetBSD work.imb.invention.com 6.0_BETA NetBSD 6.0_BETA (GENERIC) #0: Mon Mar 12 17:25:05 FET 2012 cheusov@work.imb.invention.com:/srv/obj-current/sys/arch/i386/compile/GENERIC i386
Architecture: i386
Machine: i386
>Description:
For testing purposes I use qemu with tap networking.

    /etc/rc.conf:
        net_interfaces='fxp0 bridge0'

    /etc/ifconfig.bridge0:
        !ifconfig tap0 create up
        !ifconfig tap1 create up
        !ifconfig tap2 create up
        !ifconfig tap3 create up
        !brconfig bridge0 add fxp0 add tap0 add tap1 add tap2 add tap3

Qemu guests are run with the following scripts:

   qemu -no-acpi \
       -drive file=/srv/qemu/openbsd.img,index=0,media=disk,cache=none \
       -m 192 \
       -net nic,model=ne2k_pci,macaddr=00:06:80:A3:85:CE \
       -net tap,fd=3 -boot c "$@" 3<>/dev/tap1

   qemu -no-acpi \
       -hda /srv/qemu/dfly.img \
       -m 192 \
       -net nic,model=ne2k_pci,macaddr=02:07:83:A8:81:FF \
       -net tap,fd=3 \
       -boot c "$@" 3<>/dev/tap3

I started them one-by-one and they started sucessfully for some time
and are becomes visible in my LAN.
However, after some time of uptime the network in guests(s) gisappeared.

    0 cheusov>ssh vm-openbsd uptime
    12:54PM  up 19 mins, 0 users, load averages: 0.07, 0.10, 0.08
    0 cheusov>ssh vm-dfly uptime
    ssh: connect to host vm-dfly port 22: Connection timed out
    255 cheusov>ssh vm-openbsd uptime
     1:02PM  up 26 mins, 0 users, load averages: 0.15, 0.11, 0.09
    0 cheusov>

I don't know is this a bug in the kernel or in qemu.
Restarting guest OS doesn't help.

But the following solves the problem.
   ifconfig tap3 destroy
   ifconfig tap3 create up
   brconfig bridge0 add tap3
   qemu ... 3<>/dev/tap3

>How-To-Repeat:
See above
>Fix:
No idea

>Release-Note:

>Audit-Trail:
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org, netbsd-bugs@NetBSD.org
Subject: Re: kern/46199: 6.0_BETA. "qemu -net tap" + bridge(4). Network
 problems in guests.
Date: Thu, 15 Mar 2012 13:05:38 +0100


 would a down/up on tap3 fix the issue ?
 Also, displaying the interface queues could help (maybe netstat -id)

 -- 
 Manuel Bouyer <bouyer@antioche.eu.org>
      NetBSD: 26 ans d'experience feront toujours la difference
 --

From: Aleksey Cheusov <cheusov@tut.by>
To: Manuel Bouyer <bouyer@antioche.eu.org>
Cc: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org, gnats-admin@netbsd.org, 
	netbsd-bugs@netbsd.org
Subject: Re: kern/46199: 6.0_BETA. "qemu -net tap" + bridge(4). Network
 problems in guests.
Date: Thu, 15 Mar 2012 15:23:10 +0300

 > would a down/up on tap3 fix the issue ?

 No.

 > Also, displaying the interface queues could help (maybe netstat -id)

 Name  Mtu   Network       Address              Ipkts Ierrs    Opkts
 Oerrs Colls Drops
 tap3  1500  <Link>        f2:0b:a4:c7:dd:00      799     0    28722 0     0 138584
 tap3  1500  fe80::/64     fe80::f00b:a4ff:f      799     0    28722 0     0 138584

 > --
 > Manuel Bouyer <bouyer@antioche.eu.org>
 > =A0 =A0 NetBSD: 26 ans d'experience feront toujours la difference
 > --

From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Aleksey Cheusov <cheusov@tut.by>
Cc: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
        netbsd-bugs@netbsd.org
Subject: Re: kern/46199: 6.0_BETA. "qemu -net tap" + bridge(4). Network
 problems in guests.
Date: Thu, 15 Mar 2012 13:43:38 +0100

 On Thu, Mar 15, 2012 at 03:23:10PM +0300, Aleksey Cheusov wrote:
 > > would a down/up on tap3 fix the issue ?
 > 
 > No.
 > 
 > > Also, displaying the interface queues could help (maybe netstat -id)
 > 
 > Name  Mtu   Network       Address              Ipkts Ierrs    Opkts  Oerrs Colls Drops
 > tap3  1500  <Link>        f2:0b:a4:c7:dd:00      799     0    28722   0     0 138584
 > tap3  1500  fe80::/64     fe80::f00b:a4ff:f      799     0    28722   0     0 138584

 So, lots of drops. Do you see Ipkts and Opkts moving when it's in this
 state ?

 -- 
 Manuel Bouyer <bouyer@antioche.eu.org>
      NetBSD: 26 ans d'experience feront toujours la difference
 --

From: Aleksey Cheusov <cheusov@tut.by>
To: Manuel Bouyer <bouyer@antioche.eu.org>
Cc: gnats-bugs@netbsd.org
Subject: Re: kern/46199: 6.0_BETA. "qemu -net tap" + bridge(4). Network
 problems in guests.
Date: Thu, 15 Mar 2012 15:52:22 +0300

 > So, lots of drops. Do you see Ipkts and Opkts moving when it's in this
 > state ?

 I'm not sure I understood your question. But Opkts sremains the same,
 a number of drops increased significantly, Ipkts increased a little bit.

 tap3  1500  <Link>        f2:0b:a4:c7:dd:00     1124     0    28722 0     0 201213
 tap3  1500  fe80::/64     fe80::f00b:a4ff:f     1124     0    28722 0     0 201213


From: Manuel Bouyer <bouyer@antioche.eu.org>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org, netbsd-bugs@NetBSD.org,
        cheusov@tut.by
Subject: Re: kern/46199: 6.0_BETA. "qemu -net tap" + bridge(4). Network
 problems in guests.
Date: Thu, 15 Mar 2012 15:59:15 +0100

 On Thu, Mar 15, 2012 at 12:55:02PM +0000, Aleksey Cheusov wrote:
 >  I'm not sure I understood your question. But Opkts sremains the same,
 >  a number of drops increased significantly, Ipkts increased a little bit.
 >  
 >  tap3  1500  <Link>        f2:0b:a4:c7:dd:00     1124     0    28722    0     0 201213
 >  tap3  1500  fe80::/64     fe80::f00b:a4ff:f     1124     0    28722    0     0 201213

 OK, so this means packets sent to the VM makes it up to the tap interface,
 but nothing consumes them. The VM also sends packets, that are properly
 handled.
 The bug could be in tap(4), or in qemu not reading from the tap(4) descriptor
 to get available packets (maybe ktrace could give more details here ?).
 The problem doesn't seem to be at the bridge level.

 -- 
 Manuel Bouyer <bouyer@antioche.eu.org>
      NetBSD: 26 ans d'experience feront toujours la difference
 --

From: Manuel Bouyer <bouyer@antioche.eu.org>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org, netbsd-bugs@NetBSD.org,
        cheusov@tut.by
Subject: Re: kern/46199: 6.0_BETA. "qemu -net tap" + bridge(4). Network
 problems in guests.
Date: Thu, 15 Mar 2012 16:00:54 +0100

 On Thu, Mar 15, 2012 at 12:55:02PM +0000, Aleksey Cheusov wrote:
 >  I'm not sure I understood your question. But Opkts sremains the same,
 >  a number of drops increased significantly, Ipkts increased a little bit.
 >  
 >  tap3  1500  <Link>        f2:0b:a4:c7:dd:00     1124     0    28722    0     0 201213
 >  tap3  1500  fe80::/64     fe80::f00b:a4ff:f     1124     0    28722    0     0 201213

 BTW, could you see if the affected tap device has OACTIVE flag set when
 in this state ?

 -- 
 Manuel Bouyer <bouyer@antioche.eu.org>
      NetBSD: 26 ans d'experience feront toujours la difference
 --

From: Aleksey Cheusov <cheusov@tut.by>
To: Manuel Bouyer <bouyer@antioche.eu.org>
Cc: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org, gnats-admin@netbsd.org, 
	netbsd-bugs@netbsd.org
Subject: Re: kern/46199: 6.0_BETA. "qemu -net tap" + bridge(4). Network
 problems in guests.
Date: Fri, 23 Mar 2012 18:31:50 +0300

 On Thu, Mar 15, 2012 at 6:00 PM, Manuel Bouyer <bouyer@antioche.eu.org> wro=
 te:
 > On Thu, Mar 15, 2012 at 12:55:02PM +0000, Aleksey Cheusov wrote:
 >> =A0I'm not sure I understood your question. But Opkts sremains the same,
 >> =A0a number of drops increased significantly, Ipkts increased a little b=
 it.
 >>
 >> =A0tap3 =A01500 =A0<Link> =A0 =A0 =A0 =A0f2:0b:a4:c7:dd:00 =A0 =A0 1124 =
 =A0 =A0 0 =A0 =A028722
 >> =A0 0 =A0 =A0 0 201213
 >> =A0tap3 =A01500 =A0fe80::/64 =A0 =A0 fe80::f00b:a4ff:f =A0 =A0 1124 =A0 =
 =A0 0 =A0 =A028722
 >> =A0 0 =A0 =A0 0 201213
 >
 > BTW, could you see if the affected tap device has OACTIVE flag set when
 > in this state ?

 Yes, OACTIVE flag is set on "dead" tap3.

 Also, I've got an idea that my ipf rules may cause this problem.
 When I disabled them I could not reproduce the problem,
 at least two virtual machines worked for a day without problems.
 I'm not sure this is relevant, though.

From: Quentin Garnier <cube@cubidou.net>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: re: kern/46199
Date: Sun, 4 May 2014 01:30:48 +0000

 Hi,

 I believe this is the same issue as kern/47506.

 -- 
 Quentin Garnier - cube@cubidou.net
 "See the look on my face from staying too long in one place
 [...] every time the morning breaks I know I'm closer to falling"
 KT Tunstall, Saving My Face, Drastic Fantastic, 2007.

State-Changed-From-To: open->closed
State-Changed-By: cube@NetBSD.org
State-Changed-When: Tue, 03 Jun 2014 17:00:52 +0000
State-Changed-Why:
Fixed and pulled up to -6.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.