NetBSD Problem Report #17723

Received: (qmail 7888 invoked by uid 605); 25 Jul 2002 19:38:39 -0000
Message-Id: <200207251938.g6PJc7U04036@mail.duh.org>
Date: Thu, 25 Jul 2002 15:38:07 -0400 (EDT)
From: tv@pobox.com
Sender: gnats-bugs-owner@netbsd.org
Reply-To: tv@pobox.com
To: gnats-bugs@gnats.netbsd.org
Subject: kernelized PPPoE assumes a ridiculously reliable link (needs config options)
X-Send-Pr-Version: 3.95

>Number:         17723
>Category:       kern
>Synopsis:       kernelized PPPoE assumes a ridiculously reliable link (needs config options)
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    martin
>State:          open
>Class:          change-request
>Submitter-Id:   net
>Arrival-Date:   Thu Jul 25 19:39:00 +0000 2002
>Closed-Date:    
>Last-Modified:  Tue Apr 01 18:38:06 +0000 2003
>Originator:     Todd Vierling
>Release:        NetBSD 1.6_BETA4
>Organization:

>Environment:

>Description:

The following are set in sys/net/if_pppoe.c:

#define PPPOE_DISC_TIMEOUT      (hz*5)  /* base for quick timeout calculation */
#define PPPOE_SLOW_RETRY        (hz*60) /* persistent retry interval */
#define PPPOE_DISC_MAXPADI      4       /* retry PADI four times (quickly) */
#define PPPOE_DISC_MAXPADR      2       /* retry PADR twice */

So, the minimum separation for PADI packets is 5 seconds, and only 4 tries are done.
If that fails, we go into 1-minute retry, which is annoyingly long for a "nailed"
link.  Additionally, the PADR is only retried twice at a 5-second interval.  Lose
just two of those packets, and we go back to the PADI cycle.

In the Real World(tm), DSL isn't nearly so nice to data streams.  For me, the retry
values above result in the link taking up to 10 minutes to reestablish after a lost
connection[!].

Plus, one thing that makes my connections crap out at least four times a day is this
from sys/net/if_spppsubr.c:

#define MAXALIVECNT                     3       /* max. alive packets */

On a link which is in the middle of sending a burst chunk of data, LCP packets can be
lost in the noise.  These are retried on 30-second intervals, so if we don't get back
a LCP in 30 seconds (**even with data actively flowing**), the connection is dropped
by NetBSD.  Ugh.

>How-To-Repeat:

Use a DSL link that's either flaky or has high volume, drowning out some LCP packets.  
Notice that NetBSD is assuming a much higher (and usually unattainable) level of data
reliability from the telco.

>Fix:

Ideally, add pppoectl options to set these variables at runtime.  The one in
if_spppsubr.c might need to be a sysctl because of the global nature of that code.

Also, possibly, reset the LCP Echo-Request count to zero if any data has been
received on the link.  LCP Echo-Requests we sent may not have been seen, but we know
at least that there *is* data coming in, so we shouldn't simply drop the link based
on lack of LCP Echo-Response.
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: kern-bug-people->martin 
Responsible-Changed-By: perry 
Responsible-Changed-When: Tue Apr 1 10:37:24 PST 2003 
Responsible-Changed-Why:  
Martin handles ISDN and PPPoE 
>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.