NetBSD Problem Report #47424

From kre@munnari.OZ.AU  Wed Jan  9 06:04:03 2013
Return-Path: <kre@munnari.OZ.AU>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
	by www.NetBSD.org (Postfix) with ESMTP id 26F9163EBD5
	for <gnats-bugs@gnats.NetBSD.org>; Wed,  9 Jan 2013 06:04:03 +0000 (UTC)
Message-Id: <201301090602.r0962gxa013730@jade.coe.psu.ac.th>
Date: Wed, 9 Jan 2013 13:02:42 +0700 (ICT)
From: kre@munnari.OZ.AU
To: gnats-bugs@gnats.NetBSD.org
Subject: pkgsrc "make fetch" fails to fetch fotoxx-13.01.tar.gz for graphics/fotoxx
X-Send-Pr-Version: 3.95

>Number:         47424
>Category:       bin
>Synopsis:       pkgsrc "make fetch" fails to fetch fotoxx-13.01.tar.gz for graphics/fotoxx
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    bin-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Jan 09 06:05:00 +0000 2013
>Closed-Date:    
>Last-Modified:  Sat Jan 12 03:10:00 +0000 2013
>Originator:     Robert Elz
>Release:        NetBSD 5.1_STABLE   (pkgsrc current 2013-01-09)
>Organization:
	Prince of Songkla University
>Environment:
System: NetBSD jade.coe.psu.ac.th 5.1_STABLE NetBSD 5.1_STABLE (JADE-1.12-20120130) #27: Tue Jan 31 05:20:31 ICT 2012 kre@jade.coe.psu.ac.th:/usr/obj/5/kernels/i386/JADE i386
Architecture: i386
Machine: i386
>Description:
	For some undetermined (so far) reason, a "make fetch" (or "make
	checksum") in graphics/fotoxx stalls after fetching 2129920 of the
	expected 2131822 bytes that are in the file.

	wget fetches the file correctly (after fetching that way the size
	and checksum are as expected).   So does ftp, if it is left long
	enough (the pkgsrc fetch times out after stalling for 121 seconds,
	which is apparently not long enough for those last 1902 bytes to
	arrive).

	f.n.b currently has a fotoxx-13.01.tar.gz that is (exactly) 32KB
	in its distfiles directory (probably caused by a transfer that
	failed in a similar way).   That needs to be removed and the transfer
	redone.

	I have no love (to put it mildly) for using the http protocol to
	fetch files (a http:// url), but there must be something broken
	in the ftp client (in NetBSD 5, and current) that causes the fetch
	to stall (very repeatably) at that point.

	I tried the same thing using NetBSD current (amd64) (well, 6.99.15
	from early December, so not quite current) - it also stalled at
	the same point, and also eventually recovered.

>How-To-Repeat:
	Using NetBSD 5, attempt ...

   ftp http://www.kornelix.com/uploads/1/3/0/3/13035936/fotoxx-13.01.tar.gz

	Watch it get to 2080KiB (2129920 bytes) and stall - then just wait,
	a fairly long time, and it will complete.  Try again
	using wget, and observe it complete correctly, and quickly.

	Try using "make fetch" and observe pkgsrc detect the stalled ftp
	and kill it before it has a chance to finish.

>Fix:
	No idea at the minute (obvious workaround would be to add a
	"FETCH_USING" or whatever it is so wget is always used, or do
	something to alter the timeout) - but whatever is making ftp
	behave differently than wget here really needs to be fixed.

>Release-Note:

>Audit-Trail:
From: Thomas Klausner <wiz@NetBSD.org>
To: NetBSD bugtracking <gnats-bugs@NetBSD.org>
Cc: 
Subject: Re: pkg/47424: pkgsrc "make fetch" fails to fetch
 fotoxx-13.01.tar.gz for graphics/fotoxx
Date: Wed, 9 Jan 2013 09:17:55 +0100

 On Wed, Jan 09, 2013 at 06:05:01AM +0000, kre@munnari.OZ.AU wrote:
 > 	For some undetermined (so far) reason, a "make fetch" (or "make
 > 	checksum") in graphics/fotoxx stalls after fetching 2129920 of the
 > 	expected 2131822 bytes that are in the file.
 > 
 > 	wget fetches the file correctly (after fetching that way the size
 > 	and checksum are as expected).   So does ftp, if it is left long
 > 	enough (the pkgsrc fetch times out after stalling for 121 seconds,
 > 	which is apparently not long enough for those last 1902 bytes to
 > 	arrive).

 I see the same behaviour on 6.99.16/amd64. Perhaps a bug report for
 ftp(1) is in order?

 > 	f.n.b currently has a fotoxx-13.01.tar.gz that is (exactly) 32KB
 > 	in its distfiles directory (probably caused by a transfer that
 > 	failed in a similar way).   That needs to be removed and the transfer
 > 	redone.

 I've replaced the file on nbftp.
  Thomas

From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: pkg/47424: pkgsrc "make fetch" fails to fetch fotoxx-13.01.tar.gz for graphics/fotoxx
Date: Wed, 09 Jan 2013 18:58:18 +0700

     Date:        Wed,  9 Jan 2013 08:20:06 +0000 (UTC)
     From:        Thomas Klausner <wiz@NetBSD.org>
     Message-ID:  <20130109082006.A425663EBD5@www.NetBSD.org>

   |  I see the same behaviour on 6.99.16/amd64. Perhaps a bug report for
   |  ftp(1) is in order?

 Perhaps, though ftp does (eventually) work ... I;ll try to analyse
 what is actually happening before calling it an ftp bug.

 pkgsrc did need some assistance though (which is why the PR there).

   |  I've replaced the file on nbftp.

 Great, thyanks, in that case you can close the PR, as pkgsrc will
 fail over to that one (if the ftp stalls and times out) and the
 fetch from f.n.o should work fine as a backup.

 kre

State-Changed-From-To: open->closed
State-Changed-By: wiz@NetBSD.org
State-Changed-When: Wed, 09 Jan 2013 18:17:26 +0000
State-Changed-Why:
Distfile on nbftp is enough for submitter; he'll investigate ftp(1) further.
Thanks for the PR!


From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: pkg/47424 (pkgsrc "make fetch" fails to fetch fotoxx-13.01.tar.gz for graphics/fotoxx)
Date: Fri, 11 Jan 2013 19:15:47 +0700

     Date:        Wed,  9 Jan 2013 18:17:28 +0000 (UTC)
     From:        wiz@NetBSD.org
     Message-ID:  <20130109181728.B809F63ED1A@www.NetBSD.org>

   | Distfile on nbftp is enough for submitter; he'll investigate ftp(1) further.

 I have done, and I don't think it is a bug in ftp(1), I think it is a bug
 in the HTTP server that's serving the file (though I am certainly no
 expert on the intricacies  of HTTP file transfers, nor do I particularly
 want to become one...)

 The difference between using ftp and wget (or so it appears to me) is that
 the http client in ftp(1) does

 	Connection: close

 in the header of the GET request, and wget does

 	Connection: Keep-Alive

 instead.   The server (at www.kornelix.com) (which appears to be Apache, though
 I have no idea which version) responds to the wget request with a header that
 includes

 	Content-Length: 213182
 	Connection: Keep-Alive

 whereas with the ftp request, the first of those is present in the reply
 header, but there is no "Connection:" field at all.

 Either way, in both cases, the server seems to send the data in the file,
 and then stop (implementing keep-alive type connections).  NetBSD's ftp
 client is assuming it will get a FIN to conclude the transfer, so after
 all the data has arrived, it just sits and waits for that FIN.  The server
 on the other hand appears to be waiting for the next request.

 Deadlock...

 Eventually after almost 3 minutes) I assume their server gets tired
 of waiting for a new request, and closes the connection.  That (finally)
 provides the FIN that ftp(1) has been waiting for, it is happy, and the
 connection completed properly.

 But that 3 minutes is too much for pkgsrc, which tells ftp to give up
 after 2 minutes.   When that happens, ftp never bothers to flush the last
 (partially filled) buffer, and just aborts, leaving those final 1902
 bytes missing from the file (if there's any bug in ftp(1) it would be
 that, it did receive the data, it could have written it to the file before
 quitting, and had it done do, pkgsrc would have verified the file checksum,
 I guess, if not on the same attempt as when ftp failed, then next time
 it went to look and found the file already existing) and all would have been
 OK.   But demanding that processes clean up fully when they are failing,
 is probably too much to exoect.

 With wget, the client TCP (ie: my system running wget) sends the first FIN,
 immediately after receiving the end of the file data, which is what you'd
 expect with a client doing Keep-Alive (why it bothers when it has only one
 file to fetch I have no idea, but it does, I guess just to be consistent with
 usages when it is fetching entire trees of files.)

 The real problem here appears to be the HTTP server that seems to be
 implementing keep-alive connection mode, when just the opposite was
 requested by the client.

 I have complete tcpdump (binary form) dumps of the two transactions,
 if anyone else (someone who speaks more http than I do) would like to
 take a look and confirm (or refute) my analysis.  They're each about 2.3MB
 big (and probably do not compress much, as most of that will be the file
 data, which was a .gz file, though I haven't tried to see).

 I can e-mail one of both of those, or make them available for ftp (but
 not for http...) if someone would like to take a look - or this seems to
 be consistently repeatable enough, that you could make your own trace
 (other than the 3 minute timeout, actually about 150 secs idle) all of
 this happens fairly quickly, it is not a very big file to transfer.

 kre

From: David Holland <dholland-pbugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: pkg/47424 (pkgsrc "make fetch" fails to fetch
 fotoxx-13.01.tar.gz for graphics/fotoxx)
Date: Fri, 11 Jan 2013 17:42:05 +0000

 On Fri, Jan 11, 2013 at 12:35:07PM +0000, Robert Elz wrote:
  >  But that 3 minutes is too much for pkgsrc, which tells ftp to give
  >  up after 2 minutes.  When that happens, ftp never bothers to flush
  >  the last (partially filled) buffer, and just aborts, leaving those
  >  final 1902 bytes missing from the file (if there's any bug in
  >  ftp(1) it would be that, it did receive the data, it could have
  >  written it to the file before quitting, and had it done do, pkgsrc
  >  would have verified the file checksum, I guess, if not on the same
  >  attempt as when ftp failed, then next time it went to look and
  >  found the file already existing) and all would have been OK.  But
  >  demanding that processes clean up fully when they are failing, is
  >  probably too much to exoect.

 That is at least one and maybe two bugs; ftp should write out the data
 it has and not throw it away... and also, it should be capable of
 noticing that it's received the entire Content-Length and proceeding
 accordingly rather than timing out.

 -- 
 David A. Holland
 dholland@netbsd.org

From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: pkg/47424 (pkgsrc "make fetch" fails to fetch fotoxx-13.01.tar.gz for graphics/fotoxx)
Date: Sat, 12 Jan 2013 07:21:14 +0700

     Date:        Fri, 11 Jan 2013 17:45:02 +0000 (UTC)
     From:        David Holland <dholland-pbugs@NetBSD.org>
     Message-ID:  <20130111174502.DD0A063C07C@www.NetBSD.org>

   |  That is at least one and maybe two bugs; ftp should write out the data
   |  it has and not throw it away...

 Yes, possibly.

   |  and also, it should be capable of
   |  noticing that it's received the entire Content-Length and proceeding
   |  accordingly rather than timing out.

 I'll leave it up to someone more familiar with the GTTP spec (like someone
 who has actually read it, rather than just reading about it) to determine
 what is correct behaviour when the client requests that end of transfer be
 signalled by closing the connection (original HTTP 1.0 behaviour) but the
 server wants to implement connection keep-alive (so the client can either
 reuse the connection, or otherwise initiale the close, so it gets TIME WAIT
 state rather than the server).

 kre

Responsible-Changed-From-To: pkg-manager->bin-bug-people
Responsible-Changed-By: dholland@NetBSD.org
Responsible-Changed-When: Sat, 12 Jan 2013 03:10:00 +0000
Responsible-Changed-Why:
A problem exists in ftp(1).


State-Changed-From-To: closed->open
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Sat, 12 Jan 2013 03:10:00 +0000
State-Changed-Why:
.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.