NetBSD Problem Report #52432

From greywolf@starwolf.com  Wed Jul 26 21:28:40 2017
Return-Path: <greywolf@starwolf.com>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (verified OK))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 29E207A1FA
	for <gnats-bugs@gnats.NetBSD.org>; Wed, 26 Jul 2017 21:28:40 +0000 (UTC)
Message-Id: <20170726212838.9DA3A1A4BD@eddie.starwolf.com>
Date: Wed, 26 Jul 2017 14:28:38 -0700 (PDT)
From: greywolf@starwolf.com
Reply-To: greywolf@starwolf.com
To: gnats-bugs@NetBSD.org
Subject: /etc/dumpdates format egregious
X-Send-Pr-Version: 3.95

>Number:         52432
>Category:       bin
>Synopsis:       /etc/dumpdates format egregious
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    bin-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Jul 26 21:30:00 +0000 2017
>Last-Modified:  Sat Aug 05 05:05:00 +0000 2017
>Originator:     Greywolf
>Release:        NetBSD 8.99.1
>Organization:

				--*greywolf;
>Environment:
System: NetBSD eddie.starwolf.com 8.99.1 NetBSD 8.99.1 (EDDIE) #11: Fri Jul 21 13:55:41 PDT 2017 greywolf@eddie.starwolf.com:/sys/arch/amd64/compile/EDDIE amd64
Architecture: x86_64
Machine: amd64
>Description:
	/etc/dumpdates: padding device out to 511 chars seems excessive
>How-To-Repeat:
	do a dump, check /etc/dumpdates
>Fix:
	Why not just use "%s" instead of "%-511s", considering that on read-
	back, "%s" would be simpler for scanf?  The padding just seems a bit
	egregiously long (it used to be 32 chars).  Are there devices with
	white space in them such that we need to be this cautious on reading
	back the device name?

	[this bug report may also be egregious, but it seemed to me worth
	 noting.  I have had occasion to edit /etc/dumpdates to recover full
	 + incremental backup sets (read: I was stupid and forgot the
	 'u' flag).]

>Audit-Trail:
From: christos@zoulas.com (Christos Zoulas)
To: gnats-bugs@NetBSD.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc: 
Subject: Re: bin/52432: /etc/dumpdates format egregious
Date: Wed, 26 Jul 2017 22:55:37 -0400

 On Jul 26,  9:30pm, greywolf@starwolf.com (greywolf@starwolf.com) wrote:
 -- Subject: bin/52432: /etc/dumpdates format egregious

 Because %s can cause a buffer overflow. The better solution is to stop
 using scanf... This started with:
 https://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=50434
 Unfortunately DUMP{IN,OUT}FMT is the "api".

 christos

From: Robert Elz <kre@munnari.OZ.AU>
To: christos@zoulas.com (Christos Zoulas)
Cc: gnats-bugs@NetBSD.org, netbsd-bugs@netbsd.org
Subject: Re: bin/52432: /etc/dumpdates format egregious
Date: Thu, 27 Jul 2017 12:14:42 +0700

 The dumpdates file (and all associated with it) needs lots of work.
 It is close to hopeless when dumping filesystems on wedges (what good is
 it knowing that /dev/dk3 was last dumped... when dk3 might be something
 entirely different when the next dump happens.)

 What's more, it dates back to the days when dumps were all done to tape,
 and there would be one set of tapes for each filesystem, and (provided
 that one can map device names to filesystems) it worked OK for keeping
 track of what had been dumped where.

 These days, I'd guess that more people do dumps like I do, onto a removable
 USB drive - each one of those can handle lots of dumps, for many filesystems,
 and for many dump cycles ... but doing just that means that there's not
 a lot of redundancy (in the tape days, if a tape was found to be bad when
 a filesystem restore was needed, you simply picked the next most recent,
 usually the day before...)

 I have a bunch of USB drives I dumps to, and to make that rational, each
 has its own dump sequence (so I do a full (level 0) dump, then some
 incrementals, then another full dump, ...) but not to the same device each
 time, I flip around which drive I am using so no one gets used twice in a
 row, so if one of those fails, there's always the other one, or another...

 To make that work with dump as it is, my script has to look and see which
 dumpdates file is in /etc and if it is the wrong one, copy the one that
 belongs to the dump sequence for the output device being used for this set
 of dumps ... it would be much simpler if I could just tell dump where the
 dumpdates file is located .. and then use the one in /etc merely to keep
 track of the last dumps for each filesystem, not for working out which date
 to go back until when doing an incremental dump (ie: provide the data for
 dump W to use, and nothing else.)

 I have been meaning to propose some radical work on dumpdates for ages.

 I do not buy the "it is the API" for a second, the only thing that uses
 dumpdates is dump (restore doesn't) change dump (keep it able to read the
 old format) and we can make the dumpdates file contain any format that
 makes sense, which the current one does not.

 kre

From: Christos Zoulas <christos@zoulas.com>
To: Robert Elz <kre@munnari.OZ.AU>
Cc: gnats-bugs@NetBSD.org, netbsd-bugs@netbsd.org
Subject: Re: bin/52432: /etc/dumpdates format egregious
Date: Thu, 27 Jul 2017 10:07:38 +0300

 That sounds like a nice and useful project. I thought that other programs in=
  pkgsrc might consult dumpdates, but I think it does not matter much since t=
 he format has outlived its usefulness.

 christos=

From: Robert Elz <kre@munnari.OZ.AU>
To: Christos Zoulas <christos@zoulas.com>
Cc: gnats-bugs@NetBSD.org, netbsd-bugs@netbsd.org
Subject: Re: bin/52432: /etc/dumpdates format egregious
Date: Fri, 28 Jul 2017 02:27:06 +0700

     Date:        Thu, 27 Jul 2017 10:07:38 +0300
     From:        Christos Zoulas <christos@zoulas.com>
     Message-ID:  <DB18CE59-8745-4418-9DB6-8058AFE75A29@zoulas.com>

   | That sounds like a nice and useful project.

 Yes, I think so - what is holding it up at the minute (aside from designing
 a new format for dump's data - which I would defer right now, just to avoid
 the "use XML", "no use json", "no use plist", "it must be ASN1", just use
 plain text... discussions - after all we can iterate that a few times, there
 is not that much data to record) is that we need some kind of permanent file
 system identification in *every* filesystem dump (and clones, like dump_lfs)
 is capable of dumping.   Neither the mount point, nor the device name
 (incl the device dev_t) is suitable.   It needs to reside inside the filesystem
 so raw copies get the same value.    Some kind of GUID is probably needed,
 but it needs to be able to be retro-fitted backwards to the older filesystems
 (obviously, without breaking them).

 What's needed to make that happen I have not looked into yet.   Once a type
 and place to store it is identified (the place need not be the same for
 different filesystems) then newfs (and equivs) and perhaps even dump itself,
 or a new short-life utility just for the purpose, can supply the value,
 for filesystems that dont already have something suitable.

   | I thought that other programs in pkgsrc might consult dumpdates,

 If there's anything that uses it for dump W type purposes, we can easily
 keep an (essentially useless) /etc/dumpdates for that purpose, and it can
 be kept as accurate (or inaccurate if you prefer) as it is now.

 If there is anything there (aside from a ported dump(8)) which tries
 modifying the format, or even using it to work out what would need to be
 dumped, then it deserves breaking (it already is, really.)

 kre

From: Greywolf <greywolf@starwolf.com>
To: 
Cc: gnats-bugs@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: bin/52432: /etc/dumpdates format egregious
Date: Thu, 3 Aug 2017 19:54:26 -0700

 Hi, kre, christos, others, pardon for injecting myself into the
 discussion, but I'm following this PR with some interest.

 I'm not sure if what I say here is completely orthogonal to the
 discussion or not.

 I'm hoping the concept of a 'dump level' doesn't go away, unless
 it were to be replaced with 'f <date>', 'i <date>', or 'd <date>'
 which is really all 0, 1-9, and (n+1,n++) translate into anyway,
 but that could result in a runaway dump log (which levels nicely
 limits).

 [I might point out that there may be scripts out there which rely
 on levels 0-9 *coughmaybeevenminecough*, so shifting away from
 that may be temporarily uncomfortable.]

 Having a GUID-ish (but optionally user-selectable) DevID attached
 to each device sounds to me like a workable idea and it would
 render /etc/dumpdates considerably less useless. My question here
 would be: Is there a provision inside FFS allowing for an identi-
 fier?

 Tangentially, kre, you mentioned "designing a new format for dump's
 data" -- could you clarify the intent there? Did you mean the
 dump timestamp metadata (dumpdates), or did you mean the entire TOC
 format (inode maps)? [this can be taken off-line from this discussion.]

 You also mentioned the place *whither* you back up your filesystems via
 dump. I'm unclear on why the output device for the dump has anything to
 do with this -- dumpdates only references the last dump of an INput
 device. Restore uses restoresymtable to determine whether or not the
 incremental restore is of the proper date.

 Thank you both for chiming in on this. I'd forgotten about the
 vulnerability of "%s" respective to *scanf().

 --*greywolf;

From: Robert Elz <kre@munnari.OZ.AU>
To: 
Cc: gnats-bugs@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: bin/52432: /etc/dumpdates format egregious
Date: Fri, 04 Aug 2017 19:06:09 +0700

     Date:        Thu, 3 Aug 2017 13:48:51 -0700
     From:        Greywolf <greywolf@starwolf.com>
     Message-ID:  <9e2d4133-fab5-bf31-d226-4fdc07ce5e11@starwolf.com>

   | Hi, kre, christos, others,

 Originally:
 	I think you only sent this to Christos and me, so there
 	are no others...

 And now the message has been sent to gnats, and netbsd-bugs, here is the
 rest of my reply (verbatim), for gnats, and netbsd-bugs.

   | I'm hoping the concept of a 'dump level' doesn't go away,

 Definitely not in anything I am planning.   It isn't levels that are the
 problem.

   | Having a GUID-ish (but optionally user-selectable) DevID attached
   | to each device sounds to me like a workable idea and it would
   | render /etc/dumpdates considerably less useless.

 That is (more or less) the hope, yes.   Of course, right now this is not
 much more than a dream, as:

   | My question here would be: Is there a provision inside FFS allowing
   | for an identifier?

 And that is the question (or part of it, FFS isn't the only filesystem,
 LFS needs to be handled too - and I suspect anything that follows the
 inode/directory model (ie: cd9660, UDF, and msdosfs are irrelevant...)

 That part needs to be investigated, and solved, before doing anything else.

   | Tangentially, kre, you mentioned "designing a new format for dump's
   | data" -- could you clarify the intent there? Did you mean the
   | dump timestamp metadata (dumpdates),

 Yes, just that.

   | or did you mean the entire TOC format (inode maps)?

 No, not that, I have lots of old dumps, one day one of them might be
 needed, I don't want to change that at all (certainly not because of
 dumpdates issues.)

   | [this can be taken off-line from this discussion.]

 You already did,  even if not intentionally.

   | You also mentioned the place *whither* you back up your filesystems via
   | dump. I'm unclear on why the output device for the dump has anything to
   | do with this -- dumpdates only references the last dump of an INput
   | device.

 In a way it doesn't - though being able to find the dump (eg: "I know from
 dumpdates or whatever replaces it that there was a level 4 dump done two
 weeks ago - but where is it?") but that was more along the lines of keeping
 the dumpdates file with (somewhere near) the dumps (so if you have found
 a dumpdates file, you have also found the relevant dumps).

 But the version that gets kept in /etc/dumpdates could perhaps contain
 some (optional) extra "where" info - and because that is so variable,
 probably just in the form of a freeform string supplied as an arg to dump - we
 don't have the technology to read the sticky label stuck on a tape, which
 is what would be needed to fully automate tracking, and since I think most
 people dump into a pipe to their favourite compression prog these days,
 dump generally has no idea, and no way to find out, where its output
 eventually ends up.

   | Restore uses restoresymtable to determine whether or not the
   | incremental restore is of the proper date.

 That file is more for deleting files that were deleted before one dump
 and the incremental that follows, if it also serves to check that the
 correct incremental is being restored, that's news ... but I am not sure
 how it really can (except perhaps to make sure that one from earlier
 than the previous restore is not attempted) - it is built from the
 contents of the level N dump when it is restored, and so can only possibly
 have info that existed when that dump was done - what the next dump level
 would be, for the next incremental, cannot possibly be known then, nor is
 it really possible to tell if a level is skipped in the restore sequence,
 as only the changes get dumped, and all the changes do, but when we come
 to a restore, we (ie: restore) cannot tell if a dir/inode is not on the
 backup because it was never changed (between previous and current) or
 whether it was changed, but backed up onto an intermediate level dump that
 happened between the one that has already been restored, and the one that
 is now about to be - except in some quite rare cases (that is, nothing that
 we can rely upon.)

   | Thank you both for chiming in on this. I'd forgotten about the
   | vulnerability of "%s" respective to *scanf().

 That actually supplies a rationale for limiting the scanf() length, which
 might then provide a reason for truncating the output file (%.511s) but
 it really provides no reason at all for the padding, which is really just
 (relatively) harmless meaningless white space (scanf simply skips it).
 (and yes, I agree, it is ugly, though I rarely, if ever, manually look in
 a dumpdates file to see it...)

 That change should probably be undone, but is not important enough if the
 whole dumpdates format is to be revisited anyway.

 But since the filesystem additions to make this even worth starting might
 take a while, perhaps we should just delete the "-511", or change it to
 (%.511s, though the idea of truncating the name is kind of ugly, I'd rather
 rewrite the reading function and get rid of the scanf parsing of the device
 name, sscanf(string, %s) is, after all, just index(string, ' ') followed by
 a strncpy()) from DUMPOUTFMT in protocols/dumprestore.h ... not even sure
 why this stuff is in that file, it is really private to dump, restore
 certainly never touches it.   scanf is convenient for the rest of each line,
 but it could start after the device name has been manually extracted, into
 memory that is malloc'd to be big enough - however big that is.   NAME_MAX
 is a crock.

 kre



From: Greywolf <greywolf@starwolf.com>
To: Robert Elz <kre@munnari.oz.au>, Christos Zoulas <christos@zoulas.com>
Cc: netbsd-bugs@netbsd.org, gnats-bugs@netbsd.org
Subject: Re: bin/52432: /etc/dumpdates format egregious
Date: Fri, 4 Aug 2017 18:43:30 -0700

 Hi, Robert,

 [I wrote]:
 | > Restore uses restoresymtable to determine whether or not the
 | > incremental restore is of the proper date.

 [You replied]:
 |  That file is more for deleting files that were deleted before one dump
 |  and the incremental that follows, if it also serves to check that the
 |  correct incremental is being restored, that's news ... but I am not sure
 |  how it really can (except perhaps to make sure that one from earlier
 |  than the previous restore is not attempted) - it is built from the
 |  contents of the level N dump when it is restored, and so can only possibly
 |  have info that existed when that dump was done - what the next dump level
 |  would be, for the next incremental, cannot possibly be known then, nor is
 |  it really possible to tell if a level is skipped in the restore sequence,
 |  as only the changes get dumped, and all the changes do, but when we come
 |  to a restore, we (ie: restore) cannot tell if a dir/inode is not on the
 |  backup because it was never changed (between previous and current) or
 |  whether it was changed, but backed up onto an intermediate level dump that
 |  happened between the one that has already been restored, and the one that
 |  is now about to be - except in some quite rare cases (that is, nothing that
 |  we can rely upon.)

 Here's my understanding:

 The restoresymtable contains, among the inode map and such:

 * the device which was dumped,
 * possibly the filesystem name associated with the device (if it can figure
    it out),
 * possibly a dump label (optional),
 * the level of the dump last restored from,
 * the date of said dump (the day the files were dumped on),
 * the date SINCE files were dumped (i.e. the date of the previous dump,
    which would have shown on the screen during the dump as
    "Date of last level <X> dump:").

 If you do a level 0 full restore (-r), with the intention of doing an
 incremental restore (handling deletions/moves), your next restore MUST
 come from a dump whose last-dumped-from date matches the date of the
 level 0, and so on, or the incremental restore will fail.

 If an incremental restore depended on /etc/dumpdates, we'd never be able
 to get an integral incremental restore of the root filesystem, nor any
 other filesystem thereafter if we needed to restore the whole thing,
 because the relevant information to the current dump wouldn't be available
 until it finished.

From: Robert Elz <kre@munnari.OZ.AU>
To: Greywolf <greywolf@starwolf.com>
Cc: Christos Zoulas <christos@zoulas.com>, netbsd-bugs@netbsd.org,
        gnats-bugs@netbsd.org
Subject: Re: bin/52432: /etc/dumpdates format egregious
Date: Sat, 05 Aug 2017 12:03:20 +0700

 I think this discussion has drifted away from the point, just rest
 assured that nothing I am considering will have any effect on anything
 related to restore (or restoresymtab) (by which I do not mean that there
 are no improvements possible, just none I can think of at the minute...)

 All I am considering (and have been for years, while doing nothing about
 it, so...) is altering the format/location of the dumpdates file to make
 it a better fit for the way dumps are done today, rather that the way they
 were done in 1979 (or whenever) - which will (almost certainly) mean getting
 rid of 500 bytes of meaningless space padding that has no useful purpose,
 and was the subject of the PR.

 kre
Home
PR Database Search
(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.