NetBSD Problem Report #54222
From leot@netbsd.org Wed May 22 20:11:16 2019
Return-Path: <leot@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id C31F67A13C
for <gnats-bugs@gnats.NetBSD.org>; Wed, 22 May 2019 20:11:16 +0000 (UTC)
Message-Id: <20190522201116.64F9A84D31@mail.netbsd.org>
Date: Wed, 22 May 2019 20:11:16 +0000 (UTC)
From: leot@NetBSD.org
Reply-To: leot@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: mount_portal(8) invalid free() after src/sbin/mount_portal/puffs_portal.c,-r1.9
X-Send-Pr-Version: 3.95
>Number: 54222
>Category: bin
>Synopsis: mount_portal(8) invalid free after src/sbin/mount_portal/puffs_portal.c,-r1.9
>Confidential: no
>Severity: serious
>Priority: low
>Responsible: bin-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed May 22 20:15:00 +0000 2019
>Last-Modified: Thu May 23 11:15:01 +0000 2019
>Originator: Leonardo Taccari
>Release: NetBSD 8.99.38
>Organization:
Università Politecnica delle Marche
>Environment:
System: NetBSD abacus 8.99.38 NetBSD 8.99.38 (GENERIC) #1: Mon May 6 14:13:15 CEST 2019 leot@abacus:/usr/obj/sys/arch/amd64/compile/GENERIC amd64
Architecture: x86_64
Machine: amd64
>Description:
mount_portal(8) crashes due an invalid free (trying to free
0x5a5a5a5a5a5a5a5a).
>How-To-Repeat:
Given the following portal.conf:
# cat portal.conf
e/ rfilter e/ echo %s
...then mounting it:
# mount_portal /tmp/m/portal.conf /tmp/m/p
...and then doing:
% head -1 /tmp/m/p/e/bar
should be enough to reproduce it (invalid free() in
portal_node_reclaim()).
If we attach gdb, set a breakpoint to potral_node_reclaim() and then
doing `head -1 /tmp/m/p/e/bar':
# gdb -p `pgrep mount_portal`
GNU gdb (GDB) 8.0.1
[...]
(gdb) b portal_node_reclaim
Breakpoint 1 at 0x1aa403040: file /usr/src/sbin/mount_portal/puffs_portal.c, line 805.
(gdb) c
Continuing.
[Switching to LWP 1 of process 29602]
[... `head -1 /tmp/m/p/e/bar' is invoked and the breakpoint hitted ...]
Thread 1 "" hit Breakpoint 1, portal_node_reclaim (pu=0x7ed8ec7ab000, opc=0x7ed8ec7b7020)
at /usr/src/sbin/mount_portal/puffs_portal.c:805
805 {
(gdb) bt
#0 portal_node_reclaim (pu=0x7ed8ec7ab000, opc=0x7ed8ec7b7020) at /usr/src/sbin/mount_portal/puffs_portal.c:805
#1 0x00000001aa403528 in portal_node_getattr (pu=0x7ed8ec7ab000, opc=0x7ed8ec7b7020, va=0x7ed8ec7aa090, pcr=0x7ed8ec7aa038)
at /usr/src/sbin/mount_portal/puffs_portal.c:593
#2 0x00007ed8ec008d50 in dispatch (pcc=pcc@entry=0x7ed8ec740000) at /usr/src/lib/libpuffs/dispatcher.c:483
#3 0x00007ed8ec009495 in puffs__ml_dispatch (pu=0x7ed8ec7ab000, pb=0x7ed8ec795050) at /usr/src/lib/libpuffs/dispatcher.c:64
#4 0x00007ed8ec00b90e in puffs__framev_input (pu=pu@entry=0x7ed8ec7ab000, fctrl=0x7ed8ec7ab5f8, fio=fio@entry=0x7ed8ec79d060)
at /usr/src/lib/libpuffs/framebuf.c:699
#5 0x00007ed8ec00d80b in puffs__theloop (pcc=<optimized out>) at /usr/src/lib/libpuffs/puffs.c:909
#6 0x00007ed8eb86b3a0 in ?? () from /lib/libc.so.12
Backtrace stopped: Cannot access memory at address 0x7ed8ec780000
(gdb) c
Continuing.
Thread 1 "" hit Breakpoint 1, portal_node_reclaim (pu=0x7ed8ec7ab000, opc=0x7ed8ec7b7020)
at /usr/src/sbin/mount_portal/puffs_portal.c:805
805 {
(gdb) bt
#0 portal_node_reclaim (pu=0x7ed8ec7ab000, opc=0x7ed8ec7b7020) at /usr/src/sbin/mount_portal/puffs_portal.c:805
#1 0x00007ed8ec008e3b in dispatch (pcc=pcc@entry=0x7ed8ec740000) at /usr/src/lib/libpuffs/dispatcher.c:876
#2 0x00007ed8ec009495 in puffs__ml_dispatch (pu=0x7ed8ec7ab000, pb=0x7ed8ec795050) at /usr/src/lib/libpuffs/dispatcher.c:64
#3 0x00007ed8ec00b90e in puffs__framev_input (pu=pu@entry=0x7ed8ec7ab000, fctrl=0x7ed8ec7ab5f8, fio=fio@entry=0x7ed8ec79d060)
at /usr/src/lib/libpuffs/framebuf.c:699
#4 0x00007ed8ec00d80b in puffs__theloop (pcc=<optimized out>) at /usr/src/lib/libpuffs/puffs.c:909
#5 0x00007ed8eb86b3a0 in ?? () from /lib/libc.so.12
Backtrace stopped: Cannot access memory at address 0x7ed8ec780000
(gdb) c
Continuing.
<jemalloc>: /usr/src/external/bsd/jemalloc/lib/../dist/src/rtree.c:205: Failed assertion: "!dependent || leaf != NULL"
Thread 1 "" received signal SIGABRT, Aborted.
0x00007ed8eb9a4f9a in _lwp_kill () from /lib/libc.so.12
(gdb) bt
#0 0x00007ed8eb9a4f9a in _lwp_kill () from /lib/libc.so.12
#1 0x00007ed8eb9a4bc7 in abort () at /usr/src/lib/libc/stdlib/abort.c:74
#2 0x00007ed8eb8de8f2 in rtree_child_leaf_tryread (elm=<optimized out>, dependent=<optimized out>)
at /usr/src/external/bsd/jemalloc/lib/../dist/src/rtree.c:205
#3 0x00007ed8eb8deb9c in je_rtree_leaf_elm_lookup_hard (tsdn=<optimized out>, rtree=0x7ed8ebc1d6a0 <je_extents_rtree>,
rtree_ctx=0x7ed8ec7c6068, rtree_ctx@entry=0x7ed8ec77fe04, key=6510615555426900570, key@entry=139470145322456,
dependent=dependent@entry=true, init_missing=init_missing@entry=false)
at /usr/src/external/bsd/jemalloc/lib/../dist/src/rtree.c:292
#4 0x00007ed8eb932f5b in rtree_leaf_elm_lookup (rtree=<optimized out>, init_missing=false, dependent=true,
key=key@entry=139470145322456, rtree_ctx=rtree_ctx@entry=0x7ed8ec77fe04, tsdn=tsdn@entry=0x7ed8ec77fddc)
at /usr/src/external/bsd/jemalloc/lib/../include/jemalloc/internal/rtree.h:381
#5 rtree_read (rtree=<optimized out>, dependent=true, key=key@entry=139470145322456, rtree_ctx=rtree_ctx@entry=0x7ed8ec77fe04,
tsdn=tsdn@entry=0x7ed8ec77fddc) at /usr/src/external/bsd/jemalloc/lib/../include/jemalloc/internal/rtree.h:406
#6 rtree_szind_slab_read (tsdn=tsdn@entry=0x7ed8ec7c6040, rtree_ctx=rtree_ctx@entry=0x7ed8ec7c6068,
key=key@entry=6510615555426900570, r_szind=r_szind@entry=0x7ed8ec77fdd8, r_slab=r_slab@entry=0x7ed8ec77fddc, dependent=true,
rtree=<optimized out>) at /usr/src/external/bsd/jemalloc/lib/../include/jemalloc/internal/rtree.h:458
#7 0x00007ed8eb9368c6 in ifree (tsd=0x7ed8ec7c6040, ptr=0x5a5a5a5a5a5a5a5a, tcache=0x7ed8ec7c6200, slow_path=<optimized out>)
at /usr/src/external/bsd/jemalloc/lib/../dist/src/jemalloc.c:2239
#8 0x00007ed8eb93a255 in free (ptr=0x5a5a5a5a5a5a5a5a) at /usr/src/external/bsd/jemalloc/lib/../dist/src/jemalloc.c:2433
#9 0x00000001aa403063 in portal_node_reclaim (pu=<optimized out>, opc=0x7ed8ec7b7020) at /usr/src/sbin/mount_portal/puffs_portal.c:812
#10 0x00007ed8ec008e3b in dispatch (pcc=pcc@entry=0x7ed8ec740000) at /usr/src/lib/libpuffs/dispatcher.c:876
#11 0x00007ed8ec009495 in puffs__ml_dispatch (pu=0x7ed8ec7ab000, pb=0x7ed8ec795050) at /usr/src/lib/libpuffs/dispatcher.c:64
#12 0x00007ed8ec00b90e in puffs__framev_input (pu=pu@entry=0x7ed8ec7ab000, fctrl=0x7ed8ec7ab5f8, fio=fio@entry=0x7ed8ec79d060)
at /usr/src/lib/libpuffs/framebuf.c:699
#13 0x00007ed8ec00d80b in puffs__theloop (pcc=<optimized out>) at /usr/src/lib/libpuffs/puffs.c:909
#14 0x00007ed8eb86b3a0 in ?? () from /lib/libc.so.12
Backtrace stopped: Cannot access memory at address 0x7ed8ec780000
>Fix:
No idea ATM, sorry. As a possible workaround reverting
puffs_portal.c to -r1.8 avoids that.
>Audit-Trail:
From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: bin/54222: mount_portal(8) invalid free() after src/sbin/mount_portal/puffs_portal.c,-r1.9
Date: Thu, 23 May 2019 06:02:32 +0700
Date: Wed, 22 May 2019 20:15:00 +0000 (UTC)
From: leot=40NetBSD.org
Message-ID: <20190522201500.7C5B37A1EE=40mollari.NetBSD.org>
=7C No idea ATM, sorry. As a possible workaround reverting
=7C puffs_portal.c to -r1.8 avoids that.
It looks to me as if this code (the original, and the mod in 1.9)
are half baked...
1.9 (in a section of code prefixed by the comment /* cheat for now */
so one might anticipate that not all is perfect) a call was added to
portal_node_reclaim() - which had been in the source since version 1.1
but never called before.
The objective seems to be to correctly close the file descriptor
that was opened by an earlier added call of provide() also added in 1.9
But portal_node_reclaim() also makes two calls to free() - which don't
seem to be useful for anything added in 1.9 (nothing new was allocated)
and, as portal_node_reclaim() was never previously called, are unlikely
to have ever been needed.
Can you try simply deleting those 2 calls to free() in portal_node_reclai=
m()
(almost the last two lines in puffs_portal.c) and see if that improves
the situation.
kre
From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: bin/54222: mount_portal(8) invalid free() after src/sbin/mount_portal/puffs_portal.c,-r1.9
Date: Thu, 23 May 2019 06:27:49 +0700
No, that's the wrong fix, I missed the
PUFFSOP_SET(pops, portal, node, reclaim);
as it doesn't literally mention portal_node_reclaim ...
In that case, I'd add a new func which does the same as
portal_node_reclaim() without the two free() calls, and
call that new func where portal_node_reclaim() is currently
explicitly called.
Either that or simply inline the two relevant lines in place
of the portal_node_reclaim() call - we already know fd is valid
so no need for an extra test.
kre
From: Leonardo Taccari <leot@NetBSD.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: bin/54222: mount_portal(8) invalid free() after src/sbin/mount_portal/puffs_portal.c,-r1.9
Date: Thu, 23 May 2019 12:04:16 +0200
Hello Robert,
Robert Elz writes:
> [...]
> In that case, I'd add a new func which does the same as
> portal_node_reclaim() without the two free() calls, and
> call that new func where portal_node_reclaim() is currently
> explicitly called.
>
> Either that or simply inline the two relevant lines in place
> of the portal_node_reclaim() call - we already know fd is valid
> so no need for an extra test.
> [...]
Thank you! I can confirm that by avoiding calling portal_node_reclaim()
in portal_node_getattr() and inlining relevant lines instead fixes
the problem reported (for completeness patch attached).
However, the results can be a bit surprising. At least in the `cp'
example (that was also mentioned in the -r1.9 commit message) I
would expect the file copied containing `foo\n' but the resulting
file is just an empty file:
% cat /tmp/m/p/e/foo
foo
% head -1 /tmp/m/p/e/foo
% cp /tmp/m/p/e/foo /tmp/
% cat /tmp/foo
%
(I think that also `cp' usages described in examples/advanced.1
and examples/cvs.1 does not do what is probably expected (copying
real file fetching from ftp:// or CVS.)
Thank you again!
Index: puffs_portal.c
===================================================================
RCS file: /cvsroot/src/sbin/mount_portal/puffs_portal.c,v
retrieving revision 1.9
diff -u -p -r1.9 puffs_portal.c
--- puffs_portal.c 10 May 2017 16:35:18 -0000 1.9
+++ puffs_portal.c 23 May 2019 10:03:10 -0000
@@ -590,7 +590,8 @@ portal_node_getattr(struct puffs_usermou
va->va_ctime = st.st_ctim;
va->va_mtime = st.st_mtim;
va->va_birthtime = st.st_birthtim;
- portal_node_reclaim(pu, opc);
+ puffs_framev_removefd(pu, portn->fd, 0);
+ close(portn->fd);
}
return 0;
From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: bin/54222: mount_portal(8) invalid free() after src/sbin/mount_portal/puffs_portal.c,-r1.9
Date: Thu, 23 May 2019 18:06:06 +0700
Date: Thu, 23 May 2019 10:05:01 +0000 (UTC)
From: Leonardo Taccari <leot@NetBSD.org>
Message-ID: <20190523100501.F36277A1AB@mollari.NetBSD.org>
| Thank you! I can confirm that by avoiding calling portal_node_reclaim()
| in portal_node_getattr() and inlining relevant lines instead fixes
| the problem reported (for completeness patch attached).
Yes, I had done that myself, and ...
The patch needed is (I think - or at least, without really understanding
enough about the underlying operations, to be ultra safe) that it needs
to be a little more complicated that that simple version. Eventually
someone who understands this stuff well can work out if my extra
complications are needed, and if not, rip them out.
| However, the results can be a bit surprising.
Yes, I saw that as well (cat worked, head/sed/awk/...) did not.
I have fixes for that as well.
All is still not good however, ls (and similar, stat, ...) show
what should be directories as if they were files. In general the
file type isn't really getting passed back correctly, and so far
I have not worked out why.
I think I will commit what I have so far, and then keep trying to
see if the file type issue can get fixed.
These fixes probably ought to be pulled up to -8 (whether before
or after 8.1 isn't my decision to make, fortunately) - but I can't
really test this stuff on -8 (my kernels don't generally have
PUFFS enabled, so to test this on current I had to make a special
kernel config). So, if these changes are to get pulled up,
someone else will need to test them in -8 (shouldn't be hard to
apply, doesn't look as if any of the relevant code has been touched
in years) and make the request.
kre
From: "Robert Elz" <kre@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/54222 CVS commit: src/sbin/mount_portal
Date: Thu, 23 May 2019 11:13:17 +0000
Module Name: src
Committed By: kre
Date: Thu May 23 11:13:17 UTC 2019
Modified Files:
src/sbin/mount_portal: pt_file.c puffs_portal.c
Log Message:
PR bin/54222
Don't use portal_node_reclaim() inappropriately. It frees data we
did not allocate, but which might have been allocated by someone else.
While here, various other cleanups (avoid losing fd's if fork fails,
don't compose mangled st_mode S_IFMT values - puffs or's in what it
thinks is correct to the value we set, one case I saw was producing
0110600 for the mode, the 011 isn't any defined type at all - I'd
never seen ls print a '?' as the first char of ls -l output before!
This is still not really correct, but is I believe, better than before.
To generate a diff of this commit:
cvs rdiff -u -r1.19 -r1.20 src/sbin/mount_portal/pt_file.c
cvs rdiff -u -r1.9 -r1.10 src/sbin/mount_portal/puffs_portal.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
(Contact us)
$NetBSD: query-full-pr,v 1.43 2018/01/16 07:36:43 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.