NetBSD Problem Report #55177
From www@netbsd.org Wed Apr 15 05:58:19 2020
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id DE4EE1A9219
for <gnats-bugs@gnats.NetBSD.org>; Wed, 15 Apr 2020 05:58:19 +0000 (UTC)
Message-Id: <20200415055818.8C52F1A924B@mollari.NetBSD.org>
Date: Wed, 15 Apr 2020 05:58:18 +0000 (UTC)
From: carenas@gmail.com
Reply-To: carenas@gmail.com
To: gnats-bugs@NetBSD.org
Subject: mremap(MAP_REMAPDUP) fails after fork()
X-Send-Pr-Version: www-1.0
>Number: 55177
>Category: lib
>Synopsis: mremap(MAP_REMAPDUP) fails after fork()
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: lib-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Apr 15 06:00:00 +0000 2020
>Last-Modified: Fri Mar 03 12:50:01 +0000 2023
>Originator: Carlo Arenas
>Release: NetBSD 8.2
>Organization:
>Environment:
System: NetBSD localhost 8.2 NetBSD 8.2 (GENERIC) #0: Tue Mar 31 05:08:40 UTC 2020 mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/i386/compile/GENERIC i386
>Description:
an application that has obtained a pair of maps to the same physical
page as described in mremap(2) won't be able to use the RW map to
update the code to execute in the memory pointed by the RX map after
a fork() on neither the parent or the child.
the problem seems to be related to the way the pages are inherited
between processes by the fork(), and the fact that at least the RW
page is marked as COW and therefore uses a different physical page
when written again that is no longer linked to the RX map.
had reproduced the problem with 9 and current in both i386 and amd64
>How-To-Repeat:
to reproduce, apply following patch and run tests
Index: arch/Makefile.exec_prot
===================================================================
RCS file: /cvsroot/src/tests/lib/libc/arch/Makefile.exec_prot,v
retrieving revision 1.1
diff -u -r1.1 Makefile.exec_prot
--- arch/Makefile.exec_prot 18 Jul 2011 23:16:08 -0000 1.1
+++ arch/Makefile.exec_prot 14 Apr 2020 12:37:16 -0000
@@ -5,4 +5,4 @@
EXEC_PROT_ARCHDIR= ${.PARSEDIR}/${MACHINE_CPU}
.PATH: ${EXEC_PROT_ARCHDIR}
-SRCS_EXEC_PROT= exec_prot_support.c return_one.S
+SRCS_EXEC_PROT= exec_prot_support.c return_one.S return_two.S
Index: arch/i386/return_two.S
===================================================================
RCS file: arch/i386/return_two.S
diff -N arch/i386/return_two.S
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ arch/i386/return_two.S 14 Apr 2020 12:37:16 -0000
@@ -0,0 +1,10 @@
+/* $NetBSD: return_two.S,v 1.1 2011/07/18 23:16:09 carenas Exp $ */
+
+#include <machine/asm.h>
+
+RCSID("$NetBSD: return_two.S,v 1.1 2011/07/18 23:16:09 carenas Exp $");
+
+_ENTRY(return_two)
+ movl $0x2,%eax
+ ret
+LABEL(return_two_end)
Index: common/exec_prot.h
===================================================================
RCS file: /cvsroot/src/tests/lib/libc/common/exec_prot.h,v
retrieving revision 1.1
diff -u -r1.1 exec_prot.h
--- common/exec_prot.h 18 Jul 2011 23:16:11 -0000 1.1
+++ common/exec_prot.h 14 Apr 2020 12:37:16 -0000
@@ -38,10 +38,12 @@
*/
/*
- * Trivial MD shellcode that justs returns 1.
+ * Trivial MD shellcode that justs returns 1 or 2.
*/
int return_one(void); /* begin marker -- shellcode entry */
int return_one_end(void); /* end marker */
+int return_two(void); /* begin marker -- shellcode entry */
+int return_two_end(void); /* end marker */
/*
* MD callback to verify whether host offers executable space protection.
Index: sys/t_mprotect.c
===================================================================
RCS file: /cvsroot/src/tests/lib/libc/sys/t_mprotect.c,v
retrieving revision 1.8
diff -u -r1.8 t_mprotect.c
--- sys/t_mprotect.c 16 Jul 2019 17:29:18 -0000 1.8
+++ sys/t_mprotect.c 14 Apr 2020 12:37:16 -0000
@@ -383,6 +383,58 @@
ATF_REQUIRE(munmap(map2, page) == 0);
}
+ATF_TC(mprotect_mremap_fork_exec);
+ATF_TC_HEAD(mprotect_mremap_fork_exec, tc)
+{
+ atf_tc_set_md_var(tc, "descr",
+ "Test mremap(2)+fork(2)+mprotect(2) executable space protections");
+}
+
+ATF_TC_BODY(mprotect_mremap_fork_exec, tc)
+{
+ void *map, *map2;
+ pid_t pid;
+
+ /*
+ * Map a page read/write/exec and duplicate it.
+ * Map the copy executable.
+ * Copy a function to the writeable mapping and execute it
+ * Fork a child and wait for it
+ * Copy a different function to the writeable mapping and execute it
+ * The original function shouldn't be called
+ */
+
+ map = mmap(NULL, page, PROT_READ|PROT_WRITE|PROT_MPROTECT(PROT_EXEC),
+ MAP_ANON, -1, 0);
+ ATF_REQUIRE(map != MAP_FAILED);
+ map2 = mremap(map, page, NULL, page, MAP_REMAPDUP);
+ ATF_REQUIRE(map2 != MAP_FAILED);
+ ATF_REQUIRE(mprotect(map2, page, PROT_EXEC|PROT_READ) == 0);
+
+ memcpy(map, (void *)return_one,
+ (uintptr_t)return_one_end - (uintptr_t)return_one);
+ __builtin___clear_cache(map, (void *)((uintptr_t)map + page));
+
+ ATF_REQUIRE(((int (*)(void))map2)() == 1);
+
+ pid = fork();
+ ATF_REQUIRE(pid >= 0);
+
+ if (pid == 0)
+ _exit(0);
+
+ (void)wait(NULL);
+
+ memcpy(map, (void *)return_two,
+ (uintptr_t)return_two_end - (uintptr_t)return_two);
+ __builtin___clear_cache(map, (void *)((uintptr_t)map + page));
+
+ ATF_REQUIRE(((int (*)(void))map2)() == 2);
+
+ ATF_REQUIRE(munmap(map, page) == 0);
+ ATF_REQUIRE(munmap(map2, page) == 0);
+}
+
ATF_TP_ADD_TCS(tp)
{
page = sysconf(_SC_PAGESIZE);
>Fix:
a workaround to avoid the pages being protected is to use MAP_SHARED but that will of course just introduce race conditions between parent/child that will result in crashes
>Audit-Trail:
From: "Christos Zoulas" <christos@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/55177 CVS commit: src/tests/lib/libc/sys
Date: Sat, 18 Apr 2020 13:44:53 -0400
Module Name: src
Committed By: christos
Date: Sat Apr 18 17:44:53 UTC 2020
Modified Files:
src/tests/lib/libc/sys: Makefile t_mprotect.c
Added Files:
src/tests/lib/libc/sys: t_mprotect_helper.c
Log Message:
PR/55177: Carlo Arenas: mremap(MAP_REMAPDUP) fails after fork()
To generate a diff of this commit:
cvs rdiff -u -r1.61 -r1.62 src/tests/lib/libc/sys/Makefile
cvs rdiff -u -r1.8 -r1.9 src/tests/lib/libc/sys/t_mprotect.c
cvs rdiff -u -r0 -r1.1 src/tests/lib/libc/sys/t_mprotect_helper.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: Taylor R Campbell <riastradh@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/55177: mremap(MAP_REMAPDUP) fails after fork()
Date: Fri, 3 Mar 2023 12:48:22 +0000
What's happening is:
1. mmap creates an entry in the process's VM map at p for a new
anonymous object, call it A.
2. mremap(MAP_REMAPDUP) creates an entry in the process's VM map at q
for the same anonymous object A.
3. On the next write to p, the CPU will fault, and NetBSD will
allocate a backing page for A, call it G, and map p to G in the
process's page table so the write can complete.
4. On the next read from q, the CPU will fault, and NetBSD will find G
from A in the VM map entry for q, and map q to G in the process's
page table so the read can complete and return what was written
through p.
5. fork marks all the VM map entries copy-on-write, and updates the
page table to make the pages nonwritable so the CPU will fault on
writes to them.
6. On the next write to virtual address p, the CPU will fault, and
NetBSD will see that the entry in the VM map at p is copy-on-write,
so it will create a new anonymous object, call it A', and allocate
a new backing page, call it G'. It will update the VM map entry at
p for A' and the page table entry at p for G'.
7. On the next read from virtual address q, the CPU will read from the
old page G -- without faulting or consulting the kernel, because
nothing changed the page table entry at q -- and return the stale
value.
Workarounds:
(a) Use MAP_SHARED to create the mapping.
Using MAP_SHARED means in step (6) NetBSD will not create a new
anonymous object, so the anonymous object and backing page remain
unchanged: both p and q will be mapped to A by the VM map, and to
G by the page table. But it also means the child will inherit
them, and writes issued in the child will be reflected by reads
issued in the parent -- which may pose security or synchronization
issues.
(b) Use minherit(MAP_INHERIT_NONE) or minherit(MAP_INHERIT_ZERO) on
the writable mapping (as long as there is only one writable
mapping).
This way the writable mapping won't be cloned on fork, so it won't
ever have the copy-on-write logic that breaks the coupling between
the mappings. Of course, this means the child _won't_ inherit the
writable mapping, so writes by the child can't influence the
readable mapping.
(c) Don't fork; use posix_spawn.
It would seem sensible to me for the coupled mappings to be copied
wholesale into the child on write -- that is, the parent maintains
coupled mappings of the same backing object, and the child maintains
coupled mappings of a copy of the backing object. This way, parent
and child would get their own copy of the data without influencing the
other, but each process still has coupled mappings like MAP_REMAPDUP
originally created.
Unfortunately, this will take a bit of work to implement correctly.
As a stop-gap, we're considering making MAP_REMAPDUP have the side
effect of minherit(MAP_INHERIT_NONE) on both mappings so that at least
calling fork() in a process that uses JIT compilation with the code
sample in the mremap(2) man page won't have the side effect of
breaking the parent.
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.