NetBSD Problem Report #55177

From www@netbsd.org  Wed Apr 15 05:58:19 2020
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id DE4EE1A9219
	for <gnats-bugs@gnats.NetBSD.org>; Wed, 15 Apr 2020 05:58:19 +0000 (UTC)
Message-Id: <20200415055818.8C52F1A924B@mollari.NetBSD.org>
Date: Wed, 15 Apr 2020 05:58:18 +0000 (UTC)
From: carenas@gmail.com
Reply-To: carenas@gmail.com
To: gnats-bugs@NetBSD.org
Subject: mremap(MAP_REMAPDUP) fails after fork()
X-Send-Pr-Version: www-1.0

>Number:         55177
>Category:       lib
>Synopsis:       mremap(MAP_REMAPDUP) fails after fork()
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    lib-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Apr 15 06:00:00 +0000 2020
>Last-Modified:  Fri Mar 03 12:50:01 +0000 2023
>Originator:     Carlo Arenas
>Release:        NetBSD 8.2
>Organization:
>Environment:
System: NetBSD localhost 8.2 NetBSD 8.2 (GENERIC) #0: Tue Mar 31 05:08:40 UTC 2020 mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/i386/compile/GENERIC i386
>Description:
an application that has obtained a pair of maps to the same physical
page as described in mremap(2) won't be able to use the RW map to
update the code to execute in the memory pointed by the RX map after
a fork() on neither the parent or the child.

the problem seems to be related to the way the pages are inherited
between processes by the fork(), and the fact that at least the RW
page is marked as COW and therefore uses a different physical page
when written again that is no longer linked to the RX map.

had reproduced the problem with 9 and current in both i386 and amd64
>How-To-Repeat:
to reproduce, apply following patch and run tests

Index: arch/Makefile.exec_prot
===================================================================
RCS file: /cvsroot/src/tests/lib/libc/arch/Makefile.exec_prot,v
retrieving revision 1.1
diff -u -r1.1 Makefile.exec_prot
--- arch/Makefile.exec_prot     18 Jul 2011 23:16:08 -0000      1.1
+++ arch/Makefile.exec_prot     14 Apr 2020 12:37:16 -0000
@@ -5,4 +5,4 @@
 EXEC_PROT_ARCHDIR=     ${.PARSEDIR}/${MACHINE_CPU}
 .PATH: ${EXEC_PROT_ARCHDIR}

-SRCS_EXEC_PROT=                exec_prot_support.c return_one.S
+SRCS_EXEC_PROT=                exec_prot_support.c return_one.S return_two.S
Index: arch/i386/return_two.S
===================================================================
RCS file: arch/i386/return_two.S
diff -N arch/i386/return_two.S
--- /dev/null   1 Jan 1970 00:00:00 -0000
+++ arch/i386/return_two.S      14 Apr 2020 12:37:16 -0000
@@ -0,0 +1,10 @@
+/*     $NetBSD: return_two.S,v 1.1 2011/07/18 23:16:09 carenas Exp $ */
+
+#include <machine/asm.h>
+
+RCSID("$NetBSD: return_two.S,v 1.1 2011/07/18 23:16:09 carenas Exp $");
+
+_ENTRY(return_two)
+       movl    $0x2,%eax
+       ret
+LABEL(return_two_end)
Index: common/exec_prot.h
===================================================================
RCS file: /cvsroot/src/tests/lib/libc/common/exec_prot.h,v
retrieving revision 1.1
diff -u -r1.1 exec_prot.h
--- common/exec_prot.h  18 Jul 2011 23:16:11 -0000      1.1
+++ common/exec_prot.h  14 Apr 2020 12:37:16 -0000
@@ -38,10 +38,12 @@
  */

 /*
- * Trivial MD shellcode that justs returns 1.
+ * Trivial MD shellcode that justs returns 1 or 2.
  */
 int return_one(void);     /* begin marker -- shellcode entry */
 int return_one_end(void); /* end marker */
+int return_two(void);     /* begin marker -- shellcode entry */
+int return_two_end(void); /* end marker */

 /*
  * MD callback to verify whether host offers executable space protection.
Index: sys/t_mprotect.c
===================================================================
RCS file: /cvsroot/src/tests/lib/libc/sys/t_mprotect.c,v
retrieving revision 1.8
diff -u -r1.8 t_mprotect.c
--- sys/t_mprotect.c    16 Jul 2019 17:29:18 -0000      1.8
+++ sys/t_mprotect.c    14 Apr 2020 12:37:16 -0000
@@ -383,6 +383,58 @@
        ATF_REQUIRE(munmap(map2, page) == 0);
 }

+ATF_TC(mprotect_mremap_fork_exec);
+ATF_TC_HEAD(mprotect_mremap_fork_exec, tc)
+{
+       atf_tc_set_md_var(tc, "descr",
+           "Test mremap(2)+fork(2)+mprotect(2) executable space protections");
+}
+
+ATF_TC_BODY(mprotect_mremap_fork_exec, tc)
+{
+       void *map, *map2;
+       pid_t pid;
+
+       /*
+        * Map a page read/write/exec and duplicate it.
+        * Map the copy executable.
+        * Copy a function to the writeable mapping and execute it
+        * Fork a child and wait for it
+        * Copy a different function to the writeable mapping and execute it
+        * The original function shouldn't be called
+        */
+
+       map = mmap(NULL, page, PROT_READ|PROT_WRITE|PROT_MPROTECT(PROT_EXEC),
+           MAP_ANON, -1, 0);
+       ATF_REQUIRE(map != MAP_FAILED);
+       map2 = mremap(map, page, NULL, page, MAP_REMAPDUP);
+       ATF_REQUIRE(map2 != MAP_FAILED);
+       ATF_REQUIRE(mprotect(map2, page, PROT_EXEC|PROT_READ) == 0);
+
+       memcpy(map, (void *)return_one,
+           (uintptr_t)return_one_end - (uintptr_t)return_one);
+       __builtin___clear_cache(map, (void *)((uintptr_t)map + page));
+
+       ATF_REQUIRE(((int (*)(void))map2)() == 1);
+
+       pid = fork();
+       ATF_REQUIRE(pid >= 0);
+
+       if (pid == 0)
+               _exit(0);
+
+       (void)wait(NULL);
+
+       memcpy(map, (void *)return_two,
+           (uintptr_t)return_two_end - (uintptr_t)return_two);
+       __builtin___clear_cache(map, (void *)((uintptr_t)map + page));
+
+       ATF_REQUIRE(((int (*)(void))map2)() == 2);
+
+       ATF_REQUIRE(munmap(map, page) == 0);
+       ATF_REQUIRE(munmap(map2, page) == 0);
+}
+
 ATF_TP_ADD_TCS(tp)
 {
        page = sysconf(_SC_PAGESIZE);


>Fix:
a workaround to avoid the pages being protected is to use MAP_SHARED but that will of course just introduce race conditions between parent/child that will result in crashes

>Audit-Trail:
From: "Christos Zoulas" <christos@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/55177 CVS commit: src/tests/lib/libc/sys
Date: Sat, 18 Apr 2020 13:44:53 -0400

 Module Name:	src
 Committed By:	christos
 Date:		Sat Apr 18 17:44:53 UTC 2020

 Modified Files:
 	src/tests/lib/libc/sys: Makefile t_mprotect.c
 Added Files:
 	src/tests/lib/libc/sys: t_mprotect_helper.c

 Log Message:
 PR/55177: Carlo Arenas: mremap(MAP_REMAPDUP) fails after fork()


 To generate a diff of this commit:
 cvs rdiff -u -r1.61 -r1.62 src/tests/lib/libc/sys/Makefile
 cvs rdiff -u -r1.8 -r1.9 src/tests/lib/libc/sys/t_mprotect.c
 cvs rdiff -u -r0 -r1.1 src/tests/lib/libc/sys/t_mprotect_helper.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Taylor R Campbell <riastradh@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/55177: mremap(MAP_REMAPDUP) fails after fork()
Date: Fri, 3 Mar 2023 12:48:22 +0000

 What's happening is:

 1. mmap creates an entry in the process's VM map at p for a new
    anonymous object, call it A.

 2. mremap(MAP_REMAPDUP) creates an entry in the process's VM map at q
    for the same anonymous object A.

 3. On the next write to p, the CPU will fault, and NetBSD will
    allocate a backing page for A, call it G, and map p to G in the
    process's page table so the write can complete.

 4. On the next read from q, the CPU will fault, and NetBSD will find G
    from A in the VM map entry for q, and map q to G in the process's
    page table so the read can complete and return what was written
    through p.

 5. fork marks all the VM map entries copy-on-write, and updates the
    page table to make the pages nonwritable so the CPU will fault on
    writes to them.

 6. On the next write to virtual address p, the CPU will fault, and
    NetBSD will see that the entry in the VM map at p is copy-on-write,
    so it will create a new anonymous object, call it A', and allocate
    a new backing page, call it G'.  It will update the VM map entry at
    p for A' and the page table entry at p for G'.

 7. On the next read from virtual address q, the CPU will read from the
    old page G -- without faulting or consulting the kernel, because
    nothing changed the page table entry at q -- and return the stale
    value.

 Workarounds:

 (a) Use MAP_SHARED to create the mapping.

     Using MAP_SHARED means in step (6) NetBSD will not create a new
     anonymous object, so the anonymous object and backing page remain
     unchanged: both p and q will be mapped to A by the VM map, and to
     G by the page table.  But it also means the child will inherit
     them, and writes issued in the child will be reflected by reads
     issued in the parent -- which may pose security or synchronization
     issues.

 (b) Use minherit(MAP_INHERIT_NONE) or minherit(MAP_INHERIT_ZERO) on
     the writable mapping (as long as there is only one writable
     mapping).

     This way the writable mapping won't be cloned on fork, so it won't
     ever have the copy-on-write logic that breaks the coupling between
     the mappings.  Of course, this means the child _won't_ inherit the
     writable mapping, so writes by the child can't influence the
     readable mapping.

 (c) Don't fork; use posix_spawn.

 It would seem sensible to me for the coupled mappings to be copied
 wholesale into the child on write -- that is, the parent maintains
 coupled mappings of the same backing object, and the child maintains
 coupled mappings of a copy of the backing object.  This way, parent
 and child would get their own copy of the data without influencing the
 other, but each process still has coupled mappings like MAP_REMAPDUP
 originally created.

 Unfortunately, this will take a bit of work to implement correctly.

 As a stop-gap, we're considering making MAP_REMAPDUP have the side
 effect of minherit(MAP_INHERIT_NONE) on both mappings so that at least
 calling fork() in a process that uses JIT compilation with the code
 sample in the mremap(2) man page won't have the side effect of
 breaking the parent.

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.