NetBSD Problem Report #56828

From wiz@yt.nih.at  Fri May 13 09:52:10 2022
Return-Path: <wiz@yt.nih.at>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id E4F591A9239
	for <gnats-bugs@gnats.NetBSD.org>; Fri, 13 May 2022 09:52:10 +0000 (UTC)
Message-Id: <20220513093416.AE0001CB6ABE@yt.nih.at>
Date: Fri, 13 May 2022 11:34:16 +0200 (CEST)
From: Thomas Klausner <wiz@NetBSD.org>
Reply-To: Thomas Klausner <wiz@NetBSD.org>
To: gnats-bugs@NetBSD.org
Subject: futex calls in Linux emulation sometimes hang
X-Send-Pr-Version: 3.95

>Number:         56828
>Category:       kern
>Synopsis:       futex calls in Linux emulation sometimes hang
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          needs-pullups
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri May 13 09:55:00 +0000 2022
>Closed-Date:    
>Last-Modified:  Wed Mar 05 20:10:00 +0000 2025
>Originator:     Thomas Klausner
>Release:        NetBSD 9.99.96
>Organization:

>Environment:


Architecture: x86_64
Machine: amd64
>Description:
I'm using a Java application running under Linux emulation.
It sometimes hangs and ps says it's in futex:

ps -alxwww | grep java

1000 20124 19935 581925  83  0 37599660 110892 futex   Sl+  ttyp7    0:02.57 /usr/pkg/java/oracle-8/bin/java -cp /usr/pkg/PDF-Over/lib/* at.asit.pdfover.gui.Main file.pdf

Usually I can CTRL-C and restart it, and about 1/3 or 1/5 of tries
I can finish what I want to do with it.

>How-To-Repeat:
Install PDF-Over by downloading
https://webstart.buergerkarte.at/PDF-Over/setup_pdf-over_linux.jar

Install it using oracle-jre8-8.0.202:

oracle8-java -jar setup_pdf-over_linux.jar

Run it, using some variant of

/usr/pkg/bin/oracle8-java -cp "/installation/path/PDF-Over/lib/*" at.asit.pdfover.gui.Main "$@"

on a PDF file.
Sometimes it hangs before displaying the first screen, i.e. it pops up
a new window but that window stays gray. If you get further than that, just
ctrl-c and try again.

>Fix:
Yes, please!

>Release-Note:

>Audit-Trail:
From: Thomas Klausner <wiz@NetBSD.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/56828: futex calls in Linux emulation sometimes hang
Date: Fri, 13 May 2022 18:08:48 +0200

 I'm told that this can also be reproduced in the following way:

 Using the Metalworks demo
 (/usr/pkg/java/openjdk8/demo/jfc/Metalworks/Metalworks.jar from the
 opendjk8 package):

 Start it, select the "About" dialog from the help menu and click the OK button.

 I forgot to mention that this wasn't working better on the
 thorpej-futex2 branch last I tried.
  Thomas

From: Thomas Klausner <wiz@NetBSD.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/56828: futex calls in Linux emulation sometimes hang
Date: Fri, 16 Sep 2022 14:05:07 +0200

 Some information from debugging with riastradh, OCR'd and handfixed:

 crash> ps/w | grep futex
 7203 8445 java linux 43 futex ffffa3212cc45ed0
 7203 8701 java linux 43 futex ffffa3212cc3eed0
 ...
 many many more of these.

 (gdb) p futex_tab
 $1 = (lock = {u = (mtxa_owner = 0, s = (mtxs_dummy = 0 '\000', mtxs_ipl = {_ipl = 0 '\000'}, mtxs_lock = 0 '\000', mtxs_unused = 0 '\000'}}}, va = {rbt_root = 0x0, rbt_ops = 0xffffffff81375ba0 <futex_rb_ops>,
 rbt_minmax = {0x0, 0x0}}, oa = {rbt_root = 0xffff8039324615a0, rbt_ops = 0xffffffff81375b80 <futex_shared_rb_ops>, rbt_minmax = {0xffff8035d9ac71e0, 0xffff8035fa7f0c60}}}
 (gdb) print ((struct futex_wait *) (0xffffa3212cc45ed0
 - (size_t)&((struct futex_wait *)0)->fw_cv))
 $2 = (struct futex wait *) 0xffffa3212cc45ec8
 (gd) print &((struct futex_wait *)0)->fw_cv
 $3 = (kcondvar t *) 0x8
 (gdb) print _Alignof (struct futex_wait)
 $4 = 8
 (gdb) print *((struct futex_wait *) (0xffffa3212cc45ed0 - (size_t)&((struct futex_wait *)0)->fw_cv))
 $5 = (fw lock = {u = {mtxa_owner = 0, s = {mtxs_dummy = 0 '\000', mtxs_ipl = { ipl = 0 '\000'}, mtxs_lock = 0 '\000', mtxs_unused = 0 '\000'}}}, fw_cv = {cv_opaque = 0xffff803897cb8100, 0xffffffff813e61a9}}, fw_futex = 0xffff803873cf9400, fw_entry = {tqe_next = 0x0, tqe_prev = 0xffff803873cf9440}, fw_abort = (le_next = 0xca, le_prev = 0xffffa3212cc45f407, fw_bitset=-1, fw_aborting = false}
 (gdb) x/s ((struct futex_wait *)(0xffffa3212cc45ed0 - (size_t)&((struct futex_wait *)0)->fw_cv))->fw_cv->cv_opaque[1]
 Oxffffffff813e61a9: "futex"
 (gdb) print *((struct futex_wait *)(0xffffa3212cceed0 - (size_t)&((struct futex_wait *)0)->fw_cv))
 $6 = {fw lock = {u = (mtxa_owner = 0, s = {mtxs_dummy = 0 '\000', mtxs_ipl = { ipl = 0 '\000'}, mtxs_lock = 0 '\000', mtxs_unused = 0 '\000'}}}, fw_cv = {cv_opaque = {0xffff80389a4d3940, 0xffffffff813e61a93}, fw_futex = 0xffff8038720bcac0, fw_entry = {tqe_next = 0x0, tqe_prev = 0xffff8038720bcb00}, fw_abort = {le_next = Oxffff8035da5d3100, le_prev = 0xffffa3212cc3ef307, fw_bitset = -1, fw_aborting = false }

From: Thomas Klausner <wiz@NetBSD.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/56828: futex calls in Linux emulation sometimes hang
Date: Sat, 25 Mar 2023 06:44:46 +0100

 I compiled
 https://github.com/linux-test-project/ltp/releases/download/20230127/ltp-full-20230127.tar.bz2
 on a CentOS (exact version unknown, sorry) and copied the contents of
 testcases/kernel/syscalls/futex (see
 https://github.com/linux-test-project/ltp/tree/master/testcases/kernel/syscalls/futex)
 to a NetBSD 10.99.2/amd64 system and ran them. The output shows quite
 a number of problems (look for 'failed' and 'broken' below):

 *** futex_cmp_requeue01 ***

 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 futex_cmp_requeue01.c:194: TINFO: Testing variant: syscall with old kernel spec
 futex_cmp_requeue01.c:103: TINFO: Test 0: waiters: 10, wakes: 3, requeues: 7
 futex_cmp_requeue01.c:126: TINFO: futex_cmp_requeue() returned 3
 futex_cmp_requeue01.c:140: TINFO: children woken, futex0: 0, futex1: 7, spurious wakeups: 0
 futex_cmp_requeue01.c:180: TFAIL: woken up -4, expected range (3, 3)
 futex_cmp_requeue01.c:103: TINFO: Test 1: waiters: 10, wakes: 0, requeues: 10
 futex_cmp_requeue01.c:126: TINFO: futex_cmp_requeue() returned 0
 futex_cmp_requeue01.c:140: TINFO: children woken, futex0: 0, futex1: 10, spurious wakeups: 0
 futex_cmp_requeue01.c:180: TFAIL: woken up -10, expected range (0, 0)
 futex_cmp_requeue01.c:103: TINFO: Test 2: waiters: 10, wakes: 2, requeues: 6
 futex_cmp_requeue01.c:126: TINFO: futex_cmp_requeue() returned 2
 futex_cmp_requeue01.c:140: TINFO: children woken, futex0: 2, futex1: 6, spurious wakeups: 0
 futex_cmp_requeue01.c:180: TFAIL: woken up -4, expected range (2, 2)
 futex_cmp_requeue01.c:103: TINFO: Test 3: waiters: 100, wakes: 50, requeues: 50
 futex_cmp_requeue01.c:126: TINFO: futex_cmp_requeue() returned 50
 futex_cmp_requeue01.c:140: TINFO: children woken, futex0: 0, futex1: 50, spurious wakeups: 0
 futex_cmp_requeue01.c:180: TFAIL: woken up 0, expected range (50, 50)
 futex_cmp_requeue01.c:103: TINFO: Test 4: waiters: 100, wakes: 0, requeues: 70
 futex_cmp_requeue01.c:126: TINFO: futex_cmp_requeue() returned 0
 futex_cmp_requeue01.c:140: TINFO: children woken, futex0: 30, futex1: 70, spurious wakeups: 0
 futex_cmp_requeue01.c:180: TFAIL: woken up -70, expected range (0, 0)
 futex_cmp_requeue01.c:95: TBROK: fork() failed: EAGAIN/EWOULDBLOCK (11)
 tst_test.c:1606: TINFO: Killed the leftover descendant processes

 Summary:
 passed   0
 failed   5
 broken   1
 skipped  0
 warnings 0

 *** futex_cmp_requeue02 ***

 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 futex_cmp_requeue02.c:71: TINFO: Testing variant: syscall with old kernel spec
 futex_cmp_requeue02.c:64: TPASS: futex_cmp_requeue() failed as expected: EINVAL (22)
 futex_cmp_requeue02.c:64: TPASS: futex_cmp_requeue() failed as expected: EINVAL (22)
 futex_cmp_requeue02.c:53: TFAIL: futex_cmp_requeue() succeeded unexpectedly

 HINT: You _MAY_ be missing kernel fixes:

 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fbe0e839d1e2

 HINT: You _MAY_ be vulnerable to CVE(s):

 https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-6927

 Summary:
 passed   2
 failed   1
 broken   0
 skipped  0
 warnings 0

 *** futex_wait01 ***

 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 futex_wait01.c:69: TINFO: Testing variant: syscall with old kernel spec
 futex_wait01.c:62: TPASS: futex_wait() passed: ETIMEDOUT (110)
 futex_wait01.c:62: TPASS: futex_wait() passed: EAGAIN/EWOULDBLOCK (11)
 futex_wait01.c:62: TPASS: futex_wait() passed: ETIMEDOUT (110)
 futex_wait01.c:62: TPASS: futex_wait() passed: EAGAIN/EWOULDBLOCK (11)

 Summary:
 passed   4
 failed   0
 broken   0
 skipped  0
 warnings 0

 *** futex_wait02 ***

 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 futex_wait02.c:66: TINFO: Testing variant: syscall with old kernel spec
 futex_wait02.c:59: TPASS: futex_wait() woken up

 Summary:
 passed   1
 failed   0
 broken   0
 skipped  0
 warnings 0

 *** futex_wait03 ***

 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 futex_wait03.c:63: TINFO: Testing variant: syscall with old kernel spec
 Test timeouted, sending SIGKILL!
 tst_test.c:1612: TINFO: If you are running on slow machine, try exporting LTP_TIMEOUT_MUL > 1
 tst_test.c:1614: TBROK: Test killed! (timeout?)

 Summary:
 passed   0
 failed   0
 broken   1
 skipped  0
 warnings 0

 *** futex_wait04 ***

 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 futex_wait04.c:50: TINFO: Testing variant: syscall with old kernel spec
 futex_wait04.c:39: TPASS: futex_wait() returned -1: EAGAIN/EWOULDBLOCK (11)

 Summary:
 passed   1
 failed   0
 broken   0
 skipped  0
 warnings 0

 *** futex_wait05 ***

 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 sh: systemd-detect-virt: not found
 tst_timer_test.c:357: TINFO: CLOCK_MONOTONIC resolution 69ns
 tst_timer_test.c:365: TINFO: prctl(PR_GET_TIMERSLACK) = -1, using 50us
 tst_test.c:1566: TINFO: Updating max runtime to 0h 00m 09s
 tst_test.c:1558: TINFO: Timeout per run is 0h 00m 39s
 tst_timer_test.c:379: TINFO: Failed to set zero latency constraint: No such file or directory
 tst_timer_test.c:263: TINFO: futex_wait() sleeping for 1000us 500 iterations, threshold 450.01us
 tst_timer_test.c:285: TINFO: Found 500 outliners in [20033,10715] range
 tst_timer_test.c:305: TINFO: min 10715us, max 20033us, median 19998us, trunc mean 19811.95us (discarded 25)
 tst_timer_test.c:314: TFAIL: futex_wait() slept for too long

  Time: us | Frequency
 --------------------------------------------------------------------------------
     10715 | *
     11206 |
     11697 |
     12188 | .
     12679 |
     13170 |
     13661 |
     14152 |
     14643 |
     15134 |
     15625 |
     16116 |
     16607 |
     17098 |
     17589 |
     18080 |
     18571 | +
     19062 | -
     19553 | ********************************************************************
 --------------------------------------------------------------------------------
     491us | 1 sample = 0.14108 '*', 0.28216 '+', 0.56432 '-', non-zero '.'

 tst_timer_test.c:263: TINFO: futex_wait() sleeping for 2000us 500 iterations, threshold 450.01us
 tst_timer_test.c:285: TINFO: Found 11 outliners in [20014,20001] range
 tst_timer_test.c:305: TINFO: min 11091us, max 20014us, median 19998us, trunc mean 19963.34us (discarded 25)
 tst_timer_test.c:314: TFAIL: futex_wait() slept for too long

  Time: us | Frequency
 --------------------------------------------------------------------------------
     11091 | .
     11561 |
     12031 |
     12501 |
     12971 |
     13441 |
     13911 | .
     14381 |
     14851 |
     15321 |
     15791 |
     16261 |
     16731 |
     17201 |
     17671 |
     18141 |
     18611 | .
     19081 |
     19551 | ********************************************************************
 --------------------------------------------------------------------------------
     470us | 1 sample = 0.13682 '*', 0.27364 '+', 0.54728 '-', non-zero '.'

 tst_timer_test.c:263: TINFO: futex_wait() sleeping for 5000us 300 iterations, threshold 450.04us
 tst_timer_test.c:305: TINFO: min 19554us, max 20000us, median 19998us, trunc mean 19996.26us (discarded 15)
 tst_timer_test.c:314: TFAIL: futex_wait() slept for too long

  Time: us | Frequency
 --------------------------------------------------------------------------------
     19554 | .
     19578 |
     19602 |
     19626 |
     19650 |
     19674 |
     19698 |
     19722 |
     19746 |
     19770 |
     19794 |
     19818 |
     19842 |
     19866 |
     19890 |
     19914 |
     19938 |
     19962 |
     19986 | ********************************************************************
 --------------------------------------------------------------------------------
      24us | 1 sample = 0.22742 '*', 0.45485 '+', 0.90970 '-', non-zero '.'

 tst_timer_test.c:263: TINFO: futex_wait() sleeping for 10000us 100 iterations, threshold 450.33us
 tst_timer_test.c:305: TINFO: min 19678us, max 20001us, median 19998us, trunc mean 19994.49us (discarded 5)
 tst_timer_test.c:314: TFAIL: futex_wait() slept for too long

  Time: us | Frequency
 --------------------------------------------------------------------------------
     19678 | +
     19695 |
     19712 |
     19729 |
     19746 |
     19763 |
     19780 |
     19797 |
     19814 |
     19831 |
     19848 |
     19865 |
     19882 |
     19899 |
     19916 |
     19933 |
     19950 |
     19967 |
     19984 | ********************************************************************
     20001 | +
 --------------------------------------------------------------------------------
      17us | 1 sample = 0.69388 '*', 1.38776 '+', 2.77551 '-', non-zero '.'

 tst_timer_test.c:263: TINFO: futex_wait() sleeping for 25000us 50 iterations, threshold 451.29us
 tst_timer_test.c:305: TINFO: min 39713us, max 40000us, median 39998us, trunc mean 39992.19us (discarded 2)
 tst_timer_test.c:314: TFAIL: futex_wait() slept for too long

  Time: us | Frequency
 --------------------------------------------------------------------------------
     39713 | *-
     39729 |
     39745 |
     39761 |
     39777 |
     39793 |
     39809 |
     39825 |
     39841 |
     39857 |
     39873 |
     39889 |
     39905 |
     39921 |
     39937 |
     39953 |
     39969 |
     39985 | ********************************************************************
 --------------------------------------------------------------------------------
      16us | 1 sample = 1.38776 '*', 2.77551 '+', 5.55102 '-', non-zero '.'

 tst_timer_test.c:263: TINFO: futex_wait() sleeping for 100000us 10 iterations, threshold 537.00us
 tst_timer_test.c:305: TINFO: min 109716us, max 110001us, median 109999us, trunc mean 109967.78us (discarded 1)
 tst_timer_test.c:314: TFAIL: futex_wait() slept for too long

  Time: us | Frequency
 --------------------------------------------------------------------------------
    109716 | ********+
    109731 |
    109746 |
    109761 |
    109776 |
    109791 |
    109806 |
    109821 |
    109836 |
    109851 |
    109866 |
    109881 |
    109896 |
    109911 |
    109926 |
    109941 |
    109956 |
    109971 |
    109986 | ********************************************************************
    110001 | ********+
 --------------------------------------------------------------------------------
      15us | 1 sample = 8.50000 '*', 17.00000 '+', 34.00000 '-', non-zero '.'

 tst_timer_test.c:263: TINFO: futex_wait() sleeping for 1000000us 2 iterations, threshold 4400.00us
 tst_timer_test.c:305: TINFO: min 1009721us, max 1010012us, median 1009721us, trunc mean 1009721.00us (discarded 1)
 tst_timer_test.c:314: TFAIL: futex_wait() slept for too long

  Time: us | Frequency
 --------------------------------------------------------------------------------
   1009721 | ********************************************************************
   1009737 |
   1009753 |
   1009769 |
   1009785 |
   1009801 |
   1009817 |
   1009833 |
   1009849 |
   1009865 |
   1009881 |
   1009897 |
   1009913 |
   1009929 |
   1009945 |
   1009961 |
   1009977 |
   1009993 |
   1010009 | ********************************************************************
 --------------------------------------------------------------------------------
      16us | 1 sample = 68.00000 '*', 136.00000 '+', 272.00000 '-', non-zero '.'


 Summary:
 passed   0
 failed   7
 broken   0
 skipped  0
 warnings 0

 *** futex_wait_bitset01 ***

 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 futex_wait_bitset01.c:99: TINFO: Testing variant: syscall with old kernel spec
 futex_wait_bitset01.c:44: TINFO: testing futex_wait_bitset() timeout with CLOCK_MONOTONIC
 futex_wait_bitset01.c:86: TPASS: futex_wait_bitset() waited 114236us, expected 100010us
 futex_wait_bitset01.c:44: TINFO: testing futex_wait_bitset() timeout with CLOCK_REALTIME
 futex_wait_bitset01.c:86: TPASS: futex_wait_bitset() waited 119960us, expected 100010us

 Summary:
 passed   2
 failed   0
 broken   0
 skipped  0
 warnings 0

 *** futex_waitv01 ***

 tst_test.c:899: TCONF: The test requires kernel 5.16 or newer

 *** futex_waitv02 ***

 tst_test.c:899: TCONF: The test requires kernel 5.16 or newer

 *** futex_waitv03 ***

 tst_test.c:899: TCONF: The test requires kernel 5.16 or newer

 *** futex_wake01 ***

 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 futex_wake01.c:59: TINFO: Testing variant: syscall with old kernel spec
 futex_wake01.c:52: TPASS: futex_wake() passed
 futex_wake01.c:52: TPASS: futex_wake() passed
 futex_wake01.c:52: TPASS: futex_wake() passed
 futex_wake01.c:52: TPASS: futex_wake() passed
 futex_wake01.c:52: TPASS: futex_wake() passed
 futex_wake01.c:52: TPASS: futex_wake() passed

 Summary:
 passed   6
 failed   0
 broken   0
 skipped  0
 warnings 0

 *** futex_wake02 ***

 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 futex_wake02.c:134: TINFO: Testing variant: syscall with old kernel spec
 futex_utils.h:69: TINFO: 0 threads sleeping, expected 55
 futex_wake02.c:91: TPASS: futex_wake() woken up 1 threads
 futex_wake02.c:91: TPASS: futex_wake() woken up 2 threads
 futex_wake02.c:91: TPASS: futex_wake() woken up 3 threads
 futex_wake02.c:91: TPASS: futex_wake() woken up 4 threads
 futex_wake02.c:91: TPASS: futex_wake() woken up 5 threads
 futex_wake02.c:91: TPASS: futex_wake() woken up 6 threads
 futex_wake02.c:91: TPASS: futex_wake() woken up 7 threads
 futex_wake02.c:91: TPASS: futex_wake() woken up 8 threads
 futex_wake02.c:91: TPASS: futex_wake() woken up 9 threads
 futex_wake02.c:91: TPASS: futex_wake() woken up 10 threads
 futex_wake02.c:103: TPASS: futex_wake() woken up 0 threads

 Summary:
 passed   11
 failed   0
 broken   0
 skipped  0
 warnings 0

 *** futex_wake03 ***

 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 futex_wake03.c:97: TINFO: Testing variant: syscall with old kernel spec
 futex_wake03.c:61: TPASS: futex_wake() woken up 1 childs
 futex_wake03.c:61: TPASS: futex_wake() woken up 2 childs
 futex_wake03.c:61: TPASS: futex_wake() woken up 3 childs
 futex_wake03.c:61: TPASS: futex_wake() woken up 4 childs
 futex_wake03.c:61: TPASS: futex_wake() woken up 5 childs
 futex_wake03.c:61: TPASS: futex_wake() woken up 6 childs
 futex_wake03.c:61: TPASS: futex_wake() woken up 7 childs
 futex_wake03.c:61: TPASS: futex_wake() woken up 8 childs
 futex_wake03.c:61: TPASS: futex_wake() woken up 9 childs
 futex_wake03.c:61: TPASS: futex_wake() woken up 10 childs
 futex_wake03.c:89: TPASS: futex_wake() woken up 0 children

 Summary:
 passed   11
 failed   0
 broken   0
 skipped  0
 warnings 0

 *** futex_wake04 ***

 tst_test.c:1152: TCONF: Test needs to be run as root


 Running the last as root just gives:

 tst_hugepage.c:34: TCONF: hugetlbfs is not supported

 Perhaps these are easier-to-debug cases?
  Thomas

From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/56828 CVS commit: src/tests/lib/libc/sys
Date: Sat, 18 Jan 2025 06:22:35 +0000

 Module Name:	src
 Committed By:	riastradh
 Date:		Sat Jan 18 06:22:35 UTC 2025

 Modified Files:
 	src/tests/lib/libc/sys: t_futex_ops.c

 Log Message:
 tests/lib/libc/sys/t_futex_ops: Test FUTEX_CMP_REQUEUE edge case.

 It must always compare the futex value and fail with EAGAIN on
 mismatch, even if there are no waiters.

 PR kern/56828: futex calls in Linux emulation sometimes hang


 To generate a diff of this commit:
 cvs rdiff -u -r1.5 -r1.6 src/tests/lib/libc/sys/t_futex_ops.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/56828 CVS commit: src/tests/lib/libc/sys
Date: Sat, 18 Jan 2025 06:22:56 +0000

 Module Name:	src
 Committed By:	riastradh
 Date:		Sat Jan 18 06:22:56 UTC 2025

 Modified Files:
 	src/tests/lib/libc/sys: t_futex_ops.c

 Log Message:
 tests/lib/libc/sys/t_futex_ops: Fix FUTEX_CMP_REQUEUE return values.

 The return value is the number of waiters woken _or requeued_, not
 just the number of waiters woken:

    FUTEX_CMP_REQUEUE
           Returns the total number of waiters that were woken up or
           requeued to the futex for the futex word at uaddr2.  If
           this value is greater than val, then the difference is the
           number of waiters requeued to the futex for the futex word
           at uaddr2.

 https://man7.org/linux/man-pages/man2/futex.2.html

 PR kern/56828: futex calls in Linux emulation sometimes hang


 To generate a diff of this commit:
 cvs rdiff -u -r1.6 -r1.7 src/tests/lib/libc/sys/t_futex_ops.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/56828 CVS commit: src/tests/lib/libc/sys
Date: Sat, 18 Jan 2025 07:05:15 +0000

 Module Name:	src
 Committed By:	riastradh
 Date:		Sat Jan 18 07:05:15 UTC 2025

 Modified Files:
 	src/tests/lib/libc/sys: t_futex_ops.c

 Log Message:
 tests/lib/libc/sys/t_futex_ops: Fix another FUTEX_CMP_REQUEUE case.

 PR kern/56828: futex calls in Linux emulation sometimes hang


 To generate a diff of this commit:
 cvs rdiff -u -r1.7 -r1.8 src/tests/lib/libc/sys/t_futex_ops.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/56828 CVS commit: src
Date: Sat, 18 Jan 2025 07:26:06 +0000

 Module Name:	src
 Committed By:	riastradh
 Date:		Sat Jan 18 07:26:06 UTC 2025

 Modified Files:
 	src/sys/kern: sys_futex.c
 	src/tests/lib/libc/sys: t_futex_ops.c

 Log Message:
 futex(2): Fix FUTEX_CMP_REQUEUE to always compare even if no waiters.

 It must always compare the futex value and fail with EAGAIN on
 mismatch, even if there are no waiters.

   FUTEX_CMP_REQUEUE (since Linux 2.6.7)
          This operation first checks whether the location uaddr
          still contains the value val3.  If not, the operation
          fails with the error EAGAIN.  Otherwise, the operation [...]

 https://man7.org/linux/man-pages/man2/futex.2.html

 PR kern/56828: futex calls in Linux emulation sometimes hang


 To generate a diff of this commit:
 cvs rdiff -u -r1.20 -r1.21 src/sys/kern/sys_futex.c
 cvs rdiff -u -r1.8 -r1.9 src/tests/lib/libc/sys/t_futex_ops.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/56828 CVS commit: src
Date: Sat, 18 Jan 2025 07:26:22 +0000

 Module Name:	src
 Committed By:	riastradh
 Date:		Sat Jan 18 07:26:21 UTC 2025

 Modified Files:
 	src/sys/kern: sys_futex.c
 	src/tests/lib/libc/sys: t_futex_ops.c

 Log Message:
 futex(2): Fix return value of FUTEX_CMP_REQUEUE.

 The return value is the number of waiters woken _or requeued_, not
 just the number of waiters woken:

    FUTEX_CMP_REQUEUE
           Returns the total number of waiters that were woken up or
           requeued to the futex for the futex word at uaddr2.  If
           this value is greater than val, then the difference is the
           number of waiters requeued to the futex for the futex word
           at uaddr2.

 https://man7.org/linux/man-pages/man2/futex.2.html

 While here, clarify some of the arguments with comments so it's not
 quite so cryptic with val/val2/val3 everywhere.

 PR kern/56828: futex calls in Linux emulation sometimes hang


 To generate a diff of this commit:
 cvs rdiff -u -r1.21 -r1.22 src/sys/kern/sys_futex.c
 cvs rdiff -u -r1.9 -r1.10 src/tests/lib/libc/sys/t_futex_ops.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Taylor R Campbell <riastradh@NetBSD.org>
To: Thomas Klausner <wiz@NetBSD.org>
Cc: gnats-bugs@NetBSD.org, netbsd-bugs@NetBSD.org,
	Jason Thorpe <thorpej@NetBSD.org>
Subject: Re: kern/56828: futex calls in Linux emulation sometimes hang
Date: Sat, 18 Jan 2025 07:45:56 +0000

 Can you try again with sys_futex.c 1.22 -- both the futex tests, and
 your Java application?

State-Changed-From-To: open->feedback
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Sat, 18 Jan 2025 07:51:43 +0000
State-Changed-Why:
candidate fixes committed, feedback requested


From: Thomas Klausner <wiz@NetBSD.org>
To: Taylor R Campbell <riastradh@NetBSD.org>
Cc: gnats-bugs@NetBSD.org, Jason Thorpe <thorpej@NetBSD.org>
Subject: Re: kern/56828: futex calls in Linux emulation sometimes hang
Date: Sat, 18 Jan 2025 11:36:27 +0100

 On Sat, Jan 18, 2025 at 07:45:56AM +0000, Taylor R Campbell wrote:
 > Can you try again with sys_futex.c 1.22 -- both the futex tests, and
 > your Java application?

 Thank you!

 The futex tests look much better now, but still quite a lot are
 failing (mostly futex_wait issues):

 *** futex_cmp_requeue01 ***

 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 futex_cmp_requeue01.c:194: TINFO: Testing variant: syscall with old kernel spec
 futex_cmp_requeue01.c:103: TINFO: Test 0: waiters: 10, wakes: 3, requeues: 7
 futex_cmp_requeue01.c:126: TINFO: futex_cmp_requeue() returned 10
 futex_cmp_requeue01.c:140: TINFO: children woken, futex0: 0, futex1: 7, spurious wakeups: 0
 futex_cmp_requeue01.c:187: TPASS: futex_cmp_requeue()
 futex_cmp_requeue01.c:103: TINFO: Test 1: waiters: 10, wakes: 0, requeues: 10
 futex_cmp_requeue01.c:126: TINFO: futex_cmp_requeue() returned 10
 futex_cmp_requeue01.c:140: TINFO: children woken, futex0: 0, futex1: 10, spurious wakeups: 0
 futex_cmp_requeue01.c:187: TPASS: futex_cmp_requeue()
 futex_cmp_requeue01.c:103: TINFO: Test 2: waiters: 10, wakes: 2, requeues: 6
 futex_cmp_requeue01.c:126: TINFO: futex_cmp_requeue() returned 8
 futex_cmp_requeue01.c:140: TINFO: children woken, futex0: 2, futex1: 6, spurious wakeups: 0
 futex_cmp_requeue01.c:187: TPASS: futex_cmp_requeue()
 futex_cmp_requeue01.c:103: TINFO: Test 3: waiters: 100, wakes: 50, requeues: 50
 futex_cmp_requeue01.c:126: TINFO: futex_cmp_requeue() returned 100
 futex_cmp_requeue01.c:140: TINFO: children woken, futex0: 0, futex1: 50, spurious wakeups: 0
 futex_cmp_requeue01.c:187: TPASS: futex_cmp_requeue()
 futex_cmp_requeue01.c:103: TINFO: Test 4: waiters: 100, wakes: 0, requeues: 70
 futex_cmp_requeue01.c:126: TINFO: futex_cmp_requeue() returned 70
 futex_cmp_requeue01.c:140: TINFO: children woken, futex0: 30, futex1: 70, spurious wakeups: 0
 futex_cmp_requeue01.c:187: TPASS: futex_cmp_requeue()
 futex_cmp_requeue01.c:95: TBROK: fork() failed: EAGAIN/EWOULDBLOCK (11)
 tst_test.c:1606: TINFO: Killed the leftover descendant processes

 Summary:
 passed   5
 failed   0
 broken   1
 skipped  0
 warnings 0

 *** futex_cmp_requeue02 ***

 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 futex_cmp_requeue02.c:71: TINFO: Testing variant: syscall with old kernel spec
 futex_cmp_requeue02.c:64: TPASS: futex_cmp_requeue() failed as expected: EINVAL (22)
 futex_cmp_requeue02.c:64: TPASS: futex_cmp_requeue() failed as expected: EINVAL (22)
 futex_cmp_requeue02.c:64: TPASS: futex_cmp_requeue() failed as expected: EAGAIN/EWOULDBLOCK (11)

 Summary:
 passed   3
 failed   0
 broken   0
 skipped  0
 warnings 0

 *** futex_wait01 ***

 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 futex_wait01.c:69: TINFO: Testing variant: syscall with old kernel spec
 futex_wait01.c:62: TPASS: futex_wait() passed: ETIMEDOUT (110)
 futex_wait01.c:62: TPASS: futex_wait() passed: EAGAIN/EWOULDBLOCK (11)
 futex_wait01.c:62: TPASS: futex_wait() passed: ETIMEDOUT (110)
 futex_wait01.c:62: TPASS: futex_wait() passed: EAGAIN/EWOULDBLOCK (11)

 Summary:
 passed   4
 failed   0
 broken   0
 skipped  0
 warnings 0

 *** futex_wait02 ***

 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 futex_wait02.c:66: TINFO: Testing variant: syscall with old kernel spec
 futex_wait02.c:59: TPASS: futex_wait() woken up

 Summary:
 passed   1
 failed   0
 broken   0
 skipped  0
 warnings 0

 *** futex_wait03 ***

 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 futex_wait03.c:63: TINFO: Testing variant: syscall with old kernel spec
 Test timeouted, sending SIGKILL!
 tst_test.c:1612: TINFO: If you are running on slow machine, try exporting LTP_TIMEOUT_MUL > 1
 tst_test.c:1614: TBROK: Test killed! (timeout?)

 Summary:
 passed   0
 failed   0
 broken   1
 skipped  0
 warnings 0

 *** futex_wait04 ***

 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 futex_wait04.c:50: TINFO: Testing variant: syscall with old kernel spec
 futex_wait04.c:39: TPASS: futex_wait() returned -1: EAGAIN/EWOULDBLOCK (11)

 Summary:
 passed   1
 failed   0
 broken   0
 skipped  0
 warnings 0

 *** futex_wait05 ***

 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 sh: systemd-detect-virt: command not found
 tst_timer_test.c:357: TINFO: CLOCK_MONOTONIC resolution 1ns
 tst_timer_test.c:365: TINFO: prctl(PR_GET_TIMERSLACK) = -1, using 50us
 tst_test.c:1566: TINFO: Updating max runtime to 0h 00m 09s
 tst_test.c:1558: TINFO: Timeout per run is 0h 00m 39s
 tst_timer_test.c:379: TINFO: Failed to set zero latency constraint: No such file or directory
 tst_timer_test.c:263: TINFO: futex_wait() sleeping for 1000us 500 iterations, threshold 450.01us
 tst_timer_test.c:285: TINFO: Found 500 outliners in [20098,13688] range
 tst_timer_test.c:305: TINFO: min 13688us, max 20098us, median 20000us, trunc mean 19976.46us (discarded 25)
 tst_timer_test.c:314: TFAIL: futex_wait() slept for too long

  Time: us | Frequency
 --------------------------------------------------------------------------------
     13688 | .
     14026 | 
     14364 | 
     14702 | 
     15040 | 
     15378 | 
     15716 | 
     16054 | .
     16392 | 
     16730 | 
     17068 | 
     17406 | 
     17744 | 
     18082 | 
     18420 | 
     18758 | 
     19096 | .
     19434 | 
     19772 | ********************************************************************
 --------------------------------------------------------------------------------
     338us | 1 sample = 0.13682 '*', 0.27364 '+', 0.54728 '-', non-zero '.'

 tst_timer_test.c:263: TINFO: futex_wait() sleeping for 2000us 500 iterations, threshold 450.01us
 tst_timer_test.c:285: TINFO: Found 17 outliners in [20141,20001] range
 tst_timer_test.c:305: TINFO: min 19587us, max 20141us, median 20000us, trunc mean 19996.29us (discarded 25)
 tst_timer_test.c:314: TFAIL: futex_wait() slept for too long

  Time: us | Frequency
 --------------------------------------------------------------------------------
     19587 | .
     19617 | 
     19647 | 
     19677 | 
     19707 | 
     19737 | 
     19767 | 
     19797 | 
     19827 | 
     19857 | +
     19887 | -
     19917 | 
     19947 | 
     19977 | ********************************************************************
     20007 | 
     20037 | 
     20067 | 
     20097 | -
     20127 | +
 --------------------------------------------------------------------------------
      30us | 1 sample = 0.14196 '*', 0.28392 '+', 0.56785 '-', non-zero '.'

 tst_timer_test.c:263: TINFO: futex_wait() sleeping for 5000us 300 iterations, threshold 450.04us
 tst_timer_test.c:305: TINFO: min 19630us, max 20141us, median 20000us, trunc mean 19995.72us (discarded 15)
 tst_timer_test.c:314: TFAIL: futex_wait() slept for too long

  Time: us | Frequency
 --------------------------------------------------------------------------------
     19630 | .
     19657 | 
     19684 | 
     19711 | 
     19738 | 
     19765 | 
     19792 | 
     19819 | 
     19846 | *
     19873 | 
     19900 | .
     19927 | 
     19954 | 
     19981 | ********************************************************************
     20008 | 
     20035 | 
     20062 | 
     20089 | .
     20116 | *
 --------------------------------------------------------------------------------
      27us | 1 sample = 0.23693 '*', 0.47387 '+', 0.94774 '-', non-zero '.'

 tst_timer_test.c:263: TINFO: futex_wait() sleeping for 10000us 100 iterations, threshold 450.33us
 tst_timer_test.c:305: TINFO: min 19670us, max 20141us, median 20000us, trunc mean 19992.52us (discarded 5)
 tst_timer_test.c:314: TFAIL: futex_wait() slept for too long

  Time: us | Frequency
 --------------------------------------------------------------------------------
     19670 | +
     19695 | 
     19720 | 
     19745 | 
     19770 | 
     19795 | 
     19820 | 
     19845 | *+
     19870 | 
     19895 | 
     19920 | 
     19945 | +
     19970 | **-
     19995 | ********************************************************************
     20020 | 
     20045 | +
     20070 | 
     20095 | 
     20120 | *+
 --------------------------------------------------------------------------------
      25us | 1 sample = 0.75556 '*', 1.51111 '+', 3.02222 '-', non-zero '.'

 tst_timer_test.c:263: TINFO: futex_wait() sleeping for 25000us 50 iterations, threshold 451.29us
 tst_timer_test.c:305: TINFO: min 30356us, max 40009us, median 40000us, trunc mean 39790.12us (discarded 2)
 tst_timer_test.c:314: TFAIL: futex_wait() slept for too long

  Time: us | Frequency
 --------------------------------------------------------------------------------
     30356 | *-
     30865 | 
     31374 | 
     31883 | 
     32392 | 
     32901 | 
     33410 | 
     33919 | 
     34428 | 
     34937 | 
     35446 | 
     35955 | 
     36464 | 
     36973 | 
     37482 | 
     37991 | 
     38500 | 
     39009 | 
     39518 | ********************************************************************
 --------------------------------------------------------------------------------
     509us | 1 sample = 1.38776 '*', 2.77551 '+', 5.55102 '-', non-zero '.'

 tst_timer_test.c:263: TINFO: futex_wait() sleeping for 100000us 10 iterations, threshold 537.00us
 tst_timer_test.c:305: TINFO: min 109664us, max 110000us, median 109999us, trunc mean 109961.78us (discarded 1)
 tst_timer_test.c:314: TFAIL: futex_wait() slept for too long

  Time: us | Frequency
 --------------------------------------------------------------------------------
    109664 | *******+
    109682 | 
    109700 | 
    109718 | 
    109736 | 
    109754 | 
    109772 | 
    109790 | 
    109808 | 
    109826 | 
    109844 | 
    109862 | 
    109880 | 
    109898 | 
    109916 | 
    109934 | 
    109952 | 
    109970 | 
    109988 | ********************************************************************
 --------------------------------------------------------------------------------
      18us | 1 sample = 7.55556 '*', 15.11111 '+', 30.22222 '-', non-zero '.'

 tst_timer_test.c:263: TINFO: futex_wait() sleeping for 1000000us 2 iterations, threshold 4400.00us
 tst_timer_test.c:305: TINFO: min 1009663us, max 1009995us, median 1009663us, trunc mean 1009663.00us (discarded 1)
 tst_timer_test.c:314: TFAIL: futex_wait() slept for too long

  Time: us | Frequency
 --------------------------------------------------------------------------------
   1009663 | ********************************************************************
   1009681 | 
   1009699 | 
   1009717 | 
   1009735 | 
   1009753 | 
   1009771 | 
   1009789 | 
   1009807 | 
   1009825 | 
   1009843 | 
   1009861 | 
   1009879 | 
   1009897 | 
   1009915 | 
   1009933 | 
   1009951 | 
   1009969 | 
   1009987 | ********************************************************************
 --------------------------------------------------------------------------------
      18us | 1 sample = 68.00000 '*', 136.00000 '+', 272.00000 '-', non-zero '.'


 Summary:
 passed   0
 failed   7
 broken   0
 skipped  0
 warnings 0

 *** futex_wait_bitset01 ***

 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 futex_wait_bitset01.c:99: TINFO: Testing variant: syscall with old kernel spec
 futex_wait_bitset01.c:44: TINFO: testing futex_wait_bitset() timeout with CLOCK_MONOTONIC
 futex_wait_bitset01.c:86: TPASS: futex_wait_bitset() waited 113114us, expected 100010us
 futex_wait_bitset01.c:44: TINFO: testing futex_wait_bitset() timeout with CLOCK_REALTIME
 futex_wait_bitset01.c:86: TPASS: futex_wait_bitset() waited 119990us, expected 100010us

 Summary:
 passed   2
 failed   0
 broken   0
 skipped  0
 warnings 0

 *** futex_waitv01 ***

 tst_buffers.c:55: TINFO: Test is using guarded buffers
 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 futex2test.h:27: TCONF: syscall(449) __NR_futex_waitv not supported on your arch

 Summary:
 passed   0
 failed   0
 broken   0
 skipped  1
 warnings 0

 *** futex_waitv02 ***

 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 futex_waitv02.c:34: TINFO: Testing variant: syscall with old kernel spec
 tst_buffers.c:55: TINFO: Test is using guarded buffers
 futex2test.h:27: TCONF: syscall(449) __NR_futex_waitv not supported on your arch

 Summary:
 passed   0
 failed   0
 broken   0
 skipped  1
 warnings 0

 *** futex_waitv03 ***

 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 futex_waitv03.c:37: TINFO: Testing variant: syscall with old kernel spec
 tst_buffers.c:55: TINFO: Test is using guarded buffers
 futex2test.h:27: TCONF: syscall(449) __NR_futex_waitv not supported on your arch

 Summary:
 passed   0
 failed   0
 broken   0
 skipped  1
 warnings 0

 *** futex_wake01 ***

 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 futex_wake01.c:59: TINFO: Testing variant: syscall with old kernel spec
 futex_wake01.c:52: TPASS: futex_wake() passed
 futex_wake01.c:52: TPASS: futex_wake() passed
 futex_wake01.c:52: TPASS: futex_wake() passed
 futex_wake01.c:52: TPASS: futex_wake() passed
 futex_wake01.c:52: TPASS: futex_wake() passed
 futex_wake01.c:52: TPASS: futex_wake() passed

 Summary:
 passed   6
 failed   0
 broken   0
 skipped  0
 warnings 0

 *** futex_wake02 ***

 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 futex_wake02.c:134: TINFO: Testing variant: syscall with old kernel spec
 futex_utils.h:69: TINFO: 0 threads sleeping, expected 55
 futex_wake02.c:91: TPASS: futex_wake() woken up 1 threads
 futex_wake02.c:91: TPASS: futex_wake() woken up 2 threads
 futex_wake02.c:91: TPASS: futex_wake() woken up 3 threads
 futex_wake02.c:91: TPASS: futex_wake() woken up 4 threads
 futex_wake02.c:91: TPASS: futex_wake() woken up 5 threads
 futex_wake02.c:91: TPASS: futex_wake() woken up 6 threads
 futex_wake02.c:91: TPASS: futex_wake() woken up 7 threads
 futex_wake02.c:91: TPASS: futex_wake() woken up 8 threads
 futex_wake02.c:91: TPASS: futex_wake() woken up 9 threads
 futex_wake02.c:91: TPASS: futex_wake() woken up 10 threads
 futex_wake02.c:103: TPASS: futex_wake() woken up 0 threads

 Summary:
 passed   11
 failed   0
 broken   0
 skipped  0
 warnings 0

 *** futex_wake03 ***

 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 futex_wake03.c:97: TINFO: Testing variant: syscall with old kernel spec
 futex_wake03.c:61: TPASS: futex_wake() woken up 1 childs
 futex_wake03.c:61: TPASS: futex_wake() woken up 2 childs
 futex_wake03.c:61: TPASS: futex_wake() woken up 3 childs
 futex_wake03.c:61: TPASS: futex_wake() woken up 4 childs
 futex_wake03.c:61: TPASS: futex_wake() woken up 5 childs
 futex_wake03.c:61: TPASS: futex_wake() woken up 6 childs
 futex_wake03.c:61: TPASS: futex_wake() woken up 7 childs
 futex_wake03.c:61: TPASS: futex_wake() woken up 8 childs
 futex_wake03.c:61: TPASS: futex_wake() woken up 9 childs
 futex_wake03.c:61: TPASS: futex_wake() woken up 10 childs
 futex_wake03.c:89: TPASS: futex_wake() woken up 0 children

 Summary:
 passed   11
 failed   0
 broken   0
 skipped  0
 warnings 0

 *** futex_wake04 ***

 tst_test.c:1152: TCONF: Test needs to be run as root

 I can send you the tests if you want.

 I tried the Metalworks demo from jdk17 and it worked fine.

 Then I tried the PDF-Over application.
 I could get one successful run through the application, but
 I had about 9 other tries where it didn't complete the process.
 Mostly not show the PDF (step 2 of the process), or show
 just a gray screen.

 Right now top says the process is in futex, so I suspect there are
 still more problems. Perhaps the futex_wait() problem bites us here.

 Thanks,
  Thomas

From: Taylor R Campbell <riastradh@NetBSD.org>
To: Thomas Klausner <wiz@NetBSD.org>
Cc: gnats-bugs@NetBSD.org, netbsd-bugs@NetBSD.org,
	Jason Thorpe <thorpej@NetBSD.org>
Subject: Re: kern/56828: futex calls in Linux emulation sometimes hang
Date: Sat, 18 Jan 2025 11:03:37 +0000

 > Date: Sat, 18 Jan 2025 11:36:27 +0100
 > From: Thomas Klausner <wiz@NetBSD.org>
 >=20
 > The futex tests look much better now, but still quite a lot are
 > failing (mostly futex_wait issues):
 >=20
 > futex_cmp_requeue01.c:95: TBROK: fork() failed: EAGAIN/EWOULDBLOCK (11)
 > tst_test.c:1606: TINFO: Killed the leftover descendant processes

 Looks like you hit a process rlimit.  Can you bump ulimit -p or
 kern.maxproc?

 > *** futex_wait03 ***
 >=20
 > tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adj=
 ustment
 > tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
 > tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adj=
 ustment
 > futex_wait03.c:63: TINFO: Testing variant: syscall with old kernel spec
 > Test timeouted, sending SIGKILL!
 > tst_test.c:1612: TINFO: If you are running on slow machine, try exporting=
  LTP_TIMEOUT_MUL > 1
 > tst_test.c:1614: TBROK: Test killed! (timeout?)
 >=20
 > Summary:
 > passed   0
 > failed   0
 > broken   1
 > skipped  0
 > warnings 0

 I suspect this is a bug in NetBSD's implementation of /proc/$pid/stat.

 This is the only test case that queries it from another thread, I
 think, and it looks like when that happens, /proc/$pid/stat doesn't
 correctly report the other thread as sleeping (`S') when it is waiting
 in futex(FUTEX_WAIT), so the wait-for-sleep busy loop spins forever
 (or until timeout).

 Could add a printf after TST_PROCESS_STATE_WAIT (and an fflush after
 that) to verify that the test never gets past that loop.

 > *** futex_wait05 ***
 > [...]
 > tst_timer_test.c:263: TINFO: futex_wait() sleeping for 1000us 500 iterati=
 ons, threshold 450.01us
 > tst_timer_test.c:285: TINFO: Found 500 outliners in [20098,13688] range
 > tst_timer_test.c:305: TINFO: min 13688us, max 20098us, median 20000us, tr=
 unc mean 19976.46us (discarded 25)
 > tst_timer_test.c:314: TFAIL: futex_wait() slept for too long

 These failures are all about the limited resolution of sleeps.  I'm
 guessing you're running at 100 Hz.  These times are around 1-2 ticks
 past the requested deadline, or 10-20ms =3D 10000-20000us (plus a tiny
 slop of a few dozen microseconds).  I would expect this to slow things
 down but not make them deadlock.

 > I tried the Metalworks demo from jdk17 and it worked fine.
 >=20
 > Then I tried the PDF-Over application.
 > I could get one successful run through the application, but
 > I had about 9 other tries where it didn't complete the process.
 > Mostly not show the PDF (step 2 of the process), or show
 > just a gray screen.
 >=20
 > Right now top says the process is in futex, so I suspect there are
 > still more problems. Perhaps the futex_wait() problem bites us here.

 Boo.  I guess we need to kernhist it up to find what futex events had
 recently happened before the deadlock.

From: Thomas Klausner <wiz@NetBSD.org>
To: Taylor R Campbell <riastradh@NetBSD.org>
Cc: gnats-bugs@NetBSD.org, Jason Thorpe <thorpej@NetBSD.org>
Subject: Re: kern/56828: futex calls in Linux emulation sometimes hang
Date: Sat, 18 Jan 2025 12:17:02 +0100

 On Sat, Jan 18, 2025 at 11:03:37AM +0000, Taylor R Campbell wrote:
 > > Date: Sat, 18 Jan 2025 11:36:27 +0100
 > > From: Thomas Klausner <wiz@NetBSD.org>
 > > 
 > > The futex tests look much better now, but still quite a lot are
 > > failing (mostly futex_wait issues):
 > > 
 > > futex_cmp_requeue01.c:95: TBROK: fork() failed: EAGAIN/EWOULDBLOCK (11)
 > > tst_test.c:1606: TINFO: Killed the leftover descendant processes
 > 
 > Looks like you hit a process rlimit.  Can you bump ulimit -p or
 > kern.maxproc?

 I can't get 'ulimit -p' over 1044 (1045 throws an error). If I bump
 that and kern.maxproc=10000 (from 1044, what a coincidence), the test
 works:

 $ ./futex_cmp_requeue01
 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
 tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
 futex_cmp_requeue01.c:194: TINFO: Testing variant: syscall with old kernel spec
 futex_cmp_requeue01.c:103: TINFO: Test 0: waiters: 10, wakes: 3, requeues: 7
 futex_cmp_requeue01.c:126: TINFO: futex_cmp_requeue() returned 10
 futex_cmp_requeue01.c:140: TINFO: children woken, futex0: 0, futex1: 7, spurious wakeups: 0
 futex_cmp_requeue01.c:187: TPASS: futex_cmp_requeue()
 futex_cmp_requeue01.c:103: TINFO: Test 1: waiters: 10, wakes: 0, requeues: 10
 futex_cmp_requeue01.c:126: TINFO: futex_cmp_requeue() returned 10
 futex_cmp_requeue01.c:140: TINFO: children woken, futex0: 0, futex1: 10, spurious wakeups: 0
 futex_cmp_requeue01.c:187: TPASS: futex_cmp_requeue()
 futex_cmp_requeue01.c:103: TINFO: Test 2: waiters: 10, wakes: 2, requeues: 6
 futex_cmp_requeue01.c:126: TINFO: futex_cmp_requeue() returned 8
 futex_cmp_requeue01.c:140: TINFO: children woken, futex0: 2, futex1: 6, spurious wakeups: 0
 futex_cmp_requeue01.c:187: TPASS: futex_cmp_requeue()
 futex_cmp_requeue01.c:103: TINFO: Test 3: waiters: 100, wakes: 50, requeues: 50
 futex_cmp_requeue01.c:126: TINFO: futex_cmp_requeue() returned 100
 futex_cmp_requeue01.c:140: TINFO: children woken, futex0: 0, futex1: 50, spurious wakeups: 0
 futex_cmp_requeue01.c:187: TPASS: futex_cmp_requeue()
 futex_cmp_requeue01.c:103: TINFO: Test 4: waiters: 100, wakes: 0, requeues: 70
 futex_cmp_requeue01.c:126: TINFO: futex_cmp_requeue() returned 70
 futex_cmp_requeue01.c:140: TINFO: children woken, futex0: 30, futex1: 70, spurious wakeups: 0
 futex_cmp_requeue01.c:187: TPASS: futex_cmp_requeue()
 futex_cmp_requeue01.c:103: TINFO: Test 5: waiters: 1000, wakes: 100, requeues: 900
 futex_cmp_requeue01.c:126: TINFO: futex_cmp_requeue() returned 1000
 futex_cmp_requeue01.c:140: TINFO: children woken, futex0: 0, futex1: 900, spurious wakeups: 0
 futex_cmp_requeue01.c:187: TPASS: futex_cmp_requeue()
 futex_cmp_requeue01.c:103: TINFO: Test 6: waiters: 1000, wakes: 300, requeues: 500
 futex_cmp_requeue01.c:126: TINFO: futex_cmp_requeue() returned 800
 futex_cmp_requeue01.c:140: TINFO: children woken, futex0: 200, futex1: 500, spurious wakeups: 0
 futex_cmp_requeue01.c:187: TPASS: futex_cmp_requeue()

 Summary:
 passed   7
 failed   0
 broken   0
 skipped  0
 warnings 0


 > These failures are all about the limited resolution of sleeps.  I'm
 > guessing you're running at 100 Hz.  These times are around 1-2 ticks
 > past the requested deadline, or 10-20ms = 10000-20000us (plus a tiny
 > slop of a few dozen microseconds).  I would expect this to slow things
 > down but not make them deadlock.

 I'm running a GENERIC, so you're probably right with 100 Hz.

 Cheers,
  Thomas

State-Changed-From-To: feedback->needs-pullups
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Sat, 18 Jan 2025 12:42:34 +0000
State-Changed-Why:
feedback received
needs pullup-10, inapplicable <10
and then back to the drawing board for more issues


From: Hauke Fath <hauke@Espresso.Rhein-Neckar.DE>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, riastradh@NetBSD.org,
        Thomas Klausner <wiz@NetBSD.org>
Subject: Re: kern/56828 (futex calls in Linux emulation sometimes hang)
Date: Sat, 18 Jan 2025 15:21:59 +0100

 On Sat, 18 Jan 2025 12:42:35 +0000 (UTC), riastradh@NetBSD.org wrote:
 > Synopsis: futex calls in Linux emulation sometimes hang

 Do you think this would matter for kern/58677?

 I'll have to give a -current kernel a spin on Monday...

 Cheerio,
 Hauke

 --=20
 Hauke Fath                        <hauke@Espresso.Rhein-Neckar.DE>
 Linn=E9weg 7
 64342 Seeheim-Jugenheim
 Germany

From: Taylor R Campbell <riastradh@NetBSD.org>
To: Hauke Fath <hauke@Espresso.Rhein-Neckar.DE>
Cc: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org,
	gnats-admin@netbsd.org, Thomas Klausner <wiz@NetBSD.org>
Subject: Re: kern/56828 (futex calls in Linux emulation sometimes hang)
Date: Sat, 18 Jan 2025 16:01:59 +0000

 > Date: Sat, 18 Jan 2025 15:21:59 +0100
 > From: Hauke Fath <hauke@Espresso.Rhein-Neckar.DE>
 > 
 > On Sat, 18 Jan 2025 12:42:35 +0000 (UTC), riastradh@NetBSD.org wrote:
 > > Synopsis: futex calls in Linux emulation sometimes hang
 > 
 > Do you think this would matter for kern/58677?
 > 
 > I'll have to give a -current kernel a spin on Monday...

 Yes, there's a good chance of that.

From: Taylor R Campbell <riastradh@NetBSD.org>
To: Hauke Fath <hauke@Espresso.Rhein-Neckar.DE>
Cc: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org,
	gnats-admin@netbsd.org, Thomas Klausner <wiz@NetBSD.org>
Subject: Re: kern/56828 (futex calls in Linux emulation sometimes hang)
Date: Wed, 5 Mar 2025 14:17:05 +0000

 > Date: Sat, 18 Jan 2025 15:21:59 +0100
 > From: Hauke Fath <hauke@Espresso.Rhein-Neckar.DE>
 > 
 > On Sat, 18 Jan 2025 12:42:35 +0000 (UTC), riastradh@NetBSD.org wrote:
 > > Synopsis: futex calls in Linux emulation sometimes hang
 > 
 > Do you think this would matter for kern/58677?
 > 
 > I'll have to give a -current kernel a spin on Monday...

 I've just committed two more futex fixes, one affecting the semantics
 of FUTEX_WAKE_OP (which appeared in the ktrace for PR kern/58677
 though not with parameters that are affected) and another affecting
 the timing of FUTEX_WAIT.  PRs for these specific issues:

 PR kern/59129: futex(3): missing sign extension in FUTEX_WAKE_OP
 https://gnats.NetBSD.org/59129

 PR kern/59132: t_futex_ops:futex_wait_timeout_* sometimes fails on
 early wakeup
 https://gnats.NetBSD.org/59132

 Can you please try again with a current kernel, and let me know how
 (a) sysutils/tsm8,
 (b) the java pdf application, and
 (c) the ltp tests
 work with the changes?

 If things are still broken, I guess I'll have to look closer at the
 full ktrace.

From: Thomas Klausner <wiz@NetBSD.org>
To: Taylor R Campbell <riastradh@NetBSD.org>
Cc: NetBSD bugtracking <gnats-bugs@NetBSD.org>
Subject: Re: kern/56828 (futex calls in Linux emulation sometimes hang)
Date: Wed, 5 Mar 2025 17:05:39 +0100

 On Wed, Mar 05, 2025 at 02:17:05PM +0000, Taylor R Campbell wrote:
 > (c) the ltp tests

 Thank you for the fixes!

 This unbroke one of the ltp tests (futex_cmp_requeue01), but two of
 them are still broken the same way (futex_wait03, futex_wait05).

 I can send you the binaries if you want.
  Thomas

From: Hauke Fath <h.fath@nt.tu-darmstadt.de>
To: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
        Thomas Klausner <wiz@NetBSD.org>
Cc: 
Subject: Re: kern/56828 (futex calls in Linux emulation sometimes hang)
Date: Wed, 5 Mar 2025 18:27:33 +0100

 On 05.03.2025 15:20, Taylor R Campbell via gnats wrote:
 >   Can you please try again with a current kernel, and let me know how
 >   (a) sysutils/tsm8,

 Thanks for your work. Unfortunately, I see no changes with today's kernel.

 Cheerio,
 Hauke


 -- 
       The ASCII Ribbon Campaign                    Hauke Fath
 ()     No HTML/RTF in email	        Institut für Nachrichtentechnik
 /\     No Word docs in email                     TU Darmstadt
       Respect for open standards              Ruf +49-6151-16-21344

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2025 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.