NetBSD Problem Report #56828
From wiz@yt.nih.at Fri May 13 09:52:10 2022
Return-Path: <wiz@yt.nih.at>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id E4F591A9239
for <gnats-bugs@gnats.NetBSD.org>; Fri, 13 May 2022 09:52:10 +0000 (UTC)
Message-Id: <20220513093416.AE0001CB6ABE@yt.nih.at>
Date: Fri, 13 May 2022 11:34:16 +0200 (CEST)
From: Thomas Klausner <wiz@NetBSD.org>
Reply-To: Thomas Klausner <wiz@NetBSD.org>
To: gnats-bugs@NetBSD.org
Subject: futex calls in Linux emulation sometimes hang
X-Send-Pr-Version: 3.95
>Number: 56828
>Category: kern
>Synopsis: futex calls in Linux emulation sometimes hang
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: needs-pullups
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Fri May 13 09:55:00 +0000 2022
>Closed-Date:
>Last-Modified: Wed Mar 05 20:10:00 +0000 2025
>Originator: Thomas Klausner
>Release: NetBSD 9.99.96
>Organization:
>Environment:
Architecture: x86_64
Machine: amd64
>Description:
I'm using a Java application running under Linux emulation.
It sometimes hangs and ps says it's in futex:
ps -alxwww | grep java
1000 20124 19935 581925 83 0 37599660 110892 futex Sl+ ttyp7 0:02.57 /usr/pkg/java/oracle-8/bin/java -cp /usr/pkg/PDF-Over/lib/* at.asit.pdfover.gui.Main file.pdf
Usually I can CTRL-C and restart it, and about 1/3 or 1/5 of tries
I can finish what I want to do with it.
>How-To-Repeat:
Install PDF-Over by downloading
https://webstart.buergerkarte.at/PDF-Over/setup_pdf-over_linux.jar
Install it using oracle-jre8-8.0.202:
oracle8-java -jar setup_pdf-over_linux.jar
Run it, using some variant of
/usr/pkg/bin/oracle8-java -cp "/installation/path/PDF-Over/lib/*" at.asit.pdfover.gui.Main "$@"
on a PDF file.
Sometimes it hangs before displaying the first screen, i.e. it pops up
a new window but that window stays gray. If you get further than that, just
ctrl-c and try again.
>Fix:
Yes, please!
>Release-Note:
>Audit-Trail:
From: Thomas Klausner <wiz@NetBSD.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/56828: futex calls in Linux emulation sometimes hang
Date: Fri, 13 May 2022 18:08:48 +0200
I'm told that this can also be reproduced in the following way:
Using the Metalworks demo
(/usr/pkg/java/openjdk8/demo/jfc/Metalworks/Metalworks.jar from the
opendjk8 package):
Start it, select the "About" dialog from the help menu and click the OK button.
I forgot to mention that this wasn't working better on the
thorpej-futex2 branch last I tried.
Thomas
From: Thomas Klausner <wiz@NetBSD.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/56828: futex calls in Linux emulation sometimes hang
Date: Fri, 16 Sep 2022 14:05:07 +0200
Some information from debugging with riastradh, OCR'd and handfixed:
crash> ps/w | grep futex
7203 8445 java linux 43 futex ffffa3212cc45ed0
7203 8701 java linux 43 futex ffffa3212cc3eed0
...
many many more of these.
(gdb) p futex_tab
$1 = (lock = {u = (mtxa_owner = 0, s = (mtxs_dummy = 0 '\000', mtxs_ipl = {_ipl = 0 '\000'}, mtxs_lock = 0 '\000', mtxs_unused = 0 '\000'}}}, va = {rbt_root = 0x0, rbt_ops = 0xffffffff81375ba0 <futex_rb_ops>,
rbt_minmax = {0x0, 0x0}}, oa = {rbt_root = 0xffff8039324615a0, rbt_ops = 0xffffffff81375b80 <futex_shared_rb_ops>, rbt_minmax = {0xffff8035d9ac71e0, 0xffff8035fa7f0c60}}}
(gdb) print ((struct futex_wait *) (0xffffa3212cc45ed0
- (size_t)&((struct futex_wait *)0)->fw_cv))
$2 = (struct futex wait *) 0xffffa3212cc45ec8
(gd) print &((struct futex_wait *)0)->fw_cv
$3 = (kcondvar t *) 0x8
(gdb) print _Alignof (struct futex_wait)
$4 = 8
(gdb) print *((struct futex_wait *) (0xffffa3212cc45ed0 - (size_t)&((struct futex_wait *)0)->fw_cv))
$5 = (fw lock = {u = {mtxa_owner = 0, s = {mtxs_dummy = 0 '\000', mtxs_ipl = { ipl = 0 '\000'}, mtxs_lock = 0 '\000', mtxs_unused = 0 '\000'}}}, fw_cv = {cv_opaque = 0xffff803897cb8100, 0xffffffff813e61a9}}, fw_futex = 0xffff803873cf9400, fw_entry = {tqe_next = 0x0, tqe_prev = 0xffff803873cf9440}, fw_abort = (le_next = 0xca, le_prev = 0xffffa3212cc45f407, fw_bitset=-1, fw_aborting = false}
(gdb) x/s ((struct futex_wait *)(0xffffa3212cc45ed0 - (size_t)&((struct futex_wait *)0)->fw_cv))->fw_cv->cv_opaque[1]
Oxffffffff813e61a9: "futex"
(gdb) print *((struct futex_wait *)(0xffffa3212cceed0 - (size_t)&((struct futex_wait *)0)->fw_cv))
$6 = {fw lock = {u = (mtxa_owner = 0, s = {mtxs_dummy = 0 '\000', mtxs_ipl = { ipl = 0 '\000'}, mtxs_lock = 0 '\000', mtxs_unused = 0 '\000'}}}, fw_cv = {cv_opaque = {0xffff80389a4d3940, 0xffffffff813e61a93}, fw_futex = 0xffff8038720bcac0, fw_entry = {tqe_next = 0x0, tqe_prev = 0xffff8038720bcb00}, fw_abort = {le_next = Oxffff8035da5d3100, le_prev = 0xffffa3212cc3ef307, fw_bitset = -1, fw_aborting = false }
From: Thomas Klausner <wiz@NetBSD.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/56828: futex calls in Linux emulation sometimes hang
Date: Sat, 25 Mar 2023 06:44:46 +0100
I compiled
https://github.com/linux-test-project/ltp/releases/download/20230127/ltp-full-20230127.tar.bz2
on a CentOS (exact version unknown, sorry) and copied the contents of
testcases/kernel/syscalls/futex (see
https://github.com/linux-test-project/ltp/tree/master/testcases/kernel/syscalls/futex)
to a NetBSD 10.99.2/amd64 system and ran them. The output shows quite
a number of problems (look for 'failed' and 'broken' below):
*** futex_cmp_requeue01 ***
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
futex_cmp_requeue01.c:194: TINFO: Testing variant: syscall with old kernel spec
futex_cmp_requeue01.c:103: TINFO: Test 0: waiters: 10, wakes: 3, requeues: 7
futex_cmp_requeue01.c:126: TINFO: futex_cmp_requeue() returned 3
futex_cmp_requeue01.c:140: TINFO: children woken, futex0: 0, futex1: 7, spurious wakeups: 0
futex_cmp_requeue01.c:180: TFAIL: woken up -4, expected range (3, 3)
futex_cmp_requeue01.c:103: TINFO: Test 1: waiters: 10, wakes: 0, requeues: 10
futex_cmp_requeue01.c:126: TINFO: futex_cmp_requeue() returned 0
futex_cmp_requeue01.c:140: TINFO: children woken, futex0: 0, futex1: 10, spurious wakeups: 0
futex_cmp_requeue01.c:180: TFAIL: woken up -10, expected range (0, 0)
futex_cmp_requeue01.c:103: TINFO: Test 2: waiters: 10, wakes: 2, requeues: 6
futex_cmp_requeue01.c:126: TINFO: futex_cmp_requeue() returned 2
futex_cmp_requeue01.c:140: TINFO: children woken, futex0: 2, futex1: 6, spurious wakeups: 0
futex_cmp_requeue01.c:180: TFAIL: woken up -4, expected range (2, 2)
futex_cmp_requeue01.c:103: TINFO: Test 3: waiters: 100, wakes: 50, requeues: 50
futex_cmp_requeue01.c:126: TINFO: futex_cmp_requeue() returned 50
futex_cmp_requeue01.c:140: TINFO: children woken, futex0: 0, futex1: 50, spurious wakeups: 0
futex_cmp_requeue01.c:180: TFAIL: woken up 0, expected range (50, 50)
futex_cmp_requeue01.c:103: TINFO: Test 4: waiters: 100, wakes: 0, requeues: 70
futex_cmp_requeue01.c:126: TINFO: futex_cmp_requeue() returned 0
futex_cmp_requeue01.c:140: TINFO: children woken, futex0: 30, futex1: 70, spurious wakeups: 0
futex_cmp_requeue01.c:180: TFAIL: woken up -70, expected range (0, 0)
futex_cmp_requeue01.c:95: TBROK: fork() failed: EAGAIN/EWOULDBLOCK (11)
tst_test.c:1606: TINFO: Killed the leftover descendant processes
Summary:
passed 0
failed 5
broken 1
skipped 0
warnings 0
*** futex_cmp_requeue02 ***
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
futex_cmp_requeue02.c:71: TINFO: Testing variant: syscall with old kernel spec
futex_cmp_requeue02.c:64: TPASS: futex_cmp_requeue() failed as expected: EINVAL (22)
futex_cmp_requeue02.c:64: TPASS: futex_cmp_requeue() failed as expected: EINVAL (22)
futex_cmp_requeue02.c:53: TFAIL: futex_cmp_requeue() succeeded unexpectedly
HINT: You _MAY_ be missing kernel fixes:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fbe0e839d1e2
HINT: You _MAY_ be vulnerable to CVE(s):
https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-6927
Summary:
passed 2
failed 1
broken 0
skipped 0
warnings 0
*** futex_wait01 ***
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
futex_wait01.c:69: TINFO: Testing variant: syscall with old kernel spec
futex_wait01.c:62: TPASS: futex_wait() passed: ETIMEDOUT (110)
futex_wait01.c:62: TPASS: futex_wait() passed: EAGAIN/EWOULDBLOCK (11)
futex_wait01.c:62: TPASS: futex_wait() passed: ETIMEDOUT (110)
futex_wait01.c:62: TPASS: futex_wait() passed: EAGAIN/EWOULDBLOCK (11)
Summary:
passed 4
failed 0
broken 0
skipped 0
warnings 0
*** futex_wait02 ***
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
futex_wait02.c:66: TINFO: Testing variant: syscall with old kernel spec
futex_wait02.c:59: TPASS: futex_wait() woken up
Summary:
passed 1
failed 0
broken 0
skipped 0
warnings 0
*** futex_wait03 ***
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
futex_wait03.c:63: TINFO: Testing variant: syscall with old kernel spec
Test timeouted, sending SIGKILL!
tst_test.c:1612: TINFO: If you are running on slow machine, try exporting LTP_TIMEOUT_MUL > 1
tst_test.c:1614: TBROK: Test killed! (timeout?)
Summary:
passed 0
failed 0
broken 1
skipped 0
warnings 0
*** futex_wait04 ***
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
futex_wait04.c:50: TINFO: Testing variant: syscall with old kernel spec
futex_wait04.c:39: TPASS: futex_wait() returned -1: EAGAIN/EWOULDBLOCK (11)
Summary:
passed 1
failed 0
broken 0
skipped 0
warnings 0
*** futex_wait05 ***
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
sh: systemd-detect-virt: not found
tst_timer_test.c:357: TINFO: CLOCK_MONOTONIC resolution 69ns
tst_timer_test.c:365: TINFO: prctl(PR_GET_TIMERSLACK) = -1, using 50us
tst_test.c:1566: TINFO: Updating max runtime to 0h 00m 09s
tst_test.c:1558: TINFO: Timeout per run is 0h 00m 39s
tst_timer_test.c:379: TINFO: Failed to set zero latency constraint: No such file or directory
tst_timer_test.c:263: TINFO: futex_wait() sleeping for 1000us 500 iterations, threshold 450.01us
tst_timer_test.c:285: TINFO: Found 500 outliners in [20033,10715] range
tst_timer_test.c:305: TINFO: min 10715us, max 20033us, median 19998us, trunc mean 19811.95us (discarded 25)
tst_timer_test.c:314: TFAIL: futex_wait() slept for too long
Time: us | Frequency
--------------------------------------------------------------------------------
10715 | *
11206 |
11697 |
12188 | .
12679 |
13170 |
13661 |
14152 |
14643 |
15134 |
15625 |
16116 |
16607 |
17098 |
17589 |
18080 |
18571 | +
19062 | -
19553 | ********************************************************************
--------------------------------------------------------------------------------
491us | 1 sample = 0.14108 '*', 0.28216 '+', 0.56432 '-', non-zero '.'
tst_timer_test.c:263: TINFO: futex_wait() sleeping for 2000us 500 iterations, threshold 450.01us
tst_timer_test.c:285: TINFO: Found 11 outliners in [20014,20001] range
tst_timer_test.c:305: TINFO: min 11091us, max 20014us, median 19998us, trunc mean 19963.34us (discarded 25)
tst_timer_test.c:314: TFAIL: futex_wait() slept for too long
Time: us | Frequency
--------------------------------------------------------------------------------
11091 | .
11561 |
12031 |
12501 |
12971 |
13441 |
13911 | .
14381 |
14851 |
15321 |
15791 |
16261 |
16731 |
17201 |
17671 |
18141 |
18611 | .
19081 |
19551 | ********************************************************************
--------------------------------------------------------------------------------
470us | 1 sample = 0.13682 '*', 0.27364 '+', 0.54728 '-', non-zero '.'
tst_timer_test.c:263: TINFO: futex_wait() sleeping for 5000us 300 iterations, threshold 450.04us
tst_timer_test.c:305: TINFO: min 19554us, max 20000us, median 19998us, trunc mean 19996.26us (discarded 15)
tst_timer_test.c:314: TFAIL: futex_wait() slept for too long
Time: us | Frequency
--------------------------------------------------------------------------------
19554 | .
19578 |
19602 |
19626 |
19650 |
19674 |
19698 |
19722 |
19746 |
19770 |
19794 |
19818 |
19842 |
19866 |
19890 |
19914 |
19938 |
19962 |
19986 | ********************************************************************
--------------------------------------------------------------------------------
24us | 1 sample = 0.22742 '*', 0.45485 '+', 0.90970 '-', non-zero '.'
tst_timer_test.c:263: TINFO: futex_wait() sleeping for 10000us 100 iterations, threshold 450.33us
tst_timer_test.c:305: TINFO: min 19678us, max 20001us, median 19998us, trunc mean 19994.49us (discarded 5)
tst_timer_test.c:314: TFAIL: futex_wait() slept for too long
Time: us | Frequency
--------------------------------------------------------------------------------
19678 | +
19695 |
19712 |
19729 |
19746 |
19763 |
19780 |
19797 |
19814 |
19831 |
19848 |
19865 |
19882 |
19899 |
19916 |
19933 |
19950 |
19967 |
19984 | ********************************************************************
20001 | +
--------------------------------------------------------------------------------
17us | 1 sample = 0.69388 '*', 1.38776 '+', 2.77551 '-', non-zero '.'
tst_timer_test.c:263: TINFO: futex_wait() sleeping for 25000us 50 iterations, threshold 451.29us
tst_timer_test.c:305: TINFO: min 39713us, max 40000us, median 39998us, trunc mean 39992.19us (discarded 2)
tst_timer_test.c:314: TFAIL: futex_wait() slept for too long
Time: us | Frequency
--------------------------------------------------------------------------------
39713 | *-
39729 |
39745 |
39761 |
39777 |
39793 |
39809 |
39825 |
39841 |
39857 |
39873 |
39889 |
39905 |
39921 |
39937 |
39953 |
39969 |
39985 | ********************************************************************
--------------------------------------------------------------------------------
16us | 1 sample = 1.38776 '*', 2.77551 '+', 5.55102 '-', non-zero '.'
tst_timer_test.c:263: TINFO: futex_wait() sleeping for 100000us 10 iterations, threshold 537.00us
tst_timer_test.c:305: TINFO: min 109716us, max 110001us, median 109999us, trunc mean 109967.78us (discarded 1)
tst_timer_test.c:314: TFAIL: futex_wait() slept for too long
Time: us | Frequency
--------------------------------------------------------------------------------
109716 | ********+
109731 |
109746 |
109761 |
109776 |
109791 |
109806 |
109821 |
109836 |
109851 |
109866 |
109881 |
109896 |
109911 |
109926 |
109941 |
109956 |
109971 |
109986 | ********************************************************************
110001 | ********+
--------------------------------------------------------------------------------
15us | 1 sample = 8.50000 '*', 17.00000 '+', 34.00000 '-', non-zero '.'
tst_timer_test.c:263: TINFO: futex_wait() sleeping for 1000000us 2 iterations, threshold 4400.00us
tst_timer_test.c:305: TINFO: min 1009721us, max 1010012us, median 1009721us, trunc mean 1009721.00us (discarded 1)
tst_timer_test.c:314: TFAIL: futex_wait() slept for too long
Time: us | Frequency
--------------------------------------------------------------------------------
1009721 | ********************************************************************
1009737 |
1009753 |
1009769 |
1009785 |
1009801 |
1009817 |
1009833 |
1009849 |
1009865 |
1009881 |
1009897 |
1009913 |
1009929 |
1009945 |
1009961 |
1009977 |
1009993 |
1010009 | ********************************************************************
--------------------------------------------------------------------------------
16us | 1 sample = 68.00000 '*', 136.00000 '+', 272.00000 '-', non-zero '.'
Summary:
passed 0
failed 7
broken 0
skipped 0
warnings 0
*** futex_wait_bitset01 ***
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
futex_wait_bitset01.c:99: TINFO: Testing variant: syscall with old kernel spec
futex_wait_bitset01.c:44: TINFO: testing futex_wait_bitset() timeout with CLOCK_MONOTONIC
futex_wait_bitset01.c:86: TPASS: futex_wait_bitset() waited 114236us, expected 100010us
futex_wait_bitset01.c:44: TINFO: testing futex_wait_bitset() timeout with CLOCK_REALTIME
futex_wait_bitset01.c:86: TPASS: futex_wait_bitset() waited 119960us, expected 100010us
Summary:
passed 2
failed 0
broken 0
skipped 0
warnings 0
*** futex_waitv01 ***
tst_test.c:899: TCONF: The test requires kernel 5.16 or newer
*** futex_waitv02 ***
tst_test.c:899: TCONF: The test requires kernel 5.16 or newer
*** futex_waitv03 ***
tst_test.c:899: TCONF: The test requires kernel 5.16 or newer
*** futex_wake01 ***
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
futex_wake01.c:59: TINFO: Testing variant: syscall with old kernel spec
futex_wake01.c:52: TPASS: futex_wake() passed
futex_wake01.c:52: TPASS: futex_wake() passed
futex_wake01.c:52: TPASS: futex_wake() passed
futex_wake01.c:52: TPASS: futex_wake() passed
futex_wake01.c:52: TPASS: futex_wake() passed
futex_wake01.c:52: TPASS: futex_wake() passed
Summary:
passed 6
failed 0
broken 0
skipped 0
warnings 0
*** futex_wake02 ***
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
futex_wake02.c:134: TINFO: Testing variant: syscall with old kernel spec
futex_utils.h:69: TINFO: 0 threads sleeping, expected 55
futex_wake02.c:91: TPASS: futex_wake() woken up 1 threads
futex_wake02.c:91: TPASS: futex_wake() woken up 2 threads
futex_wake02.c:91: TPASS: futex_wake() woken up 3 threads
futex_wake02.c:91: TPASS: futex_wake() woken up 4 threads
futex_wake02.c:91: TPASS: futex_wake() woken up 5 threads
futex_wake02.c:91: TPASS: futex_wake() woken up 6 threads
futex_wake02.c:91: TPASS: futex_wake() woken up 7 threads
futex_wake02.c:91: TPASS: futex_wake() woken up 8 threads
futex_wake02.c:91: TPASS: futex_wake() woken up 9 threads
futex_wake02.c:91: TPASS: futex_wake() woken up 10 threads
futex_wake02.c:103: TPASS: futex_wake() woken up 0 threads
Summary:
passed 11
failed 0
broken 0
skipped 0
warnings 0
*** futex_wake03 ***
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
futex_wake03.c:97: TINFO: Testing variant: syscall with old kernel spec
futex_wake03.c:61: TPASS: futex_wake() woken up 1 childs
futex_wake03.c:61: TPASS: futex_wake() woken up 2 childs
futex_wake03.c:61: TPASS: futex_wake() woken up 3 childs
futex_wake03.c:61: TPASS: futex_wake() woken up 4 childs
futex_wake03.c:61: TPASS: futex_wake() woken up 5 childs
futex_wake03.c:61: TPASS: futex_wake() woken up 6 childs
futex_wake03.c:61: TPASS: futex_wake() woken up 7 childs
futex_wake03.c:61: TPASS: futex_wake() woken up 8 childs
futex_wake03.c:61: TPASS: futex_wake() woken up 9 childs
futex_wake03.c:61: TPASS: futex_wake() woken up 10 childs
futex_wake03.c:89: TPASS: futex_wake() woken up 0 children
Summary:
passed 11
failed 0
broken 0
skipped 0
warnings 0
*** futex_wake04 ***
tst_test.c:1152: TCONF: Test needs to be run as root
Running the last as root just gives:
tst_hugepage.c:34: TCONF: hugetlbfs is not supported
Perhaps these are easier-to-debug cases?
Thomas
From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/56828 CVS commit: src/tests/lib/libc/sys
Date: Sat, 18 Jan 2025 06:22:35 +0000
Module Name: src
Committed By: riastradh
Date: Sat Jan 18 06:22:35 UTC 2025
Modified Files:
src/tests/lib/libc/sys: t_futex_ops.c
Log Message:
tests/lib/libc/sys/t_futex_ops: Test FUTEX_CMP_REQUEUE edge case.
It must always compare the futex value and fail with EAGAIN on
mismatch, even if there are no waiters.
PR kern/56828: futex calls in Linux emulation sometimes hang
To generate a diff of this commit:
cvs rdiff -u -r1.5 -r1.6 src/tests/lib/libc/sys/t_futex_ops.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/56828 CVS commit: src/tests/lib/libc/sys
Date: Sat, 18 Jan 2025 06:22:56 +0000
Module Name: src
Committed By: riastradh
Date: Sat Jan 18 06:22:56 UTC 2025
Modified Files:
src/tests/lib/libc/sys: t_futex_ops.c
Log Message:
tests/lib/libc/sys/t_futex_ops: Fix FUTEX_CMP_REQUEUE return values.
The return value is the number of waiters woken _or requeued_, not
just the number of waiters woken:
FUTEX_CMP_REQUEUE
Returns the total number of waiters that were woken up or
requeued to the futex for the futex word at uaddr2. If
this value is greater than val, then the difference is the
number of waiters requeued to the futex for the futex word
at uaddr2.
https://man7.org/linux/man-pages/man2/futex.2.html
PR kern/56828: futex calls in Linux emulation sometimes hang
To generate a diff of this commit:
cvs rdiff -u -r1.6 -r1.7 src/tests/lib/libc/sys/t_futex_ops.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/56828 CVS commit: src/tests/lib/libc/sys
Date: Sat, 18 Jan 2025 07:05:15 +0000
Module Name: src
Committed By: riastradh
Date: Sat Jan 18 07:05:15 UTC 2025
Modified Files:
src/tests/lib/libc/sys: t_futex_ops.c
Log Message:
tests/lib/libc/sys/t_futex_ops: Fix another FUTEX_CMP_REQUEUE case.
PR kern/56828: futex calls in Linux emulation sometimes hang
To generate a diff of this commit:
cvs rdiff -u -r1.7 -r1.8 src/tests/lib/libc/sys/t_futex_ops.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/56828 CVS commit: src
Date: Sat, 18 Jan 2025 07:26:06 +0000
Module Name: src
Committed By: riastradh
Date: Sat Jan 18 07:26:06 UTC 2025
Modified Files:
src/sys/kern: sys_futex.c
src/tests/lib/libc/sys: t_futex_ops.c
Log Message:
futex(2): Fix FUTEX_CMP_REQUEUE to always compare even if no waiters.
It must always compare the futex value and fail with EAGAIN on
mismatch, even if there are no waiters.
FUTEX_CMP_REQUEUE (since Linux 2.6.7)
This operation first checks whether the location uaddr
still contains the value val3. If not, the operation
fails with the error EAGAIN. Otherwise, the operation [...]
https://man7.org/linux/man-pages/man2/futex.2.html
PR kern/56828: futex calls in Linux emulation sometimes hang
To generate a diff of this commit:
cvs rdiff -u -r1.20 -r1.21 src/sys/kern/sys_futex.c
cvs rdiff -u -r1.8 -r1.9 src/tests/lib/libc/sys/t_futex_ops.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/56828 CVS commit: src
Date: Sat, 18 Jan 2025 07:26:22 +0000
Module Name: src
Committed By: riastradh
Date: Sat Jan 18 07:26:21 UTC 2025
Modified Files:
src/sys/kern: sys_futex.c
src/tests/lib/libc/sys: t_futex_ops.c
Log Message:
futex(2): Fix return value of FUTEX_CMP_REQUEUE.
The return value is the number of waiters woken _or requeued_, not
just the number of waiters woken:
FUTEX_CMP_REQUEUE
Returns the total number of waiters that were woken up or
requeued to the futex for the futex word at uaddr2. If
this value is greater than val, then the difference is the
number of waiters requeued to the futex for the futex word
at uaddr2.
https://man7.org/linux/man-pages/man2/futex.2.html
While here, clarify some of the arguments with comments so it's not
quite so cryptic with val/val2/val3 everywhere.
PR kern/56828: futex calls in Linux emulation sometimes hang
To generate a diff of this commit:
cvs rdiff -u -r1.21 -r1.22 src/sys/kern/sys_futex.c
cvs rdiff -u -r1.9 -r1.10 src/tests/lib/libc/sys/t_futex_ops.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: Taylor R Campbell <riastradh@NetBSD.org>
To: Thomas Klausner <wiz@NetBSD.org>
Cc: gnats-bugs@NetBSD.org, netbsd-bugs@NetBSD.org,
Jason Thorpe <thorpej@NetBSD.org>
Subject: Re: kern/56828: futex calls in Linux emulation sometimes hang
Date: Sat, 18 Jan 2025 07:45:56 +0000
Can you try again with sys_futex.c 1.22 -- both the futex tests, and
your Java application?
State-Changed-From-To: open->feedback
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Sat, 18 Jan 2025 07:51:43 +0000
State-Changed-Why:
candidate fixes committed, feedback requested
From: Thomas Klausner <wiz@NetBSD.org>
To: Taylor R Campbell <riastradh@NetBSD.org>
Cc: gnats-bugs@NetBSD.org, Jason Thorpe <thorpej@NetBSD.org>
Subject: Re: kern/56828: futex calls in Linux emulation sometimes hang
Date: Sat, 18 Jan 2025 11:36:27 +0100
On Sat, Jan 18, 2025 at 07:45:56AM +0000, Taylor R Campbell wrote:
> Can you try again with sys_futex.c 1.22 -- both the futex tests, and
> your Java application?
Thank you!
The futex tests look much better now, but still quite a lot are
failing (mostly futex_wait issues):
*** futex_cmp_requeue01 ***
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
futex_cmp_requeue01.c:194: TINFO: Testing variant: syscall with old kernel spec
futex_cmp_requeue01.c:103: TINFO: Test 0: waiters: 10, wakes: 3, requeues: 7
futex_cmp_requeue01.c:126: TINFO: futex_cmp_requeue() returned 10
futex_cmp_requeue01.c:140: TINFO: children woken, futex0: 0, futex1: 7, spurious wakeups: 0
futex_cmp_requeue01.c:187: TPASS: futex_cmp_requeue()
futex_cmp_requeue01.c:103: TINFO: Test 1: waiters: 10, wakes: 0, requeues: 10
futex_cmp_requeue01.c:126: TINFO: futex_cmp_requeue() returned 10
futex_cmp_requeue01.c:140: TINFO: children woken, futex0: 0, futex1: 10, spurious wakeups: 0
futex_cmp_requeue01.c:187: TPASS: futex_cmp_requeue()
futex_cmp_requeue01.c:103: TINFO: Test 2: waiters: 10, wakes: 2, requeues: 6
futex_cmp_requeue01.c:126: TINFO: futex_cmp_requeue() returned 8
futex_cmp_requeue01.c:140: TINFO: children woken, futex0: 2, futex1: 6, spurious wakeups: 0
futex_cmp_requeue01.c:187: TPASS: futex_cmp_requeue()
futex_cmp_requeue01.c:103: TINFO: Test 3: waiters: 100, wakes: 50, requeues: 50
futex_cmp_requeue01.c:126: TINFO: futex_cmp_requeue() returned 100
futex_cmp_requeue01.c:140: TINFO: children woken, futex0: 0, futex1: 50, spurious wakeups: 0
futex_cmp_requeue01.c:187: TPASS: futex_cmp_requeue()
futex_cmp_requeue01.c:103: TINFO: Test 4: waiters: 100, wakes: 0, requeues: 70
futex_cmp_requeue01.c:126: TINFO: futex_cmp_requeue() returned 70
futex_cmp_requeue01.c:140: TINFO: children woken, futex0: 30, futex1: 70, spurious wakeups: 0
futex_cmp_requeue01.c:187: TPASS: futex_cmp_requeue()
futex_cmp_requeue01.c:95: TBROK: fork() failed: EAGAIN/EWOULDBLOCK (11)
tst_test.c:1606: TINFO: Killed the leftover descendant processes
Summary:
passed 5
failed 0
broken 1
skipped 0
warnings 0
*** futex_cmp_requeue02 ***
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
futex_cmp_requeue02.c:71: TINFO: Testing variant: syscall with old kernel spec
futex_cmp_requeue02.c:64: TPASS: futex_cmp_requeue() failed as expected: EINVAL (22)
futex_cmp_requeue02.c:64: TPASS: futex_cmp_requeue() failed as expected: EINVAL (22)
futex_cmp_requeue02.c:64: TPASS: futex_cmp_requeue() failed as expected: EAGAIN/EWOULDBLOCK (11)
Summary:
passed 3
failed 0
broken 0
skipped 0
warnings 0
*** futex_wait01 ***
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
futex_wait01.c:69: TINFO: Testing variant: syscall with old kernel spec
futex_wait01.c:62: TPASS: futex_wait() passed: ETIMEDOUT (110)
futex_wait01.c:62: TPASS: futex_wait() passed: EAGAIN/EWOULDBLOCK (11)
futex_wait01.c:62: TPASS: futex_wait() passed: ETIMEDOUT (110)
futex_wait01.c:62: TPASS: futex_wait() passed: EAGAIN/EWOULDBLOCK (11)
Summary:
passed 4
failed 0
broken 0
skipped 0
warnings 0
*** futex_wait02 ***
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
futex_wait02.c:66: TINFO: Testing variant: syscall with old kernel spec
futex_wait02.c:59: TPASS: futex_wait() woken up
Summary:
passed 1
failed 0
broken 0
skipped 0
warnings 0
*** futex_wait03 ***
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
futex_wait03.c:63: TINFO: Testing variant: syscall with old kernel spec
Test timeouted, sending SIGKILL!
tst_test.c:1612: TINFO: If you are running on slow machine, try exporting LTP_TIMEOUT_MUL > 1
tst_test.c:1614: TBROK: Test killed! (timeout?)
Summary:
passed 0
failed 0
broken 1
skipped 0
warnings 0
*** futex_wait04 ***
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
futex_wait04.c:50: TINFO: Testing variant: syscall with old kernel spec
futex_wait04.c:39: TPASS: futex_wait() returned -1: EAGAIN/EWOULDBLOCK (11)
Summary:
passed 1
failed 0
broken 0
skipped 0
warnings 0
*** futex_wait05 ***
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
sh: systemd-detect-virt: command not found
tst_timer_test.c:357: TINFO: CLOCK_MONOTONIC resolution 1ns
tst_timer_test.c:365: TINFO: prctl(PR_GET_TIMERSLACK) = -1, using 50us
tst_test.c:1566: TINFO: Updating max runtime to 0h 00m 09s
tst_test.c:1558: TINFO: Timeout per run is 0h 00m 39s
tst_timer_test.c:379: TINFO: Failed to set zero latency constraint: No such file or directory
tst_timer_test.c:263: TINFO: futex_wait() sleeping for 1000us 500 iterations, threshold 450.01us
tst_timer_test.c:285: TINFO: Found 500 outliners in [20098,13688] range
tst_timer_test.c:305: TINFO: min 13688us, max 20098us, median 20000us, trunc mean 19976.46us (discarded 25)
tst_timer_test.c:314: TFAIL: futex_wait() slept for too long
Time: us | Frequency
--------------------------------------------------------------------------------
13688 | .
14026 |
14364 |
14702 |
15040 |
15378 |
15716 |
16054 | .
16392 |
16730 |
17068 |
17406 |
17744 |
18082 |
18420 |
18758 |
19096 | .
19434 |
19772 | ********************************************************************
--------------------------------------------------------------------------------
338us | 1 sample = 0.13682 '*', 0.27364 '+', 0.54728 '-', non-zero '.'
tst_timer_test.c:263: TINFO: futex_wait() sleeping for 2000us 500 iterations, threshold 450.01us
tst_timer_test.c:285: TINFO: Found 17 outliners in [20141,20001] range
tst_timer_test.c:305: TINFO: min 19587us, max 20141us, median 20000us, trunc mean 19996.29us (discarded 25)
tst_timer_test.c:314: TFAIL: futex_wait() slept for too long
Time: us | Frequency
--------------------------------------------------------------------------------
19587 | .
19617 |
19647 |
19677 |
19707 |
19737 |
19767 |
19797 |
19827 |
19857 | +
19887 | -
19917 |
19947 |
19977 | ********************************************************************
20007 |
20037 |
20067 |
20097 | -
20127 | +
--------------------------------------------------------------------------------
30us | 1 sample = 0.14196 '*', 0.28392 '+', 0.56785 '-', non-zero '.'
tst_timer_test.c:263: TINFO: futex_wait() sleeping for 5000us 300 iterations, threshold 450.04us
tst_timer_test.c:305: TINFO: min 19630us, max 20141us, median 20000us, trunc mean 19995.72us (discarded 15)
tst_timer_test.c:314: TFAIL: futex_wait() slept for too long
Time: us | Frequency
--------------------------------------------------------------------------------
19630 | .
19657 |
19684 |
19711 |
19738 |
19765 |
19792 |
19819 |
19846 | *
19873 |
19900 | .
19927 |
19954 |
19981 | ********************************************************************
20008 |
20035 |
20062 |
20089 | .
20116 | *
--------------------------------------------------------------------------------
27us | 1 sample = 0.23693 '*', 0.47387 '+', 0.94774 '-', non-zero '.'
tst_timer_test.c:263: TINFO: futex_wait() sleeping for 10000us 100 iterations, threshold 450.33us
tst_timer_test.c:305: TINFO: min 19670us, max 20141us, median 20000us, trunc mean 19992.52us (discarded 5)
tst_timer_test.c:314: TFAIL: futex_wait() slept for too long
Time: us | Frequency
--------------------------------------------------------------------------------
19670 | +
19695 |
19720 |
19745 |
19770 |
19795 |
19820 |
19845 | *+
19870 |
19895 |
19920 |
19945 | +
19970 | **-
19995 | ********************************************************************
20020 |
20045 | +
20070 |
20095 |
20120 | *+
--------------------------------------------------------------------------------
25us | 1 sample = 0.75556 '*', 1.51111 '+', 3.02222 '-', non-zero '.'
tst_timer_test.c:263: TINFO: futex_wait() sleeping for 25000us 50 iterations, threshold 451.29us
tst_timer_test.c:305: TINFO: min 30356us, max 40009us, median 40000us, trunc mean 39790.12us (discarded 2)
tst_timer_test.c:314: TFAIL: futex_wait() slept for too long
Time: us | Frequency
--------------------------------------------------------------------------------
30356 | *-
30865 |
31374 |
31883 |
32392 |
32901 |
33410 |
33919 |
34428 |
34937 |
35446 |
35955 |
36464 |
36973 |
37482 |
37991 |
38500 |
39009 |
39518 | ********************************************************************
--------------------------------------------------------------------------------
509us | 1 sample = 1.38776 '*', 2.77551 '+', 5.55102 '-', non-zero '.'
tst_timer_test.c:263: TINFO: futex_wait() sleeping for 100000us 10 iterations, threshold 537.00us
tst_timer_test.c:305: TINFO: min 109664us, max 110000us, median 109999us, trunc mean 109961.78us (discarded 1)
tst_timer_test.c:314: TFAIL: futex_wait() slept for too long
Time: us | Frequency
--------------------------------------------------------------------------------
109664 | *******+
109682 |
109700 |
109718 |
109736 |
109754 |
109772 |
109790 |
109808 |
109826 |
109844 |
109862 |
109880 |
109898 |
109916 |
109934 |
109952 |
109970 |
109988 | ********************************************************************
--------------------------------------------------------------------------------
18us | 1 sample = 7.55556 '*', 15.11111 '+', 30.22222 '-', non-zero '.'
tst_timer_test.c:263: TINFO: futex_wait() sleeping for 1000000us 2 iterations, threshold 4400.00us
tst_timer_test.c:305: TINFO: min 1009663us, max 1009995us, median 1009663us, trunc mean 1009663.00us (discarded 1)
tst_timer_test.c:314: TFAIL: futex_wait() slept for too long
Time: us | Frequency
--------------------------------------------------------------------------------
1009663 | ********************************************************************
1009681 |
1009699 |
1009717 |
1009735 |
1009753 |
1009771 |
1009789 |
1009807 |
1009825 |
1009843 |
1009861 |
1009879 |
1009897 |
1009915 |
1009933 |
1009951 |
1009969 |
1009987 | ********************************************************************
--------------------------------------------------------------------------------
18us | 1 sample = 68.00000 '*', 136.00000 '+', 272.00000 '-', non-zero '.'
Summary:
passed 0
failed 7
broken 0
skipped 0
warnings 0
*** futex_wait_bitset01 ***
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
futex_wait_bitset01.c:99: TINFO: Testing variant: syscall with old kernel spec
futex_wait_bitset01.c:44: TINFO: testing futex_wait_bitset() timeout with CLOCK_MONOTONIC
futex_wait_bitset01.c:86: TPASS: futex_wait_bitset() waited 113114us, expected 100010us
futex_wait_bitset01.c:44: TINFO: testing futex_wait_bitset() timeout with CLOCK_REALTIME
futex_wait_bitset01.c:86: TPASS: futex_wait_bitset() waited 119990us, expected 100010us
Summary:
passed 2
failed 0
broken 0
skipped 0
warnings 0
*** futex_waitv01 ***
tst_buffers.c:55: TINFO: Test is using guarded buffers
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
futex2test.h:27: TCONF: syscall(449) __NR_futex_waitv not supported on your arch
Summary:
passed 0
failed 0
broken 0
skipped 1
warnings 0
*** futex_waitv02 ***
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
futex_waitv02.c:34: TINFO: Testing variant: syscall with old kernel spec
tst_buffers.c:55: TINFO: Test is using guarded buffers
futex2test.h:27: TCONF: syscall(449) __NR_futex_waitv not supported on your arch
Summary:
passed 0
failed 0
broken 0
skipped 1
warnings 0
*** futex_waitv03 ***
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
futex_waitv03.c:37: TINFO: Testing variant: syscall with old kernel spec
tst_buffers.c:55: TINFO: Test is using guarded buffers
futex2test.h:27: TCONF: syscall(449) __NR_futex_waitv not supported on your arch
Summary:
passed 0
failed 0
broken 0
skipped 1
warnings 0
*** futex_wake01 ***
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
futex_wake01.c:59: TINFO: Testing variant: syscall with old kernel spec
futex_wake01.c:52: TPASS: futex_wake() passed
futex_wake01.c:52: TPASS: futex_wake() passed
futex_wake01.c:52: TPASS: futex_wake() passed
futex_wake01.c:52: TPASS: futex_wake() passed
futex_wake01.c:52: TPASS: futex_wake() passed
futex_wake01.c:52: TPASS: futex_wake() passed
Summary:
passed 6
failed 0
broken 0
skipped 0
warnings 0
*** futex_wake02 ***
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
futex_wake02.c:134: TINFO: Testing variant: syscall with old kernel spec
futex_utils.h:69: TINFO: 0 threads sleeping, expected 55
futex_wake02.c:91: TPASS: futex_wake() woken up 1 threads
futex_wake02.c:91: TPASS: futex_wake() woken up 2 threads
futex_wake02.c:91: TPASS: futex_wake() woken up 3 threads
futex_wake02.c:91: TPASS: futex_wake() woken up 4 threads
futex_wake02.c:91: TPASS: futex_wake() woken up 5 threads
futex_wake02.c:91: TPASS: futex_wake() woken up 6 threads
futex_wake02.c:91: TPASS: futex_wake() woken up 7 threads
futex_wake02.c:91: TPASS: futex_wake() woken up 8 threads
futex_wake02.c:91: TPASS: futex_wake() woken up 9 threads
futex_wake02.c:91: TPASS: futex_wake() woken up 10 threads
futex_wake02.c:103: TPASS: futex_wake() woken up 0 threads
Summary:
passed 11
failed 0
broken 0
skipped 0
warnings 0
*** futex_wake03 ***
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
futex_wake03.c:97: TINFO: Testing variant: syscall with old kernel spec
futex_wake03.c:61: TPASS: futex_wake() woken up 1 childs
futex_wake03.c:61: TPASS: futex_wake() woken up 2 childs
futex_wake03.c:61: TPASS: futex_wake() woken up 3 childs
futex_wake03.c:61: TPASS: futex_wake() woken up 4 childs
futex_wake03.c:61: TPASS: futex_wake() woken up 5 childs
futex_wake03.c:61: TPASS: futex_wake() woken up 6 childs
futex_wake03.c:61: TPASS: futex_wake() woken up 7 childs
futex_wake03.c:61: TPASS: futex_wake() woken up 8 childs
futex_wake03.c:61: TPASS: futex_wake() woken up 9 childs
futex_wake03.c:61: TPASS: futex_wake() woken up 10 childs
futex_wake03.c:89: TPASS: futex_wake() woken up 0 children
Summary:
passed 11
failed 0
broken 0
skipped 0
warnings 0
*** futex_wake04 ***
tst_test.c:1152: TCONF: Test needs to be run as root
I can send you the tests if you want.
I tried the Metalworks demo from jdk17 and it worked fine.
Then I tried the PDF-Over application.
I could get one successful run through the application, but
I had about 9 other tries where it didn't complete the process.
Mostly not show the PDF (step 2 of the process), or show
just a gray screen.
Right now top says the process is in futex, so I suspect there are
still more problems. Perhaps the futex_wait() problem bites us here.
Thanks,
Thomas
From: Taylor R Campbell <riastradh@NetBSD.org>
To: Thomas Klausner <wiz@NetBSD.org>
Cc: gnats-bugs@NetBSD.org, netbsd-bugs@NetBSD.org,
Jason Thorpe <thorpej@NetBSD.org>
Subject: Re: kern/56828: futex calls in Linux emulation sometimes hang
Date: Sat, 18 Jan 2025 11:03:37 +0000
> Date: Sat, 18 Jan 2025 11:36:27 +0100
> From: Thomas Klausner <wiz@NetBSD.org>
>=20
> The futex tests look much better now, but still quite a lot are
> failing (mostly futex_wait issues):
>=20
> futex_cmp_requeue01.c:95: TBROK: fork() failed: EAGAIN/EWOULDBLOCK (11)
> tst_test.c:1606: TINFO: Killed the leftover descendant processes
Looks like you hit a process rlimit. Can you bump ulimit -p or
kern.maxproc?
> *** futex_wait03 ***
>=20
> tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adj=
ustment
> tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
> tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adj=
ustment
> futex_wait03.c:63: TINFO: Testing variant: syscall with old kernel spec
> Test timeouted, sending SIGKILL!
> tst_test.c:1612: TINFO: If you are running on slow machine, try exporting=
LTP_TIMEOUT_MUL > 1
> tst_test.c:1614: TBROK: Test killed! (timeout?)
>=20
> Summary:
> passed 0
> failed 0
> broken 1
> skipped 0
> warnings 0
I suspect this is a bug in NetBSD's implementation of /proc/$pid/stat.
This is the only test case that queries it from another thread, I
think, and it looks like when that happens, /proc/$pid/stat doesn't
correctly report the other thread as sleeping (`S') when it is waiting
in futex(FUTEX_WAIT), so the wait-for-sleep busy loop spins forever
(or until timeout).
Could add a printf after TST_PROCESS_STATE_WAIT (and an fflush after
that) to verify that the test never gets past that loop.
> *** futex_wait05 ***
> [...]
> tst_timer_test.c:263: TINFO: futex_wait() sleeping for 1000us 500 iterati=
ons, threshold 450.01us
> tst_timer_test.c:285: TINFO: Found 500 outliners in [20098,13688] range
> tst_timer_test.c:305: TINFO: min 13688us, max 20098us, median 20000us, tr=
unc mean 19976.46us (discarded 25)
> tst_timer_test.c:314: TFAIL: futex_wait() slept for too long
These failures are all about the limited resolution of sleeps. I'm
guessing you're running at 100 Hz. These times are around 1-2 ticks
past the requested deadline, or 10-20ms =3D 10000-20000us (plus a tiny
slop of a few dozen microseconds). I would expect this to slow things
down but not make them deadlock.
> I tried the Metalworks demo from jdk17 and it worked fine.
>=20
> Then I tried the PDF-Over application.
> I could get one successful run through the application, but
> I had about 9 other tries where it didn't complete the process.
> Mostly not show the PDF (step 2 of the process), or show
> just a gray screen.
>=20
> Right now top says the process is in futex, so I suspect there are
> still more problems. Perhaps the futex_wait() problem bites us here.
Boo. I guess we need to kernhist it up to find what futex events had
recently happened before the deadlock.
From: Thomas Klausner <wiz@NetBSD.org>
To: Taylor R Campbell <riastradh@NetBSD.org>
Cc: gnats-bugs@NetBSD.org, Jason Thorpe <thorpej@NetBSD.org>
Subject: Re: kern/56828: futex calls in Linux emulation sometimes hang
Date: Sat, 18 Jan 2025 12:17:02 +0100
On Sat, Jan 18, 2025 at 11:03:37AM +0000, Taylor R Campbell wrote:
> > Date: Sat, 18 Jan 2025 11:36:27 +0100
> > From: Thomas Klausner <wiz@NetBSD.org>
> >
> > The futex tests look much better now, but still quite a lot are
> > failing (mostly futex_wait issues):
> >
> > futex_cmp_requeue01.c:95: TBROK: fork() failed: EAGAIN/EWOULDBLOCK (11)
> > tst_test.c:1606: TINFO: Killed the leftover descendant processes
>
> Looks like you hit a process rlimit. Can you bump ulimit -p or
> kern.maxproc?
I can't get 'ulimit -p' over 1044 (1045 throws an error). If I bump
that and kern.maxproc=10000 (from 1044, what a coincidence), the test
works:
$ ./futex_cmp_requeue01
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
tst_memutils.c:141: TINFO: oom_score_adj does not exist, skipping the adjustment
futex_cmp_requeue01.c:194: TINFO: Testing variant: syscall with old kernel spec
futex_cmp_requeue01.c:103: TINFO: Test 0: waiters: 10, wakes: 3, requeues: 7
futex_cmp_requeue01.c:126: TINFO: futex_cmp_requeue() returned 10
futex_cmp_requeue01.c:140: TINFO: children woken, futex0: 0, futex1: 7, spurious wakeups: 0
futex_cmp_requeue01.c:187: TPASS: futex_cmp_requeue()
futex_cmp_requeue01.c:103: TINFO: Test 1: waiters: 10, wakes: 0, requeues: 10
futex_cmp_requeue01.c:126: TINFO: futex_cmp_requeue() returned 10
futex_cmp_requeue01.c:140: TINFO: children woken, futex0: 0, futex1: 10, spurious wakeups: 0
futex_cmp_requeue01.c:187: TPASS: futex_cmp_requeue()
futex_cmp_requeue01.c:103: TINFO: Test 2: waiters: 10, wakes: 2, requeues: 6
futex_cmp_requeue01.c:126: TINFO: futex_cmp_requeue() returned 8
futex_cmp_requeue01.c:140: TINFO: children woken, futex0: 2, futex1: 6, spurious wakeups: 0
futex_cmp_requeue01.c:187: TPASS: futex_cmp_requeue()
futex_cmp_requeue01.c:103: TINFO: Test 3: waiters: 100, wakes: 50, requeues: 50
futex_cmp_requeue01.c:126: TINFO: futex_cmp_requeue() returned 100
futex_cmp_requeue01.c:140: TINFO: children woken, futex0: 0, futex1: 50, spurious wakeups: 0
futex_cmp_requeue01.c:187: TPASS: futex_cmp_requeue()
futex_cmp_requeue01.c:103: TINFO: Test 4: waiters: 100, wakes: 0, requeues: 70
futex_cmp_requeue01.c:126: TINFO: futex_cmp_requeue() returned 70
futex_cmp_requeue01.c:140: TINFO: children woken, futex0: 30, futex1: 70, spurious wakeups: 0
futex_cmp_requeue01.c:187: TPASS: futex_cmp_requeue()
futex_cmp_requeue01.c:103: TINFO: Test 5: waiters: 1000, wakes: 100, requeues: 900
futex_cmp_requeue01.c:126: TINFO: futex_cmp_requeue() returned 1000
futex_cmp_requeue01.c:140: TINFO: children woken, futex0: 0, futex1: 900, spurious wakeups: 0
futex_cmp_requeue01.c:187: TPASS: futex_cmp_requeue()
futex_cmp_requeue01.c:103: TINFO: Test 6: waiters: 1000, wakes: 300, requeues: 500
futex_cmp_requeue01.c:126: TINFO: futex_cmp_requeue() returned 800
futex_cmp_requeue01.c:140: TINFO: children woken, futex0: 200, futex1: 500, spurious wakeups: 0
futex_cmp_requeue01.c:187: TPASS: futex_cmp_requeue()
Summary:
passed 7
failed 0
broken 0
skipped 0
warnings 0
> These failures are all about the limited resolution of sleeps. I'm
> guessing you're running at 100 Hz. These times are around 1-2 ticks
> past the requested deadline, or 10-20ms = 10000-20000us (plus a tiny
> slop of a few dozen microseconds). I would expect this to slow things
> down but not make them deadlock.
I'm running a GENERIC, so you're probably right with 100 Hz.
Cheers,
Thomas
State-Changed-From-To: feedback->needs-pullups
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Sat, 18 Jan 2025 12:42:34 +0000
State-Changed-Why:
feedback received
needs pullup-10, inapplicable <10
and then back to the drawing board for more issues
From: Hauke Fath <hauke@Espresso.Rhein-Neckar.DE>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, riastradh@NetBSD.org,
Thomas Klausner <wiz@NetBSD.org>
Subject: Re: kern/56828 (futex calls in Linux emulation sometimes hang)
Date: Sat, 18 Jan 2025 15:21:59 +0100
On Sat, 18 Jan 2025 12:42:35 +0000 (UTC), riastradh@NetBSD.org wrote:
> Synopsis: futex calls in Linux emulation sometimes hang
Do you think this would matter for kern/58677?
I'll have to give a -current kernel a spin on Monday...
Cheerio,
Hauke
--=20
Hauke Fath <hauke@Espresso.Rhein-Neckar.DE>
Linn=E9weg 7
64342 Seeheim-Jugenheim
Germany
From: Taylor R Campbell <riastradh@NetBSD.org>
To: Hauke Fath <hauke@Espresso.Rhein-Neckar.DE>
Cc: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org,
gnats-admin@netbsd.org, Thomas Klausner <wiz@NetBSD.org>
Subject: Re: kern/56828 (futex calls in Linux emulation sometimes hang)
Date: Sat, 18 Jan 2025 16:01:59 +0000
> Date: Sat, 18 Jan 2025 15:21:59 +0100
> From: Hauke Fath <hauke@Espresso.Rhein-Neckar.DE>
>
> On Sat, 18 Jan 2025 12:42:35 +0000 (UTC), riastradh@NetBSD.org wrote:
> > Synopsis: futex calls in Linux emulation sometimes hang
>
> Do you think this would matter for kern/58677?
>
> I'll have to give a -current kernel a spin on Monday...
Yes, there's a good chance of that.
From: Taylor R Campbell <riastradh@NetBSD.org>
To: Hauke Fath <hauke@Espresso.Rhein-Neckar.DE>
Cc: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org,
gnats-admin@netbsd.org, Thomas Klausner <wiz@NetBSD.org>
Subject: Re: kern/56828 (futex calls in Linux emulation sometimes hang)
Date: Wed, 5 Mar 2025 14:17:05 +0000
> Date: Sat, 18 Jan 2025 15:21:59 +0100
> From: Hauke Fath <hauke@Espresso.Rhein-Neckar.DE>
>
> On Sat, 18 Jan 2025 12:42:35 +0000 (UTC), riastradh@NetBSD.org wrote:
> > Synopsis: futex calls in Linux emulation sometimes hang
>
> Do you think this would matter for kern/58677?
>
> I'll have to give a -current kernel a spin on Monday...
I've just committed two more futex fixes, one affecting the semantics
of FUTEX_WAKE_OP (which appeared in the ktrace for PR kern/58677
though not with parameters that are affected) and another affecting
the timing of FUTEX_WAIT. PRs for these specific issues:
PR kern/59129: futex(3): missing sign extension in FUTEX_WAKE_OP
https://gnats.NetBSD.org/59129
PR kern/59132: t_futex_ops:futex_wait_timeout_* sometimes fails on
early wakeup
https://gnats.NetBSD.org/59132
Can you please try again with a current kernel, and let me know how
(a) sysutils/tsm8,
(b) the java pdf application, and
(c) the ltp tests
work with the changes?
If things are still broken, I guess I'll have to look closer at the
full ktrace.
From: Thomas Klausner <wiz@NetBSD.org>
To: Taylor R Campbell <riastradh@NetBSD.org>
Cc: NetBSD bugtracking <gnats-bugs@NetBSD.org>
Subject: Re: kern/56828 (futex calls in Linux emulation sometimes hang)
Date: Wed, 5 Mar 2025 17:05:39 +0100
On Wed, Mar 05, 2025 at 02:17:05PM +0000, Taylor R Campbell wrote:
> (c) the ltp tests
Thank you for the fixes!
This unbroke one of the ltp tests (futex_cmp_requeue01), but two of
them are still broken the same way (futex_wait03, futex_wait05).
I can send you the binaries if you want.
Thomas
From: Hauke Fath <h.fath@nt.tu-darmstadt.de>
To: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
Thomas Klausner <wiz@NetBSD.org>
Cc:
Subject: Re: kern/56828 (futex calls in Linux emulation sometimes hang)
Date: Wed, 5 Mar 2025 18:27:33 +0100
On 05.03.2025 15:20, Taylor R Campbell via gnats wrote:
> Can you please try again with a current kernel, and let me know how
> (a) sysutils/tsm8,
Thanks for your work. Unfortunately, I see no changes with today's kernel.
Cheerio,
Hauke
--
The ASCII Ribbon Campaign Hauke Fath
() No HTML/RTF in email Institut für Nachrichtentechnik
/\ No Word docs in email TU Darmstadt
Respect for open standards Ruf +49-6151-16-21344
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2025
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.