NetBSD Problem Report #54515
From reinoud@13thmonkey.org Fri Aug 30 19:50:57 2019
Return-Path: <reinoud@13thmonkey.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id CCE3F7A153
for <gnats-bugs@gnats.NetBSD.org>; Fri, 30 Aug 2019 19:50:57 +0000 (UTC)
Message-Id: <20190830195053.7652CC1EA85@dropje.13thmonkey.org>
Date: Fri, 30 Aug 2019 21:50:53 +0200 (CEST)
From: reinoud@13thmonkey.org
Reply-To: reinoud@13thmonkey.org
To: gnats-bugs@NetBSD.org
Subject: Atomic update failure message in i915/intel_sprite.c
X-Send-Pr-Version: 3.95
>Number: 54515
>Category: kern
>Synopsis: Atomic update failure message in i915/intel_sprite.c
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Fri Aug 30 19:55:00 +0000 2019
>Closed-Date:
>Last-Modified: Sat Apr 17 14:44:37 +0000 2021
>Originator: Reinoud Zandijk
>Release: NetBSD 9.0_BETA
>Organization:
NetBSD
>Environment:
System: NetBSD dropje 9.0_BETA NetBSD 9.0_BETA (GENERIC) #0: Wed Aug 28 10:01:57 UTC 2019 mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/amd64/compile/GENERIC amd64
Architecture: x86_64
Machine: amd64
>Description:
This machine can crash its i965 GPU under normal 2D usage. Its result is a 60
second display/mouse freeze until it resets the GPU and the machine unfreezes.
In the mean time the i915 driver has dumped its memory to dmesg. At times it
gives
kern error: [drm:(/usr/src/sys/external/bsd/drm2/dist/drm/i915/intel_sprite.c:132)intel_pipe_update_start]
*ERROR* Potential atomic update failure on pipe A: -35
The error -35 under linux is -EAGAIN so most likely interaction between linux
and netbsd code.
Relevant parts from Xorg.log:
[ 50.726] (II) intel(0): [DRI2] Setup complete
[ 50.726] (II) intel(0): [DRI2] DRI driver: i965
[ 50.726] (II) intel(0): [DRI2] VDPAU driver: va_gl
Already running with
Option "AccelMethod" "UXA"
in xorg.conf
>How-To-Repeat:
Boot NetBSD on and amd64 with an i965 GPU and work in X. Using gvim or pidgin
can crash the GPU easily due to its cursor/sprite update.
>Fix:
phone@NetBSD.org suggested it might have something to do with
external/bsd/common/include/linux/err.h rev 1.3
Possible diagnostic path provided by phone@ (untested) :
https://www.netbsd.org/~mrg/syscall.diff :
---------
Index: sys/arch/x86/x86/syscall.c
===================================================================
RCS file: /cvsroot/src/sys/arch/x86/x86/syscall.c,v
retrieving revision 1.18
diff -p -u -r1.18 syscall.c
--- sys/arch/x86/x86/syscall.c 6 Apr 2019 11:54:21 -0000 1.18
+++ sys/arch/x86/x86/syscall.c 30 Aug 2019 19:32:00 -0000
@@ -47,6 +47,10 @@ __KERNEL_RCSID(0, "$NetBSD: syscall.c,v
#include <machine/psl.h>
#include <machine/userret.h>
+// XXXMRG
+#include <machine/db_machdep.h>
+#include <ddb/db_interface.h>
+
#include "opt_dtrace.h"
#ifndef __x86_64__
@@ -143,6 +147,12 @@ syscall(struct trapframe *frame)
X86_TF_RFLAGS(frame) &= ~PSL_C; /* carry bit */
} else {
switch (error) {
+#if 1 /* COMPAT_DRM */
+ case ELAST+1: /* linux-y ERESTARTSYS */
+ uprintf("%s: got linux ERESTARTSYS\n", __func__);
+ db_stacktrace();
+#endif
+ /* FALLTHROUGH */
case ERESTART:
/*
* The offset to adjust the PC by depends on whether we
---------
>Release-Note:
>Audit-Trail:
From: matthew green <mrg@eterna.com.au>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Subject: re: kern/54515: Atomic update failure message in i915/intel_sprite.c
Date: Sat, 31 Aug 2019 07:00:04 +1000
> phone@NetBSD.org suggested it might have something to do with
> external/bsd/common/include/linux/err.h rev 1.3
i think you misunderstood me.
i'm saying, use the ideas present in that change as a way to
diagnose this problem, which _may_ be a similar problem (but
not likely the same problem.)
the other patch is a way to find missing conversions similar
to the change above. i'm still tempted to commit it to get
better diags in this case, but i'd like it to be less #ifdefy.
(it also has a matching change in netbsd32_syscallc.c, and
needs one for i386.)
thanks.
.mrg.
From: Reinoud Zandijk <reinoud@NetBSD.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/54515: Atomic update failure message in i915/intel_sprite.c
Date: Tue, 3 Sep 2019 16:37:59 +0200
It might be related to:
kern error:
[drm:(/usr/src/sys/external/bsd/drm2/dist/drm/i915/i915_irq.c:3093)i915_hangcheck_elapsed]
*ERROR* Hangcheck timer elapsed... blitter ring idle
From: "Maya Rashish" <maya@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/54515 CVS commit: src/sys/external/bsd/drm2/dist/drm/i915
Date: Sat, 31 Oct 2020 04:05:42 +0000
Module Name: src
Committed By: maya
Date: Sat Oct 31 04:05:42 UTC 2020
Modified Files:
src/sys/external/bsd/drm2/dist/drm/i915: intel_sprite.c
Log Message:
Match linux here and wait without interrupts.
From David H. Gutteridge in PR port-amd64/55555
There's a second part to the patch, but "make our code behave the way
the upstream code does" is very welcome.
Also PR kern/54515 and possibly others.
To generate a diff of this commit:
cvs rdiff -u -r1.10 -r1.11 \
src/sys/external/bsd/drm2/dist/drm/i915/intel_sprite.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/54515 CVS commit: [netbsd-9] src/sys/external/bsd/drm2/dist/drm/i915
Date: Sun, 29 Nov 2020 11:34:04 +0000
Module Name: src
Committed By: martin
Date: Sun Nov 29 11:34:04 UTC 2020
Modified Files:
src/sys/external/bsd/drm2/dist/drm/i915 [netbsd-9]: intel_sprite.c
Log Message:
Pull up following revision(s) (requested by maya in ticket #1136):
sys/external/bsd/drm2/dist/drm/i915/intel_sprite.c: revision 1.11
Match linux here and wait without interrupts.
From David H. Gutteridge in PR port-amd64/55555
There's a second part to the patch, but "make our code behave the way
the upstream code does" is very welcome.
Also PR kern/54515 and possibly others.
To generate a diff of this commit:
cvs rdiff -u -r1.9 -r1.9.4.1 \
src/sys/external/bsd/drm2/dist/drm/i915/intel_sprite.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->feedback
State-Changed-By: maya@NetBSD.org
State-Changed-When: Sat, 17 Apr 2021 08:02:10 +0000
State-Changed-Why:
Is this still an issue?
From: Reinoud Zandijk <reinoud@NetBSD.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/54515 (Atomic update failure message in i915/intel_sprite.c)
Date: Sat, 17 Apr 2021 13:22:55 +0200
On Sat, Apr 17, 2021 at 08:02:11AM +0000, maya@NetBSD.org wrote:
> Synopsis: Atomic update failure message in i915/intel_sprite.c
>
> State-Changed-From-To: open->feedback
> State-Changed-By: maya@NetBSD.org
> State-Changed-When: Sat, 17 Apr 2021 08:02:10 +0000
> State-Changed-Why:
> Is this still an issue?
It ist still showing up a lot in NetBSD 9.99.81 (GENERIC) #0: Sat Mar 27
14:24:25 CET 2021.i
Running
zcat /var/log/messages.* | grep atomic | wc
gives 297 lines and I haven't been using the desktop that often too, so yeah
its still there.
State-Changed-From-To: feedback->open
State-Changed-By: maya@NetBSD.org
State-Changed-When: Sat, 17 Apr 2021 14:44:37 +0000
State-Changed-Why:
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.