NetBSD Problem Report #49710

From www@NetBSD.org  Mon Mar  2 00:20:42 2015
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id AA5F8A650D
	for <gnats-bugs@gnats.NetBSD.org>; Mon,  2 Mar 2015 00:20:42 +0000 (UTC)
Message-Id: <20150302002040.D5B14A6567@mollari.NetBSD.org>
Date: Mon,  2 Mar 2015 00:20:40 +0000 (UTC)
From: jdbaker@mylinuxisp.com
Reply-To: jdbaker@consolidated.net
To: gnats-bugs@NetBSD.org
Subject: i386 radeondrmkms panic when starting Xorg
X-Send-Pr-Version: www-1.0

>Number:         49710
>Notify-List:    jdbaker@consolidated.net
>Category:       kern
>Synopsis:       i386 radeondrmkms panic when starting Xorg
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Mar 02 00:25:00 +0000 2015
>Last-Modified:  Tue Jan 29 06:31:47 +0000 2019
>Originator:     John D. Baker
>Release:        NetBSD/i386-7.99.5 (and 7.0_BETA)
>Organization:
>Environment:
NetBSD slab 7.99.5 NetBSD 7.99.5 (SLAB_KMS) #7: Sun Mar  1 14:17:39 CST 2015  sysop@skuld.technoskunk.fur:/d0/build/current/obj/i386/sys/arch/i386/compile/SLAB_KMS i386

>Description:
i386 radeondrmkms kernel panics when starting Xorg.  dmesg excerpt and
panic messages from serial console:

NetBSD 7.99.5 (SLAB_KMS) #7: Sun Mar  1 14:17:39 CST 2015
        sysop@skuld.technoskunk.fur:/d0/build/current/obj/i386/sys/arch/i386/compile/SLAB_KMS
[...]
com0 at acpi0 (UART, PNP0501): io 0x3f8-0x3ff irq 4   
com: ns16550a, working fifo                        
com0: console              
[...]
acpivga0 at acpi0 (VID): ACPI Display Adapter
acpiout0 at acpivga0 (LCD0, 0x0110): ACPI Display Output Device
acpiout1 at acpivga0 (CRT0, 0x0100): ACPI Display Output Device
acpiout2 at acpivga0 (TV0, 0x0200): ACPI Display Output Device 
acpiout3 at acpivga0 (DVI0, 0x0210): ACPI Display Output Device
[...]
pchb0 at pci0 dev 0 function 0: Intel 82845 Host (rev. 0x04)
agp0 at pchb0: aperture at 0xe0000000, size 0x4000000       
ppb0 at pci0 dev 1 function 0: Intel 82845 AGP (rev. 0x04)
pci1 at ppb0 bus 1                                        
radeon0 at pci1 dev 0 function 0: ATI Technologies FireGL Mobility 7800 M7 LX (rev. 0x00)
[...]
drm: initializing kernel modesetting (RV200 0x1002:0x4C58 0x1014:0x0518).
drm: register mmio base: 0xd0100000                                      
drm: register mmio size: 65536     
radeon0: info: GTT: 64M 0xE0000000 - 0xE3FFFFFF
radeon0: info: VRAM: 128M 0x00000000E8000000 - 0x00000000EFFFFFFF (64M used)
drm: Detected VRAM RAM=80M, BAR=128M                                        
drm: RAM width 128bits DDR          
Zone  kernel: Available graphics memory: 801196 kiB
drm: radeon: 64M of VRAM memory ready              
drm: radeon: 64M of GTT memory ready.
radeon0: info: WB disabled           
radeon0: info: fence driver on ring 0 use gpu addr 0x00000000e0000000 and cpu addr 0x0xdb4f0000
drm: Supports vblank timestamp caching Rev 2 (21.10.2013).
drm: Driver supports precise vblank timestamp query.      
radeon0: interrupting at irq 9 (radeon)             
drm: radeon: irq initialized.          
drm: Loading R100 Microcode  
drm: radeon: ring at 0x00000000E0001000
drm: ring test succeeded in 0 usecs    
drm: ib test succeeded in 0 usecs  
drm: Panel ID String: 1600x1200               
drm: Panel Size 1600x1200                     
drm: No TV DAC info found in BIOS
drm: Radeon Display Connectors   
drm: Connector 0:             
drm:   VGA-1     
drm:   DDC: 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60
drm:   Encoders:                                   
drm:     CRT1: INTERNAL_DAC1
drm: Connector 1:           
drm:   DVI-D-1   
drm:   HPD1   
drm:   DDC: 0x64 0x64 0x64 0x64 0x64 0x64 0x64 0x64
drm:   Encoders:                                   
drm:     DFP1: INTERNAL_TMDS1
drm: Connector 2:            
drm:   LVDS-1    
drm:   Encoders:
drm:     LCD1: INTERNAL_LVDS
drm: Connector 3:           
drm:   SVIDEO-1  
drm:   Encoders:
drm:     TV1: INTERNAL_DAC2
radeondrmkmsfb0 at radeon0 
radeon0: info: registered panic notifier
wsdisplay0 at radeondrmkmsfb0 kbdmux 1  
[...]
[startx]
panic: kernel diagnostic assertion "ttm->caching_state == tt_cached" failed: file "/x/current/src/sys/external/bsd/drm2/dist/drm/ttm/ttm_tt.c", line 423 
fatal breakpoint trap in supervisor mode                                 
trap type 1 code 0 eip c02516b4 cs 8 eflags 3246 cr2 ba76f000 ilevel 0 esp dbeacd08
curlwp 0xc3802800 pid 235 lid 1 lowest kstack 0xdbeaa2c0
Stopped in pid 235.1 (Xorg) at  netbsd:breakpoint+0x4:  popl    %ebp
db{0}> bt                                                           
breakpoint(c097a3fe,c0b72e00,c095f884,dbeacd24,c4040308,c40402a8,7000,dbeacd18,c
0894093,c095f884) at netbsd:breakpoint+0x4                                     
vpanic(c095f884,dbeacd24,dbeacd3c,c07530b8,c095f884,c095fa35,c09d390c,c09d3890,1
a7,c4040308) at netbsd:vpanic+0x127                                            
kern_assert(c095f884,c095fa35,c09d390c,c09d3890,1a7,c4040308,c40402a8,dbeacd68,c
0751d40,c4040308) at netbsd:kern_assert+0x23                                   
ttm_tt_swapout(c4040308,0,c4040308,c48289fc,2,dbeacd78,c4040308,c48289fc,2,dbeac
d78) at netbsd:ttm_tt_swapout+0x148                                            
ttm_bus_dma_unpopulate(c4040308,c48289fc,dbeacd8c,c074cb31,c4040308,0,c4828a40,d
beacdd0,c074de6c,0) at netbsd:ttm_bus_dma_unpopulate+0x40                      
ttm_tt_destroy(c4040308,0,c4828a40,dbeacdd0,c074de6c,0,0,0,1,c4828a40) at netbsd
:ttm_tt_destroy+0x49                                                           
ttm_bo_cleanup_memtype_use(0,0,0,1,c4828a40,1,0,c3aecee5,c325cb0c,c0b1f980) at n
etbsd:ttm_bo_cleanup_memtype_use+0x41                                          
ttm_bo_release(c4828a40,0,0,1,c4828b58,dbeacdfc,c061ace0,dbeacdf8,c48289cc,dbeac
e24) at netbsd:ttm_bo_release+0x28c                                            
radeon_bo_unref(dbeacdf8,c48289cc,dbeace24,c0291df2,c4828b58,ffffffff,c029cab5,c
4828ba0,c4756488,c3c58518) at netbsd:radeon_bo_unref+0x40                      
radeon_gem_object_free(c4828b58,ffffffff,c029cab5,c4828ba0,c4756488,c3c58518,c3c
58544,c4828b58,dbeace44,c0292217) at netbsd:radeon_gem_object_free+0x20        
drm_gem_object_handle_unreference_unlocked(c4828b58,c3c58518,c374f70c,c08e8614,c
3c58518,c374f70c,dbeace70,c028b1dd,c3c58518,7) at netbsd:drm_gem_object_handle_u
nreference_unlocked+0x92                                                       
drm_gem_handle_delete(c3c58518,7,c3c58518,dbeace74,c06f3b3c,1,dbeacf68,c47ff500,
0,dbeacf3c) at netbsd:drm_gem_handle_delete+0x87                               
drm_ioctl(c47ff500,80086409,dbeaceb0,e020806f,c4781168,c1bc200c,1,0,8,0) at netb
sd:drm_ioctl+0x11d                                                             
sys_ioctl(c3802800,dbeacf68,dbeacf60,c36101a8,0,c0b126c8,dbeacf68,0,0,b) at netb
sd:sys_ioctl+0x1ae                                                             
syscall() at netbsd:syscall+0x82
--- syscall (number 54) ---     
bb77af77:                  
db{0}> sh reg
ds          c06f0010    extent_insert_and_optimize.isra.0+0x70
es          dbea0010                                          
fs          30      
gs          c0950010    kmem_cache_sizes+0xd0
edi         dbeacd24                         
esi         c095f884    ostype+0x9e7
ebp         dbeacce4                
ebx         104     
edx         1  
ecx         0
eax         1
eip         c02516b4    breakpoint+0x4
cs          8                         
eflags      3246
esp         dbeacce4
ss          10      
netbsd:breakpoint+0x4:  popl    %ebp
db{0}>

>How-To-Repeat:
With no "xorg.conf" or an "xorg.conf" without the "NoAccel" option,
attempt to start Xorg on an i386 system that uses radeondrmkms.
>Fix:
Workaround:  Create minimal "xorg.conf" with "NoAccel" option:

Section "Device"
        Option          "NoAccel"       "True"
        Identifier      "Card0"
        Driver          "radeon"
EndSection

This will prevent the panic but multimedia applications will basically
fail to perform w/o acceleration.

>Release-Note:

>Audit-Trail:
From: "John D. Baker" <jdbaker@mylinuxisp.com>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/49710: i386 radeondrmkms panic when starting Xorg
Date: Tue, 3 Mar 2015 21:50:56 -0600 (CST)

 With the latest round of fixes, i386 radeondrmkms no-longer panics if
 using accleration in Xorg (no "xorg.conf" file at all).

 It doesn't produce a usable display, either, however.  Upon starting Xorg,
 the X cursor is displayed, but the framebuffer console cursor is still
 present in the upper left corner of the screen.  The X cursor responds
 to mouse motion as expected.  I couldn't test button presses.

 One can launch X clients and direct them to the display, but their
 windows never appear.

 Below are "dmesg" excerpts and a sample session.  The X server cannot
 be killed with SIGTERM.  Sending SIGKILL makes the system unresponsive.
 Since the console was on serial port, I could get a backtrace, etc. and
 reboot the machine.


 NetBSD 7.99.5 (SLAB) #1: Tue Mar  3 18:26:53 CST 2015
 	sysop@skuld.technoskunk.fur:/d0/build/current/obj/i386/sys/arch/i386/compile/SLAB
 [...]
 acpivga0 at acpi0 (VID): ACPI Display Adapter
 acpiout0 at acpivga0 (LCD0, 0x0110): ACPI Display Output Device
 acpiout1 at acpivga0 (CRT0, 0x0100): ACPI Display Output Device
 acpiout2 at acpivga0 (TV0, 0x0200): ACPI Display Output Device 
 acpiout3 at acpivga0 (DVI0, 0x0210): ACPI Display Output Device
 [...]
 pchb0 at pci0 dev 0 function 0: Intel 82845 Host (rev. 0x04)
 agp0 at pchb0: aperture at 0xe0000000, size 0x4000000       
 ppb0 at pci0 dev 1 function 0: Intel 82845 AGP (rev. 0x04)
 pci1 at ppb0 bus 1                                        
 radeon0 at pci1 dev 0 function 0: ATI Technologies FireGL Mobility 7800 M7 LX (rev. 0x00)
 [...]
 drm: initializing kernel modesetting (RV200 0x1002:0x4C58 0x1014:0x0518).
 drm: register mmio base: 0xd0100000                                      
 drm: register mmio size: 65536     
 radeon0: info: GTT: 64M 0xE0000000 - 0xE3FFFFFF
 radeon0: info: VRAM: 128M 0x00000000E8000000 - 0x00000000EFFFFFFF (64M used)
 drm: Detected VRAM RAM=80M, BAR=128M                                        
 drm: RAM width 128bits DDR          
 Zone  kernel: Available graphics memory: 801208 kiB
 drm: radeon: 64M of VRAM memory ready              
 drm: radeon: 64M of GTT memory ready.
 radeon0: info: WB disabled           
 radeon0: info: fence driver on ring 0 use gpu addr 0x00000000e0000000 and cpu addr 0x0xdb4ed000
 drm: Supports vblank timestamp caching Rev 2 (21.10.2013).
 drm: Driver supports precise vblank timestamp query.      
 radeon0: interrupting at irq 9 (radeon)             
 drm: radeon: irq initialized.          
 drm: Loading R100 Microcode  
 drm: radeon: ring at 0x00000000E0001000
 drm: ring test succeeded in 0 usecs    
 drm: ib test succeeded in 0 usecs  
 drm: Panel ID String: 1600x1200               
 drm: Panel Size 1600x1200                     
 drm: No TV DAC info found in BIOS
 drm: Radeon Display Connectors   
 drm: Connector 0:             
 drm:   VGA-1     
 drm:   DDC: 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60
 drm:   Encoders:                                   
 drm:     CRT1: INTERNAL_DAC1
 drm: Connector 1:           
 drm:   DVI-D-1   
 drm:   HPD1   
 drm:   DDC: 0x64 0x64 0x64 0x64 0x64 0x64 0x64 0x64
 drm:   Encoders:                                   
 drm:     DFP1: INTERNAL_TMDS1
 drm: Connector 2:            
 drm:   LVDS-1    
 drm:   Encoders:
 drm:     LCD1: INTERNAL_LVDS
 drm: Connector 3:           
 drm:   SVIDEO-1  
 drm:   Encoders:
 drm:     TV1: INTERNAL_DAC2
 radeondrmkmsfb0 at radeon0 
 radeon0: info: registered panic notifier
 wsdisplay0 at radeondrmkmsfb0 kbdmux 1  
 [...]
 $ X -retro & sleep 2 ; xterm -display :0 &
 [1] 73

 X.Org X Server 1.10.6
 Release Date: 2011-07-08
 X Protocol Version 11, Revision 0
 Build Operating System: NetBSD/i386  - 
 Current Operating System: NetBSD slab 7.99.5 NetBSD 7.99.5 (SLAB) #1: Tue Mar  3 18:26:53 CST 2015  sysop@skuld.technoskunk.fur:/d0/build/current/obj/i386/sys/arch/i386/compile/SLAB i386
 Build Date: 01 August 2011  01:01:00AM

 Current version of pixman: 0.32.6
         Before reporting problems, check http://wiki.X.Org
         to make sure that you have the latest version.
 Markers: (--) probed, (**) from config file, (==) default setting,
         (++) from command line, (!!) notice, (II) informational,
         (WW) warning, (EE) error, (NI) not implemented, (??) unknown.
 (==) Log file: "/var/log/Xorg.0.log", Time: Tue Mar  3 21:12:41 2015
 (==) Using default built-in configuration (21 lines)
 (II) [KMS] Kernel modesetting enabled.
 [2] 82
 The XKEYBOARD keymap compiler (xkbcomp) reports:
 > Error:            Couldn't lookup keysym
 >                   Symbol interpretation ignored
 > Error:            Couldn't lookup keysym
 >                   Symbol interpretation ignored
 Errors from xkbcomp are not fatal to the X server

 [mi] EQ overflowing. The server is probably stuck in an infinite loop.

 1027 [sysop@slab:~]$ ps ax
  PID TTY   STAT    TIME COMMAND
    0 ?     DKl  0:02.44 [system]
    1 ?     Is   0:00.02 init 
   73 ?     S    0:00.86 X -retro (Xorg)
  412 ?     Is   0:00.00 /sbin/dhcpcd -n -qM fxp0 
  657 ?     Is   0:00.08 /usr/sbin/syslogd -P /var/run/syslogd.sockets -s 
  708 ?     Ss   0:00.53 /usr/sbin/rpcbind -l 
  856 ?     Ss   0:00.13 /usr/sbin/amd -l syslog -x error,noinfo,nostats -p -a /
 1181 ?     Ss   0:00.12 /usr/sbin/ntpd -u ntpd:ntpd -i /var/chroot/ntpd -p /var
 1381 ?     Is   0:00.01 /usr/sbin/lpd -s 
 1423 ?     Is   0:00.01 /usr/sbin/sshd 
 1486 ?     Is   0:00.00 /usr/sbin/powerd 
 1497 ?     S    0:00.04 pickup -l -t unix -u 
 1754 ?     I    0:00.04 qmgr -l -t unix -u 
 2075 ?     Ss   0:00.03 /usr/libexec/postfix/master -w 
 2156 ?     Ss   0:00.01 /usr/sbin/cron 
 2229 ?     Is   0:00.00 /usr/sbin/inetd -l 
  104 pts/0 Is+  0:00.02 ksh 
   42 tty00 S    0:00.03 -ksh 
   82 tty00 I    0:00.10 xterm 
 2477 tty00 Is   0:00.15 login 
 2694 tty00 O+   0:00.01 ps -ax 
 2454 ttyE0 Is+  0:00.01 /usr/libexec/getty Pc ttyE0 
 2247 ttyE1 Is+  0:00.01 /usr/libexec/getty Pc ttyE1 
 2576 ttyE2 Is+  0:00.01 /usr/libexec/getty Pc ttyE2 
 2302 ttyE3 Is+  0:00.01 /usr/libexec/getty Pc ttyE3 

 1028 [sysop@slab:~]$ top | cat

 load averages:  0.41,  0.24,  0.10;               up 0+00:03:25        21:14:30
 26 processes: 1 runnable, 24 sleeping, 1 on CPU
 CPU states:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
 Memory: 76M Act, 19M Wired, 19M Exec, 46M File, 1875M Free
 Swap: 4096M Total, 4096M Free

   PID USERNAME PRI NICE   SIZE   RES STATE      TIME   WCPU    CPU COMMAND
     0 root      43    0     0K   24M RUN        0:02  0.00%  0.00% [system]
  1523 sysop     43    0  3744K 1488K CPU        0:00  0.00%  0.00% top
    73 sysop     85    0    31M   11M radfence   0:00  0.00%  0.00% Xorg
  1181 ntpd      85    0  6676K 8728K pause      0:00  0.00%  0.00% ntpd
   856 root      85    0  6292K 8344K select     0:00  0.00%  0.00% amd
    82 sysop     85    0  7496K 4428K select     0:00  0.00%  0.00% xterm
  1754 postfix   85    0  8868K 3348K kqueue     0:00  0.00%  0.00% qmgr
  1497 postfix   85    0  8868K 3336K kqueue     0:00  0.00%  0.00% pickup
  2477 root      85    0  7440K 3080K wait       0:00  0.00%  0.00% login
  2075 root      85    0  8868K 2196K kqueue     0:00  0.00%  0.00% master
   657 root      85    0  6200K 1820K kqueue     0:00  0.00%  0.00% syslogd
   104 sysop     85    0  3368K 1364K ttyraw     0:00  0.00%  0.00% ksh
    42 sysop     85    0  3364K 1348K pause      0:00  0.00%  0.00% ksh
  2156 root      85    0  3464K 1280K nanoslp    0:00  0.00%  0.00% cron
  2302 root      85    0  3520K 1212K ttyraw     0:00  0.00%  0.00% getty
  2576 root      85    0  3520K 1212K ttyraw     0:00  0.00%  0.00% getty
  2247 root      85    0  3520K 1212K ttyraw     0:00  0.00%  0.00% getty
  2454 root      85    0  3520K 1208K ttyraw     0:00  0.00%  0.00% getty
   708 root      85    0  3496K 1172K select     0:00  0.00%  0.00% rpcbind
   412 root      85    0  3376K 1144K select     0:00  0.00%  0.00% dhcpcd
  1381 root      85    0  3404K 1128K select     0:00  0.00%  0.00% lpd
  1486 root      85    0  3532K  948K kqueue     0:00  0.00%  0.00% powerd
  2359 sysop     85    0  3364K  856K pipe_rd    0:00  0.00%  0.00% cat
  1423 root      84    0  8932K 2316K select     0:00  0.00%  0.00% sshd
  2229 root      84    0  3556K  972K kqueue     0:00  0.00%  0.00% inetd
     1 root      83    0  3504K 1224K wait       0:00  0.00%  0.00% init

 $ pkill -KILL X

 [unresponsive]
 [BREAK sent]
 fatal breakpoint trap in supervisor mode
 trap type 1 code 0 eip c02516b4 cs 8 eflags 202 cr2 bbae74f8 ilevel 8 esp da780f6c
 curlwp 0xc38e0d40 pid 73 lid 1 lowest kstack 0xdb4d72c0
 Stopped in pid 73.1 (Xorg) at   netbsd:breakpoint+0x4:  popl    %ebp
 db{0}> bt                                                           
 breakpoint(c0aebac0,3f8,5,c0b3e400,c0b3e400,c046b3ee,c34b0608,c34b0580,c34de164,
 c34df000) at netbsd:breakpoint+0x4                                             
 comintr(c34b04c8,db4d9c58,0,0,0,0,0,0,0,0) at netbsd:comintr+0x5d5
 --- switch to interrupt stack ---                                 
 Xintr_legacy4() at netbsd:Xintr_legacy4+0xc3
 --- interrupt ---                           
 mutex_spin_enter(c2fddb54,c2fddb50,32,bffb,0,0,c0000000,1003fff,c2fdd8c0,db4d9d7
 0) at netbsd:mutex_spin_enter+0x31                                             
 radeon_fence_wait_seq(c2fdd870,c2fdd000,5,0,5,0,0,0,0,0) at netbsd:radeon_fence_
 wait_seq+0x125                                                                 
 radeon_fence_wait(c3b1fabc,1,1,c96210,1,c2fdd701,c3b1fabc,c3836184,0,c383610c) a
 t netbsd:radeon_fence_wait+0x6b                                                
 ttm_bo_wait(c383613c,1,1,0,c383613c,c2fdd000,c3836298,c383610c,db4d9e44,c061b8cb
 ) at netbsd:ttm_bo_wait+0x8a                                                   
 radeon_bo_wait(c383610c,0,0,0,0,0,0,c0937bb0,c3c961f8,c374c70c) at netbsd:radeon
 _bo_wait+0xac                                                                  
 radeon_gem_wait_idle_ioctl(c374c70c,db4d9eb0,c3c961f8,0,0,0,db4d9f68,c3f9e400,0,
 db4d9f3c) at netbsd:radeon_gem_wait_idle_ioctl+0x4b                            
 drm_ioctl(c3f9e400,80086464,db4d9eb0,0,0,0,0,0,8,0) at netbsd:drm_ioctl+0x11d
 sys_ioctl(c38e0d40,db4d9f68,db4d9f60,7ce5a000,c360e960,c0b126c8,db4d9f68,0,0,b) a
 t netbsd:sys_ioctl+0x1ae
 syscall() at netbsd:syscall+0x82
 --- syscall (number 54) ---     
 bb771f77:                  
 db{0}> sh reg
 ds          c2fb0010
 es          c0440010    lwp_startup+0x1b0
 fs          da780030                     
 gs          c0ae0010    loc+0x390
 edi         800                  
 esi         100
 ebp         da780f28
 ebx         c34b04c8
 edx         3f9     
 ecx         8c6
 eax         1  
 eip         c02516b4    breakpoint+0x4
 cs          8                         
 eflags      202
 esp         da780f28
 ss          10      
 netbsd:breakpoint+0x4:  popl    %ebp
 db{0}>

 -- 
 |/"\ John D. Baker, KN5UKS               NetBSD     Darwin/MacOS X
 |\ / jdbaker[snail]mylinuxisp[flyspeck]com    OpenBSD            FreeBSD
 | X  No HTML/proprietary data in email.   BSD just sits there and works!
 |/ \ GPGkeyID:  D703 4A7E 479F 63F8 D3F4  BD99 9572 8F23 E4AD 1645

From: mlelstv@serpens.de (Michael van Elst)
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/49710: i386 radeondrmkms panic when starting Xorg
Date: Wed, 4 Mar 2015 06:37:54 +0000 (UTC)

 jdbaker@mylinuxisp.com ("John D. Baker") writes:

 > One can launch X clients and direct them to the display, but their
 > windows never appear.

 What happens when you restart X at that point (e.g. kill it with
 ctrl-alt-backspace if that is enabled and let xdm restart it) ?


 -- 
 -- 
                                 Michael van Elst
 Internet: mlelstv@serpens.de
                                 "A potential Snark may lurk in every tree."

From: "John D. Baker" <jdbaker@mylinuxisp.com>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/49710: i386 radeondrmkms panic when starting Xorg
Date: Wed, 4 Mar 2015 02:56:22 -0600 (CST)

 On Wed, 4 Mar 2015, Michael van Elst wrote:

 >  What happens when you restart X at that point (e.g. kill it with
 >  ctrl-alt-backspace if that is enabled and let xdm restart it) ?

 I normally don't start xdm on this machine when testing -current (at
 least not automatically with a known buggy situation), but I gave it a
 try.

 Although the output of 'ps -ax' indicated that all was running, the
 operations performed by "Xsetup_0" (set root window color and launch
 'xconsole') were not manifested on the display.  (The framebuffer console
 cursor did disappear.)  The xdm greeter widget never appeared.

 The keyboard seems to be unresponsive.  One cannot switch to another
 virtual terminal when acceleration is enabled.  Attempting to get 'xdm'
 to restart the server by pressing <ctrl-C> three times did not appear
 to do anything.  "TerminateServer" via "ctrl-alt-backspace" does not
 function at all--whether with or without acceleration--even with

 Section "ServerFlags"
   Option "DontZap" "False"
 EndSection

 defined in "xorg.conf"

 When acceleration is disabled, 'xdm' works and one may freely switch
 virtual terminals with Ctrl-Alt-Fn.  Ctrl-Alt-Backspace does not work.

 -- 
 |/"\ John D. Baker, KN5UKS               NetBSD     Darwin/MacOS X
 |\ / jdbaker[snail]mylinuxisp[flyspeck]com    OpenBSD            FreeBSD
 | X  No HTML/proprietary data in email.   BSD just sits there and works!
 |/ \ GPGkeyID:  D703 4A7E 479F 63F8 D3F4  BD99 9572 8F23 E4AD 1645

From: matthew green <mrg@eterna.com.au>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
    netbsd-bugs@netbsd.org
Subject: re: kern/49710: i386 radeondrmkms panic when starting Xorg
Date: Sat, 07 Mar 2015 18:35:55 +1100

 > trap type 1 code 0 eip c02516b4 cs 8 eflags 202 cr2 bbae74f8 ilevel 8 esp da780f6c
 > curlwp 0xc38e0d40 pid 73 lid 1 lowest kstack 0xdb4d72c0
 > Stopped in pid 73.1 (Xorg) at   netbsd:breakpoint+0x4:  popl    %ebp
 > db{0}> bt                                                           
 > breakpoint(c0aebac0,3f8,5,c0b3e400,c0b3e400,c046b3ee,c34b0608,c34b0580,c34de164,
 > c34df000) at netbsd:breakpoint+0x4                                             
 > comintr(c34b04c8,db4d9c58,0,0,0,0,0,0,0,0) at netbsd:comintr+0x5d5
 > --- switch to interrupt stack ---                                 
 > Xintr_legacy4() at netbsd:Xintr_legacy4+0xc3
 > --- interrupt ---                           
 > mutex_spin_enter(c2fddb54,c2fddb50,32,bffb,0,0,c0000000,1003fff,c2fdd8c0,db4d9d7
                    ^^^^^^^^
 > 0) at netbsd:mutex_spin_enter+0x31                                             
 > radeon_fence_wait_seq(c2fdd870,c2fdd000,5,0,5,0,0,0,0,0) at netbsd:radeon_fence_
 > wait_seq+0x125                                                                 
 > radeon_fence_wait(c3b1fabc,1,1,c96210,1,c2fdd701,c3b1fabc,c3836184,0,c383610c) a

 can you reproduce this?  if so, please also run "show lock c2fddb54"
 from ddb -- where the first argument to mutex_spin_enter is the
 argument to show lock.

 this seems like a deadlock, and the above will show info about the
 lock being waited on.


 .mrg.

From: "John D. Baker" <jdbaker@mylinuxisp.com>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: re: kern/49710: i386 radeondrmkms panic when starting Xorg
Date: Mon, 9 Mar 2015 20:20:36 -0500 (CDT)

 A little more detail on what happens without "NoAccel":

 On serial console, starting X server with:

 $ X -retro &

 The retro stipple pattern is not displayed.  The framebuffer console
 cursor is still visible in the upper left corner of the display.  The
 X mouse cursor is visible and tracks mouse movements correctly.  One
 can move the X cursor over the framebuffer cursor.  The X cursor appears
 above (Z axis) the framebuffer cursor and when moved off, the framebuffer
 cursor is unchanged.

 The X server sits idle in "select" until the first client connects to
 it.  Thereafter, it alternates between "RUN" and "radfence" when observed
 with 'top'.  The client window ('xterm' in my tests) never appears.
 Moving the cursor to the area where it should be and typing 'exit'
 <return> has no effect (so it's not that the window is simply invisible).

 The 'xterm' instance spawned a new shell process.  Sending a SIGHUP to that
 shell process caused it and the 'xterm' process to exit as expected.  As it
 was the last client, the Xserver should have reset itself, but it instead
 continues to alternate between RUN state and waiting in "radfence".

 Attaching 'ktruss' to the Xorg process reveals:

 # ktruss -i -p 114
    114      1 Xorg     SIGALRM caught handler=0x8190040 mask=0x0 code=0x0
    114      1 Xorg     setcontext(0xbfbfe3ac)      JUSTRETURN
    114      1 Xorg     emul(netbsd)
    114      1 Xorg     ioctl(0xb, _IOR('d',0x64,0x8), 0xbfbfe6fc) Err#4 EINTR
        "\^C\0\0\0\0\0\0\0"
    114      1 Xorg     SIGALRM caught handler=0x8190040 mask=0x0 code=0x0
    114      1 Xorg     setcontext(0xbfbfe3ac)      JUSTRETURN
    114      1 Xorg     ioctl(0xb, _IOR('d',0x64,0x8), 0xbfbfe6fc) Err#4 EINTR
        "\^C\0\0\0\0\0\0\0"
    114      1 Xorg     SIGALRM caught handler=0x8190040 mask=0x0 code=0x0
    114      1 Xorg     setcontext(0xbfbfe3ac)      JUSTRETURN
    114      1 Xorg     ioctl(0xb, _IOR('d',0x64,0x8), 0xbfbfe6fc) Err#4 EINTR
        "\^C\0\0\0\0\0\0\0"
 [...]

 The only additional information in the "Xorg.0.log" file is the line noted
 before:

 [   420.146] [mi] EQ overflowing. The server is probably stuck in an infinite loop.

 Sending SIGKILL to the X server renders the machine mostly unresponsive.
 The terminal driver still echoes/translates characters, but that's about
 all.

 On Sat, 7 Mar 2015, matthew green wrote:

 >  can you reproduce this?  if so, please also run "show lock c2fddb54"
 >  from ddb -- where the first argument to mutex_spin_enter is the
 >  argument to show lock.

 I was not able to reproduce a similar backtrace showing "mutext_spin_enter()".

 Instead, this time I got the following:

 [BREAK sent]
 fatal breakpoint trap in supervisor mode
 trap type 1 code 0 eip c02518c4 cs 8 eflags 200202 cr2 bbae74f8 ilevel 8 esp da782f6c
 curlwp 0xc38e4d40 pid 114 lid 1 lowest kstack 0xdb4da2c0
 Stopped in pid 114.1 (Xorg) at  netbsd:breakpoint+0x4:  popl    %ebp
 db{0}> bt
 breakpoint(c0aecac0,3f8,5,c0b41440,c0b41440,c046bb9e,c34b3608,c34b3580,c34e16b6,
 c34e2000) at netbsd:breakpoint+0x4
 comintr(c34b34c8,db4dcc08,0,0,0,0,0,0,0,0) at netbsd:comintr+0x5f5
 --- switch to interrupt stack ---
 Xintr_legacy4() at netbsd:Xintr_legacy4+0xc3
 --- interrupt ---
 sigispending(c38e4d40,0,c38e4d40,0,c0446847,1,0,c2fe0b54,c38e4dac,c2fe0b50) at n
 etbsd:sigispending+0xc
 sleepq_block(32,1,c09bb183,c0b16030,c0446847,c2fbfb00,c2fc1d40,32,40a14,c2fe0000
 ) at netbsd:sleepq_block+0x106
 cv_timedwait_sig(c2fe0b54,c2fe0b50,32,bffb,0,0,c0000000,1003fff,c2fe08c0,db4dcd7
 0) at netbsd:cv_timedwait_sig+0x103
 radeon_fence_wait_seq(c2fe0870,c2fe0000,5,0,5,0,0,0,0,0) at netbsd:radeon_fence_
 wait_seq+0x125
 radeon_fence_wait(c3b22abc,1,1,c9c210,1,c2fe0701,c3b22abc,c38d0484,0,c38d040c) a
 t netbsd:radeon_fence_wait+0x6b
 ttm_bo_wait(c38d043c,1,1,0,c38d043c,c2fe0000,c38d0598,c38d040c,db4dce44,c061c29b
 ) at netbsd:ttm_bo_wait+0x8a
 radeon_bo_wait(c38d040c,0,0,0,0,0,0,c09385d0,c3c9c1f8,c374f70c) at netbsd:radeon
 _bo_wait+0xac
 radeon_gem_wait_idle_ioctl(c374f70c,db4dceb0,c3c9c1f8,0,0,0,db4dcf68,c3fa4400,0,
 db4dcf3c) at netbsd:radeon_gem_wait_idle_ioctl+0x4b
 drm_ioctl(c3fa4400,80086464,db4dceb0,0,0,0,0,0,8,0) at netbsd:drm_ioctl+0x135
 sys_ioctl(c38e4d40,db4dcf68,db4dcf60,db4dcf60,fffffffe,db4dcf68,c0b13898,0,0,b) a
 t netbsd:sys_ioctl+0x1ae
 syscall() at netbsd:syscall+0x16f
 --- syscall (number 54) ---
 bb771f77:
 db{0}>

 >  this seems like a deadlock, and the above will show info about the
 >  lock being waited on.

 There are similarities to the previous backtrace, but not the specific
 item of interest.  What should I consider of interest in this backtrace,
 or any like it in future trials?

 -- 
 |/"\ John D. Baker, KN5UKS               NetBSD     Darwin/MacOS X
 |\ / jdbaker[snail]mylinuxisp[flyspeck]com    OpenBSD            FreeBSD
 | X  No HTML/proprietary data in email.   BSD just sits there and works!
 |/ \ GPGkeyID:  D703 4A7E 479F 63F8 D3F4  BD99 9572 8F23 E4AD 1645

From: matthew green <mrg@eterna.com.au>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
    netbsd-bugs@netbsd.org, jdbaker@mylinuxisp.com
Subject: re: kern/49710: i386 radeondrmkms panic when starting Xorg
Date: Thu, 23 Apr 2015 18:39:08 +1000

 i'm curious what the current status of this PR is.  on my similarly
 behaving systems, it fails slightly less poorly now, while still
 failing pretty badly.


 >  The 'xterm' instance spawned a new shell process.  Sending a SIGHUP to that
 >  shell process caused it and the 'xterm' process to exit as expected.  As it
 >  was the last client, the Xserver should have reset itself, but it instead
 >  continues to alternate between RUN state and waiting in "radfence".

 this sounds like the same problem i see on R200.  so far it seems
 that we're hitting a case where the hardware is hung or isn't 
 performing what we expect, and giving indication it is done.

 >  [   420.146] [mi] EQ overflowing. The server is probably stuck in an infinite loop.
 >  
 >  Sending SIGKILL to the X server renders the machine mostly unresponsive.
 >  The terminal driver still echoes/translates characters, but that's about
 >  all.

 in my testing, this seems to be related to the fact that some IO
 occurs during signal handling, but otherwise we're spinning in
 userland, polling the kernel if an operation is completed, and
 each time we notice it isn't, we can do stuff like update the
 mouse pointer, or handle other async IO>

 >  I was not able to reproduce a similar backtrace showing "mutext_spin_enter()".
 >  
 >  Instead, this time I got the following:
 >  
 >  [BREAK sent]
 [ ... ]
 >  radeon_fence_wait_seq(c2fe0870,c2fe0000,5,0,5,0,0,0,0,0) at netbsd:radeon_fence_wait_seq+0x125
 >  radeon_fence_wait(c3b22abc,1,1,c9c210,1,c2fe0701,c3b22abc,c38d0484,0,c38d040c) at netbsd:radeon_fence_wait+0x6b
 >  ttm_bo_wait(c38d043c,1,1,0,c38d043c,c2fe0000,c38d0598,c38d040c,db4dce44,c061c29b) at netbsd:ttm_bo_wait+0x8a
 >  radeon_bo_wait(c38d040c,0,0,0,0,0,0,c09385d0,c3c9c1f8,c374f70c) at netbsd:radeon_bo_wait+0xac
 [ ... ]
 >  
 >  >  this seems like a deadlock, and the above will show info about the
 >  >  lock being waited on.
 >  
 >  There are similarities to the previous backtrace, but not the specific
 >  item of interest.  What should I consider of interest in this backtrace,
 >  or any like it in future trials?

 actually, i've pretty much convinced myself this problem is the
 same basic problem on see on my my R200 cards/systems (it's
 happening similar for a PCI 9250 card, and an laptop agp 9000-M).

 it reminds me of my failed attempts to port drm a long long time
 ago where the CP busy flag would never clear, and the system
 would basically hang.  i'd seen this looping failure in both the
 drm kernel code, and the ati X ddx driver.  the latter was less
 annoying to determine :)

 these went away with the original drm code in sys/dev/drm
 (commited by drochner@.)


 .mrg.

From: "John D. Baker" <jdbaker@mylinuxisp.com>
To: matthew green <mrg@eterna.com.au>
Cc: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org
Subject: re: kern/49710: i386 radeondrmkms panic when starting Xorg
Date: Thu, 23 Apr 2015 06:06:01 -0500 (CDT)

 On Thu, 23 Apr 2015, matthew green wrote:

 > i'm curious what the current status of this PR is.  on my similarly
 > behaving systems, it fails slightly less poorly now, while still
 > failing pretty badly.

 I tried again a few days ago with the latest -current as of then.  There
 was no observable change, except that I think the hack to compensate
 for the almost-black-on-black framebuffer console didn't make weird
 colors on my A31p (which isn't affected by a-b-o-b-f-c to begin with).
 I didn't test further than X w/acceleration enabled (which hung as
 before), but at least the framebuffer console text wasn't rainbow-colored.

 I'm travelling right now, but have a -current tree with me.  When I
 reach my destination, I'll update, build and test again (-current kernel
 on netbsd-7 userland).

 -- 
 |/"\ John D. Baker, KN5UKS               NetBSD     Darwin/MacOS X
 |\ / jdbaker[snail]mylinuxisp[flyspeck]com    OpenBSD            FreeBSD
 | X  No HTML/proprietary data in email.   BSD just sits there and works!
 |/ \ GPGkeyID:  D703 4A7E 479F 63F8 D3F4  BD99 9572 8F23 E4AD 1645

From: "John D. Baker" <jdbaker@mylinuxisp.com>
To: matthew green <mrg@eterna.com.au>
Cc: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org
Subject: re: kern/49710: i386 radeondrmkms panic when starting Xorg
Date: Sun, 26 Apr 2015 08:09:13 -0500 (CDT)

 On Thu, 23 Apr 2015, John D. Baker wrote:

 > was no observable change, except that I think the hack to compensate
 > for the almost-black-on-black framebuffer console didn't make weird
 > colors on my A31p (which isn't affected by a-b-o-b-f-c to begin with).
 > I didn't test further than X w/acceleration enabled (which hung as
 > before), but at least the framebuffer console text wasn't rainbow-colored.

 After a protracted series of updates while waiting for recent breakage to
 be mitigated, I finally booted a -current kernel as of 201504251800Z or
 so.

 The almost-black-on-black console workaround makes green kernel messages
 unreadable as if pixels are missing and the normal white text is sprinkled
 with red and blue pixels (remember displaying 80-column text on an NTSC
 composite color monitor ala Apple ][?).  (I'm using FONT_GLASS10x19.
 Maybe I should take that out, reverting to FONT_BOLD8x16 in the interrim.)

 X behavior is mostly unchanged.  Running without acceleration works,
 although the colors are now borked.  Running WITH acceleration "hangs"
 as described previously although the most recent run displayed hash
 across the root window with an intact and working mouse cursor.  Starting
 an 'xterm' as a sample client never opened its window and instead the
 X virtual terminal went completely black as if nothing was running.
 Back over on the console the message "usl_attach timeout" begain appearing
 about once ever 30 seconds or so.

 Once the X server is running with acceleration enabled, attempts to stop
 it hang the machine and one must drop into DDB and reboot from there.

 -- 
 |/"\ John D. Baker, KN5UKS               NetBSD     Darwin/MacOS X
 |\ / jdbaker[snail]mylinuxisp[flyspeck]com    OpenBSD            FreeBSD
 | X  No HTML/proprietary data in email.   BSD just sits there and works!
 |/ \ GPGkeyID:  D703 4A7E 479F 63F8 D3F4  BD99 9572 8F23 E4AD 1645

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.43 2018/01/16 07:36:43 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.