NetBSD Problem Report #47057
From www@NetBSD.org Thu Oct 11 17:17:06 2012
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
by www.NetBSD.org (Postfix) with ESMTP id 6D60263E407
for <gnats-bugs@gnats.NetBSD.org>; Thu, 11 Oct 2012 17:17:06 +0000 (UTC)
Message-Id: <20121011171705.85EF163E3BF@www.NetBSD.org>
Date: Thu, 11 Oct 2012 17:17:05 +0000 (UTC)
From: royger@netbsd.org
Reply-To: royger@netbsd.org
To: gnats-bugs@NetBSD.org
Subject: Xen NetBSD DomU file system trash under Linux Dom0
X-Send-Pr-Version: www-1.0
>Number: 47057
>Category: port-xen
>Synopsis: Xen NetBSD DomU file system trash under Linux Dom0
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: port-xen-maintainer
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Thu Oct 11 17:20:00 +0000 2012
>Closed-Date: Fri Nov 30 09:30:57 +0000 2012
>Last-Modified: Fri Nov 30 09:30:57 +0000 2012
>Originator: Roger Pau Monné
>Release: 6.0RC2
>Organization:
Citrix
>Environment:
NetBSD 6.0_RC2 NetBSD 6.0_RC2 (XEN3_DOMU) #6: Wed Sep 26 18:06:29 BST 2012 root@roger-xen:/root/obj/sys/arch/amd64/compile/XEN3_DOMU amd64
>Description:
This problem might be related to 'port-xen/47056', and the root cause might actually be the same, but I'm posting them as different PR until we can figure out if they are related or not.
When doing heavy IO inside a NetBSD DomU backed by a Linux Dom0 I get random file system crashes, I've found this with FFSv1, FFSv2 with both WAPL enabled and disabled. The panics where about performing a free of an already free'd block usually, but I've also saw that sometimes on a fresh install you can end up with corrupted files (when performing the install from netbsd-INSTALL_XEN3_DOMU kernel).
>How-To-Repeat:
As with 'port-xen/47056', the easiest way to reproduce this is to try to do a build of NetBSD from sources from inside a DomU backed by a MP Linux Dom0.
>Fix:
I'm not sure about this, but I think we have a problem with reentrancy of the xen event channel callback (do_hypervisor_callback in hypervisor_machdep.c), but I haven't been able to find a fix for this.
The right solution might be to bind all events to CPU#0 and use a producer/consumer approach to dispatch them to different threads. This way it will be easier to block all events while we are in the callback itself, and then it's just a matter of calling the appropriate callback from the "consumer" thread. Also, we will be sure that callbacks won't be nested (ie. we will not have reentrant callbacks).
>Release-Note:
>Audit-Trail:
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: gnats-bugs@NetBSD.org
Cc: port-xen-maintainer@NetBSD.org, gnats-admin@NetBSD.org,
netbsd-bugs@NetBSD.org
Subject: Re: port-xen/47057: Xen NetBSD DomU file system trash under Linux
Dom0
Date: Thu, 11 Oct 2012 19:28:51 +0200
On Thu, Oct 11, 2012 at 05:20:00PM +0000, royger@NetBSD.org wrote:
> >Fix:
> I'm not sure about this, but I think we have a problem with reentrancy of the xen event channel callback (do_hypervisor_callback in hypervisor_machdep.c), but I haven't been able to find a fix for this.
Can you expand on this ? AFAIK this code is safe.
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
From: =?UTF-8?Q?Roger_Pau_Monn=C3=A9?= <royger@NetBSD.org>
To: gnats-bugs@netbsd.org
Cc: port-xen-maintainer@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Subject: Re: port-xen/47057: Xen NetBSD DomU file system trash under Linux Dom0
Date: Fri, 12 Oct 2012 17:52:22 +0200
On Thu, Oct 11, 2012 at 7:30 PM, Manuel Bouyer <bouyer@antioche.eu.org> wrote:
> The following reply was made to PR port-xen/47057; it has been noted by GNATS.
>
> From: Manuel Bouyer <bouyer@antioche.eu.org>
> To: gnats-bugs@NetBSD.org
> Cc: port-xen-maintainer@NetBSD.org, gnats-admin@NetBSD.org,
> netbsd-bugs@NetBSD.org
> Subject: Re: port-xen/47057: Xen NetBSD DomU file system trash under Linux
> Dom0
> Date: Thu, 11 Oct 2012 19:28:51 +0200
>
> On Thu, Oct 11, 2012 at 05:20:00PM +0000, royger@NetBSD.org wrote:
> > >Fix:
> > I'm not sure about this, but I think we have a problem with reentrancy of the xen event channel callback (do_hypervisor_callback in hypervisor_machdep.c), but I haven't been able to find a fix for this.
>
>
> Can you expand on this ? AFAIK this code is safe.
I'm not sure, but I think we might have a problem when we call
intr_biglock_wrapper, this function takes the kernel_lock, but just
before calling it we call sti(), which allows further hypervisor
callbacks. Isn't it posible that another hypervisor callback
interrupts the execution of the handler, leaving the kernel_lock held
and thus locking the system when this new callback tries to execute a
handler?
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Roger Pau =?iso-8859-1?Q?Monn=E9?= <royger@NetBSD.org>
Cc: gnats-bugs@NetBSD.org, port-xen-maintainer@NetBSD.org,
gnats-admin@NetBSD.org, netbsd-bugs@NetBSD.org
Subject: Re: port-xen/47057: Xen NetBSD DomU file system trash under Linux
Dom0
Date: Fri, 12 Oct 2012 18:10:27 +0200
On Fri, Oct 12, 2012 at 05:52:22PM +0200, Roger Pau Monné wrote:
> I'm not sure, but I think we might have a problem when we call
> intr_biglock_wrapper, this function takes the kernel_lock, but just
> before calling it we call sti(), which allows further hypervisor
> callbacks. Isn't it posible that another hypervisor callback
> interrupts the execution of the handler, leaving the kernel_lock held
> and thus locking the system when this new callback tries to execute a
> handler?
Either the new callback wants to execute a MPSAFE handler and things will
run fine, or it will also try to grab the kernel_lock, and will either
succeed or be delayed depending on who did take the kernel_lock before.
Remember that kernel_lock is not a simple mutex, it's reentrant on
the same processor. So we may have one handler interrupted and the
new callback can grab the kernel_lock, because it's already owned
by its CPU. but then handlers reentrancy is protected by the traditional
spl mechanism.
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
From: =?UTF-8?Q?Roger_Pau_Monn=C3=A9?= <royger@NetBSD.org>
To: Manuel Bouyer <bouyer@antioche.eu.org>
Cc: gnats-bugs@netbsd.org, port-xen-maintainer@netbsd.org,
gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: port-xen/47057: Xen NetBSD DomU file system trash under Linux Dom0
Date: Fri, 12 Oct 2012 18:55:56 +0200
On Fri, Oct 12, 2012 at 6:10 PM, Manuel Bouyer <bouyer@antioche.eu.org> wrote:
> Either the new callback wants to execute a MPSAFE handler and things will
> run fine, or it will also try to grab the kernel_lock, and will either
> succeed or be delayed depending on who did take the kernel_lock before.
>
> Remember that kernel_lock is not a simple mutex, it's reentrant on
> the same processor. So we may have one handler interrupted and the
> new callback can grab the kernel_lock, because it's already owned
> by its CPU. but then handlers reentrancy is protected by the traditional
> spl mechanism.
When you say it's protected by the same mechanism as normal handlers,
I assume it's due to the code in evtchn.c, the "splx" section of
evtchn_do_event, so there's no way for example that xbd_handler might
be interrupted by another xbd_handler?
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Roger Pau =?iso-8859-1?Q?Monn=E9?= <royger@NetBSD.org>
Cc: gnats-bugs@NetBSD.org, port-xen-maintainer@NetBSD.org,
gnats-admin@NetBSD.org, netbsd-bugs@NetBSD.org
Subject: Re: port-xen/47057: Xen NetBSD DomU file system trash under Linux
Dom0
Date: Fri, 12 Oct 2012 19:11:08 +0200
On Fri, Oct 12, 2012 at 06:55:56PM +0200, Roger Pau Monné wrote:
> When you say it's protected by the same mechanism as normal handlers,
> I assume it's due to the code in evtchn.c, the "splx" section of
> evtchn_do_event, so there's no way for example that xbd_handler might
> be interrupted by another xbd_handler?
yes.
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
From: =?ISO-8859-1?Q?Roger_Pau_Monn=E9?= <roger.pau@citrix.com>
To: "gnats-bugs@NetBSD.org" <gnats-bugs@NetBSD.org>
Cc: Manuel Bouyer <bouyer@antioche.eu.org>, "port-xen-maintainer@netbsd.org"
<port-xen-maintainer@netbsd.org>, "gnats-admin@netbsd.org"
<gnats-admin@netbsd.org>, "netbsd-bugs@netbsd.org" <netbsd-bugs@netbsd.org>,
"royger@netbsd.org" <royger@netbsd.org>
Subject: Re: port-xen/47057: Xen NetBSD DomU file system trash under Linux
Dom0
Date: Sat, 20 Oct 2012 13:31:37 +0200
More info on this subject, I was able to get to ddb after the system
freezed (using +++++), here is the output:
fatal breakpoint trap in supervisor mode
trap type 1 code 0 rip ffffffff80130bf5 cs e030 rflags 202 cr2
7f7ff780c390 cpl 8 rsp ffffa0002e4848a8
Stopped in pid 0.26 (system) at netbsd:breakpoint+0x5: leave
breakpoint() at netbsd:breakpoint+0x5
xencons_tty_input() at netbsd:xencons_tty_input+0xc9
xencons_handler() at netbsd:xencons_handler+0x79
intr_biglock_wrapper() at netbsd:intr_biglock_wrapper+0x1d
evtchn_do_event() at netbsd:evtchn_do_event+0x15a
call_evtchn_do_event() at netbsd:call_evtchn_do_event+0xd
hypervisor_callback() at netbsd:hypervisor_callback+0x9e
xenbus_thread() at netbsd:xenbus_thread+0xf5
ds ca00
es 4a58
fs b9e8
gs 6640
rdi 1
rsi ffffffff80a7b303
rbp ffffa0002e4848a8
rbx ffffffff80a7b303
rdx 2b
rcx 2b
rax 7f
r8 ffffffff80a8f800
r9 0
r10 ffffffff80a8fa00
r11 246
r12 ffffa00002365090
r13 ffffffff80a7b303
r14 ffffa00002367340
r15 1
rip ffffffff80130bf5 breakpoint+0x5
cs e030
rflags 202
rsp ffffa0002e4848a8
ss e02b
And the ps:
db{0}> ps
PID LID S CPU FLAGS STRUCT LWP * NAME WAIT
26620 1 2 0 0 ffffa00008045b00 cc1
28013 1 3 0 80 ffffa00002fe61a0 x86_64--netbsd-g wait
23074 1 3 0 80 ffffa000080456e0 nbmkdep wait
16970 1 3 0 80 ffffa00002628180 sh wait
14476 1 2 0 0 ffffa00002b6f600 nbmkdep
24751 1 2 0 0 ffffa0000404a4c0 as
23651 1 3 0 80 ffffa000034d95a0 sh wait
16446 1 2 0 40000 ffffa00002fe65c0 cc1
3398 1 3 0 80 ffffa00002fd95a0 x86_64--netbsd-g wait
22279 1 3 0 80 ffffa0000387a980 sh wait
16566 1 3 0 40080 ffffa000076672c0 nbmake select
22489 1 3 0 40080 ffffa0000236e520 sh wait
16321 1 3 0 80 ffffa00002aa7660 nbmake select
21974 1 3 0 80 ffffa00002573960 sh wait
5656 1 3 0 80 ffffa0000236b8e0 x86_64--netbsd-g wait
6264 1 3 0 80 ffffa000075881e0 sh wait
870 1 3 0 80 ffffa0000404a0a0 nbmake select
28228 1 3 0 80 ffffa00002aa7a80 sh wait
19537 1 3 0 80 ffffa000076676e0 nbmake select
19740 1 3 0 80 ffffa00007588600 sh wait
8737 1 3 0 80 ffffa00002f8e220 nbmake select
29418 1 3 0 80 ffffa00002fa5aa0 sh wait
4427 1 3 0 80 ffffa0000263a5c0 nbmake select
28916 1 3 0 80 ffffa000080452c0 sh wait
13113 1 3 0 80 ffffa00002fe69e0 nbmake select
19412 1 3 0 80 ffffa000075691c0 sh wait
583 1 3 0 80 ffffa0000387a140 nbmake select
5923 1 3 0 80 ffffa000085da860 sh wait
21434 1 3 0 80 ffffa0000404a8e0 nbmake select
22103 1 3 0 80 ffffa00002f8ea60 sh wait
22077 1 3 0 80 ffffa00002b6f1e0 nbmake select
6976 1 3 0 80 ffffa000085da020 sh wait
18463 1 3 0 80 ffffa000075695e0 nbmake select
19784 1 3 0 80 ffffa00002f8e640 sh wait
8975 1 3 0 80 ffffa00007eb0420 nbmake select
6597 1 3 0 80 ffffa000026289c0 sh wait
18499 1 3 0 80 ffffa00002aa7240 nbmake select
649 1 3 0 80 ffffa0000263a1a0 sh wait
23152 1 3 0 80 ffffa0000387a560 nbmake select
11133 1 3 0 80 ffffa00007588a20 sh wait
11482 1 2 0 0 ffffa00007569a00 getty
15288 1 3 0 80 ffffa00007667b00 sh wait
6588 1 3 0 80 ffffa000034d99c0 screen-4.0.3 select
541 1 3 0 80 ffffa000026285a0 getty nanoslp
479 1 3 0 80 ffffa00002573540 getty nanoslp
539 1 3 0 80 ffffa0000236e100 getty nanoslp
532 1 3 0 80 ffffa000025fd580 cron nanoslp
535 1 3 0 80 ffffa0000263a9e0 inetd kqueue
333 1 3 0 80 ffffa000025fd9a0 sshd select
463 1 3 0 80 ffffa00002590980 powerd kqueue
307 1 2 0 0 ffffa000025fd160 syslogd
249 1 3 0 80 ffffa00002590560 dhcpcd select
1 1 3 0 80 ffffa0000236c0c0 init wait
0 36 3 0 200 ffffa00002590140 physiod physiod
0 35 3 0 200 ffffa0000236b4c0 aiodoned aiodoned
0 34 3 0 200 ffffa0000236c900 ioflush syncer
0 33 3 0 200 ffffa0000236b0a0 pgdaemon pgdaemon
0 30 3 0 200 ffffa0000235e080 cryptoret crypto_w
0 29 3 0 200 ffffa0000236c4e0 xen_balloon xen_balloon
0 28 3 0 200 ffffa0000236d920 unpgc unpgc
0 27 3 0 200 ffffa0000236d500 vmem_rehash vmem_rehash
0 > 26 7 0 200 ffffa0000236e940 xenbus
0 25 3 0 200 ffffa0000236d0e0 xenwatch evtsq
0 15 3 0 200 ffffa0000235e4a0 pmfsuspend pmfsuspend
0 14 3 0 200 ffffa0000235e8c0 pmfevent pmfevent
0 13 3 0 200 ffffa00001ee4060 sopendfree sopendfr
0 12 3 0 200 ffffa00001ee4480 nfssilly nfssilly
0 11 3 0 200 ffffa00001ee48a0 cachegc cachegc
0 10 3 0 200 ffffa00001ee3040 vrele vrele
0 9 3 0 200 ffffa00001ee3460 vdrain vdrain
0 8 3 0 200 ffffa00001ee3880 modunload mod_unld
0 7 3 0 200 ffffa00001ed9020 xcall/0 xcall
0 6 1 0 200 ffffa00001ed9440 softser/0
0 5 1 0 200 ffffa00001ed9860 softclk/0
0 4 1 0 200 ffffa00001ed6000 softbio/0
0 3 1 0 200 ffffa00001ed6420 softnet/0
0 2 1 0 201 ffffa00001ed6840 idle/0
0 1 3 0 200 ffffffff805b4c80 swapper uvm
I will try to create a patch that shows the value of the ring indexes,
since I'm pretty sure they are screwed up, and the system was blocked in
xenbus_thread because of that before the callback came in.
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Roger Pau =?iso-8859-1?Q?Monn=E9?= <roger.pau@citrix.com>
Cc: "gnats-bugs@NetBSD.org" <gnats-bugs@NetBSD.org>,
"port-xen-maintainer@netbsd.org" <port-xen-maintainer@NetBSD.org>,
"gnats-admin@netbsd.org" <gnats-admin@NetBSD.org>,
"netbsd-bugs@netbsd.org" <netbsd-bugs@NetBSD.org>,
"royger@netbsd.org" <royger@NetBSD.org>
Subject: Re: port-xen/47057: Xen NetBSD DomU file system trash under Linux
Dom0
Date: Sat, 20 Oct 2012 13:47:56 +0200
On Sat, Oct 20, 2012 at 01:31:37PM +0200, Roger Pau Monné wrote:
> More info on this subject, I was able to get to ddb after the system
> freezed (using +++++), here is the output:
>
> fatal breakpoint trap in supervisor mode
> trap type 1 code 0 rip ffffffff80130bf5 cs e030 rflags 202 cr2
> 7f7ff780c390 cpl 8 rsp ffffa0002e4848a8
> Stopped in pid 0.26 (system) at netbsd:breakpoint+0x5: leave
> breakpoint() at netbsd:breakpoint+0x5
> xencons_tty_input() at netbsd:xencons_tty_input+0xc9
> xencons_handler() at netbsd:xencons_handler+0x79
> intr_biglock_wrapper() at netbsd:intr_biglock_wrapper+0x1d
> evtchn_do_event() at netbsd:evtchn_do_event+0x15a
> call_evtchn_do_event() at netbsd:call_evtchn_do_event+0xd
> hypervisor_callback() at netbsd:hypervisor_callback+0x9e
> xenbus_thread() at netbsd:xenbus_thread+0xf5
> ds ca00
> es 4a58
> fs b9e8
> gs 6640
> rdi 1
> rsi ffffffff80a7b303
> rbp ffffa0002e4848a8
> rbx ffffffff80a7b303
> rdx 2b
> rcx 2b
> rax 7f
> r8 ffffffff80a8f800
> r9 0
> r10 ffffffff80a8fa00
> r11 246
> r12 ffffa00002365090
> r13 ffffffff80a7b303
> r14 ffffa00002367340
> r15 1
> rip ffffffff80130bf5 breakpoint+0x5
> cs e030
> rflags 202
> rsp ffffa0002e4848a8
> ss e02b
>
> And the ps:
>
> db{0}> ps
> PID LID S CPU FLAGS STRUCT LWP * NAME WAIT
> 26620 1 2 0 0 ffffa00008045b00 cc1
> 28013 1 3 0 80 ffffa00002fe61a0 x86_64--netbsd-g wait
> 23074 1 3 0 80 ffffa000080456e0 nbmkdep wait
> 16970 1 3 0 80 ffffa00002628180 sh wait
> 14476 1 2 0 0 ffffa00002b6f600 nbmkdep
> 24751 1 2 0 0 ffffa0000404a4c0 as
> 23651 1 3 0 80 ffffa000034d95a0 sh wait
> 16446 1 2 0 40000 ffffa00002fe65c0 cc1
> 3398 1 3 0 80 ffffa00002fd95a0 x86_64--netbsd-g wait
> 22279 1 3 0 80 ffffa0000387a980 sh wait
> 16566 1 3 0 40080 ffffa000076672c0 nbmake select
> 22489 1 3 0 40080 ffffa0000236e520 sh wait
> 16321 1 3 0 80 ffffa00002aa7660 nbmake select
> 21974 1 3 0 80 ffffa00002573960 sh wait
> 5656 1 3 0 80 ffffa0000236b8e0 x86_64--netbsd-g wait
> 6264 1 3 0 80 ffffa000075881e0 sh wait
> 870 1 3 0 80 ffffa0000404a0a0 nbmake select
> 28228 1 3 0 80 ffffa00002aa7a80 sh wait
> 19537 1 3 0 80 ffffa000076676e0 nbmake select
> 19740 1 3 0 80 ffffa00007588600 sh wait
> 8737 1 3 0 80 ffffa00002f8e220 nbmake select
> 29418 1 3 0 80 ffffa00002fa5aa0 sh wait
> 4427 1 3 0 80 ffffa0000263a5c0 nbmake select
> 28916 1 3 0 80 ffffa000080452c0 sh wait
> 13113 1 3 0 80 ffffa00002fe69e0 nbmake select
> 19412 1 3 0 80 ffffa000075691c0 sh wait
> 583 1 3 0 80 ffffa0000387a140 nbmake select
> 5923 1 3 0 80 ffffa000085da860 sh wait
> 21434 1 3 0 80 ffffa0000404a8e0 nbmake select
> 22103 1 3 0 80 ffffa00002f8ea60 sh wait
> 22077 1 3 0 80 ffffa00002b6f1e0 nbmake select
> 6976 1 3 0 80 ffffa000085da020 sh wait
> 18463 1 3 0 80 ffffa000075695e0 nbmake select
> 19784 1 3 0 80 ffffa00002f8e640 sh wait
> 8975 1 3 0 80 ffffa00007eb0420 nbmake select
> 6597 1 3 0 80 ffffa000026289c0 sh wait
> 18499 1 3 0 80 ffffa00002aa7240 nbmake select
> 649 1 3 0 80 ffffa0000263a1a0 sh wait
> 23152 1 3 0 80 ffffa0000387a560 nbmake select
> 11133 1 3 0 80 ffffa00007588a20 sh wait
> 11482 1 2 0 0 ffffa00007569a00 getty
> 15288 1 3 0 80 ffffa00007667b00 sh wait
> 6588 1 3 0 80 ffffa000034d99c0 screen-4.0.3 select
> 541 1 3 0 80 ffffa000026285a0 getty nanoslp
> 479 1 3 0 80 ffffa00002573540 getty nanoslp
> 539 1 3 0 80 ffffa0000236e100 getty nanoslp
> 532 1 3 0 80 ffffa000025fd580 cron nanoslp
> 535 1 3 0 80 ffffa0000263a9e0 inetd kqueue
> 333 1 3 0 80 ffffa000025fd9a0 sshd select
> 463 1 3 0 80 ffffa00002590980 powerd kqueue
> 307 1 2 0 0 ffffa000025fd160 syslogd
> 249 1 3 0 80 ffffa00002590560 dhcpcd select
> 1 1 3 0 80 ffffa0000236c0c0 init wait
> 0 36 3 0 200 ffffa00002590140 physiod physiod
> 0 35 3 0 200 ffffa0000236b4c0 aiodoned aiodoned
> 0 34 3 0 200 ffffa0000236c900 ioflush syncer
> 0 33 3 0 200 ffffa0000236b0a0 pgdaemon pgdaemon
> 0 30 3 0 200 ffffa0000235e080 cryptoret crypto_w
> 0 29 3 0 200 ffffa0000236c4e0 xen_balloon xen_balloon
> 0 28 3 0 200 ffffa0000236d920 unpgc unpgc
> 0 27 3 0 200 ffffa0000236d500 vmem_rehash vmem_rehash
> 0 > 26 7 0 200 ffffa0000236e940 xenbus
> 0 25 3 0 200 ffffa0000236d0e0 xenwatch evtsq
> 0 15 3 0 200 ffffa0000235e4a0 pmfsuspend pmfsuspend
> 0 14 3 0 200 ffffa0000235e8c0 pmfevent pmfevent
> 0 13 3 0 200 ffffa00001ee4060 sopendfree sopendfr
> 0 12 3 0 200 ffffa00001ee4480 nfssilly nfssilly
> 0 11 3 0 200 ffffa00001ee48a0 cachegc cachegc
> 0 10 3 0 200 ffffa00001ee3040 vrele vrele
> 0 9 3 0 200 ffffa00001ee3460 vdrain vdrain
> 0 8 3 0 200 ffffa00001ee3880 modunload mod_unld
> 0 7 3 0 200 ffffa00001ed9020 xcall/0 xcall
> 0 6 1 0 200 ffffa00001ed9440 softser/0
> 0 5 1 0 200 ffffa00001ed9860 softclk/0
> 0 4 1 0 200 ffffa00001ed6000 softbio/0
> 0 3 1 0 200 ffffa00001ed6420 softnet/0
> 0 2 1 0 201 ffffa00001ed6840 idle/0
> 0 1 3 0 200 ffffffff805b4c80 swapper uvm
>
> I will try to create a patch that shows the value of the ring indexes,
> since I'm pretty sure they are screwed up, and the system was blocked in
> xenbus_thread because of that before the callback came in.
What would be interesting here is a
tr/a ffffa0000236e940
(the lwp pointer of the xenbus thread). And alst what xenbus_thread+0xf5
points to in sources.
You can also try to type 'continue' and enter ddb again to see
if things changes (especially where in xenbus_thread it is
interrupted).
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
From: =?ISO-8859-1?Q?Roger_Pau_Monn=E9?= <roger.pau@citrix.com>
To: Manuel Bouyer <bouyer@antioche.eu.org>
Cc: "gnats-bugs@NetBSD.org" <gnats-bugs@NetBSD.org>,
"port-xen-maintainer@netbsd.org" <port-xen-maintainer@NetBSD.org>,
"gnats-admin@netbsd.org" <gnats-admin@NetBSD.org>, "netbsd-bugs@netbsd.org"
<netbsd-bugs@NetBSD.org>, "royger@netbsd.org" <royger@NetBSD.org>
Subject: Re: port-xen/47057: Xen NetBSD DomU file system trash under Linux
Dom0
Date: Sat, 20 Oct 2012 17:31:11 +0200
On 20/10/12 13:47, Manuel Bouyer wrote:
> What would be interesting here is a
> tr/a ffffa0000236e940
db{0}> tr/a ffffa0000236e940
trace: pid 0 lid 26 at 0xffffa0002e4848a8
breakpoint() at netbsd:breakpoint+0x5
xencons_tty_input() at netbsd:xencons_tty_input+0xc9
xencons_handler() at netbsd:xencons_handler+0x79
intr_biglock_wrapper() at netbsd:intr_biglock_wrapper+0x1d
evtchn_do_event() at netbsd:evtchn_do_event+0x15a
call_evtchn_do_event() at netbsd:call_evtchn_do_event+0xd
hypervisor_callback() at netbsd:hypervisor_callback+0x9e
xenbus_thread() at netbsd:xenbus_thread+0xf5
> (the lwp pointer of the xenbus thread). And alst what xenbus_thread+0xf5
> points to in sources.
Since the kernel was compiled without -g I guess there's no way to get
that now.
> You can also try to type 'continue' and enter ddb again to see
> if things changes (especially where in xenbus_thread it is
> interrupted).
Nope, the system is completely frozen, tried several times and the trace
is exactly the same. Will compile a new kernel with -g and let's see
what I can get, but I bet xenbus_thread is blocked at:
831: printk("XENBUS error %d while reading message\n", err);
In fact I'm going to replace that with a panic.
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Roger Pau =?iso-8859-1?Q?Monn=E9?= <roger.pau@citrix.com>
Cc: "gnats-bugs@NetBSD.org" <gnats-bugs@NetBSD.org>,
"port-xen-maintainer@netbsd.org" <port-xen-maintainer@NetBSD.org>,
"gnats-admin@netbsd.org" <gnats-admin@NetBSD.org>,
"netbsd-bugs@netbsd.org" <netbsd-bugs@NetBSD.org>,
"royger@netbsd.org" <royger@NetBSD.org>
Subject: Re: port-xen/47057: Xen NetBSD DomU file system trash under Linux
Dom0
Date: Sat, 20 Oct 2012 17:38:53 +0200
On Sat, Oct 20, 2012 at 05:31:11PM +0200, Roger Pau Monné wrote:
> > You can also try to type 'continue' and enter ddb again to see
> > if things changes (especially where in xenbus_thread it is
> > interrupted).
>
> Nope, the system is completely frozen, tried several times and the trace
> is exactly the same.
You mean, it's always at xenbus_thread+0xf5 ? You never see other offsets ?
> Will compile a new kernel with -g and let's see
> what I can get, but I bet xenbus_thread is blocked at:
>
> 831: printk("XENBUS error %d while reading message\n", err);
>
> In fact I'm going to replace that with a panic.
Maybe just a printf instead of a printk at first.
If the offset never changes, I wonder if it could be stuck in
(void)HYPERVISOR_console_io(CONSOLEIO_write, ret, buf);
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
From: =?ISO-8859-1?Q?Roger_Pau_Monn=E9?= <roger.pau@citrix.com>
To: Manuel Bouyer <bouyer@antioche.eu.org>
Cc: "gnats-bugs@NetBSD.org" <gnats-bugs@NetBSD.org>,
"port-xen-maintainer@netbsd.org" <port-xen-maintainer@NetBSD.org>,
"gnats-admin@netbsd.org" <gnats-admin@NetBSD.org>, "netbsd-bugs@netbsd.org"
<netbsd-bugs@NetBSD.org>, "royger@netbsd.org" <royger@NetBSD.org>
Subject: Re: port-xen/47057: Xen NetBSD DomU file system trash under Linux
Dom0
Date: Sat, 20 Oct 2012 17:48:08 +0200
On 20/10/12 17:38, Manuel Bouyer wrote:
> On Sat, Oct 20, 2012 at 05:31:11PM +0200, Roger Pau Monné wrote:
>>> You can also try to type 'continue' and enter ddb again to see
>>> if things changes (especially where in xenbus_thread it is
>>> interrupted).
>>
>> Nope, the system is completely frozen, tried several times and the trace
>> is exactly the same.
>
> You mean, it's always at xenbus_thread+0xf5 ? You never see other offsets ?
No, no other offsets.
>
>> Will compile a new kernel with -g and let's see
>> what I can get, but I bet xenbus_thread is blocked at:
>>
>> 831: printk("XENBUS error %d while reading message\n", err);
>>
>> In fact I'm going to replace that with a panic.
>
> Maybe just a printf instead of a printk at first.
Tried that in the past (replacing the printk with a printf), and then I
just get in an infite printf loop, intf->rsp_cons and intf->rsp_prod are
corrupted, check_indexes in xb_read always returns false and this leads
to a infinite loop in xenbus_thread because process_msg always return error.
I've now compiled a kernel that has the panic and prints the ring
indexes. What's the best way to check who modifies intf->rsp_cons and
intf->rsp_prod? Will ddb watch work on this kind of memory region?
> If the offset never changes, I wonder if it could be stuck in
> (void)HYPERVISOR_console_io(CONSOLEIO_write, ret, buf);
>
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Roger Pau =?iso-8859-1?Q?Monn=E9?= <roger.pau@citrix.com>
Cc: "gnats-bugs@NetBSD.org" <gnats-bugs@NetBSD.org>,
"port-xen-maintainer@netbsd.org" <port-xen-maintainer@NetBSD.org>,
"gnats-admin@netbsd.org" <gnats-admin@NetBSD.org>,
"netbsd-bugs@netbsd.org" <netbsd-bugs@NetBSD.org>,
"royger@netbsd.org" <royger@NetBSD.org>
Subject: Re: port-xen/47057: Xen NetBSD DomU file system trash under Linux
Dom0
Date: Sat, 20 Oct 2012 17:57:21 +0200
On Sat, Oct 20, 2012 at 05:48:08PM +0200, Roger Pau Monné wrote:
> >
> >> Will compile a new kernel with -g and let's see
> >> what I can get, but I bet xenbus_thread is blocked at:
> >>
> >> 831: printk("XENBUS error %d while reading message\n", err);
> >>
> >> In fact I'm going to replace that with a panic.
> >
> > Maybe just a printf instead of a printk at first.
>
> Tried that in the past (replacing the printk with a printf), and then I
> just get in an infite printf loop, intf->rsp_cons and intf->rsp_prod are
> corrupted, check_indexes in xb_read always returns false and this leads
> to a infinite loop in xenbus_thread because process_msg always return error.
>
> I've now compiled a kernel that has the panic and prints the ring
> indexes. What's the best way to check who modifies intf->rsp_cons and
> intf->rsp_prod? Will ddb watch work on this kind of memory region?
You can try, but at first glance I'd say it won't work.
Can you determine if it's cons or prod (or both) which is corrupted,
and in which way ? What are the values when it's corrupted ?
Are they always the same ?
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
From: =?windows-1252?Q?Roger_Pau_Monn=E9?= <roger.pau@citrix.com>
To: Manuel Bouyer <bouyer@antioche.eu.org>
Cc: "gnats-bugs@NetBSD.org" <gnats-bugs@NetBSD.org>,
"port-xen-maintainer@netbsd.org" <port-xen-maintainer@NetBSD.org>,
"gnats-admin@netbsd.org" <gnats-admin@NetBSD.org>, "netbsd-bugs@netbsd.org"
<netbsd-bugs@NetBSD.org>, "royger@netbsd.org" <royger@NetBSD.org>
Subject: Re: port-xen/47057: Xen NetBSD DomU file system trash under Linux
Dom0
Date: Sat, 20 Oct 2012 18:02:27 +0200
On 20/10/12 17:57, Manuel Bouyer wrote:
> On Sat, Oct 20, 2012 at 05:48:08PM +0200, Roger Pau Monné wrote:
>>>
>>>> Will compile a new kernel with -g and let's see
>>>> what I can get, but I bet xenbus_thread is blocked at:
>>>>
>>>> 831: printk("XENBUS error %d while reading message\n", err);
>>>>
>>>> In fact I'm going to replace that with a panic.
>>>
>>> Maybe just a printf instead of a printk at first.
>>
>> Tried that in the past (replacing the printk with a printf), and then I
>> just get in an infite printf loop, intf->rsp_cons and intf->rsp_prod are
>> corrupted, check_indexes in xb_read always returns false and this leads
>> to a infinite loop in xenbus_thread because process_msg always return error.
>>
>> I've now compiled a kernel that has the panic and prints the ring
>> indexes. What's the best way to check who modifies intf->rsp_cons and
>> intf->rsp_prod? Will ddb watch work on this kind of memory region?
>
> You can try, but at first glance I'd say it won't work.
>
> Can you determine if it's cons or prod (or both) which is corrupted,
> and in which way ? What are the values when it's corrupted ?
> Are they always the same ?
This is a trim of what I think is relevant, the first lines correspond
to the last known values of prod and cons before the corruption, and the
rest is quite self explanatory:
xenbus_xs (process_msg:763) xb_read hdr 0.
xb_read: cons: 3470 prod: 3473
Finished read of 3 bytes (0 to go)
xenbus_xs (process_msg:776) xb_read body 0.
xenbus_xs (process_msg:811) process_msg: type 7 body OK.
xenbus_xs (read_reply:134) read_reply: type 7 body OK.
xenbus_xs (xs_talkv:224) read done.
[…]
xb_read: cons: 2403996137 prod: 3531897424
xb_read EIO
xenbus_xs (process_msg:763) xb_read hdr 5.
panic: XENBUS error 5 while reading message
cpu0: Begin traceback...
printf_nolog() at netbsd:printf_nolog
xenbus_thread() at netbsd:xenbus_thread+0x140
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Roger Pau =?iso-8859-1?Q?Monn=E9?= <roger.pau@citrix.com>
Cc: "gnats-bugs@NetBSD.org" <gnats-bugs@NetBSD.org>,
"port-xen-maintainer@netbsd.org" <port-xen-maintainer@NetBSD.org>,
"gnats-admin@netbsd.org" <gnats-admin@NetBSD.org>,
"netbsd-bugs@netbsd.org" <netbsd-bugs@NetBSD.org>,
"royger@netbsd.org" <royger@NetBSD.org>
Subject: Re: port-xen/47057: Xen NetBSD DomU file system trash under Linux
Dom0
Date: Sat, 20 Oct 2012 18:42:18 +0200
--tKW2IUtsqtDRztdT
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
On Sat, Oct 20, 2012 at 06:02:27PM +0200, Roger Pau Monné wrote:
> > Can you determine if it's cons or prod (or both) which is corrupted,
> > and in which way ? What are the values when it's corrupted ?
> > Are they always the same ?
>
> This is a trim of what I think is relevant, the first lines correspond
> to the last known values of prod and cons before the corruption, and the
> rest is quite self explanatory:
>
> xenbus_xs (process_msg:763) xb_read hdr 0.
> xb_read: cons: 3470 prod: 3473
> Finished read of 3 bytes (0 to go)
> xenbus_xs (process_msg:776) xb_read body 0.
> xenbus_xs (process_msg:811) process_msg: type 7 body OK.
> xenbus_xs (read_reply:134) read_reply: type 7 body OK.
> xenbus_xs (xs_talkv:224) read done.
>
> [?]
is there anything happening here ?
>
> xb_read: cons: 2403996137 prod: 3531897424
So both cons and prod would be corrupted. As the domU is supposed to update
rsp_cons only, I guess we're looking for something that is writing to
random memory.
Maybe the atached patch will help; anything trying to write to the page
outside of xb_read and xb_write should get a page fault.
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
--tKW2IUtsqtDRztdT
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename=diff
Index: xenbus_comms.c
===================================================================
RCS file: /cvsroot/src/sys/arch/xen/xenbus/xenbus_comms.c,v
retrieving revision 1.14
diff -u -p -u -r1.14 xenbus_comms.c
--- xenbus_comms.c 20 Sep 2011 00:12:24 -0000 1.14
+++ xenbus_comms.c 20 Oct 2012 16:40:44 -0000
@@ -37,6 +37,7 @@ __KERNEL_RCSID(0, "$NetBSD: xenbus_comms
#include <sys/param.h>
#include <sys/proc.h>
#include <sys/systm.h>
+#include <uvm/uvm_extern.h>
#include <xen/xen.h> /* for xendomain_is_dom0() */
#include <xen/hypervisor.h>
@@ -142,6 +143,10 @@ xb_write(const void *data, unsigned len)
continue;
if (avail > len)
avail = len;
+ pmap_kenter_ma((vaddr_t)intf,
+ xen_start_info.store_mfn << PAGE_SHIFT,
+ VM_PROT_READ | VM_PROT_WRITE, 0);
+ pmap_update(pmap_kernel());
memcpy(dst, data, avail);
data = (const char *)data + avail;
@@ -151,6 +156,10 @@ xb_write(const void *data, unsigned len)
xen_rmb();
intf->req_prod += avail;
xen_rmb();
+ pmap_protect(pmap_kernel(), (vaddr_t)intf,
+ (vaddr_t)intf + PAGE_SIZE,
+ VM_PROT_READ);
+ pmap_update(pmap_kernel());
hypervisor_notify_via_evtchn(xen_start_info.store_evtchn);
}
@@ -198,9 +207,17 @@ xb_read(void *data, unsigned len)
len -= avail;
/* Other side must not see free space until we've copied out */
+ pmap_kenter_ma((vaddr_t)intf,
+ xen_start_info.store_mfn << PAGE_SHIFT,
+ VM_PROT_READ | VM_PROT_WRITE, 0);
+ pmap_update(pmap_kernel());
xen_rmb();
intf->rsp_cons += avail;
xen_rmb();
+ pmap_protect(pmap_kernel(), (vaddr_t)intf,
+ (vaddr_t)intf + PAGE_SIZE,
+ VM_PROT_READ);
+ pmap_update(pmap_kernel());
XENPRINTF(("Finished read of %i bytes (%i to go)\n",
avail, len));
--tKW2IUtsqtDRztdT--
From: =?ISO-8859-1?Q?Roger_Pau_Monn=E9?= <roger.pau@citrix.com>
To: Manuel Bouyer <bouyer@antioche.eu.org>
Cc: "gnats-bugs@NetBSD.org" <gnats-bugs@NetBSD.org>,
"port-xen-maintainer@netbsd.org" <port-xen-maintainer@NetBSD.org>,
"gnats-admin@netbsd.org" <gnats-admin@NetBSD.org>, "netbsd-bugs@netbsd.org"
<netbsd-bugs@NetBSD.org>, "royger@netbsd.org" <royger@NetBSD.org>
Subject: Re: port-xen/47057: Xen NetBSD DomU file system trash under Linux
Dom0
Date: Sat, 20 Oct 2012 20:34:16 +0200
On 20/10/12 18:42, Manuel Bouyer wrote:
> So both cons and prod would be corrupted. As the domU is supposed to update
> rsp_cons only, I guess we're looking for something that is writing to
> random memory.
>
> Maybe the atached patch will help; anything trying to write to the page
> outside of xb_read and xb_write should get a page fault.
I'm sorry to say that the patch didn't seem to help, here is another
output with your patch applied. It seems like prod and cons gets
overwritten with random data.
xenbus_xs (process_msg:763) xb_read hdr 0.
xb_read: cons: 3521 prod: 3523
Finished read of 2 bytes (0 to go)
xenbus_xs (process_msg:776) xb_read body 0.
xenbus_xs (process_msg:811) process_msg: type 6 body 4.
xenbus_xs (read_reply:134) read_reply: type 6 body 4.
xenbus_xs (xs_talkv:224) read done.
xenbus_xs (xs_talkv:202) write msg.
xenbus_xs (xs_talkv:204) write msg err 0.
xenbus_xs (xs_talkv:212) write iovect.
xenbus_xs (xs_talkv:214) write iovect err 0.
xenbus_xs (xs_talkv:222) read.
xb_read: cons: 3523 prod: 3546
Finished read of 16 bytes (0 to go)
xenbus_xs (process_msg:763) xb_read hdr 0.
xb_read: cons: 3539 prod: 3546
Finished read of 7 bytes (0 to go)
xenbus_xs (process_msg:776) xb_read body 0.
xenbus_xs (process_msg:811) process_msg: type 16 body ENOENT.
xenbus_xs (read_reply:134) read_reply: type 16 body ENOENT.
xenbus_xs (xs_talkv:224) read done.
xenbus_xs (xs_talkv:202) write msg.
xenbus_xs (xs_talkv:204) write msg err 0.
xenbus_xs (xs_talkv:212) write iovect.
xenbus_xs (xs_talkv:214) write iovect err 0.
xenbus_xs (xs_talkv:222) read.
xb_read: cons: 3546 prod: 3565
Finished read of 16 bytes (0 to go)
xenbus_xs (process_msg:763) xb_read hdr 0.
xb_read: cons: 3562 prod: 3565
Finished read of 3 bytes (0 to go)
xenbus_xs (process_msg:776) xb_read body 0.
xenbus_xs (process_msg:811) process_msg: type 7 body OK.
xenbus_xs (read_reply:134) read_reply: type 7 body OK.
xenbus_xs (xs_talkv:224) read done.
boot device: xbd0
root on xbd0a dumps on xbd0b
Your machine does not initialize mem_clusters; sparse_dumps disabled
/: replaying log to memory
root file system type: ffs
Sat Oct 20 17:01:27 UTC 2012
Starting root file system check:
/dev/rxbd0a: file system is journaled; not checking
/: replaying log to disk
swapctl: setting dump device to /dev/xbd0b
swapctl: adding /dev/xbd0b as swap device at priority 0
Starting file system checks:
Setting tty flags.
Setting sysctl variables:
ddb.onpanic: 1 -> 1
Starting network.
/etc/rc: WARNING: $hostname not set.
IPv6 mode: host
Configuring network interfaces: xennet0.
Adding interface aliases:.
Building databases: dev, utmp, utmpx.
wsconscfg: Cannot open `/dev/ttyEcfg': Device not configured
wsconscfg: Cannot open `/dev/ttyEcfg': Device not configured
wsconscfg: Cannot open `/dev/ttyEcfg': Device not configured
wsconscfg: Cannot open `/dev/ttyEcfg': Device not configured
Starting syslogd.
Mounting all filesystems...
Clearing temporary files.
Checking quotas: done.
swapctl: setting dump device to /dev/xbd0b
Starting virecover.
Checking for core dump...
savecore: no core dump
Starting local daemons:.
Updating motd.
Starting powerd.
Starting sshd.
postfix: rebuilding /etc/mail/aliases (missing /etc/mail/aliases.db)
newaliases: warning: valid_hostname: empty hostname
newaliases: fatal: unable to use my own hostname
/etc/rc.d/postfix exited with code 1
Oct 20 17:01:46 postfix/sendmail[494]: fatal: unable to use my own hostname
Starting inetd.
Starting cron.
The following components reported failures:
/etc/rc.d/postfix
See /var/run/rc.log for more information.
Sat Oct 20 17:01:46 UTC 2012
Oct 20 17:01:48 getty[589]: /dev/ttyE2: Device not configured
Oct 20 17:01:48 getty[569]: /dev/ttyE3: Device not configured
Oct 20 17:01:48 getty[501]: /dev/ttyE1: Device not configured
NetBSD/amd64 (Amnesiac) (console)
login: root
Password:
Oct 20 17:02:51 login: ROOT LOGIN (root) on tty console
Last login: Sat Oct 20 16:06:11 2012 on console
Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
2006, 2007, 2008, 2009, 2010, 2011, 2012
The NetBSD Foundation, Inc. All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California. All rights reserved.
NetBSD 6.99.14 (XEN3_DOMU) #2: Sat Oct 20 18:50:37 CEST 2012
Welcome to NetBSD!
Terminal type is vt100.
We recommend that you create a non-root account and use su(1) for root
access.
# cd src
# while [ 1 ]; do
> ./build.sh -j3 -m amd64 -O ../obj -T ../tools build >log
> done
xb_read: cons: 706764616 prod: 2607
xb_read EIO
xenbus_xs (process_msg:763) xb_read hdr 5.
panic: XENBUS error 5 while reading message
fatal breakpoint trap in supervisor mode
trap type 1 code 0 rip ffffffff8012e655 cs e030 rflags 246 cr2
7f7ff780c390 ilevel 0 rsp ffffa0002e4c5b30
curlwp 0xffffa00002367980 pid 0 lid 32 lowest kstack 0xffffa0002e4c2000
Stopped in pid 0.32 (system) at netbsd:breakpoint+0x5: leave
breakpoint() at netbsd:breakpoint+0x5
vpanic() at netbsd:vpanic+0x1f2
printf_nolog() at netbsd:printf_nolog
xenbus_thread() at netbsd:xenbus_thread+0x140
ds 7980
es 5b70
fs 100
gs b880
rdi 0
rsi d
rbp ffffa0002e4c5b30
rbx 104
rdx 0
rcx 8
rax 1
r8 ffffffff8063d600 cpu_info_primary
r9 1
r10 0
r11 ffffa0000238c000
r12 ffffffff804adee8 copyright+0x3bae8
r13 ffffa0002e4c5b70
r14 ffffa00002367980
r15 c2c2c2c2c2c2c2c2
rip ffffffff8012e655 breakpoint+0x5
cs e030
rflags 246
rsp ffffa0002e4c5b30
ss e02b
netbsd:breakpoint+0x5: leave
db{0}>
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Roger Pau =?iso-8859-1?Q?Monn=E9?= <roger.pau@citrix.com>
Cc: "gnats-bugs@NetBSD.org" <gnats-bugs@NetBSD.org>,
"port-xen-maintainer@netbsd.org" <port-xen-maintainer@NetBSD.org>,
"gnats-admin@netbsd.org" <gnats-admin@NetBSD.org>,
"netbsd-bugs@netbsd.org" <netbsd-bugs@NetBSD.org>,
"royger@netbsd.org" <royger@NetBSD.org>
Subject: Re: port-xen/47057: Xen NetBSD DomU file system trash under Linux
Dom0
Date: Sun, 21 Oct 2012 13:29:10 +0200
--ibTvN161/egqYuK8
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
On Sat, Oct 20, 2012 at 08:34:16PM +0200, Roger Pau Monné wrote:
> I'm sorry to say that the patch didn't seem to help, here is another
> output with your patch applied. It seems like prod and cons gets
> overwritten with random data.
OK, here's another patch, which also checks that the mapping doesn't
change. But I wonder is the corruption occurs on the NetBSD side.
Could you also add some debugging code on the other side ?
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
--ibTvN161/egqYuK8
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename=diff
Index: xenbus_comms.c
===================================================================
RCS file: /cvsroot/src/sys/arch/xen/xenbus/xenbus_comms.c,v
retrieving revision 1.14
diff -u -p -u -r1.14 xenbus_comms.c
--- xenbus_comms.c 20 Sep 2011 00:12:24 -0000 1.14
+++ xenbus_comms.c 21 Oct 2012 11:26:23 -0000
@@ -37,6 +37,7 @@ __KERNEL_RCSID(0, "$NetBSD: xenbus_comms
#include <sys/param.h>
#include <sys/proc.h>
#include <sys/systm.h>
+#include <uvm/uvm_extern.h>
#include <xen/xen.h> /* for xendomain_is_dom0() */
#include <xen/hypervisor.h>
@@ -121,7 +122,10 @@ xb_write(const void *data, unsigned len)
while (len != 0) {
void *dst;
unsigned int avail;
+ paddr_t pa;
+ KASSERT(pmap_extract_ma(pmap_kernel(), (vaddr_t)intf, &pa));
+ KASSERT(pa == (xen_start_info.store_mfn << PAGE_SHIFT));
while ((intf->req_prod - intf->req_cons) == XENSTORE_RING_SIZE) {
XENPRINTF(("xb_write tsleep\n"));
tsleep(&xenstore_interface, PRIBIO, "wrst", 0);
@@ -142,6 +146,10 @@ xb_write(const void *data, unsigned len)
continue;
if (avail > len)
avail = len;
+ pmap_kenter_ma((vaddr_t)intf,
+ xen_start_info.store_mfn << PAGE_SHIFT,
+ VM_PROT_READ | VM_PROT_WRITE, 0);
+ pmap_update(pmap_kernel());
memcpy(dst, data, avail);
data = (const char *)data + avail;
@@ -151,6 +159,10 @@ xb_write(const void *data, unsigned len)
xen_rmb();
intf->req_prod += avail;
xen_rmb();
+ pmap_protect(pmap_kernel(), (vaddr_t)intf,
+ (vaddr_t)intf + PAGE_SIZE,
+ VM_PROT_READ);
+ pmap_update(pmap_kernel());
hypervisor_notify_via_evtchn(xen_start_info.store_evtchn);
}
@@ -170,6 +182,10 @@ xb_read(void *data, unsigned len)
while (len != 0) {
unsigned int avail;
const char *src;
+ paddr_t pa;
+
+ KASSERT(pmap_extract_ma(pmap_kernel(), (vaddr_t)intf, &pa));
+ KASSERT(pa == (xen_start_info.store_mfn << PAGE_SHIFT));
while (intf->rsp_cons == intf->rsp_prod)
tsleep(&xenstore_interface, PRIBIO, "rdst", 0);
@@ -198,9 +214,17 @@ xb_read(void *data, unsigned len)
len -= avail;
/* Other side must not see free space until we've copied out */
+ pmap_kenter_ma((vaddr_t)intf,
+ xen_start_info.store_mfn << PAGE_SHIFT,
+ VM_PROT_READ | VM_PROT_WRITE, 0);
+ pmap_update(pmap_kernel());
xen_rmb();
intf->rsp_cons += avail;
xen_rmb();
+ pmap_protect(pmap_kernel(), (vaddr_t)intf,
+ (vaddr_t)intf + PAGE_SIZE,
+ VM_PROT_READ);
+ pmap_update(pmap_kernel());
XENPRINTF(("Finished read of %i bytes (%i to go)\n",
avail, len));
--ibTvN161/egqYuK8--
From: =?UTF-8?Q?Roger_Pau_Monn=C3=A9?= <royger@NetBSD.org>
To: Manuel Bouyer <bouyer@antioche.eu.org>
Cc: =?UTF-8?Q?Roger_Pau_Monn=C3=A9?= <roger.pau@citrix.com>,
"gnats-bugs@NetBSD.org" <gnats-bugs@netbsd.org>,
"port-xen-maintainer@netbsd.org" <port-xen-maintainer@netbsd.org>,
"gnats-admin@netbsd.org" <gnats-admin@netbsd.org>, "netbsd-bugs@netbsd.org" <netbsd-bugs@netbsd.org>
Subject: Re: port-xen/47057: Xen NetBSD DomU file system trash under Linux Dom0
Date: Sun, 21 Oct 2012 19:47:38 +0200
--bcaec54ee10a2dcbb804cc9555ff
Content-Type: text/plain; charset=UTF-8
On Sun, Oct 21, 2012 at 1:29 PM, Manuel Bouyer <bouyer@antioche.eu.org> wrote:
> OK, here's another patch, which also checks that the mapping doesn't
> change. But I wonder is the corruption occurs on the NetBSD side.
> Could you also add some debugging code on the other side ?
Still no luck with the new patch, I've been looking at the Linux code,
and the attached patch (taken the idea from Linux) mitigates the
problem, but we still have it. I've also added the trace and verbose
options to xenstored running in the Dom0, and there's no sign that
anyone is writing to xenstore when the crash happens.
Is it possible that someone writes to the machine address
xen_start_info.store_mfn and is there anyway to check that nobody is
mapping this ma to another va?
--bcaec54ee10a2dcbb804cc9555ff
Content-Type: application/octet-stream; name="patch.diff"
Content-Disposition: attachment; filename="patch.diff"
Content-Transfer-Encoding: base64
X-Attachment-Id: f_h8kg24l21
ZGlmZiAtLWdpdCBhL3N5cy9hcmNoL3hlbi94ZW5idXMveGVuYnVzX2NvbW1zLmMgYi9zeXMvYXJj
aC94ZW4veGVuYnVzL3hlbmJ1c19jb21tcy5jCmluZGV4IDA0ZTRmMDIuLjllMjUzMDcgMTAwNjQ0
Ci0tLSBhL3N5cy9hcmNoL3hlbi94ZW5idXMveGVuYnVzX2NvbW1zLmMKKysrIGIvc3lzL2FyY2gv
eGVuL3hlbmJ1cy94ZW5idXNfY29tbXMuYwpAQCAtMTMzLDYgKzEzMyw3IEBAIHhiX3dyaXRlKGNv
bnN0IHZvaWQgKmRhdGEsIHVuc2lnbmVkIGxlbikKIAkJcHJvZCA9IGludGYtPnJlcV9wcm9kOwog
CQl4ZW5fcm1iKCk7CiAJCWlmICghY2hlY2tfaW5kZXhlcyhjb25zLCBwcm9kKSkgeworCQkJaW50
Zi0+cmVxX2NvbnMgPSBpbnRmLT5yZXFfcHJvZCA9IDA7CiAJCQlzcGx4KHMpOwogCQkJcmV0dXJu
IEVJTzsKIAkJfQpAQCAtMTgwLDYgKzE4MSw3IEBAIHhiX3JlYWQodm9pZCAqZGF0YSwgdW5zaWdu
ZWQgbGVuKQogCQl4ZW5fcm1iKCk7CiAJCWlmICghY2hlY2tfaW5kZXhlcyhjb25zLCBwcm9kKSkg
ewogCQkJWEVOUFJJTlRGKCgieGJfcmVhZCBFSU9cbiIpKTsKKwkJCWludGYtPnJzcF9jb25zID0g
aW50Zi0+cnNwX3Byb2QgPSAwOwogCQkJc3BseChzKTsKIAkJCXJldHVybiBFSU87CiAJCX0K
--bcaec54ee10a2dcbb804cc9555ff--
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Roger Pau =?iso-8859-1?Q?Monn=E9?= <royger@NetBSD.org>
Cc: Roger Pau =?iso-8859-1?Q?Monn=E9?= <roger.pau@citrix.com>,
"gnats-bugs@NetBSD.org" <gnats-bugs@NetBSD.org>,
"port-xen-maintainer@netbsd.org" <port-xen-maintainer@NetBSD.org>,
"gnats-admin@netbsd.org" <gnats-admin@NetBSD.org>,
"netbsd-bugs@netbsd.org" <netbsd-bugs@NetBSD.org>
Subject: Re: port-xen/47057: Xen NetBSD DomU file system trash under Linux
Dom0
Date: Sun, 21 Oct 2012 20:00:21 +0200
On Sun, Oct 21, 2012 at 07:47:38PM +0200, Roger Pau Monné wrote:
> On Sun, Oct 21, 2012 at 1:29 PM, Manuel Bouyer <bouyer@antioche.eu.org> wrote:
> > OK, here's another patch, which also checks that the mapping doesn't
> > change. But I wonder is the corruption occurs on the NetBSD side.
> > Could you also add some debugging code on the other side ?
>
> Still no luck with the new patch, I've been looking at the Linux code,
> and the attached patch (taken the idea from Linux) mitigates the
> problem, but we still have it.
Does linux do this silently, or does it complain when the ring
corruption occurs ?
> I've also added the trace and verbose
> options to xenstored running in the Dom0, and there's no sign that
> anyone is writing to xenstore when the crash happens.
>
> Is it possible that someone writes to the machine address
> xen_start_info.store_mfn and is there anyway to check that nobody is
> mapping this ma to another va?
I've been thinking about checking this, but it's harder to do.
Maybe it's easier to do this check in the hypervisor ?
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
From: =?UTF-8?Q?Roger_Pau_Monn=C3=A9?= <royger@NetBSD.org>
To: Manuel Bouyer <bouyer@antioche.eu.org>
Cc: =?UTF-8?Q?Roger_Pau_Monn=C3=A9?= <roger.pau@citrix.com>,
"gnats-bugs@NetBSD.org" <gnats-bugs@netbsd.org>,
"port-xen-maintainer@netbsd.org" <port-xen-maintainer@netbsd.org>,
"gnats-admin@netbsd.org" <gnats-admin@netbsd.org>, "netbsd-bugs@netbsd.org" <netbsd-bugs@netbsd.org>
Subject: Re: port-xen/47057: Xen NetBSD DomU file system trash under Linux Dom0
Date: Sun, 21 Oct 2012 20:10:36 +0200
On Sun, Oct 21, 2012 at 8:00 PM, Manuel Bouyer <bouyer@antioche.eu.org> wrote:
> Does linux do this silently, or does it complain when the ring
> corruption occurs ?
With the patch attached in the previous post, we will do the same as
Linux (reset indexes and printk). I've never seen that happen in
Linux, so I'm not sure if there's anything else.
>> Is it possible that someone writes to the machine address
>> xen_start_info.store_mfn and is there anyway to check that nobody is
>> mapping this ma to another va?
>
> I've been thinking about checking this, but it's harder to do.
> Maybe it's easier to do this check in the hypervisor ?
Will check that, not sure if there's an easy way to this in the hypervisor.
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Roger Pau =?iso-8859-1?Q?Monn=E9?= <royger@NetBSD.org>
Cc: Roger Pau =?iso-8859-1?Q?Monn=E9?= <roger.pau@citrix.com>,
"gnats-bugs@NetBSD.org" <gnats-bugs@NetBSD.org>,
"port-xen-maintainer@netbsd.org" <port-xen-maintainer@NetBSD.org>,
"gnats-admin@netbsd.org" <gnats-admin@NetBSD.org>,
"netbsd-bugs@netbsd.org" <netbsd-bugs@NetBSD.org>
Subject: Re: port-xen/47057: Xen NetBSD DomU file system trash under Linux
Dom0
Date: Sun, 21 Oct 2012 20:31:18 +0200
--sdtB3X0nJg68CQEu
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
On Sun, Oct 21, 2012 at 08:10:36PM +0200, Roger Pau Monné wrote:
> On Sun, Oct 21, 2012 at 8:00 PM, Manuel Bouyer <bouyer@antioche.eu.org> wrote:
> > Does linux do this silently, or does it complain when the ring
> > corruption occurs ?
>
> With the patch attached in the previous post, we will do the same as
> Linux (reset indexes and printk). I've never seen that happen in
> Linux, so I'm not sure if there's anything else.
>
> >> Is it possible that someone writes to the machine address
> >> xen_start_info.store_mfn and is there anyway to check that nobody is
> >> mapping this ma to another va?
> >
> > I've been thinking about checking this, but it's harder to do.
> > Maybe it's easier to do this check in the hypervisor ?
>
> Will check that, not sure if there's an easy way to this in the hypervisor.
You can also try the attached patch, which should catch a mapping to
the same store's ma via regular pmap functons. If it's something more
clever, we'll need a more clever checks ...
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
--sdtB3X0nJg68CQEu
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename=diff
Index: x86/x86/pmap.c
===================================================================
RCS file: /cvsroot/src/sys/arch/x86/x86/pmap.c,v
retrieving revision 1.178
diff -u -p -u -r1.178 pmap.c
--- x86/x86/pmap.c 15 Jun 2012 13:53:40 -0000 1.178
+++ x86/x86/pmap.c 21 Oct 2012 18:26:43 -0000
@@ -325,6 +325,8 @@ kmutex_t pmaps_lock;
static vaddr_t pmap_maxkvaddr;
+extern void *xenstore_interface;
+
/*
* XXX kludge: dummy locking to make KASSERTs in uvm_page.c comfortable.
* actual locking is done by pm_lock.
@@ -994,6 +996,9 @@ pmap_kenter_pa(vaddr_t va, paddr_t pa, v
} else
#endif /* DOM0OPS */
npte = pmap_pa2pte(pa);
+
+ if (xenstore_interface != NULL)
+ KASSERT(npte != (xen_start_info.store_mfn << PAGE_SHIFT));
npte |= protection_codes[prot] | PG_k | PG_V | pmap_pg_g;
npte |= pmap_pat_flags(flags);
opte = pmap_pte_testset(pte, npte); /* zap! */
@@ -1026,6 +1031,8 @@ pmap_emap_enter(vaddr_t va, paddr_t pa,
#endif
npte = pmap_pa2pte(pa);
+ if (xenstore_interface != NULL)
+ KASSERT(npte != (xen_start_info.store_mfn << PAGE_SHIFT));
npte = pmap_pa2pte(pa);
npte |= protection_codes[prot] | PG_k | PG_V;
pmap_pte_set(pte, npte);
@@ -3900,6 +3907,8 @@ pmap_enter_ma(struct pmap *pmap, vaddr_t
bool wired = (flags & PMAP_WIRED) != 0;
struct pmap *pmap2;
+ if (xenstore_interface != NULL)
+ KASSERT(ma != (xen_start_info.store_mfn << PAGE_SHIFT));
KASSERT(pmap_initialized);
KASSERT(curlwp->l_md.md_gc_pmap != pmap);
KASSERT(va < VM_MAX_KERNEL_ADDRESS);
Index: xen/x86/xen_pmap.c
===================================================================
RCS file: /cvsroot/src/sys/arch/xen/x86/xen_pmap.c,v
retrieving revision 1.22
diff -u -p -u -r1.22 xen_pmap.c
--- xen/x86/xen_pmap.c 24 Jun 2012 18:31:53 -0000 1.22
+++ xen/x86/xen_pmap.c 21 Oct 2012 18:26:43 -0000
@@ -174,12 +174,16 @@ void
pmap_kenter_ma(vaddr_t va, paddr_t ma, vm_prot_t prot, u_int flags)
{
pt_entry_t *pte, opte, npte;
+ extern void *xenstore_interface;
if (va < VM_MIN_KERNEL_ADDRESS)
pte = vtopte(va);
else
pte = kvtopte(va);
+ if (xenstore_interface != NULL)
+ KASSERT(ma != (xen_start_info.store_mfn << PAGE_SHIFT) ||
+ va == (vaddr_t)xenstore_interface);
npte = ma | ((prot & VM_PROT_WRITE) ? PG_RW : PG_RO) |
PG_V | PG_k;
if (flags & PMAP_NOCACHE)
--sdtB3X0nJg68CQEu--
From: =?UTF-8?Q?Roger_Pau_Monn=C3=A9?= <royger@NetBSD.org>
To: Manuel Bouyer <bouyer@antioche.eu.org>
Cc: =?UTF-8?Q?Roger_Pau_Monn=C3=A9?= <roger.pau@citrix.com>,
"gnats-bugs@NetBSD.org" <gnats-bugs@netbsd.org>,
"port-xen-maintainer@netbsd.org" <port-xen-maintainer@netbsd.org>,
"gnats-admin@netbsd.org" <gnats-admin@netbsd.org>, "netbsd-bugs@netbsd.org" <netbsd-bugs@netbsd.org>
Subject: Re: port-xen/47057: Xen NetBSD DomU file system trash under Linux Dom0
Date: Mon, 22 Oct 2012 10:03:35 +0200
On Sun, Oct 21, 2012 at 8:31 PM, Manuel Bouyer <bouyer@antioche.eu.org> wrote:
> You can also try the attached patch, which should catch a mapping to
> the same store's ma via regular pmap functons. If it's something more
> clever, we'll need a more clever checks ...
No luck with this either. I've also found sporadic:
evtchn_do_event: handler 0xffffffff801206a7 didn't lower ipl 8 7
Which I would say it's not related to the problem at hand (because
some times I get the corruption without seeing this message), the
handler in question is xen_timer_handler, which doesn't set any spl
levels directly, although the mutex tmutex sets IPL_CLOCK.
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Roger Pau =?iso-8859-1?Q?Monn=E9?= <royger@NetBSD.org>
Cc: Roger Pau =?iso-8859-1?Q?Monn=E9?= <roger.pau@citrix.com>,
"gnats-bugs@NetBSD.org" <gnats-bugs@NetBSD.org>,
"port-xen-maintainer@netbsd.org" <port-xen-maintainer@NetBSD.org>,
"gnats-admin@netbsd.org" <gnats-admin@NetBSD.org>,
"netbsd-bugs@netbsd.org" <netbsd-bugs@NetBSD.org>
Subject: Re: port-xen/47057: Xen NetBSD DomU file system trash under Linux
Dom0
Date: Mon, 22 Oct 2012 10:08:58 +0200
On Mon, Oct 22, 2012 at 10:03:35AM +0200, Roger Pau Monné wrote:
> On Sun, Oct 21, 2012 at 8:31 PM, Manuel Bouyer <bouyer@antioche.eu.org> wrote:
> > You can also try the attached patch, which should catch a mapping to
> > the same store's ma via regular pmap functons. If it's something more
> > clever, we'll need a more clever checks ...
>
> No luck with this either. I've also found sporadic:
>
> evtchn_do_event: handler 0xffffffff801206a7 didn't lower ipl 8 7
>
> Which I would say it's not related to the problem at hand (because
> some times I get the corruption without seeing this message), the
> handler in question is xen_timer_handler, which doesn't set any spl
> levels directly, although the mutex tmutex sets IPL_CLOCK.
I'm seeing this too. The problem is probably in something called by the
clock handler, but I failed to find what. It's not a real problem because
the Xen interrupt code will restore the IPL, but this means that
something is not restoring the IPL properly somewhere.
But it should be completely unrelated to the ring corruption issue.
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
From: =?UTF-8?Q?Roger_Pau_Monn=C3=A9?= <royger@NetBSD.org>
To: Manuel Bouyer <bouyer@antioche.eu.org>
Cc: =?UTF-8?Q?Roger_Pau_Monn=C3=A9?= <roger.pau@citrix.com>,
"gnats-bugs@NetBSD.org" <gnats-bugs@netbsd.org>,
"port-xen-maintainer@netbsd.org" <port-xen-maintainer@netbsd.org>,
"gnats-admin@netbsd.org" <gnats-admin@netbsd.org>, "netbsd-bugs@netbsd.org" <netbsd-bugs@netbsd.org>
Subject: Re: port-xen/47057: Xen NetBSD DomU file system trash under Linux Dom0
Date: Mon, 22 Oct 2012 11:42:40 +0200
On Mon, Oct 22, 2012 at 10:08 AM, Manuel Bouyer <bouyer@antioche.eu.org> wrote:
> I'm seeing this too. The problem is probably in something called by the
> clock handler, but I failed to find what. It's not a real problem because
> the Xen interrupt code will restore the IPL, but this means that
> something is not restoring the IPL properly somewhere.
>
> But it should be completely unrelated to the ring corruption issue.
Another possibly unrelated problem, I've tried enabling XENDEBUG_LOW
in x86_xpmap, to see the ma passed by Xen at start, and I've got the
following fault:
xen_arch_pmap_bootstrap init_tables=0xffffffff80b0c000
xen_bootstrap_tables(0xffffffff80b0c000, 0xffffffff80b21000, 9, 17)
xen_bootstrap_tables text_end 0xffffffff8063a000 map_end 0xffffffff80b36000
console 0x124afc xenstore 0x124afd
L3 va 0xffffffff80b23000 pa 0xb23000 entry 0x124ae4007 -> L4[0x1ff]
L2 va 0xffffffff80b24000 pa 0xb24000 entry 0x124ae3007 -> L3[0x1fe]
L1 va 0xffffffff80b25000 pa 0xb25000 entry 0x124ae2007 -> L2[0]
L1 va 0xffffffff80b26000 pa 0xb26000 entry 0x124ae1007 -> L2[0x1]
L1 va 0xffffffff80b27000 pa 0xb27000 entry 0x124ae0007 -> L2[0x2]
L1 va 0xffffffff80b28000 pa 0xb28000 entry 0x124adf007 -> L2[0x3]
L1 va 0xffffffff80b29000 pa 0xb29000 entry 0x124ade007 -> L2[0x4]
xenstore_interface va 0xffffffff80b0a000 pte 0x124afd000
xencons_interface va 0xffffffff80b0b000 pte 0x124afc000
va 0xffffffff80b0c000 pa 0xb0c000 entry 0x124afb005 -> L1[0x10c]
va 0xffffffff80b0d000 pa 0xb0d000 entry 0x124afa005 -> L1[0x10d]
va 0xffffffff80b0e000 pa 0xb0e000 entry 0x124af9005 -> L1[0x10e]
va 0xffffffff80b0f000 pa 0xb0f000 entry 0x124af8005 -> L1[0x10f]
va 0xffffffff80b10000 pa 0xb10000 entry 0x124af7005 -> L1[0x110]
va 0xffffffff80b11000 pa 0xb11000 entry 0x124af6005 -> L1[0x111]
va 0xffffffff80b12000 pa 0xb12000 entry 0x124af5005 -> L1[0x112]
va 0xffffffff80b13000 pa 0xb13000 entry 0x124af4005 -> L1[0x113]
va 0xffffffff80b14000 pa 0xb14000 entry 0x124af3005 -> L1[0x114]
va 0xffffffff80b21000 pa 0xb21000 entry 0x124ae6005 -> L1[0x121]
va 0xffffffff80b22000 pa 0xb22000 entry 0x124ae5005 -> L1[0x122]
va 0xffffffff80b23000 pa 0xb23000 entry 0x124ae4005 -> L1[0x123]
va 0xffffffff80b24000 pa 0xb24000 entry 0x124ae3005 -> L1[0x124]
va 0xffffffff80b25000 pa 0xb25000 entry 0x124ae2005 -> L1[0x125]
va 0xffffffff80b26000 pa 0xb26000 entry 0x124ae1005 -> L1[0x126]
va 0xffffffff80b27000 pa 0xb27000 entry 0x124ae0005 -> L1[0x127]
va 0xffffffff80b28000 pa 0xb28000 entry 0x124adf005 -> L1[0x128]
va 0xffffffff80b29000 pa 0xb29000 entry 0x124ade005 -> L1[0x129]
va 0xffffffff80b2a000 pa 0xb2a000 entry 0x124add005 -> L1[0x12a]
va 0xffffffff80b2b000 pa 0xb2b000 entry 0x124adc005 -> L1[0x12b]
va 0xffffffff80b2c000 pa 0xb2c000 entry 0x124adb005 -> L1[0x12c]
va 0xffffffff80b2d000 pa 0xb2d000 entry 0x124ada005 -> L1[0x12d]
va 0xffffffff80b2e000 pa 0xb2e000 entry 0x124ad9005 -> L1[0x12e]
va 0xffffffff80b2f000 pa 0xb2f000 entry 0x124ad8005 -> L1[0x12f]
va 0xffffffff80b30000 pa 0xb30000 entry 0x124ad7005 -> L1[0x130]
va 0xffffffff80b31000 pa 0xb31000 entry 0x124ad6005 -> L1[0x131]
va 0xffffffff80b32000 pa 0xb32000 entry 0x124ad5005 -> L1[0x132]
va 0xffffffff80b33000 pa 0xb33000 entry 0x124ad4005 -> L1[0x133]
va 0xffffffff80b34000 pa 0xb34000 entry 0x124ad3005 -> L1[0x134]
va 0xffffffff80b35000 pa 0xb35000 entry 0x124ad2005 -> L1[0x135]
L1 va 0xffffffff80b2a000 pa 0xb2a000 entry 0x124add007 -> L2[0x5]
L1 va 0xffffffff80b2b000 pa 0xb2b000 entry 0x124adc007 -> L2[0x6]
L1 va 0xffffffff80b2c000 pa 0xb2c000 entry 0x124adb007 -> L2[0x7]
L1 va 0xffffffff80b2d000 pa 0xb2d000 entry 0x124ada007 -> L2[0x8]
L1 va 0xffffffff80b2e000 pa 0xb2e000 entry 0x124ad9007 -> L2[0x9]
L1 va 0xffffffff80b2f000 pa 0xb2f000 entry 0x124ad8007 -> L2[0xa]
L1 va 0xffffffff80b30000 pa 0xb30000 entry 0x124ad7007 -> L2[0xb]
L1 va 0xffffffff80b31000 pa 0xb31000 entry 0x124ad6007 -> L2[0xc]
L1 va 0xffffffff80b32000 pa 0xb32000 entry 0x124ad5007 -> L2[0xd]
L1 va 0xffffffff80b33000 pa 0xb33000 entry 0x124ad4007 -> L2[0xe]
L1 va 0xffffffff80b34000 pa 0xb34000 entry 0x124ad3007 -> L2[0xf]
L1 va 0xffffffff80b35000 pa 0xb35000 entry 0x124ad2007 -> L2[0x10]
bt_pgd[PDIR_SLOT_PTE] va 0xffffffff80b21000 pa 0xb21000 entry 0x124ae5005
pin PGD: b21000
switch to PGD
bt_pgd[PDIR_SLOT_PTE] now entry 0x124ae5005
unpin old PGD
*pde 0x124add027 addr 0xb2a000 pte 0xffffffff80b2a860
xen_bootstrap_tables(0xffffffff80b21000, 0xffffffff80b0c000, 21, 17)
xen_bootstrap_tables text_end 0xffffffff8063a000 map_end 0xffffffff80b28000
console 0x124afc xenstore 0x124afd
L3 va 0xffffffff80b0e000 pa 0xb0e000 entry 0x124af9007 -> L4[0x1ff]
L2 va 0xffffffff80b0f000 pa 0xb0f000 entry 0x124af8007 -> L3[0x1fe]
L1 va 0xffffffff80b10000 pa 0xb10000 entry 0x124af7007 -> L2[0]
L1 va 0xffffffff80b11000 pa 0xb11000 entry 0x124af6007 -> L2[0x1]
L1 va 0xffffffff80b12000 pa 0xb12000 entry 0x124af5007 -> L2[0x2]
L1 va 0xffffffff80b13000 pa 0xb13000 entry 0x124af4007 -> L2[0x3]
L1 va 0xffffffff80b14000 pa 0xb14000 entry 0x124af3007 -> L2[0x4]
xenstore_interface va 0xffffffff80b0a000 pte 0x124afd000
xencons_interface va 0xffffffff80b0b000 pte 0x124afc000
va 0xffffffff80b0c000 pa 0xb0c000 entry 0x124afb005 -> L1[0x10c]
va 0xffffffff80b0d000 pa 0xb0d000 entry 0x124afa005 -> L1[0x10d]
va 0xffffffff80b0e000 pa 0xb0e000 entry 0x124af9005 -> L1[0x10e]
va 0xffffffff80b0f000 pa 0xb0f000 entry 0x124af8005 -> L1[0x10f]
va 0xffffffff80b10000 pa 0xb10000 entry 0x124af7005 -> L1[0x110]
va 0xffffffff80b11000 pa 0xb11000 entry 0x124af6005 -> L1[0x111]
va 0xffffffff80b12000 pa 0xb12000 entry 0x124af5005 -> L1[0x112]
va 0xffffffff80b13000 pa 0xb13000 entry 0x124af4005 -> L1[0x113]
va 0xffffffff80b14000 pa 0xb14000 entry 0x124af3005 -> L1[0x114]
va 0xffffffff80b15000 pa 0xb15000 entry 0x124af2005 -> L1[0x115]
va 0xffffffff80b16000 pa 0xb16000 entry 0x124af1005 -> L1[0x116]
va 0xffffffff80b17000 pa 0xb17000 entry 0x124af0005 -> L1[0x117]
va 0xffffffff80b18000 pa 0xb18000 entry 0x124aef005 -> L1[0x118]
va 0xffffffff80b19000 pa 0xb19000 entry 0x124aee005 -> L1[0x119]
va 0xffffffff80b1a000 pa 0xb1a000 entry 0x124aed005 -> L1[0x11a]
va 0xffffffff80b1b000 pa 0xb1b000 entry 0x124aec005 -> L1[0x11b]
va 0xffffffff80b1c000 pa 0xb1c000 entry 0x124aeb005 -> L1[0x11c]
va 0xffffffff80b1d000 pa 0xb1d000 entry 0x124aea005 -> L1[0x11d]
va 0xffffffff80b1e000 pa 0xb1e000 entry 0x124ae9005 -> L1[0x11e]
va 0xffffffff80b1f000 pa 0xb1f000 entry 0x124ae8005 -> L1[0x11f]
va 0xffffffff80b20000 pa 0xb20000 entry 0x124ae7005 -> L1[0x120]
va 0xffffffff80b21000 pa 0xb21000 entry 0x124ae6005 -> L1[0x121]
va 0xffffffff80b22000 pa 0xb22000 entry 0x124ae5005 -> L1[0x122]
va 0xffffffff80b23000 pa 0xb23000 entry 0x124ae4005 -> L1[0x123]
va 0xffffffff80b24000 pa 0xb24000 entry 0x124ae3005 -> L1[0x124]
va 0xffffffff80b25000 pa 0xb25000 entry 0x124ae2005 -> L1[0x125]
HYPERVISOR_shared_info va 0xffffffff80b26000 pte 0x80f5000
va 0xffffffff80b26000 pa 0xb26000 entry 0x80f5005 -> L1[0x126]
va 0xffffffff80b27000 pa 0xb27000 entry 0x124ae0005 -> L1[0x127]
L1 va 0xffffffff80b15000 pa 0xb15000 entry 0x124af2007 -> L2[0x5]
L1 va 0xffffffff80b16000 pa 0xb16000 entry 0x124af1007 -> L2[0x6]
L1 va 0xffffffff80b17000 pa 0xb17000 entry 0x124af0007 -> L2[0x7]
L1 va 0xffffffff80b18000 pa 0xb18000 entry 0x124aef007 -> L2[0x8]
L1 va 0xffffffff80b19000 pa 0xb19000 entry 0x124aee007 -> L2[0x9]
L1 va 0xffffffff80b1a000 pa 0xb1a000 entry 0x124aed007 -> L2[0xa]
L1 va 0xffffffff80b1b000 pa 0xb1b000 entry 0x124aec007 -> L2[0xb]
L1 va 0xffffffff80b1c000 pa 0xb1c000 entry 0x124aeb007 -> L2[0xc]
L1 va 0xffffffff80b1d000 pa 0xb1d000 entry 0x124aea007 -> L2[0xd]
L1 va 0xffffffff80b1e000 pa 0xb1e000 entry 0x124ae9007 -> L2[0xe]
L1 va 0xffffffff80b1f000 pa 0xb1f000 entry 0x124ae8007 -> L2[0xf]
L1 va 0xffffffff80b20000 pa 0xb20000 entry 0x124ae7007 -> L2[0x10]
bt_pgd[PDIR_SLOT_PTE] va 0xffffffff80b0c000 pa 0xb0c000 entry 0x124afa005
pin PGD: b0c000
switch to PGD
bt_pgd[PDIR_SLOT_PTE] now entry 0x124afa005
unpin old PGD
*pde 0x124af2027 addr 0xb15000 pte 0xffffffff80b15908
(XEN) d17:v0: unhandled page fault (ec=0000)
(XEN) Pagetable walk from 00007fffff803f60:
(XEN) L4[0x0ff] = 0000000124afb025 0000000000000b0c
(XEN) L3[0x1ff] = 0000000124af9027 0000000000000b0e
(XEN) L2[0x1fc] = 0000000000000000 ffffffffffffffff
(XEN) domain_crash_sync called from entry.S
(XEN) Domain 17 (vcpu#0) crashed on cpu#1:
(XEN) ----[ Xen-4.3-unstable x86_64 debug=y Not tainted ]----
(XEN) CPU: 1
(XEN) RIP: e033:[<ffffffff802eb85f>]
(XEN) RFLAGS: 0000000000000206 EM: 1 CONTEXT: pv guest
(XEN) rax: 00007fffff803f60 rbx: 0000000124add007 rcx: 0000000000000000
(XEN) rdx: ffffffff8063dd40 rsi: 0000000124add007 rdi: 000ffffffffff000
(XEN) rbp: ffffffff80b24ed0 rsp: ffffffff80b24e40 r8: ffffffff80b25000
(XEN) r9: 0000000000000010 r10: 00000000deadbeef r11: 0000000000000000
(XEN) r12: 0000000000000000 r13: 0000000000000004 r14: ffffffff007ec976
(XEN) r15: 000ffffffffff000 cr0: 000000008005003b cr4: 00000000000026f0
(XEN) cr3: 0000000124afb000 cr2: 00007fffff803f60
(XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs: e033
(XEN) Guest stack trace from rsp=ffffffff80b24e40:
(XEN) 0000000000000000 0000000000000000 0000000000000000 ffffffff802eb85f
(XEN) 000000010000e030 0000000000010006 ffffffff80b24e80 000000000000e02b
(XEN) ffffffff80b24ed0 00007fffffc05938 0000000100000000 ffffffff80460288
(XEN) 0000000000000000 0000000000b25000 ffffffff80b21000 0000000000000000
(XEN) 0000000000000000 0000000000000000 ffffffff80b24f00 ffffffff80283b3a
(XEN) 0000000000000000 0000000000000000 00000000756e6547 0000000000000000
(XEN) 0000000000000000 ffffffff8010009b 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
This only happens with XENDEBUG_LOW set.
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Roger Pau =?iso-8859-1?Q?Monn=E9?= <royger@NetBSD.org>
Cc: Roger Pau =?iso-8859-1?Q?Monn=E9?= <roger.pau@citrix.com>,
"gnats-bugs@NetBSD.org" <gnats-bugs@NetBSD.org>,
"port-xen-maintainer@netbsd.org" <port-xen-maintainer@NetBSD.org>,
"gnats-admin@netbsd.org" <gnats-admin@NetBSD.org>,
"netbsd-bugs@netbsd.org" <netbsd-bugs@NetBSD.org>
Subject: Re: port-xen/47057: Xen NetBSD DomU file system trash under Linux
Dom0
Date: Mon, 22 Oct 2012 12:01:48 +0200
On Mon, Oct 22, 2012 at 11:42:40AM +0200, Roger Pau Monné wrote:
> Another possibly unrelated problem, I've tried enabling XENDEBUG_LOW
> in x86_xpmap, to see the ma passed by Xen at start, and I've got the
> following fault:
This code has not been used for a long time, it's possible that it
has not tracked other changes ...
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
From: =?ISO-8859-1?Q?Roger_Pau_Monn=E9?= <roger.pau@citrix.com>
To: Manuel Bouyer <bouyer@antioche.eu.org>
Cc: =?ISO-8859-1?Q?Roger_Pau_Monn=E9?= <royger@NetBSD.org>,
"gnats-bugs@NetBSD.org" <gnats-bugs@NetBSD.org>,
"port-xen-maintainer@netbsd.org" <port-xen-maintainer@NetBSD.org>,
"gnats-admin@netbsd.org" <gnats-admin@NetBSD.org>, "netbsd-bugs@netbsd.org"
<netbsd-bugs@NetBSD.org>
Subject: Re: port-xen/47057: Xen NetBSD DomU file system trash under Linux
Dom0
Date: Tue, 23 Oct 2012 09:45:48 +0200
I think I've found the problem, it seems to be related to
xengnt_more_entries, but still haven't been able to point exactly when
the overwrite of xenstore_interface happens. Just after the call to
xengnt_more_entries the ring gets corrupted, but it's not the call
itself that corrupts the ring.
xengnt_more_entries start: prod: 3787 cons: 3787
xengnt_more_entries: map 0x1610ff -> 0xffffa0002da45000
xengnt_more_entries end: prod: 3787 cons: 3787
xb_read: xenstore_interface: 0xffffffff80b0a000
xb_read: cons: 673215352 prod: 1651402104
xb_read EIO
xenbus_xs (process_msg:763) xb_read hdr 5.
panic: XENBUS error 5 while reading message
From: =?ISO-8859-1?Q?Roger_Pau_Monn=E9?= <roger.pau@citrix.com>
To: Manuel Bouyer <bouyer@antioche.eu.org>
Cc: =?ISO-8859-1?Q?Roger_Pau_Monn=E9?= <royger@NetBSD.org>,
"gnats-bugs@NetBSD.org" <gnats-bugs@NetBSD.org>,
"port-xen-maintainer@netbsd.org" <port-xen-maintainer@NetBSD.org>,
"gnats-admin@netbsd.org" <gnats-admin@NetBSD.org>, "netbsd-bugs@netbsd.org"
<netbsd-bugs@NetBSD.org>
Subject: Re: port-xen/47057: Xen NetBSD DomU file system trash under Linux
Dom0
Date: Tue, 23 Oct 2012 16:01:06 +0200
--------------090005000003030604070404
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 7bit
Found the problem, grants from 0 to 8 (both included), shouldn't be
used, they are reserved for the tools. I guess thats xenstore,
xenconsole and friends, so that's where the corruption came from, and
that's why the problem seemed to be related to xengnt_more_entries,
because it gets called when those low grants are used. The attached
patch solves the problem for me.
--------------090005000003030604070404
Content-Type: text/plain; charset="UTF-8"; x-mac-type=0; x-mac-creator=0;
name="0001-xen-don-t-use-grants-0-9.patch"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
filename="0001-xen-don-t-use-grants-0-9.patch"
From b80f10a3c3d0b95d3cd2a60a4669a2118fdbb9ef Mon Sep 17 00:00:00 2001
From: Roger Pau Monne <roger.pau@citrix.com>
Date: Tue, 23 Oct 2012 15:21:18 +0200
Subject: [PATCH] xen: don't use grants 0-9
Not all grants from the first frame can be used, grants from 0 to 8
(both included) are reserved for external tools. Using this grants
caused system crashes and fs corruption.
---
sys/arch/xen/xen/xengnt.c | 15 +++++++++++----
1 files changed, 11 insertions(+), 4 deletions(-)
diff --git a/sys/arch/xen/xen/xengnt.c b/sys/arch/xen/xen/xengnt.c
index 621d2dc..2de4fd3 100644
--- a/sys/arch/xen/xen/xengnt.c
+++ b/sys/arch/xen/xen/xengnt.c
@@ -51,6 +51,9 @@ __KERNEL_RCSID(0, "$NetBSD: xengnt.c,v 1.24 2012/06/30 23:36:20 jym Exp $");
#define NR_GRANT_ENTRIES_PER_PAGE (PAGE_SIZE / sizeof(grant_entry_t))
+/* External tools reserve first few grant table entries. */
+#define NR_RESERVED_ENTRIES 8
+
/* Current number of frames making up the grant table */
int gnt_nr_grant_frames;
/* Maximum number of frames that can make up the grant table */
@@ -161,7 +164,7 @@ xengnt_more_entries(void)
gnttab_setup_table_t setup;
u_long *pages;
int nframes_new = gnt_nr_grant_frames + 1;
- int i;
+ int i, start_gnt;
KASSERT(mutex_owned(&grant_lock));
if (gnt_nr_grant_frames == gnt_max_grant_frames)
@@ -204,9 +207,13 @@ xengnt_more_entries(void)
/*
* add the grant entries associated to the last grant table frame
- * and mark them as free
+ * and mark them as free. Prevent using the first grants (from 0 to 8)
+ * since they are used by the tools.
*/
- for (i = gnt_nr_grant_frames * NR_GRANT_ENTRIES_PER_PAGE;
+ start_gnt = (gnt_nr_grant_frames * NR_GRANT_ENTRIES_PER_PAGE) <
+ NR_RESERVED_ENTRIES + 1 ? NR_RESERVED_ENTRIES + 1 :
+ (gnt_nr_grant_frames * NR_GRANT_ENTRIES_PER_PAGE);
+ for (i = start_gnt;
i < nframes_new * NR_GRANT_ENTRIES_PER_PAGE;
i++) {
KASSERT(gnt_entries[last_gnt_entry] == XENGNT_NO_ENTRY);
@@ -240,7 +247,7 @@ xengnt_get_entry(void)
last_gnt_entry--;
entry = gnt_entries[last_gnt_entry];
gnt_entries[last_gnt_entry] = XENGNT_NO_ENTRY;
- KASSERT(entry != XENGNT_NO_ENTRY);
+ KASSERT(entry != XENGNT_NO_ENTRY && entry > NR_RESERVED_ENTRIES);
KASSERT(last_gnt_entry >= 0);
KASSERT(last_gnt_entry <= gnt_max_grant_frames * NR_GRANT_ENTRIES_PER_PAGE);
return entry;
--
1.7.7.5 (Apple Git-26)
--------------090005000003030604070404--
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Roger Pau =?iso-8859-1?Q?Monn=E9?= <roger.pau@citrix.com>
Cc: Roger Pau =?iso-8859-1?Q?Monn=E9?= <royger@NetBSD.org>,
"gnats-bugs@NetBSD.org" <gnats-bugs@NetBSD.org>,
"port-xen-maintainer@netbsd.org" <port-xen-maintainer@NetBSD.org>,
"gnats-admin@netbsd.org" <gnats-admin@NetBSD.org>,
"netbsd-bugs@netbsd.org" <netbsd-bugs@NetBSD.org>
Subject: Re: port-xen/47057: Xen NetBSD DomU file system trash under Linux
Dom0
Date: Tue, 23 Oct 2012 21:57:48 +0200
On Tue, Oct 23, 2012 at 04:01:06PM +0200, Roger Pau Monné wrote:
> Found the problem, grants from 0 to 8 (both included), shouldn't be
> used, they are reserved for the tools. I guess thats xenstore,
> xenconsole and friends, so that's where the corruption came from, and
> that's why the problem seemed to be related to xengnt_more_entries,
> because it gets called when those low grants are used. The attached
> patch solves the problem for me.
I guess it's new behavior of the tools ? Otherwise I guess we should have hit
this sooner. I see messages saying the kernel grows the grant entries
pool on a regular basis.
Anyway, good catch. one comment about the patch below.
> >From b80f10a3c3d0b95d3cd2a60a4669a2118fdbb9ef Mon Sep 17 00:00:00 2001
> From: Roger Pau Monne <roger.pau@citrix.com>
> Date: Tue, 23 Oct 2012 15:21:18 +0200
> Subject: [PATCH] xen: don't use grants 0-9
>
> Not all grants from the first frame can be used, grants from 0 to 8
> (both included) are reserved for external tools. Using this grants
> caused system crashes and fs corruption.
> ---
> sys/arch/xen/xen/xengnt.c | 15 +++++++++++----
> 1 files changed, 11 insertions(+), 4 deletions(-)
>
> diff --git a/sys/arch/xen/xen/xengnt.c b/sys/arch/xen/xen/xengnt.c
> index 621d2dc..2de4fd3 100644
> --- a/sys/arch/xen/xen/xengnt.c
> +++ b/sys/arch/xen/xen/xengnt.c
> @@ -51,6 +51,9 @@ __KERNEL_RCSID(0, "$NetBSD: xengnt.c,v 1.24 2012/06/30 23:36:20 jym Exp $");
>
> #define NR_GRANT_ENTRIES_PER_PAGE (PAGE_SIZE / sizeof(grant_entry_t))
>
> +/* External tools reserve first few grant table entries. */
> +#define NR_RESERVED_ENTRIES 8
> +
> /* Current number of frames making up the grant table */
> int gnt_nr_grant_frames;
> /* Maximum number of frames that can make up the grant table */
> @@ -161,7 +164,7 @@ xengnt_more_entries(void)
> gnttab_setup_table_t setup;
> u_long *pages;
> int nframes_new = gnt_nr_grant_frames + 1;
> - int i;
> + int i, start_gnt;
> KASSERT(mutex_owned(&grant_lock));
>
> if (gnt_nr_grant_frames == gnt_max_grant_frames)
> @@ -204,9 +207,13 @@ xengnt_more_entries(void)
>
> /*
> * add the grant entries associated to the last grant table frame
> - * and mark them as free
> + * and mark them as free. Prevent using the first grants (from 0 to 8)
> + * since they are used by the tools.
> */
> - for (i = gnt_nr_grant_frames * NR_GRANT_ENTRIES_PER_PAGE;
> + start_gnt = (gnt_nr_grant_frames * NR_GRANT_ENTRIES_PER_PAGE) <
> + NR_RESERVED_ENTRIES + 1 ? NR_RESERVED_ENTRIES + 1 :
> + (gnt_nr_grant_frames * NR_GRANT_ENTRIES_PER_PAGE);
please rewrite with parenthesis:
+ start_gnt = (gnt_nr_grant_frames * NR_GRANT_ENTRIES_PER_PAGE) <
+ (NR_RESERVED_ENTRIES + 1) ?
(NR_RESERVED_ENTRIES + 1) :
+ (gnt_nr_grant_frames * NR_GRANT_ENTRIES_PER_PAGE);
then please commit and request pullups for netbsd-5 and -6.
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
From: "Roger Pau Monne" <royger@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/47057 CVS commit: src/sys/arch/xen/xen
Date: Wed, 24 Oct 2012 13:07:47 +0000
Module Name: src
Committed By: royger
Date: Wed Oct 24 13:07:46 UTC 2012
Modified Files:
src/sys/arch/xen/xen: xengnt.c
Log Message:
xen: don't use grants 0-8
Not all grants from the first frame can be used, grants from 0 to 8
(both included) are reserved for external tools. Using this grants
caused system crashes and fs corruption.
Closes PR port-xen/47057 and port-xen/47056
Reviewed by bouyer@
To generate a diff of this commit:
cvs rdiff -u -r1.24 -r1.25 src/sys/arch/xen/xen/xengnt.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Stephen Borrill" <sborrill@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/47057 CVS commit: [netbsd-5] src/sys/arch/xen/xen
Date: Fri, 26 Oct 2012 11:31:50 +0000
Module Name: src
Committed By: sborrill
Date: Fri Oct 26 11:31:50 UTC 2012
Modified Files:
src/sys/arch/xen/xen [netbsd-5]: xengnt.c
Log Message:
Pull up the following revisions(s) (requested by royger in ticket #1805):
sys/arch/xen/xen/xengnt.c: revision 1.25 via patch
Prevents a memory corruption issue that freezes a Xen DomU and can also
cause fs corruption. Addresses PR port-xen/47057 and port-xen/47056
To generate a diff of this commit:
cvs rdiff -u -r1.10.4.1 -r1.10.4.2 src/sys/arch/xen/xen/xengnt.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Jeff Rizzo" <riz@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/47057 CVS commit: [netbsd-6] src/sys/arch/xen/xen
Date: Wed, 31 Oct 2012 16:15:09 +0000
Module Name: src
Committed By: riz
Date: Wed Oct 31 16:15:09 UTC 2012
Modified Files:
src/sys/arch/xen/xen [netbsd-6]: xengnt.c
Log Message:
Pull up following revision(s) (requested by royger in ticket #640):
sys/arch/xen/xen/xengnt.c: revision 1.25
xen: don't use grants 0-8
Not all grants from the first frame can be used, grants from 0 to 8
(both included) are reserved for external tools. Using this grants
caused system crashes and fs corruption.
Closes PR port-xen/47057 and port-xen/47056
Reviewed by bouyer@
To generate a diff of this commit:
cvs rdiff -u -r1.22.2.1 -r1.22.2.2 src/sys/arch/xen/xen/xengnt.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Jeff Rizzo" <riz@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/47057 CVS commit: [netbsd-6-0] src/sys/arch/xen/xen
Date: Wed, 31 Oct 2012 16:15:28 +0000
Module Name: src
Committed By: riz
Date: Wed Oct 31 16:15:28 UTC 2012
Modified Files:
src/sys/arch/xen/xen [netbsd-6-0]: xengnt.c
Log Message:
Pull up following revision(s) (requested by royger in ticket #640):
sys/arch/xen/xen/xengnt.c: revision 1.25
xen: don't use grants 0-8
Not all grants from the first frame can be used, grants from 0 to 8
(both included) are reserved for external tools. Using this grants
caused system crashes and fs corruption.
Closes PR port-xen/47057 and port-xen/47056
Reviewed by bouyer@
To generate a diff of this commit:
cvs rdiff -u -r1.22.2.1 -r1.22.2.1.4.1 src/sys/arch/xen/xen/xengnt.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->closed
State-Changed-By: royger@NetBSD.org
State-Changed-When: Fri, 30 Nov 2012 09:30:57 +0000
State-Changed-Why:
Fixed in src/sys/arch/xen/xen/xengnt.c version 1.25
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.