NetBSD Problem Report #32035

From root@ns-3.iastate.edu  Thu Nov 10 04:28:53 2005
Return-Path: <root@ns-3.iastate.edu>
Received: from mailhub-3.iastate.edu (mailhub-3.iastate.edu [129.186.140.13])
	by narn.netbsd.org (Postfix) with ESMTP id 7523E63BA17
	for <gnats-bugs@gnats.NetBSD.org>; Thu, 10 Nov 2005 04:28:53 +0000 (UTC)
Message-Id: <200511100426.jAA4QBTa000117@ns-3.iastate.edu>
Date: Wed, 9 Nov 2005 22:26:11 -0600 (CST)
From: nb-pr@gendalia.org
Reply-To: nb-pr@gendalia.org
To: gnats-bugs@netbsd.org
Subject: 3.0 MP machines can't keep time on busy nameservers
X-Send-Pr-Version: 3.95

>Number:         32035
>Category:       kern
>Synopsis:       3.0 MP machines can't keep time on busy nameservers
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Nov 10 04:29:00 +0000 2005
>Closed-Date:    Thu Jan 07 14:52:52 +0000 2021
>Last-Modified:  Thu Jan 07 14:52:52 +0000 2021
>Originator:     Tracy Di Marco White
>Release:        NetBSD 3.0_BETA
>Organization:
Iowa State University
>Environment:
System: NetBSD ns-3.iastate.edu 3.0_BETA NetBSD 3.0_BETA (GENERIC.MP) #6: Sun Oct 2 17:56:09 CDT 2005 gendalia@satai.home.:/usr/obj/i386/GENERIC.MP i386
Architecture: i386
Machine: i386
>Description:
Running NetBSD 2.0 (UP) on Dell MP hardware, Poweredge 1850, the machine
can keep time just fine.  Running NetBSD 3.0 (MP) on same hardware, with
pthreaded named 9.3.1 from pkgsrc, ntpd cannot sync the clock, and the
hardware loses just under 1 second every minute.  Running without a
threaded named, still 9.3.1, ntpd can sync the clock for a while, but
eventually it too loses, as the hardware loses only about 1/3rd second
every minute.  I am unable to reproduce this problem without putting
the nameserver into production as one of ISU's nameservers, in tests
with thousands more identical queries a second it will not lose time.
The machine can sync time just fine in production as long as the
name server is not running, starting named causes immediate loss of
time.  The loss of time is the same without HT (MP) or with HT (MPACPI).
Will be attempting 3.0 UP kernel tomorrow, and possibly a current kernel.
>How-To-Repeat:
Install NetBSD 3.0_BETA on a busy, production name server.  watch
time pass slower and slower.
>Fix:
Install 2.0 branch NetBSD. :(

>Release-Note:

>Audit-Trail:
From: Tracy Di Marco White <gendalia@gendalia.org>
To: gnats-bugs@netbsd.org
Cc: nb-pr@gendalia.org
Subject: Re: kern/32035: 3.0 MP machines can't keep time on busy nameservers 
Date: Tue, 15 Nov 2005 22:50:15 -0600

 Tests with current kernels:

 GENERIC current:
 # uptime
  1:10AM  up 44 mins, 1 user, load averages: 1.34, 1.28, 1.19
 # ntpq -c peers
      remote           refid      st t when poll reach   delay   offset  jitter
 ==============================================================================
 +ns-1.iastate.ed 209.81.9.7       2 u   66  128  377    0.001  -165.85  38.963
 *ns-2.iastate.ed 139.78.100.163   2 u  126  128  376    0.001  -180.71  29.847
 +ns-0.iastate.ed 129.186.140.200  3 u   60  128  376    0.001  -172.12  35.968
 +vs-1.iastate.ed 129.186.140.200  3 u  107  128  376    0.001  -174.66  34.986
 +vs-2.iastate.ed 129.186.140.200  3 u  110  128  376    0.001  -175.68  34.243
 +vs-3.iastate.ed 129.186.140.200  3 u  105  128  376    0.001  -175.04  34.816
 -avi-lis.gw.ligh .CDMA.           1 u   58  128  377   57.016  -216.19  27.677
 -timekeeper.isi. .GPS.            1 u  112  128  377   58.122  -233.07  43.461
 -caesar.cs.wisc. 128.105.201.11   2 u  108  128  377   24.143  -203.54  22.302

 GENERIC.MP current:
 after 11 minutes (how long it took to sync):
 ns-3# uptime
  1:26AM  up 11 mins, 1 user, load averages: 1.39, 1.15, 0.69
 ns-3# ntpq -c peers
      remote           refid      st t when poll reach   delay   offset  jitter
 ==============================================================================
 +ns-1.iastate.ed 209.81.9.7       2 u   34   64  376    0.001  1992.36 832.536
 +ns-2.iastate.ed 139.78.100.163   2 u   30   64  156    0.001  1991.09 527.578
 +ns-0.iastate.ed 129.186.140.200  3 u   31   64  276    0.001  1992.40 796.380
 +vs-1.iastate.ed 129.186.140.200  3 u   32   64  375    0.001  2101.87 777.235
 +vs-2.iastate.ed 129.186.140.200  3 u   34   64  376    0.001  1989.10 826.415
 +vs-3.iastate.ed 129.186.140.200  3 u   31   64  376    0.001  1997.57 684.249
 +avi-lis.gw.ligh .CDMA.           1 u   25   64  377   56.272  1888.00 636.618
 *timekeeper.isi. .GPS.            1 u   27   64  177   58.033  1876.92 710.250
 xcaesar.cs.wisc. 128.105.201.11   2 u   24   64  377   23.812  799.411 892.636

 It's gradually losing time.

 # uptime
  1:49AM  up 33 mins, 1 user, load averages: 0.75, 1.15, 1.05
 # ntpq -c peers
      remote           refid      st t when poll reach   delay   offset  jitter
 ==============================================================================
  ns-1.iastate.ed 128.252.19.1     2 u   85  256  376    0.001  4465.33 1084.36
  ns-2.iastate.ed 139.78.100.163   2 u   81  256  376    0.001  4478.07 1025.98
  ns-0.iastate.ed 129.186.140.200  3 u   79  256  376    0.001  4479.44 1028.17
  vs-1.iastate.ed 129.186.140.200  3 u   88  256  376    0.001  4469.37 1034.34
  vs-2.iastate.ed 129.186.140.200  3 u   93  256  376    0.001  4456.15 1088.17
  vs-3.iastate.ed 129.186.140.200  3 u  144  256  376    0.001  4366.74 1114.75
 xavi-lis.gw.ligh .CDMA.           1 u  209  256  377   56.361  4044.08 804.286
  timekeeper.isi. .GPS.            1 u  144  256  277   64.982  4692.03 1389.01
  caesar.cs.wisc. 128.105.201.11   2 u  137  256  377   23.910  2790.58 1067.63

 ntpq> lpa
 ind assID status  conf reach auth condition  last_event cnt
 ===========================================================
   1 15412  b024   yes   yes  none    reject   reachable  2
   2 15413  b044   yes   yes  none    reject   reachable  4
   3 15414  b014   yes   yes  none    reject   reachable  1
   4 15415  b034   yes   yes  none    reject   reachable  3
   5 15416  b024   yes   yes  none    reject   reachable  2
   6 15417  b034   yes   yes  none    reject   reachable  3
   7 15418  b114   yes   yes  none falsetick   reachable  1
   8 15419  b014   yes   yes  none    reject   reachable  1
   9 15420  b014   yes   yes  none    reject   reachable  1

 GENERIC.MPACPI current
 ns-3# uptime
  2:01AM  up 8 mins, 1 user, load averages: 1.20, 0.99, 0.55
 ns-3# ntpq -c peers
      remote           refid      st t when poll reach   delay   offset  jitter
 ==============================================================================
  ns-1.iastate.ed 209.81.9.7       2 u   13   64   65    0.004  1937.53 717.329
  ns-2.iastate.ed 139.78.100.163   2 u    9   64   76    0.004  1765.27 543.583
  ns-0.iastate.ed 129.186.140.200  3 u    8   64  377    0.004  1940.07 1132.19
  vs-1.iastate.ed 129.186.140.200  3 u   11   64   57    0.004  1944.99 690.699
  vs-2.iastate.ed 129.186.140.200  3 u   14   64   76    0.004  1754.90 541.711
 +vs-3.iastate.ed 129.186.140.200  3 u   12   64  176    0.004  1761.58 712.149
 *avi-lis.gw.ligh .CDMA.           1 u    8   64  377   56.436  1317.31 664.365
  timekeeper.isi. .GPS.            1 u    7   64  277   61.388  1937.87 1015.03
 +caesar.cs.wisc. 128.105.201.11   2 u    3   64  377   24.059  850.775 659.560

 # uptime 
  2:31AM  up 37 mins, 1 user, load averages: 1.02, 1.05, 0.99
 # ntpq
 ntpq> peers
      remote           refid      st t when poll reach   delay   offset  jitter
 ==============================================================================
  ns-1.iastate.ed 128.252.19.1     2 u   52  128  376    0.004  5155.84 1183.69
  ns-2.iastate.ed 139.78.100.163   2 u   50  128  376    0.004  5158.36 1180.61
  ns-0.iastate.ed 129.186.140.200  3 u   99  128  376    0.004  5075.67 1148.29
  vs-1.iastate.ed 129.186.140.200  3 u  107  128  376    0.004  5074.47 1154.17
  vs-2.iastate.ed 129.186.140.200  3 u   42  128  376    0.004  5175.62 1183.79
  vs-3.iastate.ed 129.186.140.200  3 u  105  128  376    0.004  5074.49 1152.87
  avi-lis.gw.ligh .CDMA.           1 u  101  128  377   56.122  5042.92 1093.60
 xtimekeeper.isi. .GPS.            1 u   97  128  377   59.179  3670.09 851.004
 xcaesar.cs.wisc. 128.105.201.11   2 u   96  128  377   24.147  4499.65 688.713
 ntpq> lpa
 ind assID status  conf reach auth condition  last_event cnt
 ===========================================================
   1  3580  b034   yes   yes  none    reject   reachable  3
   2  3581  b034   yes   yes  none    reject   reachable  3
   3  3582  b014   yes   yes  none    reject   reachable  1
   4  3583  b034   yes   yes  none    reject   reachable  3
   5  3584  b034   yes   yes  none    reject   reachable  3
   6  3585  b024   yes   yes  none    reject   reachable  2
   7  3586  b014   yes   yes  none    reject   reachable  1
   8  3587  b114   yes   yes  none falsetick   reachable  1
   9  3588  b114   yes   yes  none falsetick   reachable  1
 ntpq> 

 GENERIC.UP current (A GENERIC kernel with "options MPBIOS" & ioapic enabled.)
 # uptime
 10:12PM  up 9 mins, 2 users, load averages: 1.32, 1.23, 0.72
 # ntpq -c peers
      remote           refid      st t when poll reach   delay   offset  jitter
 ==============================================================================
  ns-1.iastate.ed 128.252.19.1     2 u   34   64  277    0.001  2317.93 1097.24
  ns-2.iastate.ed 139.78.100.163   2 u   29   64  356    0.001  2155.96 1128.59
  ns-0.iastate.ed 129.186.140.200  3 u   28   64  337    0.001  2329.00 1255.91
  vs-1.iastate.ed 129.186.140.200  3 u   30   64  276    0.001  2157.84 984.052
  vs-2.iastate.ed 129.186.140.200  3 u   28   64  376    0.001  2166.57 1014.74
  vs-3.iastate.ed 129.186.140.200  3 u   25   64  375    0.001  2341.10 1324.62
 xavi-lis.gw.ligh .CDMA.           1 u   30   64  377   56.639  744.180 913.299
 *timekeeper.isi. .GPS.            1 u   20   64  377   57.350  2033.97 949.629
 +caesar.cs.wisc. 128.105.201.11   2 u   15   64  377   24.268  1687.09 718.903

 # uptime
 10:40PM  up 37 mins, 2 users, load averages: 1.40, 1.32, 1.26
 # ntpq -c peers -c lpa
      remote           refid      st t when poll reach   delay   offset  jitter
 ==============================================================================
  ns-1.iastate.ed 128.252.19.1     2 u   34  128  377    0.001  5449.53 1136.97
  ns-2.iastate.ed 139.78.100.163   2 u   31  128  376    0.001  5448.35 1161.95
  ns-0.iastate.ed 129.186.1.200    3 u   31  128  377    0.001  5457.95 1134.81
  vs-1.iastate.ed 152.2.21.1       3 u   36  128  376    0.001  5437.43 1160.13
  vs-2.iastate.ed 129.186.140.200  3 u   22  128  376    0.001  5476.76 1184.05
  vs-3.iastate.ed 129.186.140.200  3 u   78  128  376    0.001  5401.07 1160.89
  avi-lis.gw.ligh .CDMA.           1 u   29  128  377   56.986  5518.57 1159.55
 xtimekeeper.isi. .GPS.            1 u   78  128  377   57.778  4211.38 702.222
 xcaesar.cs.wisc. 128.105.201.11   2 u   76  128  377   24.397  4763.41 667.182
 ind assID status  conf reach auth condition  last_event cnt
 ===========================================================
   1 13484  b024   yes   yes  none    reject   reachable  2
   2 13485  b014   yes   yes  none    reject   reachable  1
   3 13486  b014   yes   yes  none    reject   reachable  1
   4 13487  b024   yes   yes  none    reject   reachable  2
   5 13488  b024   yes   yes  none    reject   reachable  2
   6 13489  b014   yes   yes  none    reject   reachable  1
   7 13490  b014   yes   yes  none    reject   reachable  1
   8 13491  b114   yes   yes  none falsetick   reachable  1
   9 13492  b114   yes   yes  none falsetick   reachable  1

From: Simon Burge <simonb@wasabisystems.com>
To: tech-kern@netbsd.org
Cc: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org,
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/32035: APIC timer help
Date: Mon, 28 Nov 2005 12:37:14 +1100

 [ Cross-posting to tech-kern to try to get some help on working
   out WTF if going on here. ]

 Hi folks,

 Can anyone explain how the x86 local APIC timer is supposed to work?

 For some reason, just keeping the box the shows the symptoms in the PR
 busy with builds doesn't make time unhappy, but a busy named does.  This
 box is a Dell (I think) dual Xeon with an Intel E7525 chipset.

 I've written a couple of programs that sort of simulate the
 system call sequences that the busy named generates (these are at
 ftp://ftp.NetBSD.org/pub/NetBSD/misc/simonb/named-simul.tar.gz ).
 Running "receiver" and "sender <host> 2000000 2000" on the same box lead
 it to totally lose time the same way that named makes it loose time.

 On that particular i386 MP box when idle, we get to the top of
 lapic_clockintr() approx every 29926950 TSC ticks in most cases.  Here's
 some data showing the TSC cycles between lapic_clockintr() calls
 (measured only on the primary CPU):

 	tsc from last clockintr
 	                 tsc cycles over 29926950

 	29926950                0
 	29926950                0
 	29926950                0
 	29950837            23887
 	29926973               23
 	29926950                0
 	29926950                0
 	29926950                0
 	29926950                0
 	29926950                0

 I would have expected that if we're late getting to lapic_clockintr()
 because say interrupts were blocked at the time, then we'd get to
 the next (or at least some subsequent) call to lapic_clockintr() in
 the future early WRT the TSC tick count - if the local APIC timer
 interrupts are really coming in at a fixed rate.  So above I'd expect to
 see something like:

 	29926950                0
 	29950837            23887
 	29903063           -23887
 	29926950                0

 and thus keeping an average of 29926950 TSC ticks per call to
 lapic_clockintr().

 Occasionly I do see little cycles like:

 	tsc from last clockintr
 	                 tsc cycles over 29926950

 	29926950                0
 	29927025               75
 	29926875              -75
 	29926935              -15
 	29926965               15
 	29926950                0

 and

 	29926957                7
 	29926950                0
 	29926943               -7
 	29926950                0
 	29926957                7
 	29926943               -7
 	29926950                0
 	29927092              142
 	29926808             -142
 	29926950                0

 where we miss a few TSC ticks but then make those up in the next call
 to lapic_clockintr() or two, which is what I'd expect to see, but this
 never happens where any significant number of TSC ticks are missed, only
 small values like above.

 However, while the system is busy running the programs mentioned above
 we see sequences like the following:

 	tsc from last clockintr
 	                 tsc cycles over 29926950

 	29926950                0
 	30573397           646447 
 	30620333           693383 
 	30596385           669435 
 	30739912           812962 
 	30812250           885300
 	30691905           764955
 	30440408           513458
 	30082155           155205
 	30453150           526200
 	30525195           598245
 	30571875           644925
 	30692910           765960
 	30691215           764265
 	30501015           574065
 	30465150           538200
 	30062040           135090
 	30472365           545415
 	30596445           669495
 	30070425           143475
 	29926995               45
 	29926950                0

 where we have a much larger than normal delay between lapic_clockintr()
 calls.

 Here's some further data.  First here's three sets of the total number
 of TSC ticks for 10,000 lapic_clockintr() calls while the box is idle:

 	299274903660
 	299273899440
 	299274126585

 and three sets while it's busy running the above programs:

 	302256563415
 	302291485440
 	302242349520

 The busy TSC ticks are pretty close to 1% over the idle TSC ticks, and
 the clock on the box has lost approximately 23 seconds in 40 minutes
 which is also close to 1% time loss.


 Looking at i386/i386/vector.S, the lapic_ltimer interrupt vector sends
 an end-of-interrupt (EOI) to the APIC and then calls lapic_clockintr().
 Is it somehow possible that if there's a delay in sending the EOI that
 the APIC timer won't restart counting down automatically?

 Any other theories to describe the behaviour we're seeing?

 I also have a dual Xeon here based on the Intel E7505 chipset that
 doesn't seem to be affected by this problem, so it would seem that
 it may be chipset related.

 Simon.
 --
 Simon Burge                            <simonb@wasabisystems.com>
 NetBSD Support and Service:         http://www.wasabisystems.com/

From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Simon Burge <simonb@wasabisystems.com>
Cc: tech-kern@NetBSD.org, gnats-bugs@NetBSD.org,
	kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org,
	netbsd-bugs@NetBSD.org
Subject: Re: kern/32035: APIC timer help
Date: Mon, 28 Nov 2005 10:35:29 +0100

 On Mon, Nov 28, 2005 at 12:37:14PM +1100, Simon Burge wrote:
 > [ Cross-posting to tech-kern to try to get some help on working
 >   out WTF if going on here. ]
 > 
 > Hi folks,
 > 
 > Can anyone explain how the x86 local APIC timer is supposed to work?
 > 
 > For some reason, just keeping the box the shows the symptoms in the PR
 > busy with builds doesn't make time unhappy, but a busy named does.  This
 > box is a Dell (I think) dual Xeon with an Intel E7525 chipset.
 > 
 > I've written a couple of programs that sort of simulate the
 > system call sequences that the busy named generates (these are at
 > ftp://ftp.NetBSD.org/pub/NetBSD/misc/simonb/named-simul.tar.gz ).
 > Running "receiver" and "sender <host> 2000000 2000" on the same box lead
 > it to totally lose time the same way that named makes it loose time.

 Probably a shoot in the dark, I didn't think about it much, but ...
 Does your sender/receiver processes make use of the network adapter ?
 Maybe it could be the same "interrupt aliasing" as pointed out in
 "wm behaviour on NetBSD" on netbsd-users, described here:
 http://lists.freebsd.org/pipermail/freebsd-current/2005-November/058383.html

 if your network adapter happens to be aliased to the clock interrupt,
 this will probably cause the system to see much more clock interrupts than
 there really is.

 This is just because you're seeing this problems on Dell servers, and
 the interrupt aliasing problem also happens on recent Dell servers.

 -- 
 Manuel Bouyer <bouyer@antioche.eu.org>
      NetBSD: 26 ans d'experience feront toujours la difference
 --

From: Simon Burge <simonb@wasabisystems.com>
To: Manuel Bouyer <bouyer@antioche.eu.org>
Cc: tech-kern@NetBSD.org, gnats-bugs@NetBSD.org,
	kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org,
	netbsd-bugs@NetBSD.org
Subject: Re: kern/32035: APIC timer help 
Date: Mon, 28 Nov 2005 20:56:50 +1100

 Hi Manuel,

 Manuel Bouyer wrote:

 > Probably a shoot in the dark, I didn't think about it much, but ...
 > Does your sender/receiver processes make use of the network adapter ?

 We don't actually hit the network adapter at all.  Here's "vmstat 1" as
 I start the program:

  procs     memory      page                       disks      faults          cpu
  r b w     avm    fre  flt  re  pi   po   fr   sr f0 m0 c0   in     sy    cs us sy id
  0 0 0 1032856 807988    6   0   0    0    0    0  0  0  0   25     80    12  0 0 100
  0 0 0 1032856 807988    5   0   0    0    0    0  0  0  0   25    169    34  0 0 100
  0 0 0 1032856 807988    5   0   0    0    0    0  0  0  0   21     73    11  0 0 100
  2 0 0 1033028 807800   87   0   0    0    0    0  0  0  0   19 229411 58393  4 20 75
  1 0 0 1033028 807800    5   0   0    0    0    0  0  0  0   20 278233 69764  3 27 70
  1 0 0 1033028 807800    6   0   0    0    0    0  0  0  0   14 275263 69933  3 27 69

 "netstat -i 1" also shows only localhost traffic.

 Simon.
 --
 Simon Burge                            <simonb@wasabisystems.com>
 NetBSD Support and Service:         http://www.wasabisystems.com/

From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Simon Burge <simonb@wasabisystems.com>
Cc: tech-kern@NetBSD.org, gnats-bugs@NetBSD.org,
	kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org,
	netbsd-bugs@NetBSD.org
Subject: Re: kern/32035: APIC timer help
Date: Mon, 28 Nov 2005 11:22:46 +0100

 On Mon, Nov 28, 2005 at 08:56:50PM +1100, Simon Burge wrote:
 > Hi Manuel,
 > 
 > Manuel Bouyer wrote:
 > 
 > > Probably a shoot in the dark, I didn't think about it much, but ...
 > > Does your sender/receiver processes make use of the network adapter ?
 > 
 > We don't actually hit the network adapter at all.  Here's "vmstat 1" as
 > I start the program:
 > 
 >  procs     memory      page                       disks      faults          cpu
 >  r b w     avm    fre  flt  re  pi   po   fr   sr f0 m0 c0   in     sy    cs us sy id
 >  0 0 0 1032856 807988    6   0   0    0    0    0  0  0  0   25     80    12  0 0 100
 >  0 0 0 1032856 807988    5   0   0    0    0    0  0  0  0   25    169    34  0 0 100
 >  0 0 0 1032856 807988    5   0   0    0    0    0  0  0  0   21     73    11  0 0 100
 >  2 0 0 1033028 807800   87   0   0    0    0    0  0  0  0   19 229411 58393  4 20 75
 >  1 0 0 1033028 807800    5   0   0    0    0    0  0  0  0   20 278233 69764  3 27 70
 >  1 0 0 1033028 807800    6   0   0    0    0    0  0  0  0   14 275263 69933  3 27 69
 > 
 > "netstat -i 1" also shows only localhost traffic.

 And does 'systat vm' or 'vmstat -i' show anything odd about the interrupts ?

 -- 
 Manuel Bouyer <bouyer@antioche.eu.org>
      NetBSD: 26 ans d'experience feront toujours la difference
 --

From: Simon Burge <simonb@wasabisystems.com>
To: Manuel Bouyer <bouyer@antioche.eu.org>
Cc: tech-kern@NetBSD.org, gnats-bugs@NetBSD.org,
	kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org,
	netbsd-bugs@NetBSD.org
Subject: Re: kern/32035: APIC timer help 
Date: Mon, 28 Nov 2005 23:19:15 +1100

 Manuel Bouyer wrote:

 > And does 'systat vm' or 'vmstat -i' show anything odd about the interrupts ?

 "systat vm" shows "normal" looking interrupts to me - lots of softnet
 interrupts and not much else interesting.  Here's a snapshot:

    79711 Interrupts
      100 lapic_clockintr
      100 cpu0 softclock
    10719 cpu0 softnet
          cpu0 softserial
      100 cpu0 timer
          FPU flush IPI
          FPU synch IPI
        1 TLB shootdown I
    25243 cpu1 softnet
      100 cpu1 timer
        1 timeset IPI
          FPU flush IPI
          FPU synch IPI
        1 TLB shootdown I
    22686 cpu2 softnet
      100 cpu2 timer
        1 timeset IPI
          FPU flush IPI
          FPU synch IPI
        1 TLB shootdown I
    20441 cpu3 softnet
      100 cpu3 timer
        1 timeset IPI
          FPU flush IPI
        1 FPU synch IPI
        1 TLB shootdown I
          ioapic0 pin 6
          ioapic0 pin 4
          ioapic1 pin 14
          ioapic3 pin 10
       14 ioapic3 pin 0
          ioapic3 pin 1

 Simon.
 --
 Simon Burge                            <simonb@wasabisystems.com>
 NetBSD Support and Service:         http://www.wasabisystems.com/

From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Simon Burge <simonb@wasabisystems.com>
Cc: tech-kern@NetBSD.org, gnats-bugs@NetBSD.org,
	kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org,
	netbsd-bugs@NetBSD.org
Subject: Re: kern/32035: APIC timer help
Date: Mon, 28 Nov 2005 13:23:14 +0100

 On Mon, Nov 28, 2005 at 11:19:15PM +1100, Simon Burge wrote:
 > Manuel Bouyer wrote:
 > 
 > > And does 'systat vm' or 'vmstat -i' show anything odd about the interrupts ?
 > 
 > "systat vm" shows "normal" looking interrupts to me - lots of softnet
 > interrupts and not much else interesting.  Here's a snapshot:

 Yes, this looks good. And I don't think the interrupt aliasing problem will
 affect software interrupts. AFAIK there is no hardware interrupt at a lower
 priority than software interrupts on x86.

 -- 
 Manuel Bouyer <bouyer@antioche.eu.org>
      NetBSD: 26 ans d'experience feront toujours la difference
 --

From: Simon Burge <simonb@wasabisystems.com>
To: tech-kern@netbsd.org
Cc: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org,
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/32035: APIC timer help 
Date: Wed, 07 Dec 2005 18:02:44 +1100

 Simon Burge wrote:

 > [ local APIC timer problem discussed ]

 I've come to the conclusion that for some reason on the problematic
 machines the APIC timer just doesn't fire with the same period for some
 unknown reason, and that there's nothing we can really do about.  The
 patch at

    ftp://ftp.netbsd.org/pub/NetBSD/misc/simonb/mp-time-hack.diff

 at least lets time run stably.  The main comment at the top of the patch
 describes what it does:

 	* Some MP systems have been observed to not have a
 	* stable local APIC timer interrupt.  We count the
 	* number of TSC cycles since the last call to
 	* lapic_clockintr(), and if it has been longer than
 	* expected we add in some extract time for hardclock()
 	* to add in when it computes the next value of the
 	* system "time" variable.  Note that we don't skip
 	* time backwards - early arrivals to lapic_clockintr()
 	* have only been observed sporadically, and we'll
 	* soon catch up.

 Longer term, switching to timecounters is a more correct fix since they
 base time calculations on the TSC counter and not the period of the
 clock interrupt.  Using HPET timers where available will also help.

 Until then though, any comments on the patch as is?  Is this too ugly
 to consider to use in our source tree until then?  Is the name of the
 option (LAPIC_TIMER_IS_BUGGERED) not quite appropriate? :-)

 I'd be curious if anyone else with SMP boxes that have time keeping
 problems could test this out and see if it fixes the time problem.

 Simon.
 --
 Simon Burge                            <simonb@wasabisystems.com>
 NetBSD Support and Service:         http://www.wasabisystems.com/

From: fredb@immanent.net (Frederick Bruckman)
To: Simon Burge <simonb@wasabisystems.com>
Cc: tech-kern@NetBSD.org, gnats-bugs@NetBSD.org
Subject: Re: kern/32035: APIC timer help
Date: Mon, 19 Dec 2005 11:38:38 -0600 (CST)

 In article <20051207070244.E209723402@thoreau.thistledown.com.au>,
 	Simon Burge <simonb@wasabisystems.com> writes:
 > Simon Burge wrote:
 > 
 >> [ local APIC timer problem discussed ]
 > 
 > I've come to the conclusion that for some reason on the problematic
 > machines the APIC timer just doesn't fire with the same period for some
 > unknown reason, and that there's nothing we can really do about.  The
 > patch at
 > 
 >    ftp://ftp.netbsd.org/pub/NetBSD/misc/simonb/mp-time-hack.diff
 > 
 > at least lets time run stably.  The main comment at the top of the patch
 > describes what it does:
 > 
 > 	* Some MP systems have been observed to not have a
 > 	* stable local APIC timer interrupt.  We count the
 > 	* number of TSC cycles since the last call to
 > 	* lapic_clockintr(), and if it has been longer than
 > 	* expected we add in some extract time for hardclock()
 > 	* to add in when it computes the next value of the
 > 	* system "time" variable.  Note that we don't skip
 > 	* time backwards - early arrivals to lapic_clockintr()
 > 	* have only been observed sporadically, and we'll
 > 	* soon catch up.
 > 
 > Longer term, switching to timecounters is a more correct fix since they
 > base time calculations on the TSC counter and not the period of the
 > clock interrupt.  Using HPET timers where available will also help.

 That sounds really interesting. The problem I see with your theory,
 is that it's the same APIC timer for the one CPU or two CPU cases.
 I suspect some latency in the IPI/read-TSC code path.  Maybe the
 "rdtsc" instruction simply isn't in the icache on the slow cycles?
 Experimenting as you suggest would help answer the question.

 > I'd be curious if anyone else with SMP boxes that have time keeping
 > problems could test this out and see if it fixes the time problem.

 It helps! The frequency (as logged in "/var/log/loopstats") jumps to
 a few hundred under heavy disk I/O, but then settles back down without
 stepping. (Patch applied to netbsd-3-0). Yet, on the same machine with
 a non-SMP kernel (2.1 to 3.0_RC6), the frequency slowly varies from
 about 5.0 to 11.0, depending on ambient temperature, so it's clearly
 not a complete fix.


 Frederick

From: David Brownlee <abs@absd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/32035: UP ACPI kernels lose time
Date: Tue, 14 Mar 2006 18:12:40 +0000 (GMT)

  	I'm also seeing a similar affect (clock losing minutes per hour),
  	and ntp unable to cope, on two single proc machines, but only
  	when an ACPI kernel is run.

  	diff between working (noacpi) an non working (acpi) kernels below.

  	I wonder if ths could in any way be related to port-i386/32953
  	(Repeated keystrokes under X using APCI kernel)

 --- dmesg.noacpi	2006-03-14 18:05:11.000000000 +0000
 +++ dmesg.acpi	2006-03-14 18:05:11.000000000 +0000
 -NetBSD 3.0_STABLE (_NOACPI_) #2: Fri Mar 10 17:35:58 GMT 2006
 -	root@tll.i:/var/obj/i386/files/netbsd/3/sys/arch/i386/compile/_NOACPI_
 +NetBSD 3.0_STABLE (_ACPI_) #4: Fri Mar 10 16:47:41 GMT 2006
 +	root@tll.i:/var/obj/i386/files/netbsd/3/sys/arch/i386/compile/_ACPI_
 -cpu0 at mainbus0: (uniprocessor)
 -cpu0: Intel Celeron (686-class), 1716.98 MHz, id 0xf13
 +cpu0 at mainbus0: apid 0 (boot processor)
 +cpu0: Intel Celeron (686-class), 1716.99 MHz, id 0xf13
 -pnpbios0 at mainbus0: nodes 17, max len 92
 -pckbc1 at pnpbios0 index 4 (PNP0303): kbd port
 -npx0 at pnpbios0 index 6 (PNP0C04)
 -npx0: io f0-ff, irq 13
 +cpu0: calibrating local timer
 +cpu0: apic clock running at 100 MHz
 +ioapic0 at mainbus0 apid 2 (I/O APIC)
 +ioapic0: pa 0xfec00000, version 20, 24 pins
 +acpi0 at mainbus0
 +acpi0: using Intel ACPI CA subsystem version 20040211
 +acpi0: X/RSDT: OemId <GBT   ,AWRDACPI,42302e31>, AslId <AWRD,01010101>
 +acpi0: SCI interrupting at int 9
 +acpi0: fixed-feature power button present
 +ACPI Object Type 'Processor' (0x0c) at acpi0 not configured
 +acpibut0 at acpi0 (PNP0C0C): ACPI Power Button
 +acpibut1 at acpi0 (PNP0C0E): ACPI Sleep Button
 +PNP0C01 at acpi0 not configured
 +PNP0A03 at acpi0 not configured
 +PNP0C02 at acpi0 not configured
 +PNP0000 at acpi0 not configured
 +PNP0200 at acpi0 not configured
 +PNP0100 at acpi0 not configured
 +PNP0B00 at acpi0 not configured
 +PNP0800 at acpi0 not configured
 +npx0 at acpi0 (PNP0C04)
 +npx0: io 0xf0-0xff irq 13
 -com3 at pnpbios0 index 12 (PNP0501)
 -com3: io 3f8-3ff, irq 4
 -com3: ns16550a, working fifo
 -com3: console
 -fdc1 at pnpbios0 index 13 (PNP0700)
 -fdc1: io 3f0-3f5 3f7, irq 6, DMA 2
 -lpt3 at pnpbios0 index 14 (PNP0400)
 -lpt3: io 378-37f 778-77f, irq 7
 -com4 at pnpbios0 index 16 (PNP0501)
 -com4: io 2f8-2ff, irq 3
 -com4: ns16550a, working fifo
 +fdc0 at acpi0 (PNP0700)
 +fdc0: io 0x3f0-0x3f5,0x3f7 irq 6 drq 2
 +com0 at acpi0 (PNP0501-1)
 +com0: io 0x3f8-0x3ff irq 4
 +com0: ns16550a, working fifo
 +com0: console
 +com1 at acpi0 (PNP0501-2)
 +com1: io 0x2f8-0x2ff irq 3
 +com1: ns16550a, working fifo
 +lpt0 at acpi0 (PNP0400)
 +lpt0: io 0x378-0x37f irq 7
 +PNP0C02 at acpi0 not configured
 +PNPB006 at acpi0 not configured
 +joy0 at acpi0 (PNPB02F)
 +joy0: io 0x201
 +joy0: joystick not connected
 +PNP0C0F at acpi0 not configured
 +PNP0C0F at acpi0 not configured
 +PNP0C0F at acpi0 not configured
 +PNP0C0F at acpi0 not configured
 +PNP0C0F at acpi0 not configured
 +PNP0C0F at acpi0 not configured
 -uhci0: interrupting at irq 12
 +uhci0: interrupting at ioapic0 pin 16 (irq 12)
 -uhci1: interrupting at irq 12
 +uhci1: interrupting at ioapic0 pin 19 (irq 12)
 -uhci2: interrupting at irq 11
 +uhci2: interrupting at ioapic0 pin 18 (irq 11)
 -ehci0: interrupting at irq 9
 +ehci0: interrupting at ioapic0 pin 23 (irq 9)
 -rtk0: interrupting at irq 11
 +rtk0: interrupting at ioapic0 pin 21 (irq 11)
 -piixide0: primary channel interrupting at irq 14
 +piixide0: primary channel interrupting at ioapic0 pin 14 (irq 14)
 -piixide0: secondary channel interrupting at irq 15
 +piixide0: secondary channel interrupting at ioapic0 pin 15 (irq 15)
 -auich0: interrupting at irq 5
 +auich0: interrupting at ioapic0 pin 17 (irq 5)
 +pckbc0 at isa0 port 0x60-0x64
 -apm0 at mainbus0: Power Management spec V1.2
 -auich0: measured ac97 link rate at 48007 Hz, will use 48000 Hz
 +ioapic0: enabling
 +auich0: measured ac97 link rate at 49413 Hz, will use 48000 Hz

From: Pavel Cahyna <pavel.cahyna@st.mff.cuni.cz>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/32035: UP ACPI kernels lose time
Date: Tue, 14 Mar 2006 21:21:08 +0100

 On Tue, Mar 14, 2006 at 06:15:04PM +0000, David Brownlee wrote:
 >   	I wonder if ths could in any way be related to port-i386/32953
 >   	(Repeated keystrokes under X using APCI kernel)

 I saw a similar thing as in PR 32953 under Linux, and there the kernel was
 not losing, but gaining time very quickly. (I could watch a Gnome clock
 applet with second resolution and sometimes see the seconds counter
 incremented by two seconds instead of one.)

 Pavel Cahyna

From: Frank Kardel <kardel@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org
Subject: Re: kern/32035: 3.0 MP machines can't keep time on busy nameservers
Date: Sat, 28 Oct 2006 12:33:23 +0200

 Tracy Di Marco White wrote:

 >The following reply was made to PR kern/32035; it has been noted by GNATS.
 >
 >From: Tracy Di Marco White <gendalia@gendalia.org>
 >To: gnats-bugs@netbsd.org
 >Cc: nb-pr@gendalia.org
 >Subject: Re: kern/32035: 3.0 MP machines can't keep time on busy nameservers 
 >Date: Tue, 15 Nov 2005 22:50:15 -0600
 >  
 >
 Is this bug still valid after conversion to timecounters ? If not, can 
 we close it ?

 Frank

From: Hauke Fath <hf@spg.tu-darmstadt.de>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org,
	netbsd-bugs@NetBSD.org, nb-pr@gendalia.org
Subject: Re: kern/32035: 3.0 MP machines can't keep time on busy
 nameservers
Date: Mon, 30 Oct 2006 11:07:35 +0100

 Am 28.10.2006 um 10:35 Uhr +0000 schrieb Frank Kardel:
 >  Is this bug still valid after conversion to timecounters ? If not, can
 >  we close it ?

 What did I miss? When has netbsd-3 been converted to timecounters?

 	hauke

 -- 
 /~\  The ASCII Ribbon Campaign                    Hauke Fath
 \ /    No HTML/RTF in email	        Institut f�r Nachrichtentechnik
   X     No Word docs in email	                  TU Darmstadt
 / \  Respect for open standards              Ruf +49-6151-16-3281

From: Matthias Scheler <tron@zhadum.org.uk>
To: Hauke Fath <hf@spg.tu-darmstadt.de>
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/32035: 3.0 MP machines can't keep time on busy nameservers
Date: Mon, 30 Oct 2006 13:21:52 +0000

 On Mon, Oct 30, 2006 at 11:07:35AM +0100, Hauke Fath wrote:
 > Am 28.10.2006 um 10:35 Uhr +0000 schrieb Frank Kardel:
 > > Is this bug still valid after conversion to timecounters ? If not, can
 > > we close it ?
 > 
 > What did I miss? When has netbsd-3 been converted to timecounters?

 No, it hasn't. And it probably never will. So the answer is really
 "Please update to NetBSD 4.0 after it has been released.".

 	Kind regards

 -- 
 Matthias Scheler                                  http://zhadum.org.uk/

From: Tracy Di Marco White <nb-pr@gendalia.org>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org
Subject: Re: kern/32035: 3.0 MP machines can't keep time on busy nameservers
Date: Tue, 31 Oct 2006 01:45:32 -0600

 On Mon, Oct 30, 2006 at 01:25PM +0000, Matthias Scheler wrote:
 >The following reply was made to PR kern/32035; it has been noted by GNATS.
 >
 >From: Matthias Scheler <tron@zhadum.org.uk>
 >To: Hauke Fath <hf@spg.tu-darmstadt.de>
 >Cc: gnats-bugs@NetBSD.org
 >Subject: Re: kern/32035: 3.0 MP machines can't keep time on busy nameservers
 >Date: Mon, 30 Oct 2006 13:21:52 +0000
 >
 > On Mon, Oct 30, 2006 at 11:07:35AM +0100, Hauke Fath wrote:
 > > Am 28.10.2006 um 10:35 Uhr +0000 schrieb Frank Kardel:
 > > > Is this bug still valid after conversion to timecounters ? If not, can
 > > > we close it ?
 > >
 > > What did I miss? When has netbsd-3 been converted to timecounters?
 >
 > No, it hasn't. And it probably never will. So the answer is really
 > "Please update to NetBSD 4.0 after it has been released.".

 I don't care if we close this one.  I won't be able to test on work's
 production name servers til next summer, and I definitely hope we'll
 have 4.0 out long before that.  And if we still can't keep time, I'll
 reopen it.

 -Tracy

From: Frank Kardel <kardel@netbsd.org>
To: Hauke Fath <hf@spg.tu-darmstadt.de>
Cc: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org,
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, nb-pr@gendalia.org
Subject: Re: kern/32035: 3.0 MP machines can't keep time on busy nameservers
Date: Tue, 31 Oct 2006 23:31:28 +0100

 Hauke Fath wrote:

 > Am 28.10.2006 um 10:35 Uhr +0000 schrieb Frank Kardel:
 >
 >>  Is this bug still valid after conversion to timecounters ? If not, can
 >>  we close it ?
 >
 >
 > What did I miss? When has netbsd-3 been converted to timecounters?

 my bad - 3.0 MP has had AFAIK no fixes - my interest is whether this problem
 still exists with timecounters (4.0 and up)

 >
 >     hauke
 >
 Frank

State-Changed-From-To: open->feedback
State-Changed-By: maya@NetBSD.org
State-Changed-When: Wed, 28 Nov 2018 06:37:11 +0000
State-Changed-Why:
Is this still an issue?


State-Changed-From-To: feedback->closed
State-Changed-By: maya@NetBSD.org
State-Changed-When: Thu, 07 Jan 2021 14:52:52 +0000
State-Changed-Why:
feedback timeout: this bug was suggested to be fixed in netbsd-4 but not in netbsd-3. Those two branches are now EOL. Assuming it is already fixed.


>Unformatted:
Home
PR Database Search
(Contact us) $NetBSD: gnats-precook-prs,v 1.4 2018/12/21 14:20:20 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.