NetBSD Problem Report #45626

From www@NetBSD.org  Thu Nov 17 20:21:36 2011
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by www.NetBSD.org (Postfix) with ESMTP id 8EDC063D8AD
	for <gnats-bugs@gnats.NetBSD.org>; Thu, 17 Nov 2011 20:21:36 +0000 (UTC)
Message-Id: <20111117202135.8F0C563D875@www.NetBSD.org>
Date: Thu, 17 Nov 2011 20:21:35 +0000 (UTC)
From: donaldcallen@gmail.com
Reply-To: donaldcallen@gmail.com
To: gnats-bugs@NetBSD.org
Subject: System time does not advance correctly when noatime is specified for /var
X-Send-Pr-Version: www-1.0

>Number:         45626
>Category:       kern
>Synopsis:       System time does not advance correctly when noatime is specified for /var
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Nov 17 20:25:00 +0000 2011
>Last-Modified:  Wed Jun 25 00:15:01 +0000 2014
>Originator:     Don Allen
>Release:        5.1
>Organization:
>Environment:
NetBSD salome.comcast.net 5.1 NetBSD 5.1 (GENERIC) #0: Sat Nov  6 13:19:33 UTC 2010  builds@b6.netbsd.org:/home/builds/ab/netbsd-5-1-RELEASE/amd64/201011061943Z-obj/home/builds/ab/netbsd-5-1-RELEASE/src/sys/arch/amd64/compile/GENERIC amd64
>Description:
I just installed NetBSD 5.1 on a Lenovo S10 workstation: 4-core Intel processor, 4 Gb, 2 fast SAS disks on an LSI controller in a raid-0 config. In my first attempt, I set up just a root partition (and swap). After installing, I changed fstab to mount / async,noatime (I want high performance and am willing to risk losing the whole thing). After doing that, I noticed that the system wasn't keeping time correctly -- the system time was advancing at a snail's pace, about 1 second per about a minute of real time. There were some other symptoms, things taking much too long, such as system boot, logging off and shutting the system down. I did a little experimenting to try to understand what was going on here. First, I changed fstab to remove the async,noatime and rebooted. That fixed the problem. Then I reinstalled with multiple filesystems -- /, /usr, /var, /tmp, /home. I suspected that making / async,noatime was the problem and tried all the others async,noatime except /. That di
 d not fix the problem. I then took a wild guess that /var was the issue and set all to async,noatime except /var. That fixed the problem. I then tried setting /var to just async, without the noatime and rebooted. Time-keeping now worked correctly. Adding the async back again brought the problem back after rebooting. I'm guessing that there's something in the time-keeping logic that depends on the access-time of a file somewhere in the /var hierarchy being maintained. Guessing further, I think the slow booting, logging off, and shutdown is due to timeouts that take longer than intended, because the system clock is barely advancing.
>How-To-Repeat:
Install 5.1 with /var mounted async,noatime (I don't know if the async matters -- I haven't tested this without it, but my guess, and it's only a guess, is that it doesn't matter).
>Fix:
Don't mount /var async,noatime. Just async works.

>Audit-Trail:
From: Bernd Ernesti <netbsd@lists.veego.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/45626: System time does not advance correctly when noatime is specified for /var
Date: Thu, 17 Nov 2011 21:42:21 +0100

 hmm, are you using ntpd?

 If so what happens if you disable it and use noatime again?

 Bernd

From: Donald Allen <donaldcallen@gmail.com>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/45626: System time does not advance correctly when noatime
 is specified for /var
Date: Thu, 17 Nov 2011 16:01:36 -0500

 --20cf307f3760b3164804b1f4865f
 Content-Type: text/plain; charset=ISO-8859-1

 On Thu, Nov 17, 2011 at 3:45 PM, Bernd Ernesti <netbsd@lists.veego.de>wrote:

 > The following reply was made to PR kern/45626; it has been noted by GNATS.
 >
 > From: Bernd Ernesti <netbsd@lists.veego.de>
 > To: gnats-bugs@NetBSD.org
 > Cc:
 > Subject: Re: kern/45626: System time does not advance correctly when
 > noatime is specified for /var
 > Date: Thu, 17 Nov 2011 21:42:21 +0100
 >
 >  hmm, are you using ntpd?
 >

 Not yet -- still setting the system up. This problem occurred *without*
 ntpd running.

 /Don

 >
 >  If so what happens if you disable it and use noatime again?
 >
 >  Bernd
 >
 >

 --20cf307f3760b3164804b1f4865f
 Content-Type: text/html; charset=ISO-8859-1
 Content-Transfer-Encoding: quoted-printable

 <br><br><div class=3D"gmail_quote">On Thu, Nov 17, 2011 at 3:45 PM, Bernd E=
 rnesti <span dir=3D"ltr">&lt;<a href=3D"mailto:netbsd@lists.veego.de">netbs=
 d@lists.veego.de</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quote"=
  style=3D"margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 2=
 04); padding-left: 1ex;">
 The following reply was made to PR kern/45626; it has been noted by GNATS.<=
 br>
 <br>
 From: Bernd Ernesti &lt;<a href=3D"mailto:netbsd@lists.veego.de">netbsd@lis=
 ts.veego.de</a>&gt;<br>
 To: gnats-bugs@NetBSD.org<br>
 Cc:<br>
 Subject: Re: kern/45626: System time does not advance correctly when noatim=
 e is specified for /var<br>
 Date: Thu, 17 Nov 2011 21:42:21 +0100<br>
 <br>
 =A0hmm, are you using ntpd?<br></blockquote><div><br>Not yet -- still setti=
 ng the system up. This problem occurred *without* ntpd running.<br><br>/Don=
  <br></div><blockquote class=3D"gmail_quote" style=3D"margin: 0pt 0pt 0pt 0=
 .8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">

 <br>
 =A0If so what happens if you disable it and use noatime again?<br>
 <font color=3D"#888888"><br>
 =A0Bernd<br>
 <br>
 </font></blockquote></div><br>

 --20cf307f3760b3164804b1f4865f--

From: Donald Allen <donaldcallen@gmail.com>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/45626: System time does not advance correctly when noatime
 is specified for /var
Date: Fri, 18 Nov 2011 08:23:16 -0500

 --20cf3077692762578004b2023d32
 Content-Type: text/plain; charset=ISO-8859-1

 On Thu, Nov 17, 2011 at 3:25 PM, <gnats-admin@netbsd.org> wrote:

 > Thank you very much for your problem report.
 > It has the internal identification `kern/45626'.
 > The individual assigned to look at your
 > report is: kern-bug-people.
 >
 > >Category:       kern
 > >Responsible:    kern-bug-people
 > >Synopsis:       System time does not advance correctly when noatime is
 > specified for /var
 > >Arrival-Date:   Thu Nov 17 20:25:00 +0000 2011
 >

 It turns out that my attributing the cause of this problem to the use of
 noatime in mounting /var is not correct. I tried booting the system several
 times with and without the noatime option for /var and got perfect
 correlation with the timer behaving incorrectly and not. But with /var
 mounted without noatime, I began using the system and after it had been up
 for hours, I noticed last night that the clock had gone catatonic, was
 exhibiting the same advancing-at-a-snail's-pace behavior that prompted this
 report.

 I had an exchange of emails with Christos Zoulas about this. Christos
 suggested running

 sysctl -a | grep kern.timecounter

 which I did:

 kern.timecounter.choice = TSC(q=3000, f=67750617510 Hz) clockinterrupt(q=0,
 f=100 Hz) ichlpcib0(q=1000, f=3579545 Hz) hpet0(q=2000, f=14318179 Hz)
 ACPI-Fast(q=1000, f=3579545 Hz) lapic(q=-100, f=266097187 Hz) i8254(q=100,
 f=1193182 Hz) dummy(q=-1000000, f=1000000 Hz)
 kern.timecounter.hardware = TSC
 kern.timecounter.timestepwarnings = 0

 He also suggested trying

 sysctl -w kern.timecounter.hardware=hpet0
 >
 > and see if that fixes it.
 >

 It does:

 root@salome:/home/dca$ date
 Thu Nov 17 17:53:36 EST 2011
 root@salome:/home/dca$ date
 Thu Nov 17 17:53:36 EST 2011
 root@salome:/home/dca$ date
 Thu Nov 17 17:53:36 EST 2011
 root@salome:/home/dca$ date
 Thu Nov 17 17:53:36 EST 2011
 root@salome:/home/dca$ sysctl -w kern.timecounter.hardware=hpet0
 kern.timecounter.hardware: TSC -> hpet0
 root@salome:/home/dca$ date
 Thu Nov 17 17:53:40 EST 2011
 root@salome:/home/dca$ date
 Thu Nov 17 17:53:41 EST 2011
 root@salome:/home/dca$ date
 Thu Nov 17 17:53:42 EST 2011
 root@salome:/home/dca$ date
 Thu Nov 17 17:53:43 EST 2011
 root@salome:/home/dca$ date
 Thu Nov 17 17:53:44 EST 2011
 root@salome:/home/dca$

 However, after doing this, the system behaved in odd ways. I had trouble
 shutting X down, the system seemed not to be hearing the (USB) keyboard (I
 couldn't log in after doing ctrl-alt-f2). I finally got it shut down and
 rebooted, ending this experiment.

 I then decided to try a newer kernel, and installed the kernel from the
 11/17 snapshot. Upon booting the system this morning (with all the file
 systems, including /var, mounted async,noatime), I quickly observed the
 clock problem.

 I cannot use this system because of this problem, but I will leave NetBSD
 installed, so if any further information is needed (would you like dmesg
 output?), or further experimentation, it will be available and I will
 provide whatever help I can.

 /Don

 --20cf3077692762578004b2023d32
 Content-Type: text/html; charset=ISO-8859-1
 Content-Transfer-Encoding: quoted-printable

 <br><br><div class=3D"gmail_quote">On Thu, Nov 17, 2011 at 3:25 PM,  <span =
 dir=3D"ltr">&lt;<a href=3D"mailto:gnats-admin@netbsd.org" target=3D"_blank"=
 >gnats-admin@netbsd.org</a>&gt;</span> wrote:<br><blockquote class=3D"gmail=
 _quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:=
 1ex">

 Thank you very much for your problem report.<br>
 It has the internal identification `kern/45626&#39;.<br>
 The individual assigned to look at your<br>
 report is: kern-bug-people.<br>
 <br>
 &gt;Category: =A0 =A0 =A0 kern<br>
 &gt;Responsible: =A0 =A0kern-bug-people<br>
 &gt;Synopsis: =A0 =A0 =A0 System time does not advance correctly when noati=
 me is specified for /var<br>
 &gt;Arrival-Date: =A0 Thu Nov 17 20:25:00 +0000 2011<br></blockquote><div><=
 br></div><div>It turns out that my attributing the cause of this problem to=
  the use of noatime in mounting /var is not correct. I tried booting the sy=
 stem several times with and without the noatime option for /var and got per=
 fect correlation with the timer behaving incorrectly and not. But with /var=
  mounted without noatime, I began using the system and after it had been up=
  for hours, I noticed last night that the clock had gone catatonic, was exh=
 ibiting the same advancing-at-a-snail&#39;s-pace behavior that prompted thi=
 s report.</div>

 <div><br></div><div>I had an exchange of emails with Christos Zoulas about =
 this. Christos suggested running</div><div><br></div><div><meta http-equiv=
 =3D"content-type" content=3D"text/html; charset=3Dutf-8"><span class=3D"App=
 le-style-span" style=3D"border-collapse: collapse; color: rgb(51, 51, 51); =
 font-family: arial, sans-serif; font-size: 13px; ">sysctl -a | grep kern.ti=
 mecounter</span></div>
 <div><span class=3D"Apple-style-span" style=3D"border-collapse: collapse; c=
 olor: rgb(51, 51, 51); font-family: arial, sans-serif; font-size: 13px; "><=
 br></span></div><div><span class=3D"Apple-style-span" style=3D"border-colla=
 pse: collapse; color: rgb(51, 51, 51); font-family: arial, sans-serif; font=
 -size: 13px; ">which I did:</span></div>
 <div><span class=3D"Apple-style-span" style=3D"border-collapse: collapse; c=
 olor: rgb(51, 51, 51); font-family: arial, sans-serif; font-size: 13px; "><=
 br></span></div><div><span class=3D"Apple-style-span" style=3D"border-colla=
 pse: collapse; color: rgb(51, 51, 51); font-family: arial, sans-serif; font=
 -size: 13px; "><meta http-equiv=3D"content-type" content=3D"text/html; char=
 set=3Dutf-8"><span class=3D"Apple-style-span" style=3D"color: rgb(68, 68, 6=
 8); ">kern.timecounter.choice =3D TSC(q=3D3000, f=3D67750617510 Hz) clockin=
 terrupt(q=3D0, f=3D100 Hz) ichlpcib0(q=3D1000, f=3D3579545 Hz) hpet0(q=3D20=
 00, f=3D14318179 Hz) ACPI-Fast(q=3D1000, f=3D3579545 Hz) lapic(q=3D-100, f=
 =3D266097187 Hz) i8254(q=3D100, f=3D1193182 Hz) dummy(q=3D-1000000, f=3D100=
 0000 Hz)<br>
 kern.timecounter.hardware =3D TSC<br>kern.timecounter.timestepwarnings =3D =
 0</span></span></div><div><span class=3D"Apple-style-span" style=3D"border-=
 collapse: collapse; color: rgb(51, 51, 51); font-family: arial, sans-serif;=
  font-size: 13px; "><span class=3D"Apple-style-span" style=3D"color: rgb(68=
 , 68, 68); "><br>
 </span></span></div><div><span class=3D"Apple-style-span" style=3D"border-c=
 ollapse: collapse; color: rgb(51, 51, 51); font-family: arial, sans-serif; =
 font-size: 13px; "><span class=3D"Apple-style-span" style=3D"color: rgb(68,=
  68, 68); ">He also suggested trying=A0</span></span></div>
 <div><span class=3D"Apple-style-span" style=3D"border-collapse: collapse; c=
 olor: rgb(51, 51, 51); font-family: arial, sans-serif; font-size: 13px; "><=
 span class=3D"Apple-style-span" style=3D"color: rgb(68, 68, 68); "><br></sp=
 an></span></div>
 <div><span class=3D"Apple-style-span" style=3D"border-collapse: collapse; c=
 olor: rgb(51, 51, 51); font-family: arial, sans-serif; font-size: 13px; "><=
 span class=3D"Apple-style-span" style=3D"color: rgb(68, 68, 68); "><meta ht=
 tp-equiv=3D"content-type" content=3D"text/html; charset=3Dutf-8"><div class=
 =3D"im" style=3D"color: rgb(51, 51, 51); ">
 <blockquote class=3D"gmail_quote" style=3D"margin-top: 0pt; margin-right: 0=
 pt; margin-bottom: 0pt; margin-left: 0.8ex; border-left-width: 1px; border-=
 left-style: solid; border-left-color: rgb(204, 204, 204); padding-left: 1ex=
 ; ">
 sysctl -w kern.timecounter.hardware=3Dhpet0<br><br>and see if that fixes it=
 .<br></blockquote></div><div><br>It does:<br><br>root@salome:/home/dca$ dat=
 e<br>Thu Nov 17 17:53:36 EST 2011<br>root@salome:/home/dca$ date<br>Thu Nov=
  17 17:53:36 EST 2011<br>
 root@salome:/home/dca$ date<br>Thu Nov 17 17:53:36 EST 2011<br>root@salome:=
 /home/dca$ date<br>Thu Nov 17 17:53:36 EST 2011<br>root@salome:/home/dca$ s=
 ysctl -w kern.timecounter.hardware=3Dhpet0<br>kern.timecounter.hardware: TS=
 C -&gt; hpet0<br>
 root@salome:/home/dca$ date<br>Thu Nov 17 17:53:40 EST 2011<br>root@salome:=
 /home/dca$ date<br>Thu Nov 17 17:53:41 EST 2011<br>root@salome:/home/dca$ d=
 ate<br>Thu Nov 17 17:53:42 EST 2011<br>root@salome:/home/dca$ date<br>Thu N=
 ov 17 17:53:43 EST 2011<br>
 root@salome:/home/dca$ date<br>Thu Nov 17 17:53:44 EST 2011<br>root@salome:=
 /home/dca$=A0</div><div><br></div><div>However, after doing this, the syste=
 m behaved in odd ways. I had trouble shutting X down, the system seemed not=
  to be hearing the (USB) keyboard (I couldn&#39;t log in after doing ctrl-a=
 lt-f2). I finally got it shut down and rebooted, ending this experiment.</d=
 iv>
 <div><br></div><div>I then decided to try a newer kernel, and installed the=
  kernel from the 11/17 snapshot. Upon booting the system this morning (with=
  all the file systems, including /var, mounted async,noatime), I quickly ob=
 served the clock problem.</div>
 <div><br></div><div>I cannot use this system because of this problem, but I=
  will leave NetBSD installed, so if any further information is needed (woul=
 d you like dmesg output?), or further experimentation, it will be available=
  and I will provide whatever help I can.</div>
 <div><br></div><div>/Don</div><div><br></div><div><br></div></span></span><=
 /div></div><br>

 --20cf3077692762578004b2023d32--

From: Manuel Bouyer <bouyer@antioche.eu.org>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org, netbsd-bugs@NetBSD.org,
        donaldcallen@gmail.com
Subject: Re: kern/45626: System time does not advance correctly when
 noatime is specified for /var
Date: Fri, 18 Nov 2011 15:33:03 +0100

 On Fri, Nov 18, 2011 at 01:25:03PM +0000, Donald Allen wrote:
 >  [...]
 >  sysctl -a | grep kern.timecounter
 >  
 >  which I did:
 >  
 >  kern.timecounter.choice = TSC(q=3000, f=67750617510 Hz) clockinterrupt(q=0,
 >  f=100 Hz) ichlpcib0(q=1000, f=3579545 Hz) hpet0(q=2000, f=14318179 Hz)
 >  ACPI-Fast(q=1000, f=3579545 Hz) lapic(q=-100, f=266097187 Hz) i8254(q=100,
 >  f=1193182 Hz) dummy(q=-1000000, f=1000000 Hz)
 >  kern.timecounter.hardware = TSC
 >  kern.timecounter.timestepwarnings = 0

 TSC is clearly wrong here, I'm sure you don't have a 67Ghz CPU :)

 >  
 >  He also suggested trying
 >  
 >  sysctl -w kern.timecounter.hardware=hpet0
 >  >
 >  > and see if that fixes it.
 >  >
 >  
 >  It does:
 > [...]
 >  However, after doing this, the system behaved in odd ways. I had trouble
 >  shutting X down, the system seemed not to be hearing the (USB) keyboard (I
 >  couldn't log in after doing ctrl-alt-f2). I finally got it shut down and
 >  rebooted, ending this experiment.
 >  
 >  I then decided to try a newer kernel, and installed the kernel from the
 >  11/17 snapshot. Upon booting the system this morning (with all the file
 >  systems, including /var, mounted async,noatime), I quickly observed the
 >  clock problem.

 You should try setting
 kern.timecounter.hardware=hpet0
 in /etc/sysctl.conf, so that the change is applied at boot.
 The odd behavior you see may be caused by processes started before
 the timecounter change.

 You can also try other available timecounters (try ichlpcib0 or i8254
 for exampe).

 -- 
 Manuel Bouyer <bouyer@antioche.eu.org>
      NetBSD: 26 ans d'experience feront toujours la difference
 --

From: Donald Allen <donaldcallen@gmail.com>
To: Manuel Bouyer <bouyer@antioche.eu.org>
Cc: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org, gnats-admin@netbsd.org, 
	netbsd-bugs@netbsd.org
Subject: Re: kern/45626: System time does not advance correctly when noatime
 is specified for /var
Date: Fri, 18 Nov 2011 10:06:47 -0500

 On Fri, Nov 18, 2011 at 9:33 AM, Manuel Bouyer <bouyer@antioche.eu.org> wro=
 te:
 >
 > On Fri, Nov 18, 2011 at 01:25:03PM +0000, Donald Allen wrote:
 > > =A0[...]
 > > =A0sysctl -a | grep kern.timecounter
 > >
 > > =A0which I did:
 > >
 > > =A0kern.timecounter.choice =3D TSC(q=3D3000, f=3D67750617510 Hz) clocki=
 nterrupt(q=3D0,
 > > =A0f=3D100 Hz) ichlpcib0(q=3D1000, f=3D3579545 Hz) hpet0(q=3D2000, f=3D=
 14318179 Hz)
 > > =A0ACPI-Fast(q=3D1000, f=3D3579545 Hz) lapic(q=3D-100, f=3D266097187 Hz=
 ) i8254(q=3D100,
 > > =A0f=3D1193182 Hz) dummy(q=3D-1000000, f=3D1000000 Hz)
 > > =A0kern.timecounter.hardware =3D TSC
 > > =A0kern.timecounter.timestepwarnings =3D 0
 >
 > TSC is clearly wrong here, I'm sure you don't have a 67Ghz CPU :)

 That's certainly true.

 I've got NetBSD installed next to Slackware Linux on this machine,
 running the 3.0.8 kernel from kernel.org. I've had no problem with
 time-keeping in Linux on the machine, so out of curiosity, I brought
 Linux up and had a look at the dmesg:

 dca@salome:~$ sudo dmesg | fgrep TSC
 [ =A0 =A00.000000] Fast TSC calibration failed
 [ =A0 =A00.000000] TSC: PIT calibration matches HPET. 1 loops
 [ =A0 =A01.336163] Refined TSC clocksource calibration: 2393.976 MHz.

 I could be wrong, but that suggests to me that the Linux kernel
 noticed the TSC oddness and somehow compensated. Perhaps a look at
 what they're doing is in order.
 >
 > >
 > > =A0He also suggested trying
 > >
 > > =A0sysctl -w kern.timecounter.hardware=3Dhpet0
 > > =A0>
 > > =A0> and see if that fixes it.
 > > =A0>
 > >
 > > =A0It does:
 > > [...]
 > > =A0However, after doing this, the system behaved in odd ways. I had tro=
 uble
 > > =A0shutting X down, the system seemed not to be hearing the (USB) keybo=
 ard (I
 > > =A0couldn't log in after doing ctrl-alt-f2). I finally got it shut down=
  and
 > > =A0rebooted, ending this experiment.
 > >
 > > =A0I then decided to try a newer kernel, and installed the kernel from =
 the
 > > =A011/17 snapshot. Upon booting the system this morning (with all the f=
 ile
 > > =A0systems, including /var, mounted async,noatime), I quickly observed =
 the
 > > =A0clock problem.
 >
 > You should try setting
 > kern.timecounter.hardware=3Dhpet0
 > in /etc/sysctl.conf, so that the change is applied at boot.
 > The odd behavior you see may be caused by processes started before
 > the timecounter change.

 That makes sense.

 I've made the change you suggested to /etc/sysctl.conf and brought the
 system back up. I will use it for my work today and keep an eye on the
 time-keeping. So far it's working fine.

 Thanks for your help.

 /Don

 >
 > You can also try other available timecounters (try ichlpcib0 or i8254
 > for exampe).
 >
 > --
 > Manuel Bouyer <bouyer@antioche.eu.org>
 > =A0 =A0 NetBSD: 26 ans d'experience feront toujours la difference
 > --

From: Donald Allen <donaldcallen@gmail.com>
To: Manuel Bouyer <bouyer@antioche.eu.org>
Cc: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org, gnats-admin@netbsd.org, 
	netbsd-bugs@netbsd.org
Subject: Re: kern/45626: System time does not advance correctly when noatime
 is specified for /var
Date: Fri, 18 Nov 2011 13:47:41 -0500

 On Fri, Nov 18, 2011 at 10:06 AM, Donald Allen <donaldcallen@gmail.com> wro=
 te:
 > On Fri, Nov 18, 2011 at 9:33 AM, Manuel Bouyer <bouyer@antioche.eu.org> w=
 rote:
 >>
 >> On Fri, Nov 18, 2011 at 01:25:03PM +0000, Donald Allen wrote:
 >> > =A0[...]
 >> > =A0sysctl -a | grep kern.timecounter
 >> >
 >> > =A0which I did:
 >> >
 >> > =A0kern.timecounter.choice =3D TSC(q=3D3000, f=3D67750617510 Hz) clock=
 interrupt(q=3D0,
 >> > =A0f=3D100 Hz) ichlpcib0(q=3D1000, f=3D3579545 Hz) hpet0(q=3D2000, f=
 =3D14318179 Hz)
 >> > =A0ACPI-Fast(q=3D1000, f=3D3579545 Hz) lapic(q=3D-100, f=3D266097187 H=
 z) i8254(q=3D100,
 >> > =A0f=3D1193182 Hz) dummy(q=3D-1000000, f=3D1000000 Hz)
 >> > =A0kern.timecounter.hardware =3D TSC
 >> > =A0kern.timecounter.timestepwarnings =3D 0
 >>
 >> TSC is clearly wrong here, I'm sure you don't have a 67Ghz CPU :)
 >
 > That's certainly true.
 >
 > I've got NetBSD installed next to Slackware Linux on this machine,
 > running the 3.0.8 kernel from kernel.org. I've had no problem with
 > time-keeping in Linux on the machine, so out of curiosity, I brought
 > Linux up and had a look at the dmesg:
 >
 > dca@salome:~$ sudo dmesg | fgrep TSC
 > [ =A0 =A00.000000] Fast TSC calibration failed
 > [ =A0 =A00.000000] TSC: PIT calibration matches HPET. 1 loops
 > [ =A0 =A01.336163] Refined TSC clocksource calibration: 2393.976 MHz.
 >
 > I could be wrong, but that suggests to me that the Linux kernel
 > noticed the TSC oddness and somehow compensated. Perhaps a look at
 > what they're doing is in order.
 >>
 >> >
 >> > =A0He also suggested trying
 >> >
 >> > =A0sysctl -w kern.timecounter.hardware=3Dhpet0
 >> > =A0>
 >> > =A0> and see if that fixes it.
 >> > =A0>
 >> >
 >> > =A0It does:
 >> > [...]
 >> > =A0However, after doing this, the system behaved in odd ways. I had tr=
 ouble
 >> > =A0shutting X down, the system seemed not to be hearing the (USB) keyb=
 oard (I
 >> > =A0couldn't log in after doing ctrl-alt-f2). I finally got it shut dow=
 n and
 >> > =A0rebooted, ending this experiment.
 >> >
 >> > =A0I then decided to try a newer kernel, and installed the kernel from=
  the
 >> > =A011/17 snapshot. Upon booting the system this morning (with all the =
 file
 >> > =A0systems, including /var, mounted async,noatime), I quickly observed=
  the
 >> > =A0clock problem.
 >>
 >> You should try setting
 >> kern.timecounter.hardware=3Dhpet0
 >> in /etc/sysctl.conf, so that the change is applied at boot.
 >> The odd behavior you see may be caused by processes started before
 >> the timecounter change.
 >
 > That makes sense.
 >
 > I've made the change you suggested to /etc/sysctl.conf and brought the
 > system back up. I will use it for my work today and keep an eye on the
 > time-keeping. So far it's working fine.

 The system has been up almost four hours and time-keeping is working
 fine, so your suggestion to switch to another source of ticks has
 fixed my problem.

 I'd also like to mention that I've had this machine for a few years
 now, and, in addition to Linux, I have tried OpenBSD and FreeBSD on it
 (both of which were abandoned for good reasons, which I won't go into
 here). I had no problems with time-keeping with them, either. So
 you've got a couple of BSD references, in addition to Linux, for ideas
 on working around the TSC issue.

 Again, thanks very much for your help.

 /Don

 >
 > Thanks for your help.
 >
 > /Don
 >
 >>
 >> You can also try other available timecounters (try ichlpcib0 or i8254
 >> for exampe).
 >>
 >> --
 >> Manuel Bouyer <bouyer@antioche.eu.org>
 >> =A0 =A0 NetBSD: 26 ans d'experience feront toujours la difference
 >> --
 >

From: christos@zoulas.com (Christos Zoulas)
To: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org, 
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, donaldcallen@gmail.com
Cc: 
Subject: Re: kern/45626: System time does not advance correctly when noatime is specified for /var
Date: Fri, 18 Nov 2011 14:13:11 -0500

 On Nov 18,  6:50pm, donaldcallen@gmail.com (Donald Allen) wrote:
 -- Subject: Re: kern/45626: System time does not advance correctly when noati

 |  I'd also like to mention that I've had this machine for a few years
 |  now, and, in addition to Linux, I have tried OpenBSD and FreeBSD on it
 |  (both of which were abandoned for good reasons, which I won't go into
 |  here). I had no problems with time-keeping with them, either. So
 |  you've got a couple of BSD references, in addition to Linux, for ideas
 |  on working around the TSC issue.
 |  
 |  Again, thanks very much for your help.

 This might help:

 christos

 Index: tsc.c
 ===================================================================
 RCS file: /cvsroot/src/sys/arch/x86/x86/tsc.c,v
 retrieving revision 1.30
 diff -u -u -r1.30 tsc.c
 --- tsc.c	8 Aug 2011 17:00:23 -0000	1.30
 +++ tsc.c	18 Nov 2011 19:11:53 -0000
 @@ -141,6 +141,11 @@
  		    (long long)tsc_drift_observed);
  		tsc_timecounter.tc_quality = -100;
  		safe = false;
 +	} else if (tsc_freq > 16ULL * 1024 * 1024 * 1024) {
 +		aprint_error("ERROR: TSC reported %llu Hz; frequency too high\n",
 +		    (unsigned long long)tsc_freq);
 +		tsc_timecounter.tc_quality = -100;
 +		safe = false;
  	}

  	if (tsc_freq != 0) {

From: Simon Nicolussi <sinic@sinic.name>
To: gnats-bugs@NetBSD.org
Cc: Bernd Ernesti <netbsd@lists.veego.de>,
	Manuel Bouyer <bouyer@antioche.eu.org>,
	Christos Zoulas <christos@zoulas.com>
Subject: Re: kern/45626
Date: Wed, 25 Jun 2014 00:52:27 +0200

 --ZGiS0Q5IWpPtfppv
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline
 Content-Transfer-Encoding: quoted-printable

 It has been a while since this report has last been touched, but maybe
 someone's still interested in my findings:

 I was seeing the same issue on an IBM ThinkCentre and didn't spot any
 obvious errors in the code where the TSC frequency is determined, but
 noticed that i8254_delay is sometimes taking some fraction of a second
 longer than it should take (as reflected in the TSC), despite waiting
 the correct number of i8254 ticks.

 SMM is the first thing that comes to mind as the reason for such weird
 behaviour. If a system receives SMIs right when executing i8254_delay,
 it might cause the two counters (i8254 and TSC) to diverge. One common
 application of SMM is USB keyboard support for real-mode operation [1]
 (e.g., for the BIOS). I'm fortunate enough to have a knob for that in
 my BIOS, so I disabled USB support there and, lo and behold, the bug
 no longer occurs.

 [1] http://blogs.msdn.com/b/carmencr/archive/2005/09/01/459194.aspx

 --=20
 Simon Nicolussi, <sinic@sinic.name>
 http://www.sinic.name/

 --ZGiS0Q5IWpPtfppv
 Content-Type: application/pgp-signature

 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (OpenBSD)

 iQEcBAEBAgAGBQJTqgEnAAoJELFnvrFg/L45B3kH/jEj4mKciZbHJYJTEZB2rcNx
 5vnjHMm5BCuZtlyDA4wpxGc7G6/dEWFHl9Y8gCycS8fZnsp2aTfhKKWfxYta020e
 Be1pVwQDFRhv/F1SCjdcfr1tMwOD8iIRs8VyrCq3PTFmSr+Glb36BZlED+tVbZ5z
 OEi9abiGn2hWAfeWilv4abs/O7XwUmziSTuwdx8lqy8YIW6FiN6SZ3eV1jy7Xf67
 aqIvbsHz3WZRBt+s30xTTRH0Umy6TfYOwZNqgL7gE+ZIwh28QUGFQdZ6sLcFxyCL
 FtCsOC7aQMl0Mi/3u48UodgFD0cX0oknIElZL+MWO5YvjitnWzwGRZzVSH2FQ6A=
 =anGL
 -----END PGP SIGNATURE-----

 --ZGiS0Q5IWpPtfppv--

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.