NetBSD Problem Report #56842

From www@netbsd.org  Mon May 16 18:40:44 2022
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 082CA1A921F
	for <gnats-bugs@gnats.NetBSD.org>; Mon, 16 May 2022 18:40:44 +0000 (UTC)
Message-Id: <20220516184042.5AF2C1A923A@mollari.NetBSD.org>
Date: Mon, 16 May 2022 18:40:42 +0000 (UTC)
From: jspath55@gmail.com
Reply-To: jspath55@gmail.com
To: gnats-bugs@NetBSD.org
Subject: Cron hangs on Raspberry Pi Zero 2W
X-Send-Pr-Version: www-1.0

>Number:         56842
>Category:       port-arm
>Synopsis:       Cron hangs on Raspberry Pi Zero 2W
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    port-arm-maintainer
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon May 16 18:45:00 +0000 2022
>Closed-Date:    Fri Jul 22 18:14:18 +0000 2022
>Last-Modified:  Fri Jul 22 18:14:18 +0000 2022
>Originator:     Jim Spath
>Release:        NetBSD 9.2_STABLE
>Organization:
>Environment:
System: NetBSD n0b 9.2_STABLE NetBSD 9.2_STABLE (GENERIC) #0: Mon Apr 25 12:39:27 UTC 2022 mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/evbarm/compile/GENERIC evbarm

>Description:
New install of NetBSD 9.2 (stable) on a Raspberry Pi Zero 2W.
Followed install from: https://mail-index.netbsd.org/port-arm/2022/02/14/msg007592.html

I have a shell script that should run every minute. After several days of uptime, I noticed the cron commands were not being processed.

I tried to start and stop the cron process using /etc/rc.d script, and that worked to resume processing for a little while.

I can find no obvious error messages in /var/log.

If I run a command to show a crontab listing, that works and the attempt is logged:

= =
$ crontab -u _httpd -l

#
[...]

$ tail /var/log/cron
[...]
May 15 09:21:00 n0b cron[17560]: (_httpd) CMD FINISH (/usr/local/www/bin/graph-cputemp.sh >>/usr/local/www/logs/crontab-graph.log 2>>/usr/local/www/logs/crontab-graph.err)
May 16 18:25:30 n0b crontab[9766]: (root) LIST (_httpd)
= = =
ls -l  /usr/local/www/logs/crontab-graph.???
-rw-r--r--  1 _httpd  _httpd     684 Apr 28 00:40 /usr/local/www/logs/crontab-graph.err
-rw-r--r--  1 _httpd  _httpd  695088 May 15 09:21 /usr/local/www/logs/crontab-graph.log
= = =


The system has a USB Ethernet adapter connected, and is otherwise a stock Pi Zero 2W.

The logrotate pkg is set up in cron also:

# Thu Apr 28 00:16:45 UTC 2022
0       0       *       *       *       /usr/pkg/sbin/logrotate /usr/pkg/etc/logrotate.conf

No email from daily root cron jobs since May 4, 2022.


The dmesg output is replicated here:

https://jspath55.blogspot.com/2022/04/raspberry-pi-zero-2-w-netbsd-dmesg-text.html




>How-To-Repeat:
Unsure how to repeat elsewhere.
Issue recurred after a reboot.
>Fix:
Unknown.

>Release-Note:

>Audit-Trail:
From: "David H. Gutteridge" <david@gutteridge.ca>
To: Gnats Bugs <gnats-bugs@netbsd.org>
Cc: 
Subject: Re: port-arm/56842: Cron hangs on Raspberry Pi Zero 2W
Date: Thu, 19 May 2022 21:57:05 -0400

 Hello,

 FWIW, I haven't seen this issue running older NetBSD releases
 (presently 8.0_STABLE) on an "old" Raspberry Pi B+. Granted, I don't
 run anything every minute. That machine has been running for years
 without a hiccup (other than power outages). I could try upgrading it
 to 9.2_STABLE and see if I can replicate this. A few thoughts off the
 top of my head follow.

 If you look at the system after the point cron seems to have stopped
 working, what does ps(1) tell you about the state of cron? What's the
 system load at the time? (I'm assuming you have a standard CRON_WITHIN
 value like 7200 and the machine isn't under an incredibly high load all
 the time, as that seems very unlikely.)

 You might try enabling extra debugging information with the -x option.
 Have you found any core files from cron?

 What happens if you disable particular cron entries, like the script
 meant to run every minute?

 Regards,

 Dave

From: Jim Spath <jspath55@gmail.com>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-arm/56842
Date: Fri, 20 May 2022 09:20:51 -0400

 --000000000000aaee7005df715c67
 Content-Type: text/plain; charset="UTF-8"
 Content-Transfer-Encoding: quoted-printable

 Dave:

 Thank you for the feedback. I have NetBSD running also on a Pi3 and a Pi4;
 this is the first time getting a Zero 2W working. The other systems are
 running current:

 NetBSD [pi3] 9.99.82 NetBSD 9.99.82 (GENERIC64) #0: Tue Apr 27 05:40:29 UTC
 2021 mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/evbarm/compile/GENERIC64
 evbarm

 NetBSD [pi4] 9.99.93 NetBSD 9.99.93 (GENERIC64) #0: Sun Jan 2 23:46:21 UTC
 2022 mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/evbarm/compile/GENERIC64
 evbarm

 Neither of those, nor an earlier 9.2 system have shown cron hangs; I have
 the identical script running.

 Your questions:

 - what does ps(1) tell you about the state of cron?

 Nothing useful to me yet; but see below for results from top.


 USER      PID %CPU %MEM   VSZ   RSS TTY   STAT STARTED    TIME COMMAND

 root    18233  0.0  0.4  6728  1824 ?     Ss   Mon05PM 0:00.00
 /usr/sbin/cron

 - You might try enabling extra debugging information with the -x option.

 I tried one iteration with debug flags and captured logs but saw nothing
 useful there.

 - What happens if you disable particular cron entries, like the script
 meant to run every minute?

 I will try lowering the frequency, after doing a reboot and seeing if/when
 the issue recurs. It seems this might be a =E2=80=9Cslow leak=E2=80=9D that=
  will take
 patience to track.

 I investigated further and found hangs on both top and vmstat, at varying
 times.

 For vmstat, the first line (summary) is returned, but then nothing:

 n0b:jim> date ; vmstat 1 10

 Tue May 17 13:09:14 UTC 2022

 procs memory page disks faults cpu

 r b avm fre flt re pi po fr sr l0 n0 in sy cs us sy id

 1 0 304608 88784 23 0 0 0 0 0 0 0 8882 44 14 0 1 99

 ^C

 n0b:jim> date

 Tue May 17 13:09:45 UTC 2022

 That stall is inconsistent though, as the results today are nominal:

 n0b:jim> date

 Fri May 20 12:58:26 UTC 2022

 n0b:jim> vmstat 1 3

 procs memory page disks faults cpu

 r b avm fre flt re pi po fr sr l0 n0 in sy cs us sy id

 1 0 310320 82568 22 0 0 0 0 0 0 0 8870 43 13 0 1 99

 0 0 310320 82568 0 0 0 0 0 0 0 0 8826 32 11 0 1 99

 0 0 310320 82568 0 0 0 0 0 0 0 0 8902 30 10 0 1 99

 n0b:jim> date

 Fri May 20 12:58:36 UTC 2022

 n0b:jim>

 The top command starts up, displays some data, but then does not refresh.
 The data are incomplete (values are all 0):

 load averages: 0.01, 0.02, 0.00; up 11+21:37:48 13:06:53

 46 processes: 44 sleeping, 2 on CPU

 CPU0 states: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 0.0% idle

 CPU1 states: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 0.0% idle

 CPU2 states: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 0.0% idle

 CPU3 states: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 0.0% idle

 Memory: 298M Act, 104K Inact, 12M Wired, 15M Exec, 259M File, 86M Free

 Swap:


 Like vmstat, top worked later (except one core shows all zeroes).

 load averages: 0.01, 0.02, 0.00; up 14+21:32:34 13:01:39

 50 processes: 48 sleeping, 2 on CPU

 CPU0 states: 0.0% user, 0.0% nice, 0.0% system, 1.6% interrupt, 98.4% idle

 CPU1 states: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle

 CPU2 states: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 0.0% idle

 CPU3 states: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle

 Memory: 303M Act, 96K Inact, 12M Wired, 15M Exec, 262M File, 80M Free

 Swap:

 However, cron commands have not run since.

 My next steps will be:

 1. Reboot, taking note of initial state

 2. Try adding a swap device (have seen some odd Pi behavior with 0 swap)

 3. Decrease the cron job frequency


 Jim

 --000000000000aaee7005df715c67
 Content-Type: text/html; charset="UTF-8"
 Content-Transfer-Encoding: quoted-printable

 <div dir=3D"ltr">
 =09
 =09

 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent">Dave=
 :</p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent">Than=
 k you for the
 feedback. I have NetBSD running also on a Pi3 and a Pi4; this is the
 first time getting a Zero 2W working. The other systems are running
 current:</p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent"><spa=
 n style=3D"background-color:transparent">NetBSD [pi3] 9.99.82
 NetBSD 9.99.82 (GENERIC64) #0: Tue Apr 27 05:40:29 UTC 2021=20
 mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/evbarm/compile/GENERIC64
 evbarm</span><br></p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent"><spa=
 n style=3D"background-color:transparent">NetBSD [pi4] 9.99.93
 NetBSD 9.99.93 (GENERIC64) #0: Sun Jan  2 23:46:21 UTC 2022=20
 mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/evbarm/compile/GENERIC64
 evbarm</span><br></p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent"><spa=
 n style=3D"background-color:transparent">Neither of those,
 nor an earlier 9.2 system have shown cron hangs; I have the identical scrip=
 t running.</span><br></p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent"><spa=
 n style=3D"background-color:transparent">Your questions:</span><br></p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent"><spa=
 n style=3D"background-color:transparent">- </span><span style=3D"background=
 -color:transparent;font-variant-numeric:normal;font-variant-east-asian:norm=
 al"><font color=3D"#222222"><font face=3D"Arial, Helvetica, sans-serif"><fo=
 nt style=3D"font-size:12pt">what
 does ps(1) tell you about the state of cron?</font></font></font></span><br=
 ></p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent"><spa=
 n style=3D"background-color:transparent">Nothing useful to me yet;
 but see below for results from top.</span><br></p><p style=3D"line-height:1=
 00%;margin-bottom:0in;background:transparent"><span style=3D"background-col=
 or:transparent"><br></span></p><p style=3D"line-height:100%;margin-bottom:0=
 in;background:transparent">USER =C2=A0 =C2=A0 =C2=A0PID %CPU %MEM =C2=A0 VS=
 Z =C2=A0 RSS TTY =C2=A0 STAT STARTED =C2=A0 =C2=A0TIME COMMAND<br></p><p st=
 yle=3D"line-height:100%;margin-bottom:0in;background:transparent">root =C2=
 =A0 =C2=A018233 =C2=A00.0 =C2=A00.4 =C2=A06728 =C2=A01824 ? =C2=A0 =C2=A0 S=
 s =C2=A0 Mon05PM 0:00.00 /usr/sbin/cron=C2=A0<br></p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent"><spa=
 n style=3D"font-size:12pt;background-color:transparent">-
 You might try enabling extra debugging information with the -x
 option.</span><br></p><p style=3D"line-height:100%;margin-bottom:0in;backgr=
 ound:transparent"><span style=3D"background-color:transparent">I tried one
 iteration with debug flags and captured logs but saw nothing useful
 there.</span><br></p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent"><spa=
 n style=3D"background-color:transparent">- </span><span style=3D"background=
 -color:transparent;font-variant-numeric:normal;font-variant-east-asian:norm=
 al"><font color=3D"#222222"><font face=3D"Arial, Helvetica, sans-serif"><fo=
 nt style=3D"font-size:12pt">What
 happens if you disable particular cron entries, like the script meant
 to run every minute?</font></font></font></span><br></p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent"><spa=
 n style=3D"background-color:transparent">I will try lowering
 the frequency, after doing a reboot and seeing if/when the issue
 recurs. It seems this might be a =E2=80=9Cslow leak=E2=80=9D that will take
 patience to track.</span><br></p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent"><spa=
 n style=3D"background-color:transparent">I investigated
 further and found hangs on both top and vmstat, at varying times.</span><br=
 ></p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent"><spa=
 n style=3D"background-color:transparent">For vmstat, the
 first line (summary) is returned, but then nothing:</span><br></p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent"><spa=
 n style=3D"background-color:transparent">n0b:jim&gt; date ;
 vmstat 1 10</span><br></p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent">Tue =
 May 17 13:09:14
 UTC 2022</p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent"> pro=
 cs    memory   =20
  page                       disks   faults      cpu</p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent"> r b=
       avm    fre
  flt  re  pi   po   fr   sr l0 n0   in   sy  cs us sy id</p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent"> 1 0=
    304608  88784
   23   0   0    0    0    0  0  0 8882   44  14  0  1 99</p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent">^C</=
 p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent">n0b:=
 jim&gt; date</p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent">Tue =
 May 17 13:09:45
 UTC 2022</p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent"><spa=
 n style=3D"background-color:transparent">That stall is
 inconsistent though, as the results today are nominal:</span><br></p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent"><spa=
 n style=3D"background-color:transparent">n0b:jim&gt; date</span><br></p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent">Fri =
 May 20 12:58:26
 UTC 2022</p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent">n0b:=
 jim&gt; vmstat 1
 3</p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent"> pro=
 cs    memory   =20
  page                       disks   faults      cpu</p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent"> r b=
       avm    fre
  flt  re  pi   po   fr   sr l0 n0   in   sy  cs us sy id</p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent"> 1 0=
    310320  82568
   22   0   0    0    0    0  0  0 8870   43  13  0  1 99</p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent"> 0 0=
    310320  82568
    0   0   0    0    0    0  0  0 8826   32  11  0  1 99</p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent"> 0 0=
    310320  82568
    0   0   0    0    0    0  0  0 8902   30  10  0  1 99</p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent">n0b:=
 jim&gt; date</p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent">Fri =
 May 20 12:58:36
 UTC 2022</p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent">n0b:=
 jim&gt;=20
 </p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent"><spa=
 n style=3D"background-color:transparent">The top command
 starts up, displays some data, but then does not refresh. The data
 are incomplete (values are all 0):</span><br></p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent"><spa=
 n style=3D"background-color:transparent">load averages:=20
 0.01,  0.02,  0.00;               up 11+21:37:48                    =20
        13:06:53</span><br></p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent">46 p=
 rocesses: 44
 sleeping, 2 on CPU</p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent">CPU0=
  states:  0.0%
 user,  0.0% nice,  0.0% system,  0.0% interrupt,  0.0% idle</p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent">CPU1=
  states:  0.0%
 user,  0.0% nice,  0.0% system,  0.0% interrupt,  0.0% idle</p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent">CPU2=
  states:  0.0%
 user,  0.0% nice,  0.0% system,  0.0% interrupt,  0.0% idle</p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent">CPU3=
  states:  0.0%
 user,  0.0% nice,  0.0% system,  0.0% interrupt,  0.0% idle</p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent">Memo=
 ry: 298M Act,
 104K Inact, 12M Wired, 15M Exec, 259M File, 86M Free</p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent">Swap=
 :=20
 </p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent"><br>

 </p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent">Like=
  vmstat, top
 worked later (except one core shows all zeroes).</p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent"><spa=
 n style=3D"background-color:transparent">load averages:=20
 0.01,  0.02,  0.00;               up 14+21:32:34       13:01:39</span><br><=
 /p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent">50 p=
 rocesses: 48
 sleeping, 2 on CPU</p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent">CPU0=
  states:  0.0%
 user,  0.0% nice,  0.0% system,  1.6% interrupt, 98.4% idle</p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent">CPU1=
  states:  0.0%
 user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle</p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent">CPU2=
  states:  0.0%
 user,  0.0% nice,  0.0% system,  0.0% interrupt,  0.0% idle</p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent">CPU3=
  states:  0.0%
 user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle</p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent">Memo=
 ry: 303M Act,
 96K Inact, 12M Wired, 15M Exec, 262M File, 80M Free</p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent">Swap=
 :</p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent"><spa=
 n style=3D"background-color:transparent">However, cron
 commands have </span>not run<span style=3D"background-color:transparent"> s=
 ince.</span><br></p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent"><spa=
 n style=3D"background-color:transparent">My next steps will
 be:</span><br></p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent"><spa=
 n style=3D"background-color:transparent">1. Reboot, taking
 note of initial state</span><br></p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent">2. T=
 ry adding a swap
 device (have seen some odd Pi behavior with 0 swap)</p>
 <p style=3D"line-height:100%;margin-bottom:0in;background:transparent">3. D=
 ecrease the cron
 job frequency</p><p style=3D"line-height:100%;margin-bottom:0in;background:=
 transparent"><br></p><p style=3D"line-height:100%;margin-bottom:0in;backgro=
 und:transparent">Jim</p></div>

 --000000000000aaee7005df715c67--

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-arm/56842
Date: Fri, 20 May 2022 23:21:08 +0000

 On Fri, May 20, 2022 at 01:25:01PM +0000, Jim Spath wrote:
  >  USER      PID %CPU %MEM   VSZ   RSS TTY   STAT STARTED    TIME COMMAND
  >  
  >  root    18233  0.0  0.4  6728  1824 ?     Ss   Mon05PM 0:00.00
  >  /usr/sbin/cron

 ps -l might be interesting (it prints the WCHAN) but more likely not.
 If other programs have similar problems, it's more likely not cron
 itself.

 I have no idea what the situation with timecounters on this hw is, but
 if there's more than one option it might be interesting to try a
 different one. (And, relatedly: is the system time behaving normally?)

 -- 
 David A. Holland
 dholland@netbsd.org

From: "David H. Gutteridge" <david@gutteridge.ca>
To: Gnats Bugs <gnats-bugs@netbsd.org>
Cc: 
Subject: Re: port-arm/56842: Cron hangs on Raspberry Pi Zero 2W
Date: Sat, 21 May 2022 19:22:44 -0400

 It seems there's a general issue with both interactive and daemonized
 processes not running as expected. Another thing to ask, then: have you
 tried a -current kernel to see if there's any difference?

 Dave

From: Jim Spath <jspath55@gmail.com>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-arm/56842
Date: Mon, 23 May 2022 13:17:50 -0400

 Thank you for the tips. Answering several:

 > ps -l

 While I don't see anything obvious in the output from "ps -l" I did
 observe several processes that cron started. These should have
 finished quickly but I can't tell immediately why they stalled. It
 does give me ideas for other tests from cron. I've captured the output
 in a log to compare with future states.

 > timecounters on this hw?

 I don't understand this suggestion, sorry. I did find a fascinating
 page from 2006 on porting NetBSD to a new ARM SoC:

 https://www.netbsd.org/docs/kernel/porting_netbsd_arm_soc.html

 Most of those details are beyond my ken, however.
 The dmesg output shows:

 [     1.000000] timecounter: Timecounters tick every 10.000 msec
 [     1.000000] timecounter: Timecounter "armgtmr0" frequency 19200000
 Hz quality 500
 [     1.000003] timecounter: Timecounter "clockinterrupt" frequency
 100 Hz quality 0

 My Pi3 running NetBSD shows the same values.

 > system time?

 The ntp daemon looked normal, but then I saw this:

 May  3 11:32:01 n0b ntpd[436]: kernel reports TIME_ERROR: 0x41: Clock
 Unsynchronized

 After reboot, that message reappeared, as well as similar messages
 that I didn't spot before. On an unrelated note, I wish I could find a
 way to stop ntpd from seeking IPV6 hosts, as my ISP doesn't support
 that path. Just wastes time not getting responses.

 The ntpdate output seems OK.
 $ ntpdate 2.netbsd.pool.ntp.org
 23 May 17:07:03 ntpdate[497]: adjust time server 192.227.183.3 offset
 -0.014973 sec

 > newer kernel?

 The image I'm running is the first NetBSD version I've found that will
 run on the 02W. I will search for the steps to install a current
 build, on different media so I can preserve the (mostly) working
 install.

 Thank you for the suggestions. I rebooted the system today. Alas, it
 hit errors that required fsck to resolve as it did not halt cleanly
 after a shutdown request, necessitating a power-off. A partial list of
 recovered files:
 - /var/db/entroy-file
 - /var/log/cron

 I found the _httpd user crontab file was corrupted, so I reinstalled
 that. The /var/log/cron file was removed by fsck cleanup and I reset
 that also. I think the entropy-file self-corrected after reboots.

 Did not see this before (uncertain if it had not been flushed to disk
 or I overlooked)

 # ls -l /var/cron
 -rw-------  1 root  wheel  415424 May 15 09:02 cron.core

 I will report in a couple days one way or the other. The cron jobs are
 running now.

From: Jim Spath <jspath55@gmail.com>
To: gnats-bugs@netbsd.org
Cc: 
Subject: re: port-arm/56842
Date: Thu, 9 Jun 2022 09:42:42 -0400

 My original install sd card is now corrupt/suspect after a power cycle
 required fsck which then failed to recover some files.

 Looking back, I found errors related to the USB/ethernet adapter:

 May  3 10:50:38 n0b /netbsd: [ 58744.3697748] ure0: autoconfiguration
 error: watchdog timeout
 May  3 10:50:48 n0b /netbsd: [ 58754.3704177] ure0: autoconfiguration
 error: usb error on tx: TIMEOUT
 ...
 May  5 15:15:10 n0b dhcpcd[153]: ure0: dhcp_sendudp: Host is down
 May  5 15:15:10 n0b dhcpcd[153]: ure0: bpf_send: No buffer space available
 ...
 May 27 00:58:39 n0b dhcpcd[153]: ure0: bpf_send: No buffer space available
 ...
 Jun  4 21:25:03 n0b dhcpcd[260]: ure0: bpf_send: No buffer space available

 The first adapter:
 Jun  4 21:27:21 n0b /netbsd: [   7.0539144] ure0: Realtek (0xbda) USB
 10/100/1000 LAN (0x8153), rev 2.10/31.00, addr 4

 The second adapter:
 Jun  4 23:16:43 n0b /netbsd: [ 6554.5805440] ure0: Realtek (0xbda) USB
 10/100 LAN (0x8152), rev 2.10/20.00, addr 4

 I installed the same 9.2 stable image on a second Pi, with the same
 10/100 LAN adapter model. So far, no cron hangs on either. I will let
 the new system run for a few more days, then switch to the suspect
 adapter and see if the problem recurs. It seems feasible that network
 driver errors could be a root cause.

 Strangely, both adapter models say "Gigabit LAN" on the case, but the
 one that connected at 1000BT had errors, while the 100BT connection
 does not. And, the model that only connected at 100 works at 1000 on a
 PC.

 (apologies for missending this to gnats-admin first)

From: Jim Spath <jspath55@gmail.com>
To: gnats-bugs@netbsd.org
Cc: 
Subject: re: port-arm/56842
Date: Thu, 21 Jul 2022 18:05:27 -0400

 I would like to update this problem report. I have been unable to
 reproduce the original issue with cron jobs. The tests I have done
 with a second Pi and 2 different types of ethernet adapters leads me
 to believe the root cause is one of the adapters not behaving
 properly. The correctly working adapter runs at 1000BT, while the
 suspect adapter drops to 100BT, and sometimes stops working
 altogether, showing various symptoms. If I can isolate this adapter
 issue further I will open a new PR.
 Thank you to those who gave me feedback.

State-Changed-From-To: open->closed
State-Changed-By: gutteridge@NetBSD.org
State-Changed-When: Fri, 22 Jul 2022 18:14:18 +0000
State-Changed-Why:
Closing ticket, per submitter. Thanks for your efforts investigating this!

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.