NetBSD Problem Report #13399

Received: (qmail 3204 invoked from network); 7 Jul 2001 04:22:02 -0000
Message-Id: <200107070425.f674P3i00381@tardis.local>
Date: Sat, 7 Jul 2001 00:25:03 -0400 (EDT)
From: entropy@tappedin.com
Reply-To: entropy@tappedin.com
To: gnats-bugs@gnats.netbsd.org
Subject: Bugs in post-1.5 memory probe
X-Send-Pr-Version: 3.95

>Number:         13399
>Category:       port-i386
>Synopsis:       Bugs in post-1.5 memory probe
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    port-i386-maintainer
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Jul 07 04:23:01 +0000 2001
>Closed-Date:    Sun Oct 20 10:37:50 +0000 2002
>Last-Modified:  Sat Nov 22 16:35:10 +0000 2008
>Originator:     
>Release:        <NetBSD-current source date>20010706
>Organization:
entropy -- it's not just a good idea, it's the second law.
>Environment:

System: NetBSD tardis.local 1.5W NetBSD 1.5W (TARDIS) #8: Sat Jul 7 00:04:13 EDT 2001 entropy@zippy.local:/usr/src/sys/arch/i386/compile/TARDIS i386
Architecture: i386
Machine: i386
>Description:


When booting post-1.5 kernels on an old machine, I got crashes immediately
after startup.  The following warnings were printed:

>How-To-Repeat:


Try to boot -current on a DECpc XL 590.

>Fix:


The following patch fixes several bugs in the post-1.5 BIOS memory probes.

- If the BIOS reports the same memory cluster multiple times, only allocate
  the extent once.

- If we fail to allocate an extent, don't add it to mem_clusters and don't
  increment mem_cluster_cnt.

- When loading the physical extents, make sure we don't try to add an
  extent with zero length (seg_start == tmp)


Index: machdep.c
===================================================================
RCS file: /cvsroot/syssrc/sys/arch/i386/i386/machdep.c,v
retrieving revision 1.446
diff -u -u -r1.446 machdep.c
--- machdep.c	2001/06/19 15:54:48	1.446
+++ machdep.c	2001/07/07 04:03:48
@@ -2144,6 +2144,7 @@
 	u_int32_t type;
 {
 	extern struct extent *iomem_ex;
+	int i;

 	if (seg_end > 0x100000000ULL) {
 		printf("WARNING: skipping large "
@@ -2165,6 +2166,17 @@
 	if (seg_end <= seg_start)
 		return;

+	for (i = 0; i < mem_cluster_cnt; i++) {
+		if ((mem_clusters[i].start == round_page(seg_start))
+		    && (mem_clusters[i].size
+			    == trunc_page(seg_end) - mem_clusters[i].start)) {
+#ifdef DEBUG_MEMLOAD
+			printf("WARNING: skipping duplicate segment entry\n");
+#endif
+			return;
+		}
+	}
+
 	/*
 	 * Allocate the physical addresses used by RAM
 	 * from the iomem extent map.  This is done before
@@ -2179,6 +2191,7 @@
 		    "(0x%qx/0x%qx/0x%x) FROM "
 		    "IOMEM EXTENT MAP!\n",
 		    seg_start, seg_end - seg_start, type);
+		return;
 	}

 	/*
@@ -2331,6 +2344,10 @@
 	}
 #endif /* ! REALBASEMEM && ! REALEXTMEM */

+#ifdef DEBUG_MEMLOAD
+	printf("mem_cluster_count: %d\n", mem_cluster_cnt);
+#endif
+
 	/*
 	 * If the loop above didn't find any valid segment, fall back to
 	 * former code.
@@ -2449,14 +2466,17 @@
 					tmp = (16 * 1024 * 1024);
 				else
 					tmp = seg_end;
+
+				if (tmp != seg_start) {
 #if DEBUG_MEMLOAD
-				printf("loading 0x%qx-0x%qx (0x%lx-0x%lx)\n",
+				  printf("loading 0x%qx-0x%qx (0x%lx-0x%lx)\n",
 				    seg_start, tmp,
 				    atop(seg_start), atop(tmp));
 #endif
-				uvm_page_physload(atop(seg_start),
+				  uvm_page_physload(atop(seg_start),
 				    atop(tmp), atop(seg_start),
 				    atop(tmp), first16q);
+				}
 				seg_start = tmp;
 			}

@@ -2482,14 +2502,16 @@
 					tmp = (16 * 1024 * 1024);
 				else
 					tmp = seg_end1;
+				if (tmp != seg_start1) {
 #if DEBUG_MEMLOAD
-				printf("loading 0x%qx-0x%qx (0x%lx-0x%lx)\n",
+				  printf("loading 0x%qx-0x%qx (0x%lx-0x%lx)\n",
 				    seg_start1, tmp,
 				    atop(seg_start1), atop(tmp));
 #endif
-				uvm_page_physload(atop(seg_start1),
+				  uvm_page_physload(atop(seg_start1),
 				    atop(tmp), atop(seg_start1),
 				    atop(tmp), first16q);
+				}
 				seg_start1 = tmp;
 			}

>Release-Note:
>Audit-Trail:

From: Masanori Kanaoka <kanaoka@ann.hi-ho.ne.jp>
To: entropy@tappedin.com
Cc: gnats-bugs@gnats.netbsd.org, kanaoka@ann.hi-ho.ne.jp
Subject: Re: port-i386/13399
Date: Mon, 09 Jul 2001 21:42:44 +0900 (JST)

 Hi,

 I would like to understand your problem and your fixed patch.

 $ The following patch fixes several bugs in the post-1.5 BIOS memory probes.
 $ 
 $ - If the BIOS reports the same memory cluster multiple times, only allocate
 $   the extent once.
 $ 
 $ - If we fail to allocate an extent, don't add it to mem_clusters and don't
 $  increment mem_cluster_cnt.
 $
 $ - When loading the physical extents, make sure we don't try to add an
 $   extent with zero length (seg_start == tmp)

 At first part,Indeed. I also think so.
 But second part, I think that it does not fail to allocate extent,
 because your first part fixed it.

 And at 3-rd part,I don't understand it.Would you explain about it?

 At last,How part need to fix your problem?
 all part,or some of them.I guess it need first part.

 Would you please teach me about it?
 Regards!
 ---
  Masanori Kanaoka	kanaoka@ann.hi-ho.ne.jp


From: maximum entropy <entropy@tappedin.com>
To: kanaoka@ann.hi-ho.ne.jp
Cc: gnats-bugs@gnats.netbsd.org, kanaoka@ann.hi-ho.ne.jp
Subject: Re: port-i386/13399
Date: Mon, 9 Jul 2001 13:28:48 -0400 (EDT)

 >Recebido-para: <entropy@tappedin.com>
 >Date: Mon, 09 Jul 2001 21:42:44 +0900 (JST)
 >From: Masanori Kanaoka <kanaoka@ann.hi-ho.ne.jp>
 >Cc: gnats-bugs@gnats.netbsd.org, kanaoka@ann.hi-ho.ne.jp
 >Content-type: Text/Plain; charset=us-ascii
 >
 >Hi,
 >
 >I would like to understand your problem and your fixed patch.
 >
 >$ The following patch fixes several bugs in the post-1.5 BIOS memory probes.
 >$ 
 >$ - If the BIOS reports the same memory cluster multiple times, only allocate
 >$   the extent once.
 >$ 
 >$ - If we fail to allocate an extent, don't add it to mem_clusters and don't
 >$  increment mem_cluster_cnt.
 >$
 >$ - When loading the physical extents, make sure we don't try to add an
 >$   extent with zero length (seg_start == tmp)
 >
 >At first part,Indeed. I also think so.
 >But second part, I think that it does not fail to allocate extent,
 >because your first part fixed it.
 >
 >And at 3-rd part,I don't understand it.Would you explain about it?
 >
 >At last,How part need to fix your problem?
 >all part,or some of them.I guess it need first part.
 >
 >Would you please teach me about it?
 >Regards!

 Correct, the second part isn't really needed with the first part in
 place.  But I believe good error handling is always a good idea, and
 I'm certain that if an extent isn't allocated correctly it shouldn't
 become part of the configuration of the running system.  So that part
 of the patch handles that situation more gracefully.  The comments in
 the existing code say:
                 /* XXX What should we do? */
 And I think it's obvious that what we should do in this case is skip
 the segment by returning immediately.

 The third part was really the key to getting this machine running.
 What was happening was that the final segment listed by my BIOS was:
 seg_start = 0x1000000, seg_end = 0x3000000.  So the piece of code
 where it adjusts the segment end would go like this:

                 /* First hunk */
                 if (seg_start != seg_end) {
 /* seg_start == 0x1000000, seg_end = 0x3000000 */
                         if (seg_start <= (16 * 1024 * 1024) &&
 /* seg_start == (16*1024*1024) so this tests true */
                             first16q != VM_FREELIST_DEFAULT) {
 /* Have 48M memory, so first16q = VM_FREELIST_FIRST16 from earlier in
    the code, so this tests true */
                                 u_int64_t tmp;

                                 if (seg_end > (16 * 1024 * 1024))
 /* seg_end == 0x3000000, so this tests true */			
                                         tmp = (16 * 1024 * 1024);
 /* tmp = 0x1000000, which is the same value as seg_start */
                                 else
                                         tmp = seg_end;


 /* Now we come to the trouble...the following test I added to avoid
    the trouble by falling into the next block, which will correctly
    add this segment onto the default freelist.  But without this
    test... */
                                 if (tmp != seg_start) {
 #if DEBUG_MEMLOAD
                                   printf("loading 0x%qx-0x%qx (0x%lx-0x%lx)\n",
                                     seg_start, tmp,
                                     atop(seg_start), atop(tmp));
 #endif
                                   uvm_page_physload(atop(seg_start),
                                     atop(tmp), atop(seg_start),
                                     atop(tmp), first16q);
 /* ...we just called uvm_page_physload(0x100, 0x100, 0x100, 0x100,
    first16q) adding a zero-length page...and that caused the system to
    cerash.  */

                                 }
 /* With the "if (tmp != seg_start)" test added above, we end up here
    instead, and then continue on to physload the 0x1000000 - 0x3000000
    region in the next section of code... */
                                 seg_start = tmp;
                         }
                         if (seg_start != seg_end) {
 /* 0x1000000 != 0x3000000 */
 #if DEBUG_MEMLOAD
                                 printf("loading 0x%qx-0x%qx
 (0x%lx-0x%lx)\n",
                                     seg_start, seg_end,
                                     atop(seg_start), atop(seg_end));
 #endif
                                 uvm_page_physload(atop(seg_start),
                                     atop(seg_end), atop(seg_start),
                                     atop(seg_end), VM_FREELIST_DEFAULT);
 /* ...and all is well. */
                         }
                 }

 Finally, there's an analagous section following that for "Second
 hunk", where I fixed the same problem (by inspection).

 Hope that clears it up...

 Cheers,
 entropy

 --
 entropy -- it's not just a good idea, it's the second law.

From: Masanori Kanaoka <kanaoka@ann.hi-ho.ne.jp>
To: entropy@tappedin.com
Cc: gnats-bugs@gnats.netbsd.org, kanaoka@ann.hi-ho.ne.jp
Subject: Re: port-i386/13399
Date: Tue, 10 Jul 2001 09:35:47 +0900 (JST)

 Hi,
        From: maximum entropy <entropy@tappedin.com>
     Subject: Re: port-i386/13399
       Date : Mon, 09 Jul 2001 13:28:48 -0400 (EDT)

 $ The third part was really the key to getting this machine running.

 Thank you for explaining the third part.
 I understood it.
 I think third part is really need patch for -current.

 By the way,I think about the first part of your patch.
 I enumerate two memory cluster relation that BIOS reported.

 memory cluster A(start-a,end-a),memory cluster B(start-b,end-b)

 1. A and B are separate.
      - start-a < end-a < start-b < end-b
      - start-b < end-b < start-a < end-a

 2. A and B are overlap.
      - start-a < start-b  < end-a  < end-b
      - start-a < start-b  < end-b  < end-a
      - start-b < start-a  < end-a  < end-b
      - start-a == start-b < end-a == end-b
      - start-b < start-a  < end-b  < end-a
      - start-a < end-a == start-b < end-b
      .......etc

 The latest memory probe routines assume "A and B are separate".
 But your BIOS reported the same memory cluster multiple times
 that it is special case in "A and B are overlap".The first
 part of your patch fixed it.

 I don't know whether other overlap cases are rare or not.
 What do you think other overlap cases?
 Regards!
 ---
  Masanori Kanaoka	kanaoka@ann.hi-ho.ne.jp


From: maximum entropy <entropy@tappedin.com>
To: kanaoka@ann.hi-ho.ne.jp
Cc: gnats-bugs@gnats.netbsd.org, kanaoka@ann.hi-ho.ne.jp
Subject: Re: port-i386/13399
Date: Mon, 9 Jul 2001 22:01:08 -0400 (EDT)

 >Date: Tue, 10 Jul 2001 09:35:47 +0900 (JST)
 >From: Masanori Kanaoka <kanaoka@ann.hi-ho.ne.jp>
 >
 >The latest memory probe routines assume "A and B are separate".
 >But your BIOS reported the same memory cluster multiple times
 >that it is special case in "A and B are overlap".The first
 >part of your patch fixed it.
 >
 >I don't know whether other overlap cases are rare or not.
 >What do you think other overlap cases?

 I don't know.  I wouldn't worry too much unless a real-world case of
 overlapping (non-identical) regions turns up.

 Cheers,
 entropy

 --
 entropy -- it's not just a good idea, it's the second law.
State-Changed-From-To: open->closed 
State-Changed-By: kanaoka 
State-Changed-When: Sun Oct 20 03:36:20 PDT 2002 
State-Changed-Why:  
I have committed your patch. Thank you for your efforts! 
Please test it. 
From: Christoph Egger <cegger@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/13399 CVS commit: src/sys/arch/amd64/amd64
Date: Wed, 12 Nov 2008 01:33:44 +0000 (UTC)

 Module Name:	src
 Committed By:	cegger
 Date:		Wed Nov 12 01:33:44 UTC 2008

 Modified Files:
 	src/sys/arch/amd64/amd64: machdep.c

 Log Message:
 reduce diff to i386/i386/machdep.c:init386()
 - introduce add_mem_cluster() as done in i386
 - apply rev. 1.480 from i386/i386/machdep.c: fixes PR 17199 for amd64
 - apply rev. 1.492 from i386/i386/machdep.c: fixes PR 13399 for amd64


 To generate a diff of this commit:
 cvs rdiff -r1.106 -r1.107 src/sys/arch/amd64/amd64/machdep.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Soren Jacobsen <snj@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/13399 CVS commit: [netbsd-5] src/sys/arch/amd64/amd64
Date: Tue, 18 Nov 2008 01:48:00 +0000 (UTC)

 Module Name:	src
 Committed By:	snj
 Date:		Tue Nov 18 01:48:00 UTC 2008

 Modified Files:
 	src/sys/arch/amd64/amd64 [netbsd-5]: machdep.c

 Log Message:
 Pull up following revision(s) (requested by cegger in ticket #51):
 	sys/arch/amd64/amd64/machdep.c: revision 1.107
 reduce diff to i386/i386/machdep.c:init386()
 - introduce add_mem_cluster() as done in i386
 - apply rev. 1.480 from i386/i386/machdep.c: fixes PR 17199 for amd64
 - apply rev. 1.492 from i386/i386/machdep.c: fixes PR 13399 for amd64


 To generate a diff of this commit:
 cvs rdiff -r1.102.4.2 -r1.102.4.3 src/sys/arch/amd64/amd64/machdep.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Manuel Bouyer <bouyer@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/13399 CVS commit: [netbsd-3] src/sys/arch/amd64/amd64
Date: Sat, 22 Nov 2008 16:28:43 +0000 (UTC)

 Module Name:	src
 Committed By:	bouyer
 Date:		Sat Nov 22 16:28:43 UTC 2008

 Modified Files:
 	src/sys/arch/amd64/amd64 [netbsd-3]: machdep.c

 Log Message:
 Pull up following revision(s) (requested by cegger in ticket #1979):
 	sys/arch/amd64/amd64/machdep.c: revision 1.106, 1.107 via patch
 cosmetic change (mostly to reduce diff to i386/i386/machdep.c:initi386):
 use PRIx64 rather %qx
 No functional changes intended.
 reduce diff to i386/i386/machdep.c:init386()
 - introduce add_mem_cluster() as done in i386
 - apply rev. 1.480 from i386/i386/machdep.c: fixes PR 17199 for amd64
 - apply rev. 1.492 from i386/i386/machdep.c: fixes PR 13399 for amd64


 To generate a diff of this commit:
 cvs rdiff -r1.31 -r1.31.10.1 src/sys/arch/amd64/amd64/machdep.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Manuel Bouyer <bouyer@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/13399 CVS commit: [netbsd-4] src/sys/arch/amd64/amd64
Date: Sat, 22 Nov 2008 16:33:41 +0000 (UTC)

 Module Name:	src
 Committed By:	bouyer
 Date:		Sat Nov 22 16:33:41 UTC 2008

 Modified Files:
 	src/sys/arch/amd64/amd64 [netbsd-4]: machdep.c

 Log Message:
 Pull up following revision(s) (requested by cegger in ticket #1233):
 	sys/arch/amd64/amd64/machdep.c: revision 1.106, 1.107 via patch
 cosmetic change (mostly to reduce diff to i386/i386/machdep.c:initi386):
 use PRIx64 rather %qx
 No functional changes intended.
 reduce diff to i386/i386/machdep.c:init386()
 - introduce add_mem_cluster() as done in i386
 - apply rev. 1.480 from i386/i386/machdep.c: fixes PR 17199 for amd64
 - apply rev. 1.492 from i386/i386/machdep.c: fixes PR 13399 for amd64


 To generate a diff of this commit:
 cvs rdiff -r1.44.2.4 -r1.44.2.5 src/sys/arch/amd64/amd64/machdep.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

>Unformatted:
 >> NetBSD/i386 BIOS Boot, Revision 2.10
 >> (entropy@zippy.local, Thu Jul  5 05:13:48 EDT 2001)
 >> Memory: 635/48128 k
 Press return to boot now, any other key for boot menu
 booting fd0a:netbsd - starting in 0
 type "?" or "help" for help.
 > boot hd0a:netbsd -s
 booting hd0a:netbsd (howto 0x2)
 4532951+83004+324460 [65+263168+216671]=0x52c6f0
 WARNING: CAN'T ALLOCATE MEMORY SEGMENT (0x100000/0xf00000/0x1) FROM IOMEM EXTENT MAP!
 WARNING: CAN'T ALLOCATE MEMORY SEGMENT (0x1000000/0x2000000/0x1) FROM IOMEM EXTENT MAP!
 (At this point the system resets immediately).

 The machine is a DECpc XL 590 with 48M memory.

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.