NetBSD Problem Report #44410

From Wolfgang.Stukenbrock@nagler-company.com  Tue Jan 18 14:32:53 2011
Return-Path: <Wolfgang.Stukenbrock@nagler-company.com>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by www.NetBSD.org (Postfix) with ESMTP id 13F0B63B883
	for <gnats-bugs@gnats.NetBSD.org>; Tue, 18 Jan 2011 14:32:53 +0000 (UTC)
Message-Id: <20110118143243.7C2791E80CE@test-s0.nagler-company.com>
Date: Tue, 18 Jan 2011 15:32:43 +0100 (CET)
From: Wolfgang.Stukenbrock@nagler-company.com
Reply-To: Wolfgang.Stukenbrock@nagler-company.com
To: gnats-bugs@gnats.NetBSD.org
Subject: policy reference count not decremented correctly on socket close
X-Send-Pr-Version: 3.95

>Number:         44410
>Category:       kern
>Synopsis:       policy reference count not decremented correctly on socket close
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Jan 18 14:35:00 +0000 2011
>Closed-Date:    Mon Feb 21 17:19:53 +0000 2011
>Last-Modified:  Mon Feb 21 17:19:53 +0000 2011
>Originator:     Dr. W. Stukenbrock
>Release:        NetBSD 5.1
>Organization:
Dr. Nagler & Company GmbH
>Environment:


System: NetBSD test-s0 4.0 NetBSD 4.0 (NSW-WS) #0: Tue Aug 17 17:28:09 CEST 2010 wgstuken@test-s0:/usr/src/sys/arch/amd64/compile/NSW-WS amd64
Architecture: x86_64
Machine: amd64
>Description:
	If the kernel is compiled with FAST_IPSEC some policy information is stored in the PCB of the socket.
	FAST_IPSEC seems to have an additional cache to speed up things.

	Here some things I've figgured out with debugging IPSEC with "sysctl -w net-key.debug=0x41":
	remark: the output is after running the system a short time, we start with 410 references already lost ...
		I used a small program that sends two UPD-packets to the echo port on an other machine with sleep(1)
		between each part.

        I've added some comments about the system calls to the output ...

-- socket(PF_INET, SOCK_DGRAM, 0) ... ipsec_init_policy()

DP key_newsp from ../../../../netipsec/ipsec.c:1176 return SP:0xffff800010207600
DP key_newsp from ../../../../netipsec/ipsec.c:1183 return SP:0xffff80001018ec00

-- sendto echo port of other host

  ... ipsec_getpolicybysock() called from ipsec4_checkpolicy() from ip_output() from udp_output()
DP key_allocsp from ../../../../netipsec/ipsec.c:589
DP key_allocsp return SP:0x0 (ID=0) refcnt 0
DP key_allocsp_default from ../../../../netipsec/ipsec.c:591
DP key_allocsp_default returns SP:0xffffffff806948a0 (411)
DP ipsec_getpolicybysock (priv 1 policy 3) allocates SP:0xffffffff806948a0 (refcnt 411)
DP ipsec_fillpcbcache (dir 2) cause refcnt++:412 SP:0xffffffff806948a0
  ... inside of ipsec4_checkpolicy() - returns NULL
DP key_freesp SP:0xffffffff806948a0 (ID=0) from ../../../../netipsec/ipsec.c:718; refcnt now 411
  ... ipsec_invalpcbcache(IPSEC_DIR_ANY) called from ipsec_pcbdisconn() from
  ...    in_pcbdisconnect() called after udp_output()
DP key_freesp SP:0xffffffff806948a0 (ID=0) from ../../../../netipsec/ipsec.c:389; refcnt now 410

-- reply packet

  ... udp4_realinput() called from udp_input() from ip_input()
  ... in_pcblookup_bind() will return a pointer for this packet ...
  ... udp4_sendup called ...
  ... ipsec4_in_reject_so() called - macro for FAST_IPSEC to ipsec4_in_reject() ...
  ... remark: we have a "inp" pointer -> call ipsec_getpolicybysock() ...
  ... ipsec_checkpcbcache() from ipsec_getpolicybysock() - but nothing in the cache
  ... so ipsec4_setspidx_inpcb() is called that calls ipsec_setspidx()
  ... now we try to get the policy - first try specific (failes) then get default
DP key_allocsp from ../../../../netipsec/ipsec.c:589
DP key_allocsp return SP:0x0 (ID=0) refcnt 0
DP key_allocsp_default from ../../../../netipsec/ipsec.c:591
DP key_allocsp_default returns SP:0xffffffff806948a0 (411)
  ... debug message of ipsec_getpolicybysock() followed by IN-dir cache setup ....
DP ipsec_getpolicybysock (priv 1 policy 3) allocates SP:0xffffffff806948a0 (refcnt 411)
DP ipsec_fillpcbcache (dir 1) cause refcnt++:412 SP:0xffffffff806948a0
  ... back in ipsec4_in_reject() ...
  ... call ipsec_in_reject() that will return OK
  ... free the reference to the policy ...
DP key_freesp SP:0xffffffff806948a0 (ID=0) from ../../../../netipsec/ipsec.c:1732; refcnt now 411

  ... remark: now a reference to the IN-dir SP is still in the cache !!!!!!!

-- sendto echo port of other host

DP key_allocsp from ../../../../netipsec/ipsec.c:589
DP key_allocsp return SP:0x0 (ID=0) refcnt 0
DP key_allocsp_default from ../../../../netipsec/ipsec.c:591
DP key_allocsp_default returns SP:0xffffffff806948a0 (412)
DP ipsec_getpolicybysock (priv 1 policy 3) allocates SP:0xffffffff806948a0 (refcnt 412)
DP ipsec_fillpcbcache (dir 2) cause refcnt++:413 SP:0xffffffff806948a0
DP key_freesp SP:0xffffffff806948a0 (ID=0) from ../../../../netipsec/ipsec.c:718; refcnt now 412
  ... ipsec_invalpcbcache(IPSEC_DIR_ANY) called from ipsec_pcbdisconn() from
  ...    in_pcbdisconnect() called after udp_output()
DP key_freesp SP:0xffffffff806948a0 (ID=0) from ../../../../netipsec/ipsec.c:389; refcnt now 411
DP key_freesp SP:0xffffffff806948a0 (ID=0) from ../../../../netipsec/ipsec.c:389; refcnt now 410
  ... remark: this has freed the IN-bound cache entry too (why? not realy usefull for the next packet ...)

-- reply packet

  ... udp4_realinput() called from udp_input() from ip_input()
  ... in_pcblookup_bind() will return a pointer for this packet ...
  ... udp4_sendup called ...
  ... ipsec4_in_reject_so() called - macro for FAST_IPSEC to ipsec4_in_reject() ...
  ... remark: we have a "inp" pointer -> call ipsec_getpolicybysock() ...
  ... ipsec_checkpcbcache() from ipsec_getpolicybysock() - but nothing in the cache
  ... so ipsec4_setspidx_inpcb() is called that calls ipsec_setspidx()
  ... now we try to get the policy - first try specific (failes) then get default
DP key_allocsp from ../../../../netipsec/ipsec.c:589
DP key_allocsp from ../../../../netipsec/ipsec.c:589
DP key_allocsp return SP:0x0 (ID=0) refcnt 0
DP key_allocsp_default from ../../../../netipsec/ipsec.c:591
DP key_allocsp_default returns SP:0xffffffff806948a0 (411)
  ... debug message of ipsec_getpolicybysock() followed by IN-dir cache setup ....
DP ipsec_getpolicybysock (priv 1 policy 3) allocates SP:0xffffffff806948a0 (refcnt 411)
DP ipsec_fillpcbcache (dir 1) cause refcnt++:412 SP:0xffffffff806948a0
  ... back in ipsec4_in_reject() ...
  ... call ipsec_in_reject() that will return OK
  ... free the reference to the policy ...
DP key_freesp SP:0xffffffff806948a0 (ID=0) from ../../../../netipsec/ipsec.c:1732; refcnt now 411

  ... remark: now a reference to the IN-dir SP is still in the cache !!!!!!!

-- close(socket) ... ipsec4_delete_pcbpolicy()

DP key_freesp SP:0xffff800010207600 (ID=0) from ../../../../netipsec/ipsec.c:1429; refcnt now 0
DP key_freesp SP:0xffff80001018ec00 (ID=0) from ../../../../netipsec/ipsec.c:1432; refcnt now 0

  ... XXXXXX reference stored in cache not freed again XXXXXXX

-- exiting 


        Now what has happend ...
	The first packet is send and the answer has been recieved.
	After a recieve operation there is a reference in the cache to the SP ...
	The second packet will kill the hole cache after sending - including the
	recieve side - not good for performace - the cache will only hit, if there is more than
	packet inbound before another packet is send. Hmmm - but that is not the point here ...

	After all trafic is done, there is again a reference in the cache, but that is
	not freed on socket shutdown ...
	I'm not shure (simply not tested) if this also happens for connection oriented protocols.

	During shutdown ipsec4_delete_pcbpolicy() calls ipsec_delpcbpolicy().
	That one simply frees the memory.
	The following patch adds code to this routine to clear the cache prior freeing the memeory.
	It seems to solve the problem.

>How-To-Repeat:
	Just enable debugging with "sysctl -w net.key.debu=0x41" on a system build with FAST_IPSEC
	and have a look at the reference counter for the policy. No IPSEC rule needs to be entered.
>Fix:
	The following fix seems to solve the problem. It clears the cache prior freeing the memory
	of the ancor structure in /usr/src/sys/netipsec/ipsec.c

--- ipsec.c.old 2011-01-18 15:31:16.000000000 +0100
+++ ipsec.c     2011-01-18 15:22:36.000000000 +0100
@@ -1160,6 +1160,9 @@
 static void
 ipsec_delpcbpolicy(struct inpcbpolicy *p)
 {
+#ifdef __NetBSD__
+       ipsec_invalpcbcache(p, IPSEC_DIR_ANY);
+#endif
        free(p, M_SECA);
 }


>Release-Note:

>Audit-Trail:

State-Changed-From-To: open->feedback
State-Changed-By: drochner@NetBSD.org
State-Changed-When: Fri, 11 Feb 2011 17:55:41 +0000
State-Changed-Why:
does the fix in ipsec.c rev.1.49 work for you?


From: "Matthias Drochner" <drochner@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/44410 CVS commit: src/sys/netipsec
Date: Fri, 11 Feb 2011 17:53:35 +0000

 Module Name:	src
 Committed By:	drochner
 Date:		Fri Feb 11 17:53:35 UTC 2011

 Modified Files:
 	src/sys/netipsec: ipsec.c

 Log Message:
 invalidate the secpolicy cache bin the PCB before destroying, so that
 the refcount in the (global) policies gets decremented
 (This apparently was missed when the policy cache code was copied
 over from KAME IPSEC.)
 From Wolfgang Stukenbrock per PR kern/44410, just fixed differently
 to avoid unecessary differences to KAME.


 To generate a diff of this commit:
 cvs rdiff -u -r1.48 -r1.49 src/sys/netipsec/ipsec.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Wolfgang Stukenbrock <Wolfgang.Stukenbrock@nagler-company.com>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/44410 (policy reference count not decremented correctly on socket close)
Date: Mon, 14 Feb 2011 14:14:25 +0100

 Hi,

 not tested in real world till now, but it does the same as my patch - 
 only the location is different.

 The advantage of my version is, that I placed it at the location where 
 the data structure is destroyed.
 Now it is placed at all locations where we are gooing to call the 
 destruction routine. (The third call of the destruction routine is 
 during creation on error and there cannot be anything in the cache up to 
 now.)
 If for whatever reason another call to the destruction routine is added 
 in the future, we need to remember to flush the cache ...

 I think it would be better to change this in the KAME implementation 
 too, so that the flush is done in the destruction routine.

 Neverless I think this PR can be closed.

 Best regards

 W. Stukenbrock

 drochner@NetBSD.org wrote:

 > Synopsis: policy reference count not decremented correctly on socket close
 > 
 > State-Changed-From-To: open->feedback
 > State-Changed-By: drochner@NetBSD.org
 > State-Changed-When: Fri, 11 Feb 2011 17:55:41 +0000
 > State-Changed-Why:
 > does the fix in ipsec.c rev.1.49 work for you?
 > 
 > 
 > 
 > 


 -- 


 Dr. Nagler & Company GmbH
 Hauptstraße 9
 92253 Schnaittenbach

 Tel. +49 9622/71 97-42
 Fax +49 9622/71 97-50

 Wolfgang.Stukenbrock@nagler-company.com
 http://www.nagler-company.com


 Hauptsitz: Schnaittenbach
 Handelregister: Amberg HRB
 Gerichtsstand: Amberg
 Steuernummer: 201/118/51825
 USt.-ID-Nummer: DE 273143997
 Geschäftsführer: Dr. Martin Nagler, Dr. Dr. Karl-Kuno Kunze


State-Changed-From-To: feedback->closed
State-Changed-By: drochner@NetBSD.org
State-Changed-When: Mon, 21 Feb 2011 17:19:53 +0000
State-Changed-Why:
submitter agrees to close
(I fully agree that your version is better, but I don't dare to
change the KAME code. Hopefully this cal be cleaned up one day...
Actually, the cache hit ratios I've seen so far are not impressive.
Perhaps optimizing the initial lookup and removing the cache is
the better concept. The workload in my tests might be different
from some real-world applications though.)


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.