summaryrefslogtreecommitdiffstats
path: root/net/ipv6
Commit message (Collapse)AuthorAgeFilesLines
* IPv6: preferred lifetime of address not getting updatedBrian Haley2009-07-041-3/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's a bug in addrconf_prefix_rcv() where it won't update the preferred lifetime of an IPv6 address if the current valid lifetime of the address is less than 2 hours (the minimum value in the RA). For example, If I send a router advertisement with a prefix that has valid lifetime = preferred lifetime = 2 hours we'll build this address: 3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000 inet6 2001:1890:1109:a20:217:8ff:fe7d:4718/64 scope global dynamic valid_lft 7175sec preferred_lft 7175sec If I then send the same prefix with valid lifetime = preferred lifetime = 0 it will be ignored since the minimum valid lifetime is 2 hours: 3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000 inet6 2001:1890:1109:a20:217:8ff:fe7d:4718/64 scope global dynamic valid_lft 7161sec preferred_lft 7161sec But according to RFC 4862 we should always reset the preferred lifetime even if the valid lifetime is invalid, which would cause the address to immediately get deprecated. So with this patch we'd see this: 5: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000 inet6 2001:1890:1109:a20:21f:29ff:fe5a:ef04/64 scope global deprecated dynamic valid_lft 7163sec preferred_lft 0sec The comment winds-up being 5x the size of the code to fix the problem. Update the preferred lifetime of IPv6 addresses derived from a prefix info option in a router advertisement even if the valid lifetime in the option is invalid, as specified in RFC 4862 Section 5.5.3e. Fixes an issue where an address will not immediately become deprecated. Reported by Jens Rosenboom. Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* xfrm6: fix the proto and ports decode of sctp protocolWei Yongjun2009-07-041-2/+4
| | | | | | | | | | | The SCTP pushed the skb above the sctp chunk header, so the check of pskb_may_pull(skb, nh + offset + 1 - skb->data) in _decode_session6() will never return 0 and the ports decode of sctp will always fail. (nh + offset + 1 - skb->data < 0) Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* inet: Call skb_orphan before tproxy activatesHerbert Xu2009-06-271-0/+3
| | | | | | | | | | As transparent proxying looks up the socket early and assigns it to the skb for later processing, we must drop any existing socket ownership prior to that in order to distinguish between the case where tproxy is active and where it is not. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv6: Use rcu_barrier() on module unload.Jesper Dangaard Brouer2009-06-261-0/+2
| | | | | | | | | The ipv6 module uses rcu_call() thus it should use rcu_barrier() on module unload. Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv6: avoid wraparound for expired preferred lifetimeJens Rosenboom2009-06-261-1/+4
| | | | | | | | Avoid showing wrong high values when the preferred lifetime of an address is expired. Signed-off-by: Jens Rosenboom <me@jayr.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv6: Use correct data types for ICMPv6 type and codeBrian Haley2009-06-2314-30/+30
| | | | | | | | | Change all the code that deals directly with ICMPv6 type and code values to use u8 instead of a signed int as that's the actual data type. Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: correct off-by-one write allocations reportsEric Dumazet2009-06-182-5/+6
| | | | | | | | | | | | | commit 2b85a34e911bf483c27cfdd124aeb1605145dc80 (net: No more expensive sock_hold()/sock_put() on each tx) changed initial sk_wmem_alloc value. We need to take into account this offset when reporting sk_wmem_alloc to user, in PROC_FS files or various ioctls (SIOCOUTQ/TIOCOUTQ) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2009-06-152-2/+2
|\ | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: Documentation/feature-removal-schedule.txt drivers/scsi/fcoe/fcoe.c net/core/drop_monitor.c net/core/net-traces.c
| * trivial: Fix a typo in comment of addrconf_dad_start()Masatake YAMATO2009-06-121-1/+1
| | | | | | | | | | Signed-off-by: Masatake YAMATO <yamato@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
| * trivial: Kconfig: .ko is normally not included in module namesPavel Machek2009-06-121-1/+1
| | | | | | | | | | | | | | .ko is normally not included in Kconfig help, make it consistent. Signed-off-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
* | PIM-SM: namespace changesTom Goff2009-06-141-8/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | IPv4: - make PIM register vifs netns local - set the netns when a PIM register vif is created - make PIM available in all network namespaces (if CONFIG_IP_PIMSM_V2) by adding the protocol handler when multicast routing is initialized IPv6: - make PIM register vifs netns local - make PIM available in all network namespaces (if CONFIG_IPV6_PIMSM_V2) by adding the protocol handler when multicast routing is initialized Signed-off-by: Tom Goff <thomas.goff@boeing.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'master' of ↵Patrick McHardy2009-06-1121-155/+245
|\ \ | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
| * | net: No more expensive sock_hold()/sock_put() on each txEric Dumazet2009-06-111-1/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | One of the problem with sock memory accounting is it uses a pair of sock_hold()/sock_put() for each transmitted packet. This slows down bidirectional flows because the receive path also needs to take a refcount on socket and might use a different cpu than transmit path or transmit completion path. So these two atomic operations also trigger cache line bounces. We can see this in tx or tx/rx workloads (media gateways for example), where sock_wfree() can be in top five functions in profiles. We use this sock_hold()/sock_put() so that sock freeing is delayed until all tx packets are completed. As we also update sk_wmem_alloc, we could offset sk_wmem_alloc by one unit at init time, until sk_free() is called. Once sk_free() is called, we atomic_dec_and_test(sk_wmem_alloc) to decrement initial offset and atomicaly check if any packets are in flight. skb_set_owner_w() doesnt call sock_hold() anymore sock_wfree() doesnt call sock_put() anymore, but check if sk_wmem_alloc reached 0 to perform the final freeing. Drawback is that a skb->truesize error could lead to unfreeable sockets, or even worse, prematurely calling __sk_free() on a live socket. Nice speedups on SMP. tbench for example, going from 2691 MB/s to 2711 MB/s on my 8 cpu dev machine, even if tbench was not really hitting sk_refcnt contention point. 5 % speedup on a UDP transmit workload (depends on number of flows), lowering TX completion cpu usage. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | netfilter: Use frag list abstraction interfaces.David S. Miller2009-06-091-2/+2
| | | | | | | | | | | | Signed-off-by: David S. Miller <davem@davemloft.net>
| * | ipv6: Use frag list abstraction interfaces.David S. Miller2009-06-092-5/+5
| | | | | | | | | | | | Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: skb->dst accessorsEric Dumazet2009-06-0318-133/+139
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Define three accessors to get/set dst attached to a skb struct dst_entry *skb_dst(const struct sk_buff *skb) void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst) void skb_dst_drop(struct sk_buff *skb) This one should replace occurrences of : dst_release(skb->dst) skb->dst = NULL; Delete skb->dst field Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | IPv6: Print error value when skb allocation failsBrian Haley2009-06-021-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | Print-out the error value when sock_alloc_send_skb() fails in the IPv6 neighbor discovery code - can be useful for debugging. Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | IPv6: Add 'autoconf' and 'disable_ipv6' module parametersBrian Haley2009-06-012-10/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add 'autoconf' and 'disable_ipv6' parameters to the IPv6 module. The first controls if IPv6 addresses are autoconfigured from prefixes received in Router Advertisements. The IPv6 loopback (::1) and link-local addresses are still configured. The second controls if IPv6 addresses are desired at all. No IPv6 addresses will be added to any interfaces. Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | netfilter: nf_ct_icmp: keep the ICMP ct entries longerJan Kasprzak2009-06-081-12/+4Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current conntrack code kills the ICMP conntrack entry as soon as the first reply is received. This is incorrect, as we then see only the first ICMP echo reply out of several possible duplicates as ESTABLISHED, while the rest will be INVALID. Also this unnecessarily increases the conntrackd traffic on H-A firewalls. Make all the ICMP conntrack entries (including the replied ones) last for the default of nf_conntrack_icmp{,v6}_timeout seconds. Signed-off-by: Jan "Yenya" Kasprzak <kas@fi.muni.cz> Signed-off-by: Patrick McHardy <kaber@trash.net>
* | | netfilter: x_tables: added hook number into match extension parameter structure.Evgeniy Polyakov2009-06-041-1/+1
| | | | | | | | | | | | | | | Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: Patrick McHardy <kaber@trash.net>
* | | netfilter: conntrack: simplify event caching systemPablo Neira Ayuso2009-06-021-1/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch simplifies the conntrack event caching system by removing several events: * IPCT_[*]_VOLATILE, IPCT_HELPINFO and IPCT_NATINFO has been deleted since the have no clients. * IPCT_COUNTER_FILLING which is a leftover of the 32-bits counter days. * IPCT_REFRESH which is not of any use since we always include the timeout in the messages. After this patch, the existing events are: * IPCT_NEW, IPCT_RELATED and IPCT_DESTROY, that are used to identify addition and deletion of entries. * IPCT_STATUS, that notes that the status bits have changes, eg. IPS_SEEN_REPLY and IPS_ASSURED. * IPCT_PROTOINFO, that reports that internal protocol information has changed, eg. the TCP, DCCP and SCTP protocol state. * IPCT_HELPER, that a helper has been assigned or unassigned to this entry. * IPCT_MARK and IPCT_SECMARK, that reports that the mark has changed, this covers the case when a mark is set to zero. * IPCT_NATSEQADJ, to report that there's updates in the NAT sequence adjustment. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* | | Merge branch 'master' of git://dev.medozas.de/linuxPatrick McHardy2009-06-022-85/+85
|\ \ \ | |/ / |/| |
| * | netfilter: xtables: consolidate comefrom debug cast accessJan Engelhardt2009-05-081-5/+8
| | | | | | | | | | | | Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
| * | netfilter: xtables: remove another level of indentJan Engelhardt2009-05-081-23/+21Star
| | | | | | | | | | | | Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
| * | netfilter: xtables: remove some gotoJan Engelhardt2009-05-081-5/+2Star
| | | | | | | | | | | | | | | | | | Combining two ifs, and goto is easily gone. Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
| * | netfilter: xtables: reduce indent level by oneJan Engelhardt2009-05-081-68/+64Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | Cosmetic only. Transformation applied: -if (foo) { long block; } else { short block; } +if (!foo) { short block; continue; } long block; Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
| * | netfilter: xtables: consolidate open-coded logicJan Engelhardt2009-05-081-4/+10
| | | | | | | | | | | | Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
| * | netfilter: xtables: fix const inconsistencyJan Engelhardt2009-05-081-7/+7
| | | | | | | | | | | | Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
| * | netfilter: xtables: remove redundant castsJan Engelhardt2009-05-081-1/+1
| | | | | | | | | | | | Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
| * | netfilter: xtables: use NFPROTO_ in standard targetsJan Engelhardt2009-05-081-3/+3
| | | | | | | | | | | | Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
| * | netfilter: queue: use NFPROTO_ for queue callsitesJan Engelhardt2009-05-081-1/+1
| | | | | | | | | | | | | | | | | | af is an nfproto. Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
| * | netfilter: xtables: use NFPROTO_ for xt_proto_init callsitesJan Engelhardt2009-05-081-2/+2
| | | | | | | | | | | | Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
* | | gro: Avoid unnecessary comparison after skb_gro_headerHerbert Xu2009-05-271-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For the overwhelming majority of cases, skb_gro_header's return value cannot be NULL. Yet we must check it because of its current form. This patch splits it up into multiple functions in order to avoid this. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | Merge branch 'master' of ↵David S. Miller2009-05-251-0/+3
|\ \ \ | | |/ | |/| | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/ath/ath5k/phy.c drivers/net/wireless/iwlwifi/iwl-agn.c drivers/net/wireless/iwlwifi/iwl3945-base.c
| * | IPv6: set RTPROT_KERNEL to initial routeJean-Mickael Guerin2009-05-211-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The use of unspecified protocol in IPv6 initial route prevents quagga to install IPv6 default route: # show ipv6 route S ::/0 [1/0] via fe80::1, eth1_0 K>* ::/0 is directly connected, lo, rej C>* ::1/128 is directly connected, lo C>* fe80::/64 is directly connected, eth1_0 # ip -6 route fe80::/64 dev eth1_0 proto kernel metric 256 mtu 1500 advmss 1440 hoplimit -1 ff00::/8 dev eth1_0 metric 256 mtu 1500 advmss 1440 hoplimit -1 unreachable default dev lo proto none metric -1 error -101 hoplimit 255 The attached patch ensures RTPROT_KERNEL to the default initial route and fixes the problem for quagga. This is similar to "ipv6: protocol for address routes" f410a1fba7afa79d2992620e874a343fdba28332. # show ipv6 route S>* ::/0 [1/0] via fe80::1, eth1_0 C>* ::1/128 is directly connected, lo C>* fe80::/64 is directly connected, eth1_0 # ip -6 route fe80::/64 dev eth1_0 proto kernel metric 256 mtu 1500 advmss 1440 hoplimit -1 fe80::/64 dev eth1_0 proto kernel metric 256 mtu 1500 advmss 1440 hoplimit -1 ff00::/8 dev eth1_0 metric 256 mtu 1500 advmss 1440 hoplimit -1 default via fe80::1 dev eth1_0 proto zebra metric 1024 mtu 1500 advmss 1440 hoplimit -1 unreachable default dev lo proto kernel metric -1 error -101 hoplimit 255 Signed-off-by: Jean-Mickael Guerin <jean-mickael.guerin@6wind.com> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | tcp: Unexport TCPv6 GRO functionsHerbert Xu2009-05-221-4/+3Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | Sinec the TCPv6 GRO functions are used in the same file where they are defined, we do not need to export them. This was a cut-n-paste from the IPv4 code which does need to export them. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | net: Remove unused parameter from fill method in fib_rules_ops.Rami Rosen2009-05-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The netlink message header (struct nlmsghdr) is an unused parameter in fill method of fib_rules_ops struct. This patch removes this parameter from this method and fixes the places where this method is called. (include/net/fib_rules.h) Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | sit: stateless autoconf for isatapSascha Hlusiak2009-05-202-0/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | be sent periodically. The rs_delay can be speficied when adding the PRL entry and defaults to 15 minutes. The RS is sent from every link local adress that's assigned to the tunnel interface. It's directed to the (guessed) linklocal address of the router and is sent through the tunnel. Better: send to ff02::2 encapsuled in unicast directed to router-v4. Signed-off-by: Sascha Hlusiak <contact@saschahlusiak.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | addrconf: refuse isatap eui64 for INADDR_ANYSascha Hlusiak2009-05-201-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | A tunnel with no local ipv4 endpoint would otherwise use the ISATAP linklocal address fe80::5efe:0:0, which is invalid. Rather not add a linklocal address at all. Signed-off-by: Sascha Hlusiak <contact@saschahlusiak.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | sit: ipip6_tunnel_del_prl: return errSascha Hlusiak2009-05-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Typo. When deleting a PRL entry, return status to userspace instead of success. Signed-off-by: Sascha Hlusiak <contact@saschahlusiak.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | sit: strictly restrict incoming traffic to tunnel link deviceSascha Hlusiak2009-05-201-9/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Check link device when looking up a tunnel. When a tunnel is linked to a interface, traffic from a different interface must not reach the tunnel. This also allows creating of multiple tunnels with the same endpoints, if the link device differs. Signed-off-by: Sascha Hlusiak <contact@saschahlusiak.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | sit: Fail to create tunnel, if it already existsSascha Hlusiak2009-05-201-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | When locating the tunnel, do not continue if it is found. Otherwise a different tunnel with similar configuration would be returned and parts could be overwritten. Signed-off-by: Sascha Hlusiak <contact@saschahlusiak.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | net: FIX ipv6_forward sysctl restartEric W. Biederman2009-05-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Just returning -ERESTARTSYS without a signal pending is not good that will just leak it to userspace. We need return -ERESTARTNOINTR so we always restart and set signal pending so that we fall of the fast path of syscall return and setup the system call restart. So use restart_syscall() which does all of this for us. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | net: remove needless (now buggy) & from dev->dev_addrJiri Pirko2009-05-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | Patch fixes issues with dev->dev_addr changing from array to pointer. Hopefully there are no others. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | ipv4: remove an unused parameter from configure method of fib_rules_ops.Rami Rosen2009-05-171-1/+1
| | | | | | | | | | | | | | | Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | Merge branch 'master' of ↵David S. Miller2009-05-081-3/+3
|\| | | |/ |/| | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: include/net/tcp.h
| * Merge branch 'master' of ↵David S. Miller2009-05-051-3/+3
| |\ | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6
| | * netfilter: ip6t_ipv6header: fix match on packets ending with NEXTHDR_NONEChristoph Paasch2009-05-051-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | As packets ending with NEXTHDR_NONE don't have a last extension header, the check for the length needs to be after the check for NEXTHDR_NONE. Signed-off-by: Christoph Paasch <christoph.paasch@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
* | | Merge branch 'master' of ↵David S. Miller2009-04-301-86/+37Star
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: Documentation/isdn/00-INDEX drivers/net/wireless/iwlwifi/iwl-scan.c drivers/net/wireless/rndis_wlan.c net/mac80211/main.c
| * | netfilter: revised locking for x_tablesStephen Hemminger2009-04-291-86/+37Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The x_tables are organized with a table structure and a per-cpu copies of the counters and rules. On older kernels there was a reader/writer lock per table which was a performance bottleneck. In 2.6.30-rc, this was converted to use RCU and the counters/rules which solved the performance problems for do_table but made replacing rules much slower because of the necessary RCU grace period. This version uses a per-cpu set of spinlocks and counters to allow to table processing to proceed without the cache thrashing of a global reader lock and keeps the same performance for table updates. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>