summaryrefslogtreecommitdiffstats
path: root/net
Commit message (Collapse)AuthorAgeFilesLines
* [NET] NETNS: Omit net_device->nd_net without CONFIG_NET_NS.YOSHIFUJI Hideaki2008-03-2568-200/+200
| | | | | | | | Introduce per-net_device inlines: dev_net(), dev_net_set(). Without CONFIG_NET_NS, no namespace other than &init_net exists. Let's explicitly define them to help compiler optimizations. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
* [IPV6]: Support Source Address Selection API (RFC5014).YOSHIFUJI Hideaki2008-03-258-9/+123
| | | | Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
* [IPV6]: Optimize hop-limit determination.YOSHIFUJI Hideaki2008-03-256-27/+17Star
| | | | | | | | | | | Last part of hop-limit determination is always: hoplimit = dst_metric(dst, RTAX_HOPLIMIT); if (hoplimit < 0) hoplimit = ipv6_get_hoplimit(dst->dev). Let's consolidate it as ip6_dst_hoplimit(dst). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
* [IPV4,IPV6]: Share cork.rt between IPv4 and IPv6.YOSHIFUJI Hideaki2008-03-252-14/+12Star
| | | | Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
* [IPV6] ADDRCONF: Clean-up ipv6_dev_get_saddr().YOSHIFUJI Hideaki2008-03-251-214/+206Star
| | | | | | | | | | | | old: | text data bss dec hex filename | 28599 1416 96 30111 759f net/ipv6/addrconf.o new: | text data bss dec hex filename | 28007 1416 96 29519 734f net/ipv6/addrconf.o Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
* [XFRM] MIP6: Fix address keys for routing search.YOSHIFUJI Hideaki2008-03-251-9/+40
| | | | | | | | | | | | | | | | | Each MIPv6 XFRM state (DSTOPT/RH2) holds either destination or source address to be mangled in the IPv6 header (that is "CoA"). On Inter-MN communication after both nodes binds each other, they use route optimized traffic two MIPv6 states applied, and both source and destination address in the IPv6 header are replaced by the states respectively. The packet format is correct, however, next-hop routing search are not. This patch fixes it by remembering address pairs for later states. Based on patch from Masahide NAKAMURA <nakam@linux-ipv6.org>. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
* [XFRM] IPV6: Optimize __xfrm_tunnel_alloc_spi().YOSHIFUJI Hideaki2008-03-251-22/+23
| | | | | | | | | | % size old/net/ipv6/xfrm6_tunnel.o new/net/ipv6/xfrm6_tunnel.o | text data bss dec hex filename | 1606 40 2080 3726 e8e old/net/ipv6/xfrm6_tunnel.o | 1574 40 2080 3694 e6e new/net/ipv6/xfrm6_tunnel.o Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
* [XFRM] IPV6: Optimize xfrm6_input_addr().YOSHIFUJI Hideaki2008-03-251-41/+14Star
| | | | | | | | | | % size old/net/ipv6/xfrm6_input.o new/net/ipv6/xfrm6_input.o | text data bss dec hex filename | 1026 0 0 1026 402 old/net/ipv6/xfrm6_input.o | 947 0 0 947 3b3 new/net/ipv6/xfrm6_input.o Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
* [XFRM] IPV6: Use distribution counting sort for xfrm_state/xfrm_tmpl chain.YOSHIFUJI Hideaki2008-03-251-97/+74Star
| | | | Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
* [NETNS]: Enable TCP/UDP/ICMP inside namespace.Denis V. Lunev2008-03-242-0/+4
| | | | | Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETNS]: Allow to create sockets in non-initial namespace.Denis V. Lunev2008-03-241-3/+21
| | | | | | | Allow to create sockets in the namespace if the protocol ok with this. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETNS]: Drop packets in the non-initial namespace on the per/protocol basis.Denis V. Lunev2008-03-241-4/+4
| | | | | | | | | IP layer now can handle multiple namespaces normally. So, process such packets normally and drop them only if the transport layer is not aware about namespaces. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETNS]: Process netfilter hooks in initial namespace only.Denis V. Lunev2008-03-241-0/+8
| | | | | | | | | | There were no packets in the namespace other than initial previously. This will be changed in the neareast future. Netfilters are not namespace aware and should be processed in the initial namespace only for now. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETNS]: Process INET socket layer in the correct namespace.Denis V. Lunev2008-03-244-7/+7
| | | | | | | | Replace all the reast of the init_net with a proper net on the socket layer. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETNS]: Process IP layer in the context of the correct namespace.Denis V. Lunev2008-03-245-8/+14
| | | | | | | Replace all the rest of the init_net with a proper net on the IP layer. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETNS]: Add namespace parameter to ip_cmsg_send.Denis V. Lunev2008-03-243-4/+4
| | | | | | | Pass the init_net there for now. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETNS]: Add namespace parameter to ip_options_get(...).Denis V. Lunev2008-03-242-8/+10
| | | | | | | Pass the init_net there for now. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETNS]: Add namespace parameter to ip_options_compile.Denis V. Lunev2008-03-242-4/+5
| | | | | | | | | ip_options_compile uses inet_addr_type which requires a namespace. The packet argument is optional, so parameter is the only way to obtain it. Pass the init_net there for now. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETNS]: /proc/net/arp namespacing.Denis V. Lunev2008-03-241-2/+18
| | | | | | | | Seqfile operation showing /proc/net/arp are already namespace aware. All we need is to register this file for each namespace. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETNS]: Process ARP in the context of the correct namespace.Denis V. Lunev2008-03-241-14/+9Star
| | | | | | | | Get namespace from a device and pass it to the routing engine. Enable ARP packet processing and device notifiers after that. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETNS]: Minor information leak via /proc/net/ptype file.Pavel Emelyanov2008-03-241-3/+4
| | | | | | | | | This file displays the registered packet types, but some of them (packet sockets creates such) can be bound to a net device and showing them in a wrong namespace is not correct. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETNS][UDP-Lite]: Register /proc/net/udplite(6) in a namespace.Pavel Emelyanov2008-03-242-3/+33
| | | | | | | | UDP-Lite sockets are displayed in another files, rather than UDP ones, so make the present in namespaces as well. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [UDP-Lite]: Clean up proc creation a bit.Pavel Emelyanov2008-03-241-3/+11
| | | | | | | | | Just introduce a helper to remove ifdefs from inside the udplite4_register function. This will help to make the next patch nicer. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETNS][TCP]: Register /proc/net/tcp in a namespace.Pavel Emelyanov2008-03-241-2/+17
| | | | | | | | | After the commit f40c8174d3c21bf178283f3ef3aa8c7bf238fdec ([NETNS][IPV4] tcp - make proc handle the network namespaces) it is now possible to make this file present in newly created namespaces. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETNS][UDP]: Register /proc/net/udp in a namespace.Pavel Emelyanov2008-03-241-2/+17
| | | | | | | | | After the commit a91275eff43a527e1a25d6d034cbcd19ee323e64 ([NETNS][IPV6] udp - make proc handle the network namespace) it is now possible to make this file present in newly created namespaces. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ../net-2.6/David S. Miller2008-03-248-20/+38
|\ | | | | | | | | | | Conflicts: net/ipv6/ndisc.c
| * sch_htb: fix "too many events" situationMartin Devera2008-03-241-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | HTB is event driven algorithm and part of its work is to apply scheduled events at proper times. It tried to defend itself from livelock by processing only limited number of events per dequeue. Because of faster computers some users already hit this hardcoded limit. This patch limits processing up to 2 jiffies (why not 1 jiffie ? because it might stop prematurely when only fraction of jiffie remains). Signed-off-by: Martin Devera <devik@cdi.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
| * [ATM]: When proc_create() fails, do some error handling work and return -ENOMEM.Wang Chen2008-03-242-3/+20
| | | | | | | | | | Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * [9P] net/9p/trans_fd.c: remove unused variableJulia Lawall2008-03-231-2/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The variable cb is initialized but never used otherwise. The semantic patch that makes this change is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @@ type T; identifier i; constant C; @@ ( extern T i; | - T i; <+... when != i - i = C; ...+> ) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
| * [IPV6] net/ipv6/ndisc.c: remove unused variableJulia Lawall2008-03-231-2/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The variable hlen is initialized but never used otherwise. The semantic patch that makes this change is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @@ type T; identifier i; constant C; @@ ( extern T i; | - T i; <+... when != i - i = C; ...+> ) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
| * [IPV4] fib_trie: fix warning from rcu_assign_poingerStephen Hemminger2008-03-231-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | This gets rid of a warning caused by the test in rcu_assign_pointer. I tried to fix rcu_assign_pointer, but that devolved into a long set of discussions about doing it right that came to no real solution. Since the test in rcu_assign_pointer for constant NULL would never succeed in fib_trie, just open code instead. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * [TCP]: Let skbs grow over a page on fast peersHerbert Xu2008-03-221-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While testing the virtio-net driver on KVM with TSO I noticed that TSO performance with a 1500 MTU is significantly worse compared to the performance of non-TSO with a 16436 MTU. The packet dump shows that most of the packets sent are smaller than a page. Looking at the code this actually is quite obvious as it always stop extending the packet if it's the first packet yet to be sent and if it's larger than the MSS. Since each extension is bound by the page size, this means that (given a 1500 MTU) we're very unlikely to construct packets greater than a page, provided that the receiver and the path is fast enough so that packets can always be sent immediately. The fix is also quite obvious. The push calls inside the loop is just an optimisation so that we don't end up doing all the sending at the end of the loop. Therefore there is no specific reason why it has to do so at MSS boundaries. For TSO, the most natural extension of this optimisation is to do the pushing once the skb exceeds the TSO size goal. This is what the patch does and testing with KVM shows that the TSO performance with a 1500 MTU easily surpasses that of a 16436 MTU and indeed the packet sizes sent are generally larger than 16436. I don't see any obvious downsides for slower peers or connections, but it would be prudent to test this extensively to ensure that those cases don't regress. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
| * [DLCI]: Fix tiny race between module unload and sock_ioctl.Pavel Emelyanov2008-03-211-4/+3Star
| | | | | | | | | | | | | | | | This is a narrow pedantry :) but the dlci_ioctl_hook check and call should not be parted with the mutex lock. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * [IPV4]: Fix null dereference in ip_defragPhil Oester2008-03-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Been seeing occasional panics in my testing of 2.6.25-rc in ip_defrag. Offending line in ip_defrag is here: net = skb->dev->nd_net where dev is NULL. Bisected the problem down to commit ac18e7509e7df327e30d6e073a787d922eaf211d ([NETNS][FRAGS]: Make the inet_frag_queue lookup work in namespaces). Below patch (idea from Patrick McHardy) fixes the problem for me. Signed-off-by: Phil Oester <kernel@linuxace.com> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* | [SCTP]: Remove redundant wrapper functions.Florian Westphal2008-03-242-17/+3Star
| | | | | | | | | | | | | | | | | | | | sctp_datamsg_free and sctp_datamsg_track are just aliases for sctp_datamsg_put and sctp_chunk_hold, respectively. Saves 32 Bytes on x86. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* | [SCTP]: Replace char msg[] with static const char[].Florian Westphal2008-03-242-5/+5
| | | | | | | | | | | | | | | | 133886 2004 220 136110 213ae sctp.new/sctp.o 134018 2004 220 136242 21432 sctp.old/sctp.o Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* | fib_trie: print information on all routing tablesStephen Hemminger2008-03-241-84/+95
| | | | | | | | | | | | | | | | Make /proc/net/fib_trie and /proc/net/fib_triestat display all routing tables, not just local and main. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | [AF_PACKET]: Remove unused variable.Jiri Olsa2008-03-241-2/+1Star
| | | | | | | | | | Signed-off-by: Jiri Olsa <olsajiri@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | [TCP]: Shrink syncookie_secret by 8 byte.Florian Westphal2008-03-242-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the first u32 copied from syncookie_secret is overwritten by the minute-counter four lines below. After adjusting the destination address, the size of syncookie_secret can be reduced accordingly. AFAICS, the only other user of syncookie_secret[] is the ipv6 syncookie support. Because ipv6 syncookies only grab 44 bytes from syncookie_secret[], this shouldn't affect them in any way. With fixes from Glenn Griffin. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Glenn Griffin <ggriffin.kernel@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | [IPV6]: Remove unused code in ndisc_send_redirect().Rami Rosen2008-03-241-3/+0Star
| | | | | | | | | | | | | | | | This patches removes unused code in ndisc_send_redirect() method in net/ipv6/ndisc.c. Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | [IPV4] route: use read_mostlyStephen Hemminger2008-03-231-19/+17Star
| | | | | | | | | | | | | | | | | | | | | | The route table parameters are set based on system memory and sysctl values that almost never change. Also the genid only changes every 10 minutes. RTprint is defined by never used. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | [IPV4]: sk parameter is unused in ipv4_dst_blackhole.Denis V. Lunev2008-03-231-2/+2
| | | | | | | | | | | | | | Just remove it. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | [RAW]: Add raw_hashinfo member on struct proto.Pavel Emelyanov2008-03-232-27/+11Star
| | | | | | | | | | | | | | | | | | | | | | | | Sorry for the patch sequence confusion :| but I found that the similar thing can be done for raw sockets easily too late. Expand the proto.h union with the raw_hashinfo member and use it in raw_prot and rawv6_prot. This allows to drop the protocol specific versions of hash and unhash callbacks. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | [UDP]: Make full use of proto.h.udp_hash innovation.Pavel Emelyanov2008-03-236-40/+18Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | After this we have only udp_lib_get_port to get the port and two stubs for ipv4 and ipv6. No difference in udp and udplite except for initialized h.udp_hash member. I tried to find a graceful way to drop the only difference between udp_v4_get_port and udp_v6_get_port (i.e. the rcv_saddr comparison routine), but adding one more callback on the struct proto didn't appear such :( Maybe later. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | [SOCK]: Add udp_hash member to struct proto.Pavel Emelyanov2008-03-237-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Inspired by the commit ab1e0a13 ([SOCK] proto: Add hashinfo member to struct proto) from Arnaldo, I made similar thing for UDP/-Lite IPv4 and -v6 protocols. The result is not that exciting, but it removes some levels of indirection in udpxxx_get_port and saves some space in code and text. The first step is to union existing hashinfo and new udp_hash on the struct proto and give a name to this union, since future initialization of tcpxxx_prot, dccp_vx_protinfo and udpxxx_protinfo will cause gcc warning about inability to initialize anonymous member this way. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | [IPV4]: Always pass ip_options pointer into ip_options_compile.Denis V. Lunev2008-03-232-12/+10Star
| | | | | | | | | | | | | | This makes code a bit more uniform and straigthforward. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | [IPV4]: Remove unused ip_options->is_data.Denis V. Lunev2008-03-232-6/+0Star
| | | | | | | | | | | | | | | | | | ip_options->is_data is assigned only and never checked. The structure is not a part of kernel interface to the userspace. So, it is safe to remove this field. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | [IPV4]: Remove unnecessary check for opt->is_data in ip_options_compile.Denis V. Lunev2008-03-231-2/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | | | There is the only way to reach ip_options compile with opt != NULL: ip_options_get_finish opt->is_data = 1; ip_options_compile(opt, NULL) So, checking for is_data inside opt != NULL branch is not needed. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | [TCP]: TCP_DEFER_ACCEPT updates - process as establishedPatrick McManus2008-03-226-31/+89
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change TCP_DEFER_ACCEPT implementation so that it transitions a connection to ESTABLISHED after handshake is complete instead of leaving it in SYN-RECV until some data arrvies. Place connection in accept queue when first data packet arrives from slow path. Benefits: - established connection is now reset if it never makes it to the accept queue - diagnostic state of established matches with the packet traces showing completed handshake - TCP_DEFER_ACCEPT timeouts are expressed in seconds and can now be enforced with reasonable accuracy instead of rounding up to next exponential back-off of syn-ack retry. Signed-off-by: Patrick McManus <mcmanus@ducksong.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | [TCP]: TCP_DEFER_ACCEPT updates - dont retxmt synackPatrick McManus2008-03-221-2/+3
| | | | | | | | | | | | | | | | | | | | | | a socket in LISTEN that had completed its 3 way handshake, but not notified userspace because of SO_DEFER_ACCEPT, would retransmit the already acked syn-ack during the time it was waiting for the first data byte from the peer. Signed-off-by: Patrick McManus <mcmanus@ducksong.com> Acked-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>