summaryrefslogtreecommitdiffstats
path: root/include
Commit message (Collapse)AuthorAgeFilesLines
* ipv6 sit: RCU conversion phase IEric Dumazet2009-10-241-0/+1
| | | | | | | | | | SIT tunnels use one rwlock to protect their prl entries. This first patch adds RCU locking for prl management, with standard call_rcu() calls. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* pkt_sched: skbedit add support for setting markjamal2009-10-232-0/+4
| | | | | | | This adds support for setting the skb mark. Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Fix for dst_negative_adviceKrishna Kumar2009-10-211-2/+10
| | | | | | | | | | | | | dst_negative_advice() should check for changed dst and reset sk_tx_queue_mapping accordingly. Pass sock to the callers of dst_negative_advice. (sk_reset_txq is defined just for use by dst_negative_advice. The only way I could find to get around this is to move dst_negative_() from dst.h to dst.c, include sock.h in dst.c, etc) Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Introduce sk_tx_queue_mappingKrishna Kumar2009-10-211-0/+26
| | | | | | | | | | | | Introduce sk_tx_queue_mapping; and functions that set, test and get this value. Reset sk_tx_queue_mapping to -1 whenever the dst cache is set/reset, and in socket alloc. Setting txq to -1 and using valid txq=<0 to n-1> allows the tx path to use the value of sk_tx_queue_mapping directly instead of subtracting 1 on every tx. Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Avoid compiler warning for mmsghdr when CONFIG_COMPAT is not selectedArnaldo Carvalho de Melo2009-10-201-1/+5
| | | | | | Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* filter: Add SKF_AD_QUEUE instructionEric Dumazet2009-10-201-1/+2
| | | | | | | | | | | | | It can help being able to filter packets on their queue_mapping. If filter performance is not good, we could add a "numqueue" field in struct packet_type, so that netif_nit_deliver() and other functions can directly ignore packets with not expected queue number. Lets experiment this simple filter extension first. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* can: provide library functions for skb allocationWolfgang Grandegger2009-10-201-0/+4
| | | | | | | | | | | | | | | | | | | | | | This patch makes the private functions alloc_can_skb() and alloc_can_err_skb() of the at91_can driver public and adapts all drivers to use these. While making the patch I realized, that the skb's are *not* setup consistently. It's now done as shown below: skb->protocol = htons(ETH_P_CAN); skb->pkt_type = PACKET_BROADCAST; skb->ip_summed = CHECKSUM_UNNECESSARY; *cf = (struct can_frame *)skb_put(skb, sizeof(struct can_frame)); memset(*cf, 0, sizeof(struct can_frame)); The frame is zeroed out to avoid uninitialized data to be passed to user space. Some drivers or library code did not set "pkt_type" or "ip_summed". Also, "__constant_htons()" should not be used for runtime invocations, as pointed out by David Miller. Signed-off-by: Wolfgang Grandegger <wg@grandegger.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* pkt_sched: ingress socket filter by markjamal2009-10-201-1/+2
| | | | | | | | Allow bpf to set a filter to drop packets that dont match a specific mark Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
* xfrm: remove skb_icv_walkSteffen Klassert2009-10-191-3/+0Star
| | | | | | | | The last users of skb_icv_walk are converted to ahash now, so skb_icv_walk is unused and can be removed. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ah: Remove obsolete codeSteffen Klassert2009-10-191-26/+3Star
| | | | | | | | ah4 and ah6 are converted to ahash now, so we can remove the code for the obsolete hash algorithm. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ah: Add struct crypto_ahash to ah_dataSteffen Klassert2009-10-191-0/+1
| | | | | | | | To support for ahash algorithms, we add a pointer to a crypto_ahash to ah_data. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* inet: rename some inet_sock fieldsEric Dumazet2009-10-196-34/+34
| | | | | | | | | | | | | | | | In order to have better cache layouts of struct sock (separate zones for rx/tx paths), we need this preliminary patch. Goal is to transfert fields used at lookup time in the first read-mostly cache line (inside struct sock_common) and move sk_refcnt to a separate cache line (only written by rx path) This patch adds inet_ prefix to daddr, rcv_saddr, dport, num, saddr, sport and id fields. This allows a future patch to define these fields as macros, like sk_refcnt, without name clashes. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Phonet: routing table Netlink interfaceRémi Denis-Courmont2009-10-151-0/+1
| | | | | Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Phonet: routing table backendRémi Denis-Courmont2009-10-151-0/+5
| | | | | | | | The Phonet "universe" only has 64 addresses, so we keep a trivial flat routing table. Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Phonet: deliver broadcast packets to broadcast socketsRémi Denis-Courmont2009-10-151-0/+1
| | | | | Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2009-10-132-5/+7
|\ | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
| * Merge branch 'master' of ↵David S. Miller2009-10-131-0/+2
| |\ | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
| | * mac80211: document ieee80211_rx() context requirementJohannes Berg2009-10-121-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ieee80211_rx() must be called with softirqs disabled since the networking stack requires this for netif_rx() and some code in mac80211 can assume that it can not be processing its own tasklet and this call at the same time. It may be possible to remove this requirement after a careful audit of mac80211 and doing any needed locking improvements in it along with disabling softirqs around netif_rx(). An alternative might be to push all packet processing to process context in mac80211, instead of to the tasklet, and add other synchronisation. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| * | net: Fix struct sock bitfield annotationEric Dumazet2009-10-121-5/+5
| |/ | | | | | | | | | | | | | | | | | | | | | | Since commit a98b65a3 (net: annotate struct sock bitfield), we lost 8 bytes in struct sock on 64bit arches because of kmemcheck_bitfield_end(flags) misplacement. Fix this by putting together sk_shutdown, sk_no_check, sk_userlocks, sk_protocol and sk_type in the 'flags' 32bits bitfield Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: smsc911x: allow platform_data to specify mac addressManuel Lauss2009-10-131-0/+1
| | | | | | | | | | | | | | Extend the driver to accept a MAC address specified in platform_data. Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | can: make the number of echo skb's configurableWolfgang Grandegger2009-10-131-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch allows the CAN controller driver to define the number of echo skb's used for the local loopback (echo), as suggested by Kurt Van Dijck, with the function: struct net_device *alloc_candev(int sizeof_priv, unsigned int echo_skb_max); The CAN drivers have been adapted accordingly. For the ems_usb driver, as suggested by Sebastian Haas, the number of echo skb's has been increased to 10, which improves the transmission performance a lot. Signed-off-by: Wolfgang Grandegger <wg@grandegger.com> Signed-off-by: Kurt Van Dijck <kurt.van.dijck@eia.be> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: Add netdev_alloc_skb_ip_align() helperEric Dumazet2009-10-131-0/+10
| | | | | | | | | | | | | | | | Instead of hardcoding NET_IP_ALIGN stuff in various network drivers, we can add a helper around netdev_alloc_skb() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | tcp: replace ehash_size by ehash_maskEric Dumazet2009-10-131-2/+2
| | | | | | | | | | | | | | | | Storing the mask (size - 1) instead of the size allows fast path to be a bit faster. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: Introduce recvmmsg socket syscallArnaldo Carvalho de Melo2009-10-134-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Meaning receive multiple messages, reducing the number of syscalls and net stack entry/exit operations. Next patches will introduce mechanisms where protocols that want to optimize this operation will provide an unlocked_recvmsg operation. This takes into account comments made by: . Paul Moore: sock_recvmsg is called only for the first datagram, sock_recvmsg_nosec is used for the rest. . Caitlin Bestler: recvmmsg now has a struct timespec timeout, that works in the same fashion as the ppoll one. If the underlying protocol returns a datagram with MSG_OOB set, this will make recvmmsg return right away with as many datagrams (+ the OOB one) it has received so far. . Rémi Denis-Courmont & Steven Whitehouse: If we receive N < vlen datagrams and then recvmsg returns an error, recvmmsg will return the successfully received datagrams, store the error and return it in the next call. This paves the way for a subsequent optimization, sk_prot->unlocked_recvmsg, where we will be able to acquire the lock only at batch start and end, not at every underlying recvmsg call. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: Generalize socket rx gap / receive queue overflow cmsgNeil Horman2009-10-123-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Create a new socket level option to report number of queue overflows Recently I augmented the AF_PACKET protocol to report the number of frames lost on the socket receive queue between any two enqueued frames. This value was exported via a SOL_PACKET level cmsg. AFter I completed that work it was requested that this feature be generalized so that any datagram oriented socket could make use of this option. As such I've created this patch, It creates a new SOL_SOCKET level option called SO_RXQ_OVFL, which when enabled exports a SOL_SOCKET level cmsg that reports the nubmer of times the sk_receive_queue overflowed between any two given frames. It also augments the AF_PACKET protocol to take advantage of this new feature (as it previously did not touch sk->sk_drops, which this patch uses to record the overflow count). Tested successfully by me. Notes: 1) Unlike my previous patch, this patch simply records the sk_drops value, which is not a number of drops between packets, but rather a total number of drops. Deltas must be computed in user space. 2) While this patch currently works with datagram oriented protocols, it will also be accepted by non-datagram oriented protocols. I'm not sure if thats agreeable to everyone, but my argument in favor of doing so is that, for those protocols which aren't applicable to this option, sk_drops will always be zero, and reporting no drops on a receive queue that isn't used for those non-participating protocols seems reasonable to me. This also saves us having to code in a per-protocol opt in mechanism. 3) This applies cleanly to net-next assuming that commit 977750076d98c7ff6cbda51858bb5a5894a9d9ab (my af packet cmsg patch) is reverted Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Revert "af_packet: add interframe drop cmsg (v6)"David S. Miller2009-10-121-2/+0Star
| | | | | | | | | | | | | | | | This reverts commit 977750076d98c7ff6cbda51858bb5a5894a9d9ab. Neil is reimplementing this generically, outside of AF_PACKET. Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'master' of ↵David S. Miller2009-10-121-1/+1
|\| | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
| * include/linux/netdevice.h: fix nanodoc mismatchWolfram Sang2009-10-071-1/+1
| | | | | | | | | | | | | | nanodoc was missing an ndo_-prefix. Signed-off-by: Wolfram Sang <w.sang@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'master' of ↵David S. Miller2009-10-095-22/+54
|\ \ | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6
| * | cfg80211: add firmware and hardware version to wiphyKalle Valo2009-10-071-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's useful to provide firmware and hardware version to user space and have a generic interface to retrieve them. Users can provide the version information in bug reports etc. Add fields for firmware and hardware version to struct wiphy. (Dropped nl80211 bits for now and modified remaining bits in favor of ethtool. -- JWL) Cc: Kalle Valo <kalle.valo@nokia.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| * | wext: refactorJohannes Berg2009-10-074-22/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Refactor wext to * split out iwpriv handling * split out iwspy handling * split out procfs support * allow cfg80211 to have wireless extensions compat code w/o CONFIG_WIRELESS_EXT After this, drivers need to - select WIRELESS_EXT - for wext support - select WEXT_PRIV - for iwpriv support - select WEXT_SPY - for iwspy support except cfg80211 -- which gets new hooks in wext-core.c and can then get wext handlers without CONFIG_WIRELESS_EXT. Wireless extensions procfs support is auto-selected based on PROC_FS and anything that requires the wext core (i.e. WIRELESS_EXT or CFG80211_WEXT). Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| * | nl80211: report age of scan resultsHolger Schurig2009-10-071-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Linux keeps scan results up to 15 seconds. This can be a problem for fast moving clients: they get back stale data. But if the kernel reports the age of the BSS items, then user-space can simply weed out old entries by itself. Signed-off-by: Holger Schurig <hs4233@mail.mn-solutions.de> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
* | | ipv6: Fix the size overflow of addrconf_sysctl arrayJin Dongming2009-10-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (This patch fixes bug of commit f7734fdf61ec6bb848e0bafc1fb8bad2c124bb50 title "make TLLAO option for NA packets configurable") When the IPV6 conf is used, the function sysctl_set_parent is called and the array addrconf_sysctl is used as a parameter of the function. The above patch added new conf "force_tllao" into the array addrconf_sysctl, but the size of the array was not modified, the static allocated size is DEVCONF_MAX + 1 but the real size is DEVCONF_MAX + 2, so the problem is that the function sysctl_set_parent accessed wrong address. I got the following information. Call Trace: [<ffffffff8106085d>] sysctl_set_parent+0x29/0x3e [<ffffffff8106085d>] sysctl_set_parent+0x29/0x3e [<ffffffff8106085d>] sysctl_set_parent+0x29/0x3e [<ffffffff8106085d>] sysctl_set_parent+0x29/0x3e [<ffffffff8106085d>] sysctl_set_parent+0x29/0x3e [<ffffffff810622d5>] __register_sysctl_paths+0xde/0x272 [<ffffffff8110892d>] ? __kmalloc_track_caller+0x16e/0x180 [<ffffffffa00cfac3>] ? __addrconf_sysctl_register+0xc5/0x144 [ipv6] [<ffffffff8141f2c9>] register_net_sysctl_table+0x48/0x4b [<ffffffffa00cfaf5>] __addrconf_sysctl_register+0xf7/0x144 [ipv6] [<ffffffffa00cfc16>] addrconf_init_net+0xd4/0x104 [ipv6] [<ffffffff8139195f>] setup_net+0x35/0x82 [<ffffffff81391f6c>] copy_net_ns+0x76/0xe0 [<ffffffff8107ad60>] create_new_namespaces+0xf0/0x16e [<ffffffff8107afee>] copy_namespaces+0x65/0x9f [<ffffffff81056dff>] copy_process+0xb2c/0x12c3 [<ffffffff810576e1>] do_fork+0x14b/0x2d2 [<ffffffff8107ac4e>] ? up_read+0xe/0x10 [<ffffffff81438e73>] ? do_page_fault+0x27a/0x2aa [<ffffffff8101044b>] sys_clone+0x28/0x2a [<ffffffff81011fb3>] stub_clone+0x13/0x20 [<ffffffff81011c72>] ? system_call_fastpath+0x16/0x1b And the information of IPV6 in .config is as following. IPV6 in .config: CONFIG_IPV6=m CONFIG_IPV6_PRIVACY=y CONFIG_IPV6_ROUTER_PREF=y CONFIG_IPV6_ROUTE_INFO=y CONFIG_IPV6_OPTIMISTIC_DAD=y CONFIG_IPV6_MIP6=m CONFIG_IPV6_SIT=m # CONFIG_IPV6_SIT_6RD is not set CONFIG_IPV6_NDISC_NODETYPE=y CONFIG_IPV6_TUNNEL=m CONFIG_IPV6_MULTIPLE_TABLES=y CONFIG_IPV6_SUBTREES=y CONFIG_IPV6_MROUTE=y CONFIG_IPV6_PIMSM_V2=y # CONFIG_IP_VS_IPV6 is not set CONFIG_NF_CONNTRACK_IPV6=m CONFIG_IP6_NF_MATCH_IPV6HEADER=m I confirmed this patch fixes this problem. Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | can: add TI CAN (HECC) driverAnant Gole2009-10-081-0/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TI HECC (High End CAN Controller) module is found on many TI devices. It has 32 hardware mailboxes with full implementation of CAN protocol 2.0B with bus speeds up to 1Mbps. Specifications of the module are available on TI web <http://www.ti.com> Signed-off-by: Anant Gole <anantgole@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | udp: dynamically size hash tables at boot timeEric Dumazet2009-10-082-6/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | UDP_HTABLE_SIZE was initialy defined to 128, which is a bit small for several setups. 4000 active UDP sockets -> 32 sockets per chain in average. An incoming frame has to lookup all sockets to find best match, so long chains hurt latency. Instead of a fixed size hash table that cant be perfect for every needs, let UDP stack choose its table size at boot time like tcp/ip route, using alloc_large_system_hash() helper Add an optional boot parameter, uhash_entries=x so that an admin can force a size between 256 and 65536 if needed, like thash_entries and rhash_entries. dmesg logs two new lines : [ 0.647039] UDP hash table entries: 512 (order: 0, 4096 bytes) [ 0.647099] UDP Lite hash table entries: 512 (order: 0, 4096 bytes) Maximal size on 64bit arches would be 65536 slots, ie 1 MBytes for non debugging spinlocks. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | IPv6: Fix 6RD build errorBrian Haley2009-10-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix build error introduced in commit fa857afcf - ipv6 sit: 6rd (IPv6 Rapid Deployment) Support. Struct in6_addr is the issue. I'm only seeing this on x86_64 systems, not on 32-bit with same IPv6 config options, so it could be there's a missing forward declaration somewhere, but including the correct header file fixes the problem too. CC [M] net/ipv6/ip6_tunnel.o In file included from net/ipv6/ip6_tunnel.c:31: include/linux/if_tunnel.h:59: error: field ‘prefix’ has incomplete type make[2]: *** [net/ipv6/ip6_tunnel.o] Error 1 make[1]: *** [net/ipv6] Error 2 Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | dcb: data center bridging ops should be r/oStephen Hemminger2009-10-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | The data center bridging ops structure can be const Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | net: mark net_proto_ops as constStephen Hemminger2009-10-071-1/+1
| | | | | | | | | | | | | | | | | | | | | All usages of structure net_proto_ops should be declared const. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | make TLLAO option for NA packets configurableOctavian Purdila2009-10-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On Friday 02 October 2009 20:53:51 you wrote: > This is good although I would have shortened the name. Ah, I knew I forgot something :) Here is v4. tavi >From 24d96d825b9fa832b22878cc6c990d5711968734 Mon Sep 17 00:00:00 2001 From: Octavian Purdila <opurdila@ixiacom.com> Date: Fri, 2 Oct 2009 00:51:15 +0300 Subject: [PATCH] ipv6: new sysctl for sending TLLAO with unicast NAs Neighbor advertisements responding to unicast neighbor solicitations did not include the target link-layer address option. This patch adds a new sysctl option (disabled by default) which controls whether this option should be sent even with unicast NAs. The need for this arose because certain routers expect the TLLAO in some situations even as a response to unicast NS packets. Moreover, RFC 2461 recommends sending this to avoid a race condition (section 4.4, Target link-layer address) Signed-off-by: Cosmin Ratiu <cratiu@ixiacom.com> Signed-off-by: Octavian Purdila <opurdila@ixiacom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | ethtool: Add reset operationBen Hutchings2009-10-071-0/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After updating firmware stored in flash, users may wish to reset the relevant hardware and start the new firmware immediately. This should not be completely automatic as it may be disruptive. A selective reset may also be useful for debugging or diagnostics. This adds a separate reset operation which takes flags indicating the components to be reset. Drivers are allowed to reset only a subset of those requested, and must indicate the actual subset. This allows the use of generic component masks and some future expansion. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | pkt_sched: gen_estimator: Dont report fake rate estimatorsEric Dumazet2009-10-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jarek Poplawski a écrit : > > > Hmm... So you made me to do some "real" work here, and guess what?: > there is one serious checkpatch warning! ;-) Plus, this new parameter > should be added to the function description. Otherwise: > Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> > > Thanks, > Jarek P. > > PS: I guess full "Don't" would show we really mean it... Okay :) Here is the last round, before the night ! Thanks again [RFC] pkt_sched: gen_estimator: Don't report fake rate estimators We currently send TCA_STATS_RATE_EST elements to netlink users, even if no estimator is running. # tc -s -d qdisc qdisc pfifo_fast 0: dev eth0 root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 112833764978 bytes 1495081739 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 User has no way to tell if the "rate 0bit 0pps" is a real estimation, or a fake one (because no estimator is active) After this patch, tc command output is : $ tc -s -d qdisc qdisc pfifo_fast 0: dev eth0 root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 561075 bytes 1196 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 We add a parameter to gnet_stats_copy_rate_est() function so that it can use gen_estimator_active(bstats, r), as suggested by Jarek. This parameter can be NULL if check is not necessary, (htb for example has a mandatory rate estimator) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | ipv6 sit: 6rd (IPv6 Rapid Deployment) Support.YOSHIFUJI Hideaki / 吉藤英明2009-10-072-0/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | IPv6 Rapid Deployment (6rd; draft-ietf-softwire-ipv6-6rd) builds upon mechanisms of 6to4 (RFC3056) to enable a service provider to rapidly deploy IPv6 unicast service to IPv4 sites to which it provides customer premise equipment. Like 6to4, it utilizes stateless IPv6 in IPv4 encapsulation in order to transit IPv4-only network infrastructure. Unlike 6to4, a 6rd service provider uses an IPv6 prefix of its own in place of the fixed 6to4 prefix. With this option enabled, the SIT driver offers 6rd functionality by providing additional ioctl API to configure the IPv6 Prefix for in stead of static 2002::/16 for 6to4. Original patch was done by Alexandre Cassen <acassen@freebox.fr> based on old Internet-Draft. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | add vif using local interface index instead of IPIlia K2009-10-071-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When routing daemon wants to enable forwarding of multicast traffic it performs something like: struct vifctl vc = { .vifc_vifi = 1, .vifc_flags = 0, .vifc_threshold = 1, .vifc_rate_limit = 0, .vifc_lcl_addr = ip, /* <--- ip address of physical interface, e.g. eth0 */ .vifc_rmt_addr.s_addr = htonl(INADDR_ANY), }; setsockopt(fd, IPPROTO_IP, MRT_ADD_VIF, &vc, sizeof(vc)); This leads (in the kernel) to calling vif_add() function call which search the (physical) device using assigned IP address: dev = ip_dev_find(net, vifc->vifc_lcl_addr.s_addr); The current API (struct vifctl) does not allow to specify an interface other way than using it's IP, and if there are more than a single interface with specified IP only the first one will be found. The attached patch (against 2.6.30.4) allows to specify an interface by its index, instead of IP address: struct vifctl vc = { .vifc_vifi = 1, .vifc_flags = VIFF_USE_IFINDEX, /* NEW */ .vifc_threshold = 1, .vifc_rate_limit = 0, .vifc_lcl_ifindex = if_nametoindex("eth0"), /* NEW */ .vifc_rmt_addr.s_addr = htonl(INADDR_ANY), }; setsockopt(fd, IPPROTO_IP, MRT_ADD_VIF, &vc, sizeof(vc)); Signed-off-by: Ilia K. <mail4ilia@gmail.com> === modified file 'include/linux/mroute.h' Signed-off-by: David S. Miller <davem@davemloft.net>
* | | Merge branch 'master' of ↵David S. Miller2009-10-071-18/+3Star
|\ \ \ | | |/ | |/| | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
| * | net: Support inclusion of <linux/socket.h> before <sys/socket.h>Ben Hutchings2009-10-051-18/+3Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following user-space program fails to compile: #include <linux/socket.h> #include <sys/socket.h> int main() { return 0; } The reason is that <linux/socket.h> tests __GLIBC__ to decide whether it should define various structures and macros that are now defined for user-space by <sys/socket.h>, but __GLIBC__ is not defined if no libc headers have yet been included. It seems safe to drop support for libc 5 now. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Bastian Blank <waldi@debian.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | net: speedup sk_wake_async()Eric Dumazet2009-10-071-1/+2
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | An incoming datagram must bring into cpu cache *lot* of cache lines, in particular : (other parts omitted (hash chains, ip route cache...)) On 32bit arches : offsetof(struct sock, sk_rcvbuf) =0x30 (read) offsetof(struct sock, sk_lock) =0x34 (rw) offsetof(struct sock, sk_sleep) =0x50 (read) offsetof(struct sock, sk_rmem_alloc) =0x64 (rw) offsetof(struct sock, sk_receive_queue)=0x74 (rw) offsetof(struct sock, sk_forward_alloc)=0x98 (rw) offsetof(struct sock, sk_callback_lock)=0xcc (rw) offsetof(struct sock, sk_drops) =0xd8 (read if we add dropcount support, rw if frame dropped) offsetof(struct sock, sk_filter) =0xf8 (read) offsetof(struct sock, sk_socket) =0x138 (read) offsetof(struct sock, sk_data_ready) =0x15c (read) We can avoid sk->sk_socket and socket->fasync_list referencing on sockets with no fasync() structures. (socket->fasync_list ptr is probably already in cache because it shares a cache line with socket->wait, ie location pointed by sk->sk_sleep) This avoids one cache line load per incoming packet for common cases (no fasync()) We can leave (or even move in a future patch) sk->sk_socket in a cold location Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: introduce NETDEV_POST_INIT notifierJohannes Berg2009-10-051-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | For various purposes including a wireless extensions bugfix, we need to hook into the netdev creation before before netdev_register_kobject(). This will also ease doing the dev type assignment that Marcel was working on for cfg80211 drivers w/o touching them all. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | usbnet: Use wwan%d interface name for mobile broadband devicesMarcel Holtmann2009-10-051-0/+1
| | | | | | | | | | | | | | | | | | Add support for usbnet based devices like CDC-Ether to indicate that they are actually mobile broadband devices. In that case use wwan%d as default interface name. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | tunnels: Optimize tx pathEric Dumazet2009-10-051-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We currently dirty a cache line to update tunnel device stats (tx_packets/tx_bytes). We better use the txq->tx_bytes/tx_packets counters that already are present in cpu cache, in the cache line shared with txq->_xmit_lock This patch extends IPTUNNEL_XMIT() macro to use txq pointer provided by the caller. Also &tunnel->dev->stats can be replaced by &dev->stats Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | ipv4: fib table algorithm performance improvementStephen Hemminger2009-10-051-11/+14
| | | | | | | | | | | | | | | | | | | | The FIB algorithim for IPV4 is set at compile time, but kernel goes through the overhead of function call indirection at runtime. Save some cycles by turning the indirect calls to direct calls to either hash or trie code. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>