summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* cxgb4: don't offload Rx checksums for IPv6 fragmentsDimitris Michailidis2010-08-032-3/+6
| | | | | | | | The checksum provided by the device doesn't include the L3 headers, as IPv6 expects. Signed-off-by: Dimitris Michailidis <dm@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4: disable an interrupt that is neither used nor servicedDimitris Michailidis2010-08-031-1/+1
| | | | | Signed-off-by: Dimitris Michailidis <dm@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* hp100: unmap memory on error pathDan Carpenter2010-08-031-2/+4
| | | | | | | There was an error path where "mem_ptr_virt" didn't get unmapped. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Tulip: don't initialize SBE xT3E3 WAN ports.Krzysztof Hałasa2010-08-032-0/+9
| | | | | | | | SBE 2T3E3 cards use DECchips 21143 but they need a different driver. Don't even try to use a normal tulip driver with them. Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
* drivers/net/wan/farsync.c: Use standard pr_<level>Joe Perches2010-08-031-58/+53Star
| | | | | | | Remove locally defined equivalents Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'vhost-net-next' of ↵David S. Miller2010-08-035-91/+515
|\ | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
| * vhost-net: mergeable buffers supportDavid Stevens2010-07-283-18/+315
| | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for mergeable buffers in vhost-net: this is needed for older guests without indirect buffer support, as well as for zero copy with some devices. Includes changes by Michael S. Tsirkin to make the patch as low risk as possible (i.e., close to no changes when feature is disabled). Signed-off-by: David Stevens <dlstevens@us.ibm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vhost: apply cgroup to vhost workersMichael S. Tsirkin2010-07-281-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Apply the cgroup of the owner task to the created vhost worker. Based on patches from Sridhar Samudrala's and Tejun Heo. Later we'll need to also apply cpumask and probably priority of the owner process. Discussion on the best way to do this is still ongoing. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Sridhar Samudrala <samudrala.sridhar@gmail.com> Cc: Li Zefan <lizf@cn.fujitsu.com>
| * cgroups: Add an API to attach a task to current task's cgroupSridhar Samudrala2010-07-282-0/+30
| | | | | | | | | | | | | | | | | | | | Add a new kernel API to attach a task to current task's cgroup in all the active hierarchies. Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Paul Menage <menage@google.com> Acked-by: Li Zefan <lizf@cn.fujitsu.com>
| * vhost: replace vhost_workqueue with per-vhost kthreadTejun Heo2010-07-283-73/+164
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace vhost_workqueue with per-vhost kthread. Other than callback argument change from struct work_struct * to struct vhost_work *, there's no visible change to vhost_poll_*() interface. This conversion is to make each vhost use a dedicated kthread so that resource control via cgroup can be applied. Partially based on Sridhar Samudrala's patch. * Updated to use sub structure vhost_work instead of directly using vhost_poll at Michael's suggestion. * Added flusher wake_up() optimization at Michael's suggestion. Changes by MST: * Converted atomics/barrier use to a spinlock. * Create thread on SET_OWNER * Fix flushing Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Cc: Sridhar Samudrala <samudrala.sridhar@gmail.com>
* | tg3: Update version to 3.113Matt Carlson2010-08-031-2/+2
| | | | | | | | | | | | | | | | | | This patch updates the tg3 version to 3.113. Reviewed-by: Benjamin Li <benli@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | tg3: Migrate tg3_flags to phy_flagsMatt Carlson2010-08-032-152/+159
| | | | | | | | | | | | | | | | | | | | This patch moves most of the phy related flag definitions over to the phyflags member and changes the code accordingly. Reviewed-by: Benjamin Li <benli@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | tg3: Create phy_flags and migrate phy_is_low_powerMatt Carlson2010-08-032-19/+20
| | | | | | | | | | | | | | | | | | | | | | This patch deletes the link_config.phy_is_low_power flag and creates a new phy_flags device member to store all phy related settings. All the code is converted accordingly. Reviewed-by: Benjamin Li <benli@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | tg3: Add phy-related preprocessor constantsMatt Carlson2010-08-032-23/+27
| | | | | | | | | | | | | | | | | | | | This patch replaces some instances of hardcoded phy register values with preprocessor equivalents. Reviewed-by: Benjamin Li <benli@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | tg3: Add error reporting to tg3_phydsp_write()Matt Carlson2010-08-031-31/+20Star
| | | | | | | | | | | | | | | | | | | | | | This patch adds error reporting to the tg3_phydsp_write() function and converts a few more locations to use this function over the inlined equivalent. Reviewed-by: Benjamin Li <benli@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | tg3: Improve small packet performanceMatt Carlson2010-08-031-1/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | smp_mb() inside tg3_tx_avail() is used twice in the normal tg3_start_xmit() path (see illustration below). The full memory barrier is only necessary during race conditions with tx completion. We can speed up the tx path by replacing smp_mb() in tg3_tx_avail() with a compiler barrier. The compiler barrier is to force the compiler to fetch the tx_prod and tx_cons from memory. In the race condition between tg3_start_xmit() and tg3_tx(), we have the following situation: tg3_start_xmit() tg3_tx() if (!tg3_tx_avail()) BUG(); ... if (!tg3_tx_avail()) netif_tx_stop_queue(); update_tx_index(); smp_mb(); smp_mb(); if (tg3_tx_avail()) if (netif_tx_queue_stopped() && netif_tx_wake_queue(); tg3_tx_avail()) With smp_mb() removed from tg3_tx_avail(), we need to add smp_mb() to tg3_start_xmit() as shown above to properly order netif_tx_stop_queue() and tg3_tx_avail() to check the ring index. If it is not strictly ordered, the tx queue can be stopped forever. This improves performance by about 3% with 2 ports running bi-directional 64-byte packets. Reviewed-by: Benjamin Li <benli@broadcom.com> Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | tg3: Remove 5720, 5750, and 5750MMatt Carlson2010-08-032-6/+0Star
| | | | | | | | | | | | | | | | | | These devices were never released to the public. Reviewed-by: Benjamin Li <benli@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | tg3: Detect APE firmware typesMatt Carlson2010-08-032-1/+10
| | | | | | | | | | | | | | | | | | | | This patch adds code to determine the APE firmware type and report this along with the firmware version. Reviewed-by: Benjamin Li <benli@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | tg3: Restrict ASPM workaround devlistMatt Carlson2010-08-031-1/+3
| | | | | | | | | | | | | | | | | | | | The ASPM workaround setting obtained from NVRAM only works with devices older than 5717. This patch enforces the restriction. Reviewed-by: Benjamin Li <benli@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | tg3: Manage gphy power for CPMU-less devs onlyMatt Carlson2010-08-031-1/+4
| | | | | | | | | | | | | | | | | | | | | | This patch changes the code to only manage the PCIe gphy power for CPMU-less devices only. The CPMU takes over management for newer chips. Reviewed-by: Benjamin Li <benli@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | tg3: Don't access phy test ctrl reg for 5717+Matt Carlson2010-08-032-3/+11
| | | | | | | | | | | | | | | | | | | | The phy test register location has been repurposed for 5717+ devices. This patch changes the code to avoid this location for these devices. Reviewed-by: Benjamin Li <benli@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | tg3: Create TG3_FLG3_5717_PLUS flagMatt Carlson2010-08-032-50/+23Star
| | | | | | | | | | | | | | | | | | | | This patch creates a TG3_FLG3_5717_PLUS flag to collectively describe the set of changes in the ASIC that will apply to all future chip revisions. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | tg3: Disable TSS also during tg3_close()Matt Carlson2010-08-031-1/+1
| | | | | | | | | | | | | | | | | | The TSS flag needs to be turned off during tg3_close(). If the device fails to allocate more than one MSI-X vector the next time the device is brought up, transmits will fail. Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | tg3: Add 5784 ASIC rev to earlier PCIe MPS fixMatt Carlson2010-08-031-2/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | tg3 commit e7126997342560533317d8467e8516119ebcbd21 entitled "tg3: Preserve PCIe MPS setting for new devs" attempted to ensure the PCIe link negotiated Maximum Payload Size (MPS) setting was 128 bytes for all devices that didn't support higher speeds. The 5784 device was mistakenly added to this list when it shouldn't have. This patch removes the 5784 ASIC rev devices from that list. Reviewed-by: Benjamin Li <benli@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'master' of ↵David S. Miller2010-08-0345-363/+806
|\ \ | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6
| * | netfilter: nf_conntrack_acct: use skb->len for accountingChangli Gao2010-08-021-2/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | use skb->len for accounting as xt_quota does. Since nf_conntrack works at the network layer, skb_network_offset should always returns ZERO. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: nf_nat: don't check if the tuple is unique when there isn't any ↵Changli Gao2010-08-023-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | other choice The tuple got from unique_tuple() doesn't need to be really unique, so the check for the unique tuple isn't necessary, when there isn't any other choice. Eliminating the unnecessary nf_nat_used_tuple() can save some CPU cycles too. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: nf_nat: make unique_tuple return voidChangli Gao2010-08-0210-32/+30Star
| | | | | | | | | | | | | | | | | | | | | | | | The only user of unique_tuple() get_unique_tuple() doesn't care about the return value of unique_tuple(), so make unique_tuple() return void (nothing). Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: nf_nat: use local variable hdrlenChangli Gao2010-08-021-11/+7Star
| | | | | | | | | | | | | | | | | | | | | Use local variable hdrlen instead of ip_hdrlen(skb). Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | ipvs: provide default ip_vs_conn_{in,out}_get_protoSimon Horman2010-08-025-153/+63Star
| | | | | | | | | | | | | | | | | | | | | | | | This removes duplicate code by providing a default implementation which is used by 3 of the 4 modules that provide these call. Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | ipvs: remove EXPERIMENTAL tagSimon Horman2010-08-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | IPVS was merged into the kernel quite a long time ago and has been seeing wide-spread production use for even longer. It seems appropriate for it to be no longer tagged as EXPERIMENTAL Signed-off-as: Simon Horman <horms@verge.net.au> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: nf_conntrack_extend: introduce __nf_ct_ext_exist()Changli Gao2010-08-022-12/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | some users of nf_ct_ext_exist() know ct->ext isn't NULL. For these users, the check for ct->ext isn't necessary, the function __nf_ct_ext_exist() can be used instead. the type of the return value of nf_ct_ext_exist() is changed to bool. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: {ip,ip6,arp}_tables: dont block bottom half more than necessaryEric Dumazet2010-08-023-12/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We currently disable BH for the whole duration of get_counters() On machines with a lot of cpus and large tables, this might be too long. We can disable preemption during the whole function, and disable BH only while fetching counters for the current cpu. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: iptables: use skb->len for accountingChangli Gao2010-07-231-1/+1
| | | | | | | | | | | | | | | | | | | | | Use skb->len for accounting as xt_quota does. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: ip6tables: use skb->len for accountingChangli Gao2010-07-231-3/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | ipv6_hdr(skb)->payload_len is ZERO and can't be used for accounting, if the payload is a Jumbo Payload specified in RFC2675. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | xt_quota: report initial quota value instead of current value to userspaceChangli Gao2010-07-232-3/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | We should copy the initial value to userspace for iptables-save and to allow removal of specific quota rules. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: xt_quota: use per-rule spin lockChangli Gao2010-07-231-5/+5
| | | | | | | | | | | | | | | | | | | | | Use per-rule spin lock to improve the scalability. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: arptables: use arp_hdr_len()Changli Gao2010-07-231-4/+1Star
| | | | | | | | | | | | | | | | | | | | | use arp_hdr_len(). Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: nf_nat_core: merge the same linesChangli Gao2010-07-231-7/+2Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | proto->unique_tuple() will be called finally, if the previous calls fail. This patch checks the false condition of (range->flags &IP_NAT_RANGE_PROTO_RANDOM) instead to avoid duplicate line of code: proto->unique_tuple(). Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: add xt_cpu matchEric Dumazet2010-07-235-1/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In some situations a CPU match permits a better spreading of connections, or select targets only for a given cpu. With Remote Packet Steering or multiqueue NIC and appropriate IRQ affinities, we can distribute trafic on available cpus, per session. (all RX packets for a given flow is handled by a given cpu) Some legacy applications being not SMP friendly, one way to scale a server is to run multiple copies of them. Instead of randomly choosing an instance, we can use the cpu number as a key so that softirq handler for a whole instance is running on a single cpu, maximizing cache effects in TCP/UDP stacks. Using NAT for example, a four ways machine might run four copies of server application, using a separate listening port for each instance, but still presenting an unique external port : iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 0 \ -j REDIRECT --to-port 8080 iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 1 \ -j REDIRECT --to-port 8081 iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 2 \ -j REDIRECT --to-port 8082 iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 3 \ -j REDIRECT --to-port 8083 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | IPVS: make FTP work with full NAT supportHannes Eder2010-07-235-59/+165
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use nf_conntrack/nf_nat code to do the packet mangling and the TCP sequence adjusting. The function 'ip_vs_skb_replace' is now dead code, so it is removed. To SNAT FTP, use something like: % iptables -t nat -A POSTROUTING -m ipvs --vaddr 192.168.100.30/32 \ --vport 21 -j SNAT --to-source 192.168.10.10 and for the data connections in passive mode: % iptables -t nat -A POSTROUTING -m ipvs --vaddr 192.168.100.30/32 \ --vportctl 21 -j SNAT --to-source 192.168.10.10 using '-m state --state RELATED' would also works. Make sure the kernel modules ip_vs_ftp, nf_conntrack_ftp, and nf_nat_ftp are loaded. [ up-port and minor fixes by Simon Horman <horms@verge.net.au> ] Signed-off-by: Hannes Eder <heder@google.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | IPVS: make friends with nf_conntrackHannes Eder2010-07-233-37/+30Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update the nf_conntrack tuple in reply direction, as we will see traffic from the real server (RIP) to the client (CIP). Once this is done we can use netfilters SNAT in POSTROUTING, especially with xt_ipvs, to do source NAT, e.g.: % iptables -t nat -A POSTROUTING -m ipvs --vaddr 192.168.100.30/32 --vport 80 \ -j SNAT --to-source 192.168.10.10 [ minor fixes by Simon Horman <horms@verge.net.au> ] Signed-off-by: Hannes Eder <heder@google.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: xt_ipvs (netfilter matcher for IPVS)Hannes Eder2010-07-236-0/+229
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This implements the kernel-space side of the netfilter matcher xt_ipvs. [ minor fixes by Simon Horman <horms@verge.net.au> ] Signed-off-by: Hannes Eder <heder@google.com> Signed-off-by: Simon Horman <horms@verge.net.au> [ Patrick: added xt_ipvs.h to Kbuild ] Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: correct CHECKSUM header and export itMichael S. Tsirkin2010-07-162-3/+6
| | | | | | | | | | | | | | | Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: add CHECKSUM targetMichael S. Tsirkin2010-07-154-0/+105
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a `CHECKSUM' target, which can be used in the iptables mangle table. You can use this target to compute and fill in the checksum in a packet that lacks a checksum. This is particularly useful, if you need to work around old applications such as dhcp clients, that do not work well with checksum offloads, but don't want to disable checksum offload in your device. The problem happens in the field with virtualized applications. For reference, see Red Hat bz 605555, as well as http://www.spinics.net/lists/kvm/msg37660.html Typical expected use (helps old dhclient binary running in a VM): iptables -A POSTROUTING -t mangle -p udp --dport bootpc \ -j CHECKSUM --checksum-fill Includes fixes by Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: nf_ct_tcp: fix flow recovery with TCP window tracking enabledPablo Neira Ayuso2010-07-151-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the missing bits to support the recovery of TCP flows without disabling window tracking (aka be_liberal). To ensure a successful recovery, we have to inject the window scale factor via ctnetlink. This patch has been tested with a development snapshot of conntrackd and the new clause `TCPWindowTracking' that allows to perform strict TCP window tracking recovery across fail-overs. With this patch, we don't update the receiver's window until it's not initiated. We require this to perform a successful recovery. Jozsef confirmed in a private email that this spotted a real issue since that should not happen. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | nfnetlink_log: do not expose NFULNL_COPY_DISABLED to user-spacePablo Neira Ayuso2010-07-152-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch moves NFULNL_COPY_PACKET definition from linux/netfilter/nfnetlink_log.h to net/netfilter/nfnetlink_log.h since this copy mode is only for internal use. I have also changed the value from 0x03 to 0xff. Thus, we avoid a gap from user-space that may confuse users if we add new copy modes in the future. This change was introduced in: http://www.spinics.net/lists/netfilter-devel/msg13535.html Since this change is not included in any stable Linux kernel, I think it's safe to make this change now. Anyway, this copy mode does not make any sense from user-space, so this patch should not break any existing setup. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: xt_TPROXY: the length of lines should be within 80Changli Gao2010-07-091-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | According to the Documentation/CodingStyle, the length of lines should be within 80. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | ipvs: lvs sctp protocol handler is incorrectly invoked ip_vs_app_pkt_outXiaoyu Du2010-07-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | lvs sctp protocol handler is incorrectly invoked ip_vs_app_pkt_out Since there's no sctp helpers at present, it does the same thing as ip_vs_app_pkt_in. Signed-off-by: Xiaoyu Du <tingsrain@gmail.com> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | ipvs: Kconfig cleanupMichal Marek2010-07-051-4/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | IP_VS_PROTO_AH_ESP should be set iff either of IP_VS_PROTO_{AH,ESP} is selected. Express this with standard kconfig syntax. Signed-off-by: Michal Marek <mmarek@suse.cz> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Patrick McHardy <kaber@trash.net>