summaryrefslogtreecommitdiffstats
path: root/net/bridge
Commit message (Collapse)AuthorAgeFilesLines
* bridge: convert br_features_recompute() to ndo_fix_featuresMichał Mirosław2011-04-284-67/+19Star
| | | | | | | | Note: netdev_update_features() needs only rtnl_lock as br->port_list is only changed while holding it. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2011-04-261-1/+1
|\ | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Resolved logic conflicts causing a build failure due to drivers/net/r8169.c changes using a patch from Stephen Rothwell. Signed-off-by: David S. Miller <davem@davemloft.net>
| * Revert "bridge: Forward reserved group addresses if !STP"David S. Miller2011-04-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 1e253c3b8a1aeed51eef6fc366812f219b97de65. It breaks 802.3ad bonding inside of a bridge. The commit was meant to support transport bridging, and specifically virtual machines bridged to an ethernet interface connected to a switch port wiht 802.1x enabled. But this isn't the way to do it, it breaks too many other things. Signed-off-by: David S. Miller <davem@davemloft.net>
* | inet: constify ip headers and in6_addrEric Dumazet2011-04-222-8/+8
| | | | | | | | | | | | | | | | Add const qualifiers to structs iphdr, ipv6hdr and in6_addr pointers where possible, to make code intention more obvious. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'master' of ↵David S. Miller2011-04-191-4/+2Star
|\| | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/bnx2x/bnx2x_ethtool.c
| * bridge: reset IPCB in br_parse_ip_optionsEric Dumazet2011-04-121-4/+2Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 462fb2af9788a82 (bridge : Sanitize skb before it enters the IP stack), missed one IPCB init before calling ip_options_compile() Thanks to Scot Doyle for his tests and bug reports. Reported-by: Scot Doyle <lkml@scotdoyle.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> Acked-by: Bandan Das <bandan.das@stratus.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Cc: Jan Lübbe <jluebbe@debian.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bridge: fix accidental creation of sysfs directoryStephen Hemminger2011-04-181-1/+1
| | | | | | | | | | | | | | | | | | | | Commit bb900b27a2f49b37bc38c08e656ea13048fee13b ("bridge: allow creating bridge devices with netlink") introduced a bug in net-next because of a typo in notifier. Every device would have the sysfs bridge directory (and files). Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'master' of ↵David S. Miller2011-04-112-2/+2
|\| | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/smsc911x.c
| * Merge branch 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6Linus Torvalds2011-04-072-2/+2
| |\ | | | | | | | | | | | | * 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6: Fix common misspellings
| | * Fix common misspellingsLucas De Marchi2011-03-312-2/+2
| | | | | | | | | | | | | | | | | | Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
* | | bridge: range check STP parametersstephen hemminger2011-04-057-93/+107
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Apply restrictions on STP parameters based 802.1D 1998 standard. * Fixes missing locking in set path cost ioctl * Uses common code for both ioctl and sysfs This is based on an earlier patch Sasikanth V but with overhaul. Note: 1. It does NOT enforce the restriction on the relationship max_age and forward delay or hello time because in existing implementation these are set as independant operations. 2. If STP is disabled, there is no restriction on forward delay 3. No restriction on holding time because users use Linux code to act as hub or be sticky. 4. Although standard allow 0-255, Linux only allows 0-63 for port priority because more bits are reserved for port number. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | bridge: allow creating bridge devices with netlinkstephen hemminger2011-04-055-87/+101
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add netlink device ops to allow creating bridge device via netlink. This works in a manner similar to vlan, macvlan and bonding. Example: # ip link add link dev br0 type bridge # ip link del dev br0 The change required rearranging initializtion code to deal with being called by create link. Most of the initialization happens in br_dev_setup, but allocation of stats is done in ndo_init callback to deal with allocation failure. Sysfs setup has to wait until after the network device kobject is registered. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | bridge: allow creating/deleting fdb entries via netlinkstephen hemminger2011-04-053-0/+144
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use RTM_NEWNEIGH and RTM_DELNEIGH to allow updating of entries in bridge forwarding table. This allows manipulating static entries which is not possible with existing tools. Example (using bridge extensions to iproute2) # br fdb add 00:02:03:04:05:06 dev eth0 Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | bridge: add netlink notification on forward entry changesstephen hemminger2011-04-053-0/+127
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows applications to query and monitor bridge forwarding table in the same method used for neighbor table. The forward table entries are returned in same structure format as used by the ioctl. If more information is desired in future, the netlink method is extensible. Example (using bridge extensions to iproute2) # br monitor Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | bridge: split rcu and no-rcu cases of fdb lookupstephen hemminger2011-04-051-3/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | In some cases, look up of forward database entry is done with RCU; and for others no RCU is needed because of locking. Split the two cases into two differnt loops (and take off inline). Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | bridge: track last used time in forwarding tablestephen hemminger2011-04-053-8/+10
| | | | | | | | | | | | | | | | | | | | | | | | Adds tracking the last used time in forwarding table. Rename ageing_timer to updated to better describe it. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | bridge: change arguments to fdb_createstephen hemminger2011-04-051-8/+10
|/ / | | | | | | | | | | | | | | | | Later patch provides ability to create non-local static entry. To make this easier move the updating of the flag values to after the code that creates entry. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bridge: mcast snooping, fix length check of snooped MLDv1/2Linus Lüssing2011-03-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "len = ntohs(ip6h->payload_len)" does not include the length of the ipv6 header itself, which the rest of this function assumes, though. This leads to a length check less restrictive as it should be in the following line for one thing. For another, it very likely leads to an integer underrun when substracting the offset and therefore to a very high new value of 'len' due to its unsignedness. This will ultimately lead to the pskb_trim_rcsum() practically never being called, even in the cases where it should. Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bridge: Fix compilation warning in function br_stp_recalculate_bridge_id()Balaji G2011-03-301-1/+1
|/ | | | | | | | | net/bridge/br_stp_if.c: In function ‘br_stp_recalculate_bridge_id’: net/bridge/br_stp_if.c:216:3: warning: ‘return’ with no value, in function returning non-void Signed-off-by: G.Balaji <balajig81@gmail.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bridge: notify applications if address of bridge device changesstephen hemminger2011-03-283-5/+12
| | | | | | | | | | The mac address of the bridge device may be changed when a new interface is added to the bridge. If this happens, then the bridge needs to call the network notifiers to tickle any other systems that care. Since bridge can be a module, this also means exporting the notifier function. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bridge: Fix possibly wrong MLD queries' ethernet source addressLinus Lüssing2011-03-231-1/+1
| | | | | | | | | | | | | | The ipv6_dev_get_saddr() is currently called with an uninitialized destination address. Although in tests it usually seemed to nevertheless always fetch the right source address, there seems to be a possible race condition. Therefore this commit changes this, first setting the destination address and only after that fetching the source address. Reported-by: Jan Beulich <JBeulich@novell.com> Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* bridge: Reset IPCB when entering IP stack on NF_FORWARDHerbert Xu2011-03-181-0/+3
| | | | | | | | | | | | | | | Whenever we enter the IP stack proper from bridge netfilter we need to ensure that the skb is in a form the IP stack expects it to be in. The entry point on NF_FORWARD did not meet the requirements of the IP stack, therefore leading to potential crashes/panics. This patch fixes the problem. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: introduce rx_handler results and logic around thatJiri Pirko2011-03-162-11/+16
| | | | | | | | | | | This patch allows rx_handlers to better signalize what to do next to it's caller. That makes skb->deliver_no_wcard no longer needed. kernel-doc for rx_handler_result is taken from Nicolas' patch. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Reviewed-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2011-03-151-2/+2
|\ | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
| * bridge: skip forwarding delay if not using STPstephen hemminger2011-03-141-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If Spanning Tree Protocol is not enabled, there is no good reason for the bridge code to wait for the forwarding delay period before enabling the link. The purpose of the forwarding delay is to allow STP to learn about other bridges before nominating itself. The only possible impact is that when starting up a new port the bridge may flood a packet now, where previously it might have seen traffic from the other host and preseeded the forwarding table. Includes change for local variable br already available in that func. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bridge: control carrier based on ports onlinestephen hemminger2011-03-143-13/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This makes the bridge device behave like a physical device. In earlier releases the bridge always asserted carrier. This changes the behavior so that bridge device carrier is on only if one or more ports are in the forwarding state. This should help IPv6 autoconfiguration, DHCP, and routing daemons. I did brief testing with Network and Virt manager and they seem fine, but since this changes behavior of bridge, it should wait until net-next (2.6.39). Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Reviewed-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr> Tested-By: Adam Majer <adamm@zombino.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | ipv4: Create and use route lookup helpers.David S. Miller2011-03-131-5/+2Star
| | | | | | | | | | | | | | The idea here is this minimizes the number of places one has to edit in order to make changes to how flows are defined and used. Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'master' of ↵David S. Miller2011-03-101-0/+1
|\| | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/bnx2x/bnx2x_cmn.c
| * net: bridge builtin vs. ipv6 modularRandy Dunlap2011-03-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When configs BRIDGE=y and IPV6=m, this build error occurs: br_multicast.c:(.text+0xa3341): undefined reference to `ipv6_dev_get_saddr' BRIDGE_IGMP_SNOOPING is boolean; if it were tristate, then adding depends on IPV6 || IPV6=n to BRIDGE_IGMP_SNOOPING would be a good fix. As it is currently, making BRIDGE depend on the IPV6 config works. Reported-by: Patrick Schaaf <netdev@bof.de> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'master' of ↵David S. Miller2011-03-041-11/+12
|\| | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/bnx2x/bnx2x.h
| * bridge: Use IPv6 link-local address for multicast listener queriesLinus Lüssing2011-02-221-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the bridge multicast snooping feature periodically issues IPv6 general multicast listener queries to sense the absence of a listener. For this, it uses :: as its source address - however RFC 2710 requires: "To be valid, the Query message MUST come from a link-local IPv6 Source Address". Current Linux kernel versions seem to follow this requirement and ignore our bogus MLD queries. With this commit a link local address from the bridge interface is being used to issue the MLD query, resulting in other Linux devices which are multicast listeners in the network to respond with a MLD response (which was not the case before). Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>
| * bridge: Fix MLD queries' ethernet source addressLinus Lüssing2011-02-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Map the IPv6 header's destination multicast address to an ethernet source address instead of the MLD queries multicast address. For instance for a general MLD query (multicast address in the MLD query set to ::), this would wrongly be mapped to 33:33:00:00:00:00, although an MLD queries destination MAC should always be 33:33:00:00:00:01 which matches the IPv6 header's multicast destination ff02::1. Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>
| * bridge: Allow mcast snooping for transient link local addresses tooLinus Lüssing2011-02-221-5/+4Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the multicast bridge snooping support is not active for link local multicast. I assume this has been done to leave important multicast data untouched, like IPv6 Neighborhood Discovery. In larger, bridged, local networks it could however be desirable to optimize for instance local multicast audio/video streaming too. With the transient flag in IPv6 multicast addresses we have an easy way to optimize such multimedia traffic without tempering with the high priority multicast data from well-known addresses. This patch alters the multicast bridge snooping for IPv6, to take effect for transient multicast addresses instead of non-link-local addresses. Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>
| * bridge: Add missing ntohs()s for MLDv2 report parsingLinus Lüssing2011-02-221-2/+3
| | | | | | | | | | | | | | | | The nsrcs number is 2 Byte wide, therefore we need to call ntohs() before using it. Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>
| * bridge: Fix IPv6 multicast snooping by correcting offset in MLDv2 reportLinus Lüssing2011-02-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | We actually want a pointer to the grec_nsrcr and not the following field. Otherwise we can get very high values for *nsrcs as the first two bytes of the IPv6 multicast address are being used instead, leading to a failing pskb_may_pull() which results in MLDv2 reports not being parsed. Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>
| * bridge: Fix IPv6 multicast snooping by storing correct protocol typeLinus Lüssing2011-02-221-1/+1
| | | | | | | | | | | | | | | | | | | | The protocol type for IPv6 entries in the hash table for multicast bridge snooping is falsely set to ETH_P_IP, marking it as an IPv4 address, instead of setting it to ETH_P_IPV6, which results in negative look-ups in the hash table later. Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* | ipv4: Make output route lookup return rtable directly.David S. Miller2011-03-021-4/+5
| | | | | | | | | | | | Instead of on the stack. Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'master' of ↵David S. Miller2011-03-021-0/+2
|\ \ | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6
| * | bridge: netfilter: fix information leakVasiliy Kulikov2011-02-141-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Struct tmp is copied from userspace. It is not checked whether the "name" field is NULL terminated. This may lead to buffer overflow and passing contents of kernel stack as a module name to try_then_request_module() and, consequently, to modprobe commandline. It would be seen by all userspace processes. Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
* | | Merge branch 'master' of ↵David S. Miller2011-02-203-13/+11Star
|\ \ \ | | |/ | |/| | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: Documentation/feature-removal-schedule.txt drivers/net/e1000e/netdev.c net/xfrm/xfrm_policy.c
| * | bridge: Replace mp->mglist hlist with a boolHerbert Xu2011-02-123-12/+9Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As it turns out we never need to walk through the list of multicast groups subscribed by the bridge interface itself (the only time we'd want to do that is when we shut down the bridge, in which case we simply walk through all multicast groups), we don't really need to keep an hlist for mp->mglist. This means that we can replace it with just a single bit to indicate whether the bridge interface is subscribed to a group. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | bridge: Fix timer typo that may render snooping less effectiveHerbert Xu2011-02-121-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In a couple of spots where we are supposed to modify the port group timer (p->timer) we instead modify the bridge interface group timer (mp->timer). The effect of this is mostly harmless. However, it can cause port subscriptions to be longer than they should be, thus making snooping less effective. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | bridge: Fix mglist corruption that leads to memory corruptionHerbert Xu2011-02-121-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The list mp->mglist is used to indicate whether a multicast group is active on the bridge interface itself as opposed to one of the constituent interfaces in the bridge. Unfortunately the operation that adds the mp->mglist node to the list neglected to check whether it has already been added. This leads to list corruption in the form of nodes pointing to itself. Normally this would be quite obvious as it would cause an infinite loop when walking the list. However, as this list is never actually walked (which means that we don't really need it, I'll get rid of it in a subsequent patch), this instead is hidden until we perform a delete operation on the affected nodes. As the same node may now be pointed to by more than one node, the delete operations can then cause modification of freed memory. This was observed in practice to cause corruption in 512-byte slabs, most commonly leading to crashes in jbd2. Thanks to Josef Bacik for pointing me in the right direction. Reported-by: Ian Page Hands <ihands@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | bridge: implement [add/del]_slave opsJiri Pirko2011-02-142-1/+27
| | | | | | | | | | | | | | | | | | | | | add possibility to addif/delif via rtnetlink Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | Merge branch 'master' of ↵David S. Miller2011-02-041-2/+2
|\| | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
| * | bridge: Don't put partly initialized fdb into hashPavel Emelyanov2011-02-041-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The fdb_create() puts a new fdb into hash with only addr set. This is not good, since there are callers, that search the hash w/o the lock and access all the other its fields. Applies to current netdev tree. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | net: reduce and unify printk level in netdev_fix_features()Michał Mirosław2011-01-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reduce printk() levels to KERN_INFO in netdev_fix_features() as this will be used by ethtool and might spam dmesg unnecessarily. This converts the function to use netdev_info() instead of plain printk(). As a side effect, bonding and bridge devices will now log dropped features on every slave device change. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | net: change netdev->features to u32Michał Mirosław2011-01-252-2/+2
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | Quoting Ben Hutchings: we presumably won't be defining features that can only be enabled on 64-bit architectures. Occurences found by `grep -r` on net/, drivers/net, include/ [ Move features and vlan_features next to each other in struct netdev, as per Eric Dumazet's suggestion -DaveM ] Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
* | netfilter: ebt_ip6: allow matching on ipv6-icmp types/codesFlorian Westphal2011-01-131-12/+34
| | | | | | | | | | | | | | | | | | | | To avoid adding a new match revision icmp type/code are stored in the sport/dport area. Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Holger Eitzenberger <holger@eitzenberger.org> Reviewed-by: Bart De Schuymer<bdschuym@pandora.be> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* | netfilter: x_table: speedup compat operationsEric Dumazet2011-01-131-0/+1
|/ | | | | | | | | | | | | | | | | | | | | | | One iptables invocation with 135000 rules takes 35 seconds of cpu time on a recent server, using a 32bit distro and a 64bit kernel. We eventually trigger NMI/RCU watchdog. INFO: rcu_sched_state detected stall on CPU 3 (t=6000 jiffies) COMPAT mode has quadratic behavior and consume 16 bytes of memory per rule. Switch the xt_compat algos to use an array instead of list, and use a binary search to locate an offset in the sorted array. This halves memory need (8 bytes per rule), and removes quadratic behavior [ O(N*N) -> O(N*log2(N)) ] Time of iptables goes from 35 s to 150 ms. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>