summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
| * | | | | | | Merge branch 'gianfar-wol-fixes'David S. Miller2015-08-013-79/+33Star
| |\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Claudiu Manoil says: ==================== gianfar: wol magic packet fixes These changes were already validated as part of FSL SDK. Patch 2 fixes occasional wake-on magic packet failures during traffic, probably due to incorrect traffic stop/ device halt sequence and incorrect usage of txlock. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | gianfar: Enable device wakeup when appropriateClaudiu Manoil2015-08-013-13/+3Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The wol_en flag is 0 by default anyway, and we have the following inconsistency: a MAGIC packet wol capable eth interface is registered as a wake-up source but unable to wake-up the system as wol_en is 0 (wake-on flag set to 'd'). Calling set_wakeup_enable() at netdev open is just redundant because wol_en is 0 by default. Let only ethtool call set_wakeup_enable() for now. The bflock is obviously obsoleted, its utility has been corroded over time. The bitfield flags used today in gianfar are accessed only on the init/ config path, with no real possibility of concurrency - nothing that would justify smth. like bflock. Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | gianfar: Fix suspend/resume for wol magic packetClaudiu Manoil2015-08-011-68/+30Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we disable NAPI in the first place we can mask the device's interrupts (and halt it) without fearing that imask may be concurrently accessed from interrupt context, so there's no need to do local_irq_save() around gfar_halt_nodisable(). lock_rx_qs()/unlock_tx_qs() are just obsoleted and potentially buggy routines. The txlock is currently used in the driver only to manage TX congestion, it has nothing to do with halting the device. With these changes, the TX processing is stopped before gfar_halt(). Compact gfar_halt() is used instead of gfar_halt_nodisable(), as it disables Rx/TX DMA h/w blocks and the Rx/TX h/w queues. gfar_start() re-enables all these blocks on resume. Enabling the magic-packet mode remains the same, note that the RX block is re-enabled just before entering sleep mode. Add IRQF_NO_SUSPEND flag for the error interrupt line, to signal that the interrupt line must remain active during sleep in order to wake the system by magic packet (MAG) reception interrupt. (On some systems the MAG interrupt did trigger w/o this flag as well, but on others it didn't.) Without these fixes, when suspended during fair Tx traffic the interface occasionally failed to be woken up by magic packet. Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | gianfar: Fix warning when CONFIG_PM offClaudiu Manoil2015-08-011-0/+2
| |/ / / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CC drivers/net/ethernet/freescale/gianfar.o drivers/net/ethernet/freescale/gianfar.c:568:13: warning: 'lock_tx_qs' defined but not used [-Wunused-function] static void lock_tx_qs(struct gfar_private *priv) ^ drivers/net/ethernet/freescale/gianfar.c:576:13: warning: 'unlock_tx_qs' defined but not used [-Wunused-function] static void unlock_tx_qs(struct gfar_private *priv) ^ Reported-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | act_pedit: check binding before calling tcf_hash_release()WANG Cong2015-08-011-3/+2Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we share an action within a filter, the bind refcnt should increase, therefore we should not call tcf_hash_release(). Fixes: 1a29321ed045 ("net_sched: act: Dont increment refcnt on replace") Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Cong Wang <cwang@twopensource.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | net: sk_clone_lock() should only do get_net() if the parent is not a kernel ↵Sowmini Varadhan2015-07-311-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | socket The newsk returned by sk_clone_lock should hold a get_net() reference if, and only if, the parent is not a kernel socket (making this similar to sk_alloc()). E.g,. for the SYN_RECV path, tcp_v4_syn_recv_sock->..inet_csk_clone_lock sets up the syn_recv newsk from sk_clone_lock. When the parent (listen) socket is a kernel socket (defined in sk_alloc() as having sk_net_refcnt == 0), then the newsk should also have a 0 sk_net_refcnt and should not hold a get_net() reference. Fixes: 26abe14379f8 ("net: Modify sk_alloc to not reference count the netns of kernel sockets.") Acked-by: Eric Dumazet <edumazet@google.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | net: sched: fix refcount imbalance in actionsDaniel Borkmann2015-07-302-6/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit 55334a5db5cd ("net_sched: act: refuse to remove bound action outside"), we end up with a wrong reference count for a tc action. Test case 1: FOO="1,6 0 0 4294967295," BAR="1,6 0 0 4294967294," tc filter add dev foo parent 1: bpf bytecode "$FOO" flowid 1:1 \ action bpf bytecode "$FOO" tc actions show action bpf action order 0: bpf bytecode '1,6 0 0 4294967295' default-action pipe index 1 ref 1 bind 1 tc actions replace action bpf bytecode "$BAR" index 1 tc actions show action bpf action order 0: bpf bytecode '1,6 0 0 4294967294' default-action pipe index 1 ref 2 bind 1 tc actions replace action bpf bytecode "$FOO" index 1 tc actions show action bpf action order 0: bpf bytecode '1,6 0 0 4294967295' default-action pipe index 1 ref 3 bind 1 Test case 2: FOO="1,6 0 0 4294967295," tc filter add dev foo parent 1: bpf bytecode "$FOO" flowid 1:1 action ok tc actions show action gact action order 0: gact action pass random type none pass val 0 index 1 ref 1 bind 1 tc actions add action drop index 1 RTNETLINK answers: File exists [...] tc actions show action gact action order 0: gact action pass random type none pass val 0 index 1 ref 2 bind 1 tc actions add action drop index 1 RTNETLINK answers: File exists [...] tc actions show action gact action order 0: gact action pass random type none pass val 0 index 1 ref 3 bind 1 What happens is that in tcf_hash_check(), we check tcf_common for a given index and increase tcfc_refcnt and conditionally tcfc_bindcnt when we've found an existing action. Now there are the following cases: 1) We do a late binding of an action. In that case, we leave the tcfc_refcnt/tcfc_bindcnt increased and are done with the ->init() handler. This is correctly handeled. 2) We replace the given action, or we try to add one without replacing and find out that the action at a specific index already exists (thus, we go out with error in that case). In case of 2), we have to undo the reference count increase from tcf_hash_check() in the tcf_hash_check() function. Currently, we fail to do so because of the 'tcfc_bindcnt > 0' check which bails out early with an -EPERM error. Now, while commit 55334a5db5cd prevents 'tc actions del action ...' on an already classifier-bound action to drop the reference count (which could then become negative, wrap around etc), this restriction only accounts for invocations outside a specific action's ->init() handler. One possible solution would be to add a flag thus we possibly trigger the -EPERM ony in situations where it is indeed relevant. After the patch, above test cases have correct reference count again. Fixes: 55334a5db5cd ("net_sched: act: refuse to remove bound action outside") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | Merge branch 'r8152-fixes'David S. Miller2015-07-301-4/+57
| |\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hayes Wang says: ==================== r8152: device reset v3: For patch #2, remove cancel_delayed_work(). v2: For patch #1, remove usb_autopm_get_interface(), usb_autopm_put_interface(), and the checking of intf->condition. For patch #2, replace the original method with usb_queue_reset_device() to reset the device. v1: Although the driver works normally, we find the device may get all 0xff data when transmitting packets on certain platforms. It would break the device and no packet could be transmitted. The reset is necessary to recover the hw for this situation. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | r8152: reset device when tx timeouthayeswang2015-07-301-4/+3Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The device reset is necessary if the hw becomes abnormal and stops transmitting packets. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | r8152: add pre_reset and post_resethayeswang2015-07-301-0/+54
| |/ / / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add rtl8152_pre_reset() and rtl8152_post_reset() which are used when calling usb_reset_device(). The two functions could reduce the time of reset when calling usb_reset_device() after probe(). Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | qlcnic: Fix corruption while copyingShahed Shaikh2015-07-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use proper typecasting while performing byte-by-byte copy Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | act_bpf: fix memory leaks when replacing bpf programsDaniel Borkmann2015-07-301-18/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We currently trigger multiple memory leaks when replacing bpf actions, besides others: comm "tc", pid 1909, jiffies 4294851310 (age 1602.796s) hex dump (first 32 bytes): 01 00 00 00 03 00 00 00 00 00 00 00 00 00 00 00 ................ 18 b0 98 6d 00 88 ff ff 00 00 00 00 00 00 00 00 ...m............ backtrace: [<ffffffff817e623e>] kmemleak_alloc+0x4e/0xb0 [<ffffffff8120a22d>] __vmalloc_node_range+0x1bd/0x2c0 [<ffffffff8120a37a>] __vmalloc+0x4a/0x50 [<ffffffff811a8d0a>] bpf_prog_alloc+0x3a/0xa0 [<ffffffff816c0684>] bpf_prog_create+0x44/0xa0 [<ffffffffa09ba4eb>] tcf_bpf_init+0x28b/0x3c0 [act_bpf] [<ffffffff816d7001>] tcf_action_init_1+0x191/0x1b0 [<ffffffff816d70a2>] tcf_action_init+0x82/0xf0 [<ffffffff816d4d12>] tcf_exts_validate+0xb2/0xc0 [<ffffffffa09b5838>] cls_bpf_modify_existing+0x98/0x340 [cls_bpf] [<ffffffffa09b5cd6>] cls_bpf_change+0x1a6/0x274 [cls_bpf] [<ffffffff816d56e5>] tc_ctl_tfilter+0x335/0x910 [<ffffffff816b9145>] rtnetlink_rcv_msg+0x95/0x240 [<ffffffff816df34f>] netlink_rcv_skb+0xaf/0xc0 [<ffffffff816b909e>] rtnetlink_rcv+0x2e/0x40 [<ffffffff816deaaf>] netlink_unicast+0xef/0x1b0 Issue is that the old content from tcf_bpf is allocated and needs to be released when we replace it. We seem to do that since the beginning of act_bpf on the filter and insns, later on the name as well. Example test case, after patch: # FOO="1,6 0 0 4294967295," # BAR="1,6 0 0 4294967294," # tc actions add action bpf bytecode "$FOO" index 2 # tc actions show action bpf action order 0: bpf bytecode '1,6 0 0 4294967295' default-action pipe index 2 ref 1 bind 0 # tc actions replace action bpf bytecode "$BAR" index 2 # tc actions show action bpf action order 0: bpf bytecode '1,6 0 0 4294967294' default-action pipe index 2 ref 1 bind 0 # tc actions replace action bpf bytecode "$FOO" index 2 # tc actions show action bpf action order 0: bpf bytecode '1,6 0 0 4294967295' default-action pipe index 2 ref 1 bind 0 # tc actions del action bpf index 2 [...] # echo "scan" > /sys/kernel/debug/kmemleak # cat /sys/kernel/debug/kmemleak | grep "comm \"tc\"" | wc -l 0 Fixes: d23b8ad8ab23 ("tc: add BPF based action") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | Merge branch 'thunderx-fixes'David S. Miller2015-07-306-37/+92
| |\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Aleksey Makarov says: ==================== net: thunderx: Misc fixes Miscellaneous fixes for the ThunderX VNIC driver All the patches can be applied individually. It's ok to drop some if the maintainer feels uncomfortable with applying for 4.2. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | net: thunderx: Fix for crash while BGX teardownThanneeru Srinivasulu2015-07-301-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cortina phy does not have kernel driver and we don't attach device with phy layer for intefaces like XFI, XLAUI etc, Hence check for interface type before calling disconnect. Signed-off-by: Thanneeru Srinivasulu <tsrinivasulu@caviumnetworks.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | net: thunderx: Add PCI driver shutdown routineSunil Goutham2015-07-301-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | net: thunderx: Fix crash when changing rss with mutliple traffic flowsSunil Goutham2015-07-301-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes a crash when changing rss with multiple traffic flows. While interface teardown, disable tx queues after all NAPI threads are done. If done otherwise tx queues might be woken up inside NAPI if any CQE_TX are processed. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | net: thunderx: Set watchdog timeout valueSunil Goutham2015-07-302-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a txq (SQ) remains in stopped state after this timeout its considered as stuck and interface is reinited. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | net: thunderx: Wakeup TXQ only if CQE_TX are processedSunil Goutham2015-07-303-15/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously TXQ is wakedup whenever napi is executed and irrespective of if any CQE_TX are processed or not. Added 'txq_stop' and 'txq_wake' counters to aid in debugging if there are any future issues. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | net: thunderx: Suppress alloc_pages() failure warningsSunil Goutham2015-07-301-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Suppressing standard alloc_pages() warnings. Some kernel configs limit alloc size and the network driver may fail. Do not drop a kernel warning in this case, instead just drop a oneliner that the network driver could not be loaded since the buffer could not be allocated. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | net: thunderx: Fix TSO packet statisticSunil Goutham2015-07-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixing TSO packages not being counted. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | net: thunderx: Fix memory leak when changing queue countSunil Goutham2015-07-301-9/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix for memory leak when changing queue/channel count via ethtool Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | net: thunderx: Fix RQ_DROP miscalculationSunil Goutham2015-07-301-3/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With earlier configured value sufficient number of CQEs are not being reserved for transmitted packets. Hence under heavy incoming traffic load, receive notifications will take away most of the CQ thus transmit notifications will be lost resulting in tx skbs not being freed. Finally SQ will be full and it will be stopped, watchdog timer will kick in. After this fix receive notifications will not take morethan half of CQ reserving the rest for transmit notifications. Also changed CQ & SQ sizes from 16k to 4k. This is also due to the receive notifications taking first half of CQ under heavy load and time taken by NAPI to clear transmit notifications will increase with higher queue sizes. Again results in SQ being stopped. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | net: thunderx: Fix memory leak while tearing down interfaceSunil Goutham2015-07-302-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixed 'tso_hdrs' memory not being freed properly. Also fixed SQ skbuff maintenance issues. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | net: thunderx: Fix data integrity issues with LDWBSunil Goutham2015-07-301-1/+1
| |/ / / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Switching back to LDD transactions from LDWB. While transmitting packets out with LDWB transactions data integrity issues are seen very frequently. hence switching back to LDD. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | ipv6: flush nd cache on IFF_NOARP changeEric Dumazet2015-07-301-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is the IPv6 equivalent of commit 6c8b4e3ff81b ("arp: flush arp cache on IFF_NOARP change") Without it, we keep buggy neighbours in the cache, with destination MAC address equal to our own MAC address. Tested: tcpdump -i eth0 -s 0 ip6 -n -e & ip link set dev eth0 arp off ping6 remote // sends buggy frames ip link set dev eth0 arp on ping6 remote // should work once kernel is patched Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Mario Fanelli <mariofanelli@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | Merge branch 'netcp-fixes'David S. Miller2015-07-302-33/+30Star
| |\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Murali Karicheri says: ==================== net: netcp: bug fixes for dynamic module support This series fixes few bugs to allow keystone netcp modules to be dynamically loaded and removed. Currently it allows following sequence multiple times insmod cpsw_ale.ko insmod davinci_mdio.ko insmod keystone_netcp.ko insmod keystone_netcp_ethss.ko ifup eth0 ifup eth1 ping <hosts on eth0> ping <hosts on eth1> ifdown eth1 ifdown eth0 rmmod keystone_netcp_ethss.ko rmmod keystone_netcp.ko rmmod davinci_mdio.ko rmmod cpsw_ale.ko ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | net: netcp: ethss: cleanup gbe_probe() and gbe_remove() functionsKaricheri, Muralidharan2015-07-302-28/+17Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch clean up error handle code to use goto label properly. In some cases, the code unnecessarily use goto instead of just returning the error code. Code also make explicit calls to devm_* APIs on error which is not necessary. In the gbe_remove() also it makes similar calls which is also unnecessary. Also fix few checkpatch warnings Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | net: netcp: ethss: fix up incorrect use of list apiKaricheri, Muralidharan2015-07-301-3/+2Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The code seems to assume a null is returned when the list is empty from first_sec_slave() to break the loop which is incorrect. Fix the code by using list_empty(). Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | net: netcp: fix cleanup interface list in netcp_remove()Karicheri, Muralidharan2015-07-301-2/+11
| |/ / / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently if user do rmmod keystone_netcp.ko following warning is seen :- [ 59.035891] ------------[ cut here ]------------ [ 59.040535] WARNING: CPU: 2 PID: 1619 at drivers/net/ethernet/ti/ netcp_core.c:2127 netcp_remove) This is because the interface list is not cleaned up in netcp_remove. This patch fixes this. Also fix some checkpatch related warnings. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | ebpf, x86: fix general protection fault when tail call is invokedDaniel Borkmann2015-07-301-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With eBPF JIT compiler enabled on x86_64, I was able to reliably trigger the following general protection fault out of an eBPF program with a simple tail call, f.e. tracex5 (or a stripped down version of it): [ 927.097918] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC [...] [ 927.100870] task: ffff8801f228b780 ti: ffff880016a64000 task.ti: ffff880016a64000 [ 927.102096] RIP: 0010:[<ffffffffa002440d>] [<ffffffffa002440d>] 0xffffffffa002440d [ 927.103390] RSP: 0018:ffff880016a67a68 EFLAGS: 00010006 [ 927.104683] RAX: 5a5a5a5a5a5a5a5a RBX: 0000000000000000 RCX: 0000000000000001 [ 927.105921] RDX: 0000000000000000 RSI: ffff88014e438000 RDI: ffff880016a67e00 [ 927.107137] RBP: ffff880016a67c90 R08: 0000000000000000 R09: 0000000000000001 [ 927.108351] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880016a67e00 [ 927.109567] R13: 0000000000000000 R14: ffff88026500e460 R15: ffff880220a81520 [ 927.110787] FS: 00007fe7d5c1f740(0000) GS:ffff880265000000(0000) knlGS:0000000000000000 [ 927.112021] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 927.113255] CR2: 0000003e7bbb91a0 CR3: 000000006e04b000 CR4: 00000000001407e0 [ 927.114500] Stack: [ 927.115737] ffffc90008cdb000 ffff880016a67e00 ffff88026500e460 ffff880220a81520 [ 927.117005] 0000000100000000 000000000000001b ffff880016a67aa8 ffffffff8106c548 [ 927.118276] 00007ffcdaf22e58 0000000000000000 0000000000000000 ffff880016a67ff0 [ 927.119543] Call Trace: [ 927.120797] [<ffffffff8106c548>] ? lookup_address+0x28/0x30 [ 927.122058] [<ffffffff8113d176>] ? __module_text_address+0x16/0x70 [ 927.123314] [<ffffffff8117bf0e>] ? is_ftrace_trampoline+0x3e/0x70 [ 927.124562] [<ffffffff810c1a0f>] ? __kernel_text_address+0x5f/0x80 [ 927.125806] [<ffffffff8102086f>] ? print_context_stack+0x7f/0xf0 [ 927.127033] [<ffffffff810f7852>] ? __lock_acquire+0x572/0x2050 [ 927.128254] [<ffffffff810f7852>] ? __lock_acquire+0x572/0x2050 [ 927.129461] [<ffffffff8119edfa>] ? trace_call_bpf+0x3a/0x140 [ 927.130654] [<ffffffff8119ee4a>] trace_call_bpf+0x8a/0x140 [ 927.131837] [<ffffffff8119edfa>] ? trace_call_bpf+0x3a/0x140 [ 927.133015] [<ffffffff8119f008>] kprobe_perf_func+0x28/0x220 [ 927.134195] [<ffffffff811a1668>] kprobe_dispatcher+0x38/0x60 [ 927.135367] [<ffffffff81174b91>] ? seccomp_phase1+0x1/0x230 [ 927.136523] [<ffffffff81061400>] kprobe_ftrace_handler+0xf0/0x150 [ 927.137666] [<ffffffff81174b95>] ? seccomp_phase1+0x5/0x230 [ 927.138802] [<ffffffff8117950c>] ftrace_ops_recurs_func+0x5c/0xb0 [ 927.139934] [<ffffffffa022b0d5>] 0xffffffffa022b0d5 [ 927.141066] [<ffffffff81174b91>] ? seccomp_phase1+0x1/0x230 [ 927.142199] [<ffffffff81174b95>] seccomp_phase1+0x5/0x230 [ 927.143323] [<ffffffff8102c0a4>] syscall_trace_enter_phase1+0xc4/0x150 [ 927.144450] [<ffffffff81174b95>] ? seccomp_phase1+0x5/0x230 [ 927.145572] [<ffffffff8102c0a4>] ? syscall_trace_enter_phase1+0xc4/0x150 [ 927.146666] [<ffffffff817f9a9f>] tracesys+0xd/0x44 [ 927.147723] Code: 48 8b 46 10 48 39 d0 76 2c 8b 85 fc fd ff ff 83 f8 20 77 21 83 c0 01 89 85 fc fd ff ff 48 8d 44 d6 80 48 8b 00 48 83 f8 00 74 0a <48> 8b 40 20 48 83 c0 33 ff e0 48 89 d8 48 8b 9d d8 fd ff ff 4c [ 927.150046] RIP [<ffffffffa002440d>] 0xffffffffa002440d The code section with the instructions that traps points into the eBPF JIT image of the root program (the one invoking the tail call instruction). Using bpf_jit_disasm -o on the eBPF root program image: [...] 4e: mov -0x204(%rbp),%eax 8b 85 fc fd ff ff 54: cmp $0x20,%eax <--- if (tail_call_cnt > MAX_TAIL_CALL_CNT) 83 f8 20 57: ja 0x000000000000007a 77 21 59: add $0x1,%eax <--- tail_call_cnt++ 83 c0 01 5c: mov %eax,-0x204(%rbp) 89 85 fc fd ff ff 62: lea -0x80(%rsi,%rdx,8),%rax <--- prog = array->prog[index] 48 8d 44 d6 80 67: mov (%rax),%rax 48 8b 00 6a: cmp $0x0,%rax <--- check for NULL 48 83 f8 00 6e: je 0x000000000000007a 74 0a 70: mov 0x20(%rax),%rax <--- GPF triggered here! fetch of bpf_func 48 8b 40 20 [ matches <48> 8b 40 20 ... from above ] 74: add $0x33,%rax <--- prologue skip of new prog 48 83 c0 33 78: jmpq *%rax <--- jump to new prog insns ff e0 [...] The problem is that rax has 5a5a5a5a5a5a5a5a, which suggests a tail call jump to map slot 0 is pointing to a poisoned page. The issue is the following: lea instruction has a wrong offset, i.e. it should be ... lea 0x80(%rsi,%rdx,8),%rax ... but it actually seems to be ... lea -0x80(%rsi,%rdx,8),%rax ... where 0x80 is offsetof(struct bpf_array, prog), thus the offset needs to be positive instead of negative. Disassembling the interpreter, we btw similarly do: [...] c88: lea 0x80(%rax,%rdx,8),%rax <--- prog = array->prog[index] 48 8d 84 d0 80 00 00 00 c90: add $0x1,%r13d 41 83 c5 01 c94: mov (%rax),%rax 48 8b 00 [...] Now the other interesting fact is that this panic triggers only when things like CONFIG_LOCKDEP are being used. In that case offsetof(struct bpf_array, prog) starts at offset 0x80 and in non-CONFIG_LOCKDEP case at offset 0x50. Reason is that the work_struct inside struct bpf_map grows by 48 bytes in my case due to the lockdep_map member (which also has CONFIG_LOCK_STAT enabled members). Changing the emitter to always use the 4 byte displacement in the lea instruction fixes the panic on my side. It increases the tail call instruction emission by 3 more byte, but it should cover us from various combinations (and perhaps other future increases on related structures). After patch, disassembly: [...] 9e: lea 0x80(%rsi,%rdx,8),%rax <--- CONFIG_LOCKDEP/CONFIG_LOCK_STAT 48 8d 84 d6 80 00 00 00 a6: mov (%rax),%rax 48 8b 00 [...] [...] 9e: lea 0x50(%rsi,%rdx,8),%rax <--- No CONFIG_LOCKDEP 48 8d 84 d6 50 00 00 00 a6: mov (%rax),%rax 48 8b 00 [...] Fixes: b52f00e6a715 ("x86: bpf_jit: implement bpf_tail_call() helper") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | bridge: mdb: fix delmdb state in the notificationNikolay Aleksandrov2015-07-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since mdb states were introduced when deleting an entry the state was left as it was set in the delete request from the user which leads to the following output when doing a monitor (for example): $ bridge mdb add dev br0 port eth3 grp 239.0.0.1 permanent (monitor) dev br0 port eth3 grp 239.0.0.1 permanent $ bridge mdb del dev br0 port eth3 grp 239.0.0.1 permanent (monitor) dev br0 port eth3 grp 239.0.0.1 temp ^^^ Note the "temp" state in the delete notification which is wrong since the entry was permanent, the state in a delete is always reported as "temp" regardless of the real state of the entry. After this patch: $ bridge mdb add dev br0 port eth3 grp 239.0.0.1 permanent (monitor) dev br0 port eth3 grp 239.0.0.1 permanent $ bridge mdb del dev br0 port eth3 grp 239.0.0.1 permanent (monitor) dev br0 port eth3 grp 239.0.0.1 permanent There's one important note to make here that the state is actually not matched when doing a delete, so one can delete a permanent entry by stating "temp" in the end of the command, I've chosen this fix in order not to break user-space tools which rely on this (incorrect) behaviour. So to give an example after this patch and using the wrong state: $ bridge mdb add dev br0 port eth3 grp 239.0.0.1 permanent (monitor) dev br0 port eth3 grp 239.0.0.1 permanent $ bridge mdb del dev br0 port eth3 grp 239.0.0.1 temp (monitor) dev br0 port eth3 grp 239.0.0.1 permanent Note the state of the entry that got deleted is correct in the notification. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Fixes: ccb1c31a7a87 ("bridge: add flags to distinguish permanent mdb entires") Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | bridge: mcast: give fast leave precedence over multicast router and querierSatish Ashok2015-07-291-24/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When fast leave is configured on a bridge port and an IGMP leave is received for a group, the group is not deleted immediately if there is a router detected or if multicast querier is configured. Ideally the group should be deleted immediately when fast leave is configured. Signed-off-by: Satish Ashok <sashok@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | bridge: Fix network header pointer for vlan tagged packetsToshiaki Makita2015-07-291-7/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are several devices that can receive vlan tagged packets with CHECKSUM_PARTIAL like tap, possibly veth and xennet. When (multiple) vlan tagged packets with CHECKSUM_PARTIAL are forwarded by bridge to a device with the IP_CSUM feature, they end up with checksum error because before entering bridge, the network header is set to ETH_HLEN (not including vlan header length) in __netif_receive_skb_core(), get_rps_cpu(), or drivers' rx functions, and nobody fixes the pointer later. Since the network header is exepected to be ETH_HLEN in flow-dissection and hash-calculation in RPS in rx path, and since the header pointer fix is needed only in tx path, set the appropriate network header on forwarding packets. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | packet: tpacket_snd(): fix signed/unsigned comparisonAlexander Drozdov2015-07-291-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tpacket_fill_skb() can return a negative value (-errno) which is stored in tp_len variable. In that case the following condition will be (but shouldn't be) true: tp_len > dev->mtu + dev->hard_header_len as dev->mtu and dev->hard_header_len are both unsigned. That may lead to just returning an incorrect EMSGSIZE errno to the user. Fixes: 52f1454f629fa ("packet: allow to transmit +4 byte in TX_RING slot for VLAN case") Signed-off-by: Alexander Drozdov <al.drozdov@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | arp: filter NOARP neighbours for SIOCGARPEric Dumazet2015-07-291-7/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When arp is off on a device, and ioctl(SIOCGARP) is queried, a buggy answer is given with MAC address of the device, instead of the mac address of the destination/gateway. We filter out NUD_NOARP neighbours for /proc/net/arp, we must do the same for SIOCGARP ioctl. Tested: lpaa23:~# ./arp 10.246.7.190 MAC=00:01:e8:22:cb:1d // correct answer lpaa23:~# ip link set dev eth0 arp off lpaa23:~# cat /proc/net/arp # check arp table is now 'empty' IP address HW type Flags HW address Mask Device lpaa23:~# ./arp 10.246.7.190 MAC=00:1a:11:c3:0d:7f // buggy answer before patch (this is eth0 mac) After patch : lpaa23:~# ip link set dev eth0 arp off lpaa23:~# ./arp 10.246.7.190 ioctl(SIOCGARP) failed: No such device or address Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Vytautas Valancius <valas@google.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | net/ipv4: suppress NETDEV_UP notification on address lifetime updateDavid Ward2015-07-291-1/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This notification causes the FIB to be updated, which is not needed because the address already exists, and more importantly it may undo intentional changes that were made to the FIB after the address was originally added. (As a point of comparison, when an address becomes deprecated because its preferred lifetime expired, a notification on this chain is not generated.) The motivation for this commit is fixing an incompatibility between DHCP clients which set and update the address lifetime according to the lease, and a commercial VPN client which replaces kernel routes in a way that outbound traffic is sent only through the tunnel (and disconnects if any further route changes are detected via netlink). Signed-off-by: David Ward <david.ward@ll.mit.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | bridge: stp: when using userspace stp stop kernel hello and hold timersNikolay Aleksandrov2015-07-293-4/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These should be handled only by the respective STP which is in control. They become problematic for devices with limited resources with many ports because the hold_timer is per port and fires each second and the hello timer fires each 2 seconds even though it's global. While in user-space STP mode these timers are completely unnecessary so it's better to keep them off. Also ensure that when the bridge is up these timers are started only when running with kernel STP. Signed-off-by: Satish Ashok <sashok@cumulusnetworks.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | packet: missing dev_put() in packet_do_bind()Lars Westerhoff2015-07-281-5/+3Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When binding a PF_PACKET socket, the use count of the bound interface is always increased with dev_hold in dev_get_by_{index,name}. However, when rebound with the same protocol and device as in the previous bind the use count of the interface was not decreased. Ultimately, this caused the deletion of the interface to fail with the following message: unregister_netdevice: waiting for dummy0 to become free. Usage count = 1 This patch moves the dev_put out of the conditional part that was only executed when either the protocol or device changed on a bind. Fixes: 902fefb82ef7 ('packet: improve socket create/bind latency in some cases') Signed-off-by: Lars Westerhoff <lars.westerhoff@newtec.eu> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | macvtap: fix network header pointer for VLAN tagged pktsIvan Vecera2015-07-271-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Network header is set with offset ETH_HLEN but it is not true for VLAN (multiple-)tagged and results in checksum issues in lower devices. v2: leave skb->protocol untouched (thx Vlad), comment added v3: moved after skb_probe_transport_header() call (thx Toshiaki) Signed-off-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | fib_trie: Drop unnecessary calls to leaf_pull_suffixAlexander Duyck2015-07-271-4/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It was reported that update_suffix was taking a long time on systems where a large number of leaves were attached to a single node. As it turns out fib_table_flush was calling update_suffix for each leaf that didn't have all of the aliases stripped from it. As a result, on this large node removing one leaf would result in us calling update_suffix for every other leaf on the node. The fix is to just remove the calls to leaf_pull_suffix since they are redundant as we already have a call in resize that will go through and update the suffix length for the node before we exit out of fib_table_flush or fib_table_flush_external. Reported-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Tested-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | macb: Fix build with macro'ized readl/writel.David S. Miller2015-07-272-15/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If an architecture defines readl/writel using CPP macros, we get the following kinds of build failure: > > > drivers/net/ethernet/cadence/macb.c:164:1: error: macro "writel" > > > passed 3 arguments, but takes just 2 > macb_or_gem_writel(bp, SA1B, bottom); > ^ Rename the methods so that this doesn't happen. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | net: fec: Ensure clocks are enabled while using mdio busAndrew Lunn2015-07-271-13/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a switch is attached to the mdio bus, the mdio bus can be used while the interface is not open. If the IPG clock is not enabled, MDIO reads/writes will simply time out. Add support for runtime PM to control this clock. Enable/disable this clock using runtime PM, with open()/close() and mdio read()/write() function triggering runtime PM operations. Since PM is optional, the IPG clock is enabled at probe and is no longer modified by fec_enet_clk_enable(), thus if PM is not enabled in the kernel, it is guaranteed the clock is running when MDIO operations are performed. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Cc: tyler.baker@linaro.org Cc: fabio.estevam@freescale.com Cc: shawn.guo@linaro.org Tested-by: Fabio Estevam <fabio.estevam@freescale.com> Tested-by: Tyler Baker <tyler.baker@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | net: netcp: Fixes SGMII reset on network interface shutdownWingMan Kwok2015-07-273-2/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch asserts SGMII RTRESET, i.e. resetting the SGMII Tx/Rx logic, during network interface shutdown to avoid having the hardware wedge when shutting down with high incoming traffic rates. This is cleared (brought out of RTRESET) when the interface is brought back up. Signed-off-by: WingMan Kwok <w-kwok2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | Merge branch 'macb-fixes'David S. Miller2015-07-273-65/+108
| |\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Andy Shevchenko says: ==================== net/macb: fix for AVR32 and clean up It seems no one had tested recently the driver on AVR32 platforms such as ATNGW100. This series bring it back to work. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | net/macb: convert to kernel docAndy Shevchenko2015-07-271-3/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch coverts struct description to the kernel doc format. There is no functional change. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | net/macb: replace macb_count_tx_descriptors() by DIV_ROUND_UP()Andy Shevchenko2015-07-271-8/+2Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | macb_count_tx_descriptors() repeats the generic macro DIV_ROUND_UP(). The patch does a replacement. There is no functional change. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | net/macb: suppress compiler warningsAndy Shevchenko2015-07-272-6/+5Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes the following warnings: drivers/net/ethernet/cadence/macb.c: In function ‘macb_handle_link_change’: drivers/net/ethernet/cadence/macb.c:266: warning: comparison between signed and unsigned drivers/net/ethernet/cadence/macb.c:267: warning: comparison between signed and unsigned drivers/net/ethernet/cadence/macb.c:291: warning: comparison between signed and unsigned drivers/net/ethernet/cadence/macb.c: In function ‘gem_update_stats’: drivers/net/ethernet/cadence/macb.c:1908: warning: comparison between signed and unsigned drivers/net/ethernet/cadence/macb.c: In function ‘gem_get_ethtool_strings’: drivers/net/ethernet/cadence/macb.c:1988: warning: comparison between signed and unsigned Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | net/macb: use dev_*() when netdev is not yet registeredAndy Shevchenko2015-07-271-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To avoid messages like macb macb.0 (unnamed net_device) (uninitialized): Cadence caps 0x00000000 macb macb.0 (unnamed net_device) (uninitialized): invalid hw address, using random let's use dev_*() macros. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | net/macb: check if macb_config presentAndy Shevchenko2015-07-271-2/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The commit 98b5a0f4a228 introduces jumbo frame support, but also it assumes that macb_config present which is not always true. The configuration without macb_config fails to boot. Unable to handle kernel NULL pointer dereference at virtual address 00000010 ptbr = 90350000 pgd = 00000000 Oops: Kernel access of bad area, sig: 11 [#1] FRAME_POINTER chip: 0x01f:0x1e82 rev 2 Modules linked in: CPU: 0 PID: 1 Comm: swapper Not tainted 4.2.0-rc3-next-20150723+ #13 task: 91c26000 ti: 91c28000 task.ti: 91c28000 PC is at macb_probe+0x140/0x61c Fixes: 98b5a0f4a228 (net: macb: Add support for jumbo frames) Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | net/macb: improve big endian CPU supportAndy Shevchenko2015-07-272-44/+87
| |/ / / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The commit a50dad355a53 (net: macb: Add big endian CPU support) converted I/O accessors to readl_relaxed() and writel_relaxed() and consequentially broke MACB driver on AVR32 platforms such as ATNGW100. This patch improves I/O access by checking endiannes first and use the corresponding methods. Fixes: a50dad355a53 (net: macb: Add big endian CPU support) Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>