summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* net: dsa: include dsa.h only onceVivien Didelot2017-05-1813-12/+14
| | | | | | | | | | | | The public include/net/dsa.h file is meant for DSA drivers, while all DSA core files share a common private header net/dsa/dsa_priv.h file. Ensure that dsa_priv.h is the only DSA core file to include net/dsa.h, and add a new line to separate absolute and relative headers at the same time. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: fix __skb_try_recv_from_queue to return the old behaviorAndrey Vagin2017-05-182-15/+11Star
| | | | | | | | | | | | | | | | | This function has to return NULL on a error case, because there is a separate error variable. The offset has to be changed only if skb is returned v2: fix udp code to not use an extra variable Cc: Paolo Abeni <pabeni@redhat.com> Cc: Eric Dumazet <edumazet@google.com> Cc: David S. Miller <davem@davemloft.net> Fixes: 65101aeca522 ("net/sock: factor out dequeue/peek with offset cod") Signed-off-by: Andrei Vagin <avagin@openvz.org> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: make struct dst_entry::dev first memberAlexey Dobriyan2017-05-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | struct dst_entry::dev is used most often. Move it so it can be accessed without imm8 offset on x86_64. add/remove: 0/0 grow/shrink: 9/239 up/down: 52/-413 (-361) function old new delta dst_rcu_free 126 138 +12 fnhe_flush_routes 211 219 +8 rt_set_nexthop 747 754 +7 rt_cache_route 85 91 +6 rt6_release 209 215 +6 dst_release 107 111 +4 dst_destroy_rcu 29 33 +4 dn_dst_check_expire 329 333 +4 dn_insert_route 484 485 +1 xfrm_resolve_and_create_bundle 2991 2990 -1 ... ip_route_me_harder 1163 1157 -6 __ip_append_data.isra 2730 2724 -6 ip6_forward 3052 3045 -7 callforward_do_filter 659 651 -8 dst_gc_task 571 549 -22 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'fsl_ucc_hdlc-enhancements'David S. Miller2017-05-184-49/+55
|\ | | | | | | Signed-off-by: David S. Miller <davem@davemloft.net>
| * powerpc/85xx/kmcent2: use hdlc busmode for UCC1Holger Brunck2017-05-181-3/+1Star
| | | | | | | | | | Signed-off-by: Holger Brunck <holger.brunck@keymile.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/wan/fsl_ucc_hdlc: add hdlc-bus supportHolger Brunck2017-05-183-0/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for hdlc-bus mode to the fsl_ucc_hdlc driver. This can be enabled with the "fsl,hdlc-bus" property in the DTS node of the corresponding ucc. This aligns the configuration of the UPSMR and GUMR registers to what is done in our ucc_hdlc driver (that only support hdlc-bus mode) and with the QuickEngine's documentation for hdlc-bus mode. GUMR/SYNL is set to AUTO for the busmode as in this case the CD signal is ignored. The brkpt_support is enabled to set the HBM1 bit in the CMXUCR register to configure an open-drain connected HDLC bus. Signed-off-by: Holger Brunck <holger.brunck@keymile.com> Cc: Zhao Qiang <qiang.zhao@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * fsl/qe: add bit description for SYNL register for GUMRHolger Brunck2017-05-181-0/+4
| | | | | | | | | | | | | | | | | | Add the bitmask for the two bit SYNL register according to the QUICK Engine Reference Manual. Signed-off-by: Holger Brunck <holger.brunck@keymile.com> Cc: Zhao Qiang <qiang.zhao@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/wan/fsl_ucc_hdlc: call qe_setbrg only for loopback modeHolger Brunck2017-05-181-4/+3Star
| | | | | | | | | | | | | | | | | | | | | | | | We can't assume that we are always in loopback mode if rx and tx clock have the same clock source. If we want to use HDLC busmode we also have the same clock source but we are not in loopback mode. So move the setting of the baudrate generator after the check for property for the loopback mode. Signed-off-by: Holger Brunck <holger.brunck@keymile.com> Cc: Zhao Qiang <qiang.zhao@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/wan/fsl_ucc_hdlc: fix incorrect memory allocationHolger Brunck2017-05-181-6/+6
| | | | | | | | | | | | | | | | We need space for the struct qe_bd and not for a pointer to this struct. Signed-off-by: Holger Brunck <holger.brunck@keymile.com> Cc: Zhao Qiang <qiang.zhao@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/wan/fsl_ucc_hdlc: fix wrong indentationHolger Brunck2017-05-181-1/+1
| | | | | | | | | | | | Signed-off-by: Holger Brunck <holger.brunck@keymile.com> Cc: Zhao Qiang <qiang.zhao@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/wan/fsl_ucc_hdlc: fix unitialized variable warningsHolger Brunck2017-05-181-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes the following compiler warnings: drivers/net/wan/fsl_ucc_hdlc.c: In function 'ucc_hdlc_poll': warning: 'skb' may be used uninitialized in this function [-Wmaybe-uninitialized] skb->mac_header = skb->data - skb->head; and drivers/net/wan/fsl_ucc_hdlc.c: In function 'ucc_hdlc_probe': drivers/net/wan/fsl_ucc_hdlc.c:1127:3: warning: 'utdm' may be used uninitialized in this function [-Wmaybe-uninitialized] kfree(utdm); Signed-off-by: Holger Brunck <holger.brunck@keymile.com> Cc: Zhao Qiang <qiang.zhao@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/wan/fsl_ucc_hdlc: cleanup debug tracesHolger Brunck2017-05-181-33/+0Star
|/ | | | | | | | | Some of the tracing seems to be remaining traces for basic driver development. They can be removed now, as they cause noisy printouts. Signed-off-by: Holger Brunck <holger.brunck@keymile.com> Cc: Zhao Qiang <qiang.zhao@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: make struct net_device::tx_queue_len unsigned intAlexey Dobriyan2017-05-184-6/+11
| | | | | | | | | | | | | | | | | | | | | | | | 4 billion packet queue is something unthinkable so use 32-bit value for now. Space savings on x86_64: add/remove: 0/0 grow/shrink: 3/70 up/down: 16/-131 (-115) function old new delta change_tx_queue_len 94 108 +14 qdisc_create 1176 1177 +1 alloc_netdev_mqs 1124 1125 +1 xenvif_alloc 533 532 -1 x25_asy_setup 167 166 -1 ... tun_queue_resize 945 940 -5 pfifo_fast_enqueue 167 162 -5 qfq_init_qdisc 168 158 -10 tap_queue_resize 810 799 -11 transmit 719 698 -21 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* udp: make function udp_skb_dtor_locked staticColin Ian King2017-05-181-1/+1
| | | | | | | | | | | | | Function udp_skb_dtor_locked does not need to be in global scope so make it static to fix sparse warning: net/ipv4/udp.c: warning: symbol 'udp_skb_dtor_locked' was not declared. Should it be static? Fixes: 6dfb4367cd911d ("udp: keep the sk_receive_queue held when splicing") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'vhost_net-rx-batch-dequeuing'David S. Miller2017-05-187-18/+327
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jason Wang says: ==================== vhost_net rx batch dequeuing This series tries to implement rx batching for vhost-net. This is done by batching the dequeuing from skb_array which was exported by underlayer socket and pass the sbk back through msg_control to finish userspace copying. This is also the requirement for more batching implemention on rx path. Tests shows at most 7.56% improvment bon rx pps on top of batch zeroing and no obvious changes for TCP_STREAM/TCP_RR result. Please review. Thanks Changes from V4: - drop batch zeroing patch - renew the performance numbers - move skb pointer array out of vhost_net structure Changes from V3: - add batch zeroing patch to fix the build warnings Changes from V2: - rebase to net-next HEAD - use unconsume helpers to put skb back on releasing - introduce and use vhost_net internal buffer helpers - renew performance numbers on top of batch zeroing Changes from V1: - switch to use for() in __ptr_ring_consume_batched() - rename peek_head_len_batched() to fetch_skbs() - use skb_array_consume_batched() instead of skb_array_consume_batched_bh() since no consumer run in bh - drop the lockless peeking patch since skb_array could be resized, so it's not safe to call lockless one ==================== Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * vhost_net: try batch dequing from skb arrayJason Wang2017-05-181-6/+122
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We used to dequeue one skb during recvmsg() from skb_array, this could be inefficient because of the bad cache utilization and spinlock touching for each packet. This patch tries to batch them by calling batch dequeuing helpers explicitly on the exported skb array and pass the skb back through msg_control for underlayer socket to finish the userspace copying. Batch dequeuing is also the requirement for more batching improvement on receive path. Tests were done by pktgen on tap with XDP1 in guest. Host is Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz. rx batch | pps 0 2.25Mpps 1 2.33Mpps (+3.56%) 4 2.33Mpps (+3.56%) 16 2.35Mpps (+4.44%) 64 2.42Mpps (+7.56%) <- Default rx batching 128 2.40Mpps (+6.67%) 256 2.38Mpps (+5.78%) Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tap: support receiving skb from msg_controlJason Wang2017-05-181-4/+8
| | | | | | | | | | | | | | | | This patch makes tap_recvmsg() can receive from skb from its caller through msg_control. Vhost_net will be the first user. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tun: support receiving skb through msg_controlJason Wang2017-05-181-8/+10
| | | | | | | | | | | | | | | | This patch makes tun_recvmsg() can receive from skb from its caller through msg_control. Vhost_net will be the first user. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tap: export skb_arrayJason Wang2017-05-182-0/+18
| | | | | | | | | | | | | | | | This patch exports skb_array through tap_get_skb_array(). Caller can then manipulate skb array directly. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tun: export skb_arrayJason Wang2017-05-182-0/+18
| | | | | | | | | | | | | | | | This patch exports skb_array through tun_get_skb_array(). Caller can then manipulate skb array directly. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * skb_array: introduce batch dequeuingJason Wang2017-05-181-0/+25
| | | | | | | | | | Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ptr_ring: introduce batch dequeuingJason Wang2017-05-181-0/+65
| | | | | | | | | | | | | | | | | | This patch introduce a batched version of consuming, consumer can dequeue more than one pointers from the ring at a time. We don't care about the reorder of reading here so no need for compiler barrier. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * skb_array: introduce skb_array_unconsumeJason Wang2017-05-181-0/+6
| | | | | | | | | | Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ptr_ring: add ptr_ring_unconsumeMichael S. Tsirkin2017-05-181-0/+55
|/ | | | | | | | | | | | | | Applications that consume a batch of entries in one go can benefit from ability to return some of them back into the ring. Add an API for that - assuming there's space. If there's no space naturally can't do this and have to drop entries, but this implies ring is full so we'd likely drop some anyway. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'phy-marvell-cleanups'David S. Miller2017-05-171-284/+352
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Andrew Lunn says: ==================== net: phy: marvell: Checkpatch cleanup I will be contributing a few new features to the Marvell PHY driver soon. Start by making the code mostly checkpatch clean. There should not be any functional changes. Just comments set into the correct format, missing blank lines, turn some comparisons around, and refactoring to reduce indentation depth. There is still one camel in the code, but it actually makes sense, so leave it in piece. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: phy: marvell: checkpatch - Fix remaining long linesAndrew Lunn2017-05-171-4/+8
| | | | | | | | | | | | | | Fold lines longer than 80 characters Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: phy: marvell: Add helpers to get/set pageAndrew Lunn2017-05-171-56/+59
| | | | | | | | | | | | | | | | Makes the code a bit more readable, and solves quite a few checkpatch warnings of lines longer than 80 characters. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: phy: marvell: Refactor some bigger functionsAndrew Lunn2017-05-171-213/+271
| | | | | | | | | | | | | | | | | | Break big functions up by using a number of smaller helper function. Solves some of the over 80 lines warnings, by reducing the indentation level. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: phy: marvell: Checkpatch - assignments and comparisonsAndrew Lunn2017-05-171-3/+5
| | | | | | | | | | | | | | | | Avoid multiple assignments Comparisons should place the constant on the right side of the test Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: phy: marvell: Checkpatch - Missing or extra blank linesAndrew Lunn2017-05-171-3/+2Star
| | | | | | | | | | | | | | Remove the extra blank lines, add one in where recommended. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: phy: Marvell: checkpatch - CommentsAndrew Lunn2017-05-171-13/+15
|/ | | | | | | Use net style comment blocks, and wrap one block with long lines. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'tcp-TCP-TS-option-use-1-ms-clock'David S. Miller2017-05-1724-274/+259Star
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Eric Dumazet says: ==================== tcp: TCP TS option use 1 ms clock TCP Timestamps option is defined in RFC 7323 Traditionally on linux, it has been tied to the internal 'jiffy' variable, because it had been a cheap and good enough generator. Unfortunately some distros use HZ=250 or even HZ=100 leading to not very useful TCP timestamps. For TCP flows in the DC, Google has used usec resolution for more than two years with great success [1]. RCVBUF autotuning is more precise. This series converts tp->tcp_mstamp to a plain u64 value storing a 1 usec TCP clock. This choice will allow us to upstream the 1 usec TS option as discussed in IETF 97. Kathleen Nichols [2] and others advocate for 1ms TS clocks for network analysis. (1ms being the lowest value supported by RFC 7323.) [1] https://www.ietf.org/proceedings/97/slides/slides-97-tcpm-tcp-options-for-low-latency-00.pdf [2] http://netseminar.stanford.edu/seminars/02_02_17.pdf ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp: switch TCP TS option (RFC 7323) to 1ms clockEric Dumazet2017-05-1717-199/+178Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TCP Timestamps option is defined in RFC 7323 Traditionally on linux, it has been tied to the internal 'jiffies' variable, because it had been a cheap and good enough generator. For TCP flows on the Internet, 1 ms resolution would be much better than 4ms or 10ms (HZ=250 or HZ=100 respectively) For TCP flows in the DC, Google has used usec resolution for more than two years with great success [1] Receive size autotuning (DRS) is indeed more precise and converges faster to optimal window size. This patch converts tp->tcp_mstamp to a plain u64 value storing a 1 usec TCP clock. This choice will allow us to upstream the 1 usec TS option as discussed in IETF 97. [1] https://www.ietf.org/proceedings/97/slides/slides-97-tcpm-tcp-options-for-low-latency-00.pdf Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp: replace misc tcp_time_stamp to tcp_jiffies32Eric Dumazet2017-05-175-6/+6
| | | | | | | | | | | | | | | | | | After this patch, all uses of tcp_time_stamp will require a change when we introduce 1 ms and/or 1 us TCP TS option. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp_lp: cache tcp_time_stampEric Dumazet2017-05-171-3/+4
| | | | | | | | | | | | | | | | | | tcp_time_stamp will become slightly more expensive soon, cache its value. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp_westwood: use tcp_jiffies32 instead of tcp_time_stampEric Dumazet2017-05-171-3/+3
| | | | | | | | | | | | | | | | | | This CC does not need 1 ms tcp_time_stamp and can use the jiffy based 'timestamp'. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp: use tcp_jiffies32 in __tcp_oow_rate_limited()Eric Dumazet2017-05-171-2/+2
| | | | | | | | | | | | | | | | This place wants to use tcp_jiffies32, this is good enough. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp: uses jiffies_32 to feed tp->chrono_startEric Dumazet2017-05-172-2/+2
| | | | | | | | | | | | | | | | tcp_time_stamp will no longer be tied to jiffies. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp: use tcp_jiffies32 to feed probe_timestampEric Dumazet2017-05-172-4/+4
| | | | | | | | | | | | | | | | | | Use tcp_jiffies32 instead of tcp_time_stamp, since tcp_time_stamp will soon be only used for TCP TS option. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp: use tcp_jiffies32 for rcv_tstamp and lrcvtimeEric Dumazet2017-05-175-8/+8
| | | | | | | | | | | | | | | | | | Use tcp_jiffies32 instead of tcp_time_stamp, since tcp_time_stamp will soon be only used for TCP TS option. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp: bic, cubic: use tcp_jiffies32 instead of tcp_time_stampEric Dumazet2017-05-172-9/+9
| | | | | | | | | | | | | | | | | | Use tcp_jiffies32 instead of tcp_time_stamp, since tcp_time_stamp will soon be only used for TCP TS option. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp_bbr: use tcp_jiffies32 instead of tcp_time_stampEric Dumazet2017-05-171-6/+6
| | | | | | | | | | | | | | | | | | Use tcp_jiffies32 instead of tcp_time_stamp, since tcp_time_stamp will soon be only used for TCP TS option. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp: use tcp_jiffies32 to feed tp->snd_cwnd_stampEric Dumazet2017-05-173-12/+12
| | | | | | | | | | | | | | | | | | | | | | | | Use tcp_jiffies32 instead of tcp_time_stamp to feed tp->snd_cwnd_stamp. tcp_time_stamp will soon be a litle bit more expensive than simply reading 'jiffies'. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp: use tcp_jiffies32 to feed tp->lsndtimeEric Dumazet2017-05-176-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | Use tcp_jiffies32 instead of tcp_time_stamp to feed tp->lsndtime. tcp_time_stamp will soon be a litle bit more expensive than simply reading 'jiffies'. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * dccp: do not use tcp_time_stampEric Dumazet2017-05-172-5/+5
| | | | | | | | | | | | | | | | Use our own macro instead of abusing tcp_time_stamp Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp: introduce tcp_jiffies32Eric Dumazet2017-05-171-5/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We abuse tcp_time_stamp for two different cases : 1) base to generate TCP Timestamp options (RFC 7323) 2) A 32bit version of jiffies since some TCP fields are 32bit wide to save memory. Since we want in the future to have 1ms TCP TS clock, regardless of HZ value, we want to cleanup things. tcp_jiffies32 is the truncated jiffies value, which will be used only in places where we want a 'host' timestamp. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp: use tp->tcp_mstamp in output pathEric Dumazet2017-05-174-12/+14
|/ | | | | | | | | | | | Idea is to later convert tp->tcp_mstamp to a full u64 counter using usec resolution, so that we can later have fine grained TCP TS clock (RFC 7323), regardless of HZ value. We try to refresh tp->tcp_mstamp only when necessary. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sch_dsmark: Fix uninitialized variable warning.David S. Miller2017-05-171-1/+1
| | | | | | | | We still need to initialize err to -EINVAL for the case where 'opt' is NULL in dsmark_init(). Fixes: 6529eaba33f0 ("net: sched: introduce tcf block infractructure") Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'net-sched-multichain-filters'David S. Miller2017-05-1723-260/+625
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jiri Pirko says: ==================== net: sched: introduce multichain support for filters Currently, each classful qdisc holds one chain of filters. This chain is traversed and each filter could be matched on, which may lead to execution of list of actions. One of such action could be "reclassify", which would "reset" the processing of the filter chain. So this filter chain could be looked at as a flat table. Sometimes it is convenient for user to configure a hierarchy of tables. Example usecase is encapsulation. Hierarchy of tables is a common way how it is done in HW pipelines. So it is much more convenient to offload this. This patchset contains two major patches: 8/10 - This patch introduces the support for having multiple chains of filters. 10/10 - This patch adds new control action to allow going to specified chain The rest of the patches are smaller or bigger depencies of those 2. Please see individual patch descriptions for details. Corresponding iproute2 patches are appended as a reply to this cover letter. Simple example: $ tc qdisc add dev eth0 ingress $ tc filter add dev eth0 parent ffff: protocol ip pref 33 flower dst_mac 52:54:00:3d:c7:6d action goto chain 11 $ tc filter add dev eth0 parent ffff: protocol ip pref 22 chain 11 flower dst_ip 192.168.40.1 action drop $ tc filter show dev eth0 root filter parent ffff: protocol ip pref 33 flower chain 0 filter parent ffff: protocol ip pref 33 flower chain 0 handle 0x1 dst_mac 52:54:00:3d:c7:6d eth_type ipv4 action order 1: gact action goto chain 11 random type none pass val 0 index 2 ref 1 bind 1 filter parent ffff: protocol ip pref 22 flower chain 11 filter parent ffff: protocol ip pref 22 flower chain 11 handle 0x1 eth_type ipv4 dst_ip 192.168.40.1 action order 1: gact action drop random type none pass val 0 index 3 ref 1 bind 1 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: sched: add termination action to allow goto chainJiri Pirko2017-05-175-3/+54
| | | | | | | | | | | | | | | | | | | | | | | | Introduce new type of termination action called "goto_chain". This allows user to specify a chain to be processed. This action type is then processed as a return value in tcf_classify loop in similar way as "reclassify" is, only it does not reset to the first filter in chain but rather reset to the first filter of the desired chain. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>