summaryrefslogtreecommitdiffstats
path: root/drivers/net/xen-netback
Commit message (Collapse)AuthorAgeFilesLines
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds2015-02-113-106/+4Star
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking updates from David Miller: 1) More iov_iter conversion work from Al Viro. [ The "crypto: switch af_alg_make_sg() to iov_iter" commit was wrong, and this pull actually adds an extra commit on top of the branch I'm pulling to fix that up, so that the pre-merge state is ok. - Linus ] 2) Various optimizations to the ipv4 forwarding information base trie lookup implementation. From Alexander Duyck. 3) Remove sock_iocb altogether, from CHristoph Hellwig. 4) Allow congestion control algorithm selection via routing metrics. From Daniel Borkmann. 5) Make ipv4 uncached route list per-cpu, from Eric Dumazet. 6) Handle rfs hash collisions more gracefully, also from Eric Dumazet. 7) Add xmit_more support to r8169, e1000, and e1000e drivers. From Florian Westphal. 8) Transparent Ethernet Bridging support for GRO, from Jesse Gross. 9) Add BPF packet actions to packet scheduler, from Jiri Pirko. 10) Add support for uniqu flow IDs to openvswitch, from Joe Stringer. 11) New NetCP ethernet driver, from Muralidharan Karicheri and Wingman Kwok. 12) More sanely handle out-of-window dupacks, which can result in serious ACK storms. From Neal Cardwell. 13) Various rhashtable bug fixes and enhancements, from Herbert Xu, Patrick McHardy, and Thomas Graf. 14) Support xmit_more in be2net, from Sathya Perla. 15) Group Policy extensions for vxlan, from Thomas Graf. 16) Remove Checksum Offload support for vxlan, from Tom Herbert. 17) Like ipv4, support lockless transmit over ipv6 UDP sockets. From Vlad Yasevich. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1494+1 commits) crypto: fix af_alg_make_sg() conversion to iov_iter ipv4: Namespecify TCP PMTU mechanism i40e: Fix for stats init function call in Rx setup tcp: don't include Fast Open option in SYN-ACK on pure SYN-data openvswitch: Only set TUNNEL_VXLAN_OPT if VXLAN-GBP metadata is set ipv6: Make __ipv6_select_ident static ipv6: Fix fragment id assignment on LE arches. bridge: Fix inability to add non-vlan fdb entry net: Mellanox: Delete unnecessary checks before the function call "vunmap" cxgb4: Add support in cxgb4 to get expansion rom version via ethtool ethtool: rename reserved1 memeber in ethtool_drvinfo for expansion ROM version net: dsa: Remove redundant phy_attach() IB/mlx4: Reset flow support for IB kernel ULPs IB/mlx4: Always use the correct port for mirrored multicast attachments net/bonding: Fix potential bad memory access during bonding events tipc: remove tipc_snprintf tipc: nl compat add noop and remove legacy nl framework tipc: convert legacy nl stats show to nl compat tipc: convert legacy nl net id get to nl compat tipc: convert legacy nl net id set to nl compat ...
| * xen-netback: fix sparse warningLad, Prabhakar2015-02-061-1/+1
| | | | | | | | | | | | | | | | | | | | this patch fixes following sparse warning: interface.c:83:5: warning: symbol 'xenvif_poll' was not declared. Should it be static? Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2015-02-052-2/+3
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/vxlan.c drivers/vhost/net.c include/linux/if_vlan.h net/core/dev.c The net/core/dev.c conflict was the overlap of one commit marking an existing function static whilst another was adding a new function. In the include/linux/if_vlan.h case, the type used for a local variable was changed in 'net', whereas the function got rewritten to fix a stacked vlan bug in 'net-next'. In drivers/vhost/net.c, Al Viro's iov_iter conversions in 'net-next' overlapped with an endainness fix for VHOST 1.0 in 'net'. In drivers/net/vxlan.c, vxlan_find_vni() added a 'flags' parameter in 'net-next' whereas in 'net' there was a bug fix to pass in the correct network namespace pointer in calls to this function. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | xen-netback: always fully coalesce guest Rx packetsDavid Vrabel2015-01-242-105/+3Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Always fully coalesce guest Rx packets into the minimum number of ring slots. Reducing the number of slots per packet has significant performance benefits when receiving off-host traffic. Results from XenServer's performance benchmarks: Baseline Full coalesce Interhost VM receive 7.2 Gb/s 11 Gb/s Interhost aggregate 24 Gb/s 24 Gb/s Intrahost single stream 14 Gb/s 14 Gb/s Intrahost aggregate 34 Gb/s 34 Gb/s However, this can increase the number of grant ops per packet which decreases performance of backend (dom0) to VM traffic (by ~10%) /unless/ grant copy has been optimized for adjacent ops with the same source or destination (see "grant-table: defer releasing pages acquired in a grant copy"[1] expected in Xen 4.6). [1] http://lists.xen.org/archives/html/xen-devel/2015-01/msg01118.html Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | Merge tag 'stable/for-linus-3.20-rc0-tag' of ↵Linus Torvalds2015-02-102-101/+12Star
|\ \ \ | |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen features and fixes from David Vrabel: - Reworked handling for foreign (grant mapped) pages to simplify the code, enable a number of additional use cases and fix a number of long-standing bugs. - Prefer the TSC over the Xen PV clock when dom0 (and the TSC is stable). - Assorted other cleanup and minor bug fixes. * tag 'stable/for-linus-3.20-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (25 commits) xen/manage: Fix USB interaction issues when resuming xenbus: Add proper handling of XS_ERROR from Xenbus for transactions. xen/gntdev: provide find_special_page VMA operation xen/gntdev: mark userspace PTEs as special on x86 PV guests xen-blkback: safely unmap grants in case they are still in use xen/gntdev: safely unmap grants in case they are still in use xen/gntdev: convert priv->lock to a mutex xen/grant-table: add a mechanism to safely unmap pages that are in use xen-netback: use foreign page information from the pages themselves xen: mark grant mapped pages as foreign xen/grant-table: add helpers for allocating pages x86/xen: require ballooned pages for grant maps xen: remove scratch frames for ballooned pages and m2p override xen/grant-table: pre-populate kernel unmap ops for xen_gnttab_unmap_refs() mm: add 'foreign' alias for the 'pinned' page flag mm: provide a find_special_page vma operation x86/xen: cleanup arch/x86/xen/mmu.c x86/xen: add some __init annotations in arch/x86/xen/mmu.c x86/xen: add some __init and static annotations in arch/x86/xen/setup.c x86/xen: use correct types for addresses in arch/x86/xen/setup.c ...
| * | xen-netback: use foreign page information from the pages themselvesJennifer Herbert2015-01-281-91/+9Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the foreign page flag in netback to get the domid and grant ref needed for the grant copy. This signficiantly simplifies the netback code and makes netback work with foreign pages from other backends (e.g., blkback). This allows blkback to use iSCSI disks provided by domUs running on the same host. Signed-off-by: Jennifer Herbert <jennifer.herbert@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
| * | xen/grant-table: add helpers for allocating pagesDavid Vrabel2015-01-281-4/+3Star
| | | | | | | | | | | | | | | | | | | | | | | | Add gnttab_alloc_pages() and gnttab_free_pages() to allocate/free pages suitable to for granted maps. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
| * | x86/xen: require ballooned pages for grant mapsJennifer Herbert2015-01-281-6/+0Star
| |/ | | | | | | | | | | | | | | | | | | | | | | | | Ballooned pages are always used for grant maps which means the original frame does not need to be saved in page->index nor restored after the grant unmap. This allows the workaround in netback for the conflicting use of the (unionized) page->index and page->pfmemalloc to be removed. Signed-off-by: Jennifer Herbert <jennifer.herbert@citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
* / xen-netback: stop the guest rx thread after a fatal errorDavid Vrabel2015-02-032-2/+3
|/ | | | | | | | | | | | | | | | | | | After commit e9d8b2c2968499c1f96563e6522c56958d5a1d0d (xen-netback: disable rogue vif in kthread context), a fatal (protocol) error would leave the guest Rx thread spinning, wasting CPU time. Commit ecf08d2dbb96d5a4b4bcc53a39e8d29cc8fef02e (xen-netback: reintroduce guest Rx stall detection) made this even worse by removing a cond_resched() from this path. Since a fatal error is non-recoverable, just allow the guest Rx thread to exit. This requires taking additional refs to the task so the thread exiting early is handled safely. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reported-by: Julien Grall <julien.grall@linaro.org> Tested-by: Julien Grall <julien.grall@linaro.org> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* xen-netback: fixing the propagation of the transmit shaper timeoutPalik, Imre2015-01-061-0/+1
| | | | | | | | | | | | | Since e9ce7cb6b107 ("xen-netback: Factor queue-specific data into queue struct"), the transimt shaper timeout is always set to 0. The value the user sets via xenbus is never propagated to the transmit shaper. This patch fixes the issue. Cc: Anthony Liguori <aliguori@amazon.com> Signed-off-by: Imre Palik <imrep@amazon.de> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* xen-netback: support frontends without feature-rx-notify againDavid Vrabel2014-12-184-18/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit bc96f648df1bbc2729abbb84513cf4f64273a1f1 (xen-netback: make feature-rx-notify mandatory) incorrectly assumed that there were no frontends in use that did not support this feature. But the frontend driver in MiniOS does not and since this is used by (qemu) stubdoms, these stopped working. Netback sort of works as-is in this mode except: - If there are no Rx requests and the internal Rx queue fills, only the drain timeout will wake the thread. The default drain timeout of 10 s would give unacceptable pauses. - If an Rx stall was detected and the internal Rx queue is drained, then the Rx thread would never wake. Handle these two cases (when feature-rx-notify is disabled) by: - Reducing the drain timeout to 30 ms. - Disabling Rx stall detection. Reported-by: John <jw@nuclearfallout.net> Tested-by: John <jw@nuclearfallout.net> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2014-12-101-4/+5
|\ | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/ethernet/amd/xgbe/xgbe-desc.c drivers/net/ethernet/renesas/sh_eth.c Overlapping changes in both conflict cases. Signed-off-by: David S. Miller <davem@davemloft.net>
| * netback: don't store invalid vif pointerJan Beulich2014-12-101-4/+5
| | | | | | | | | | | | | | | | | | | | When xenvif_alloc() fails, it returns a non-NULL error indicator. To avoid eventual races, we shouldn't store that into struct backend_info as readers of it only check for NULL. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2014-11-301-6/+9
|\|
| * xen-netback: do not report success if backend_create_xenvif() failsAlexey Khoroshilov2014-11-241-6/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | If xenvif_alloc() or xenbus_scanf() fail in backend_create_xenvif(), xenbus is left in offline mode but netback_probe() reports success. The patch implements propagation of error code for backend_create_xenvif(). Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | xen-netback: remove unconditional __pskb_pull_tail() in guest Tx pathMalcolm Crossley2014-11-061-14/+12Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unconditionally pulling 128 bytes into the linear area is not required for: - security: Every protocol demux starts with pskb_may_pull() to pull frag data into the linear area, if necessary, before looking at headers. - performance: Netback has already grant copied up-to 128 bytes from the first slot of a packet into the linear area. The first slot normally contain all the IPv4/IPv6 and TCP/UDP headers. The unconditional pull would often copy frag data unnecessarily. This is a performance problem when running on a version of Xen where grant unmap avoids TLB flushes for pages which are not accessed. TLB flushes can now be avoided for > 99% of unmaps (it was 0% before). Grant unmap TLB flush avoidance will be available in a future version of Xen (probably 4.6). Signed-off-by: Malcolm Crossley <malcolm.crossley@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2014-11-014-203/+251
|\| | | | | | | | | | | | | | | | | Conflicts: drivers/net/phy/marvell.c Simple overlapping changes in drivers/net/phy/marvell.c Signed-off-by: David S. Miller <davem@davemloft.net>
| * xen-netback: reintroduce guest Rx stall detectionDavid Vrabel2014-10-254-1/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a frontend not receiving packets it is useful to detect this and turn off the carrier so packets are dropped early instead of being queued and drained when they expire. A to-guest queue is stalled if it doesn't have enough free slots for a an extended period of time (default 60 s). If at least one queue is stalled, the carrier is turned off (in the expectation that the other queues will soon stall as well). The carrier is only turned on once all queues are ready. When the frontend connects, all the queues start in the stalled state and only become ready once the frontend queues enough Rx requests. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * xen-netback: fix unlimited guest Rx internal queue and carrier flappingDavid Vrabel2014-10-254-178/+161Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Netback needs to discard old to-guest skb's (guest Rx queue drain) and it needs detect guest Rx stalls (to disable the carrier so packets are discarded earlier), but the current implementation is very broken. 1. The check in hard_start_xmit of the slot availability did not consider the number of packets that were already in the guest Rx queue. This could allow the queue to grow without bound. The guest stops consuming packets and the ring was allowed to fill leaving S slot free. Netback queues a packet requiring more than S slots (ensuring that the ring stays with S slots free). Netback queue indefinately packets provided that then require S or fewer slots. 2. The Rx stall detection is not triggered in this case since the (host) Tx queue is not stopped. 3. If the Tx queue is stopped and a guest Rx interrupt occurs, netback will consider this an Rx purge event which may result in it taking the carrier down unnecessarily. It also considers a queue with only 1 slot free as unstalled (even though the next packet might not fit in this). The internal guest Rx queue is limited by a byte length (to 512 Kib, enough for half the ring). The (host) Tx queue is stopped and started based on this limit. This sets an upper bound on the amount of memory used by packets on the internal queue. This allows the estimatation of the number of slots for an skb to be removed (it wasn't a very good estimate anyway). Instead, the guest Rx thread just waits for enough free slots for a maximum sized packet. skbs queued on the internal queue have an 'expires' time (set to the current time plus the drain timeout). The guest Rx thread will detect when the skb at the head of the queue has expired and discard expired skbs. This sets a clear upper bound on the length of time an skb can be queued for. For a guest being destroyed the maximum time needed to wait for all the packets it sent to be dropped is still the drain timeout (10 s) since it will not be sending new packets. Rx stall detection is reintroduced in a later commit. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * xen-netback: make feature-rx-notify mandatoryDavid Vrabel2014-10-253-25/+5Star
| | | | | | | | | | | | | | | | | | | | | | | | | | Frontends that do not provide feature-rx-notify may stall because netback depends on the notification from frontend to wake the guest Rx thread (even if can_queue is false). This could be fixed but feature-rx-notify was introduced in 2006 and I am not aware of any frontends that do not implement this. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | xen-netback: Remove __GFP_COLDZoltan Kiss2014-10-291-1/+1
| | | | | | | | | | | | | | | | | | | | This flag is unnecessary, it came from some old code. Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Zoltan Kiss <zoltan.kiss@linaro.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | xen-netback: Disable NAPI after disabling interruptsZoltan Kiss2014-10-291-1/+1
|/ | | | | | | | | | | | Otherwise the interrupt handler still calls napi_complete. Although it won't schedule NAPI again as either NAPI_STATE_DISABLE or NAPI_STATE_SCHED is set, it is just unnecessary, and it makes more sense to do this way. Signed-off-by: Zoltan Kiss <zoltan.kiss@linaro.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* xen: remove DEFINE_XENBUS_DRIVER() macroDavid Vrabel2014-10-061-7/+3Star
| | | | | | | | | | The DEFINE_XENBUS_DRIVER() macro looks a bit weird and causes sparse errors. Replace the uses with standard structure definitions instead. This is similar to pci and usb device registration. Signed-off-by: David Vrabel <david.vrabel@citrix.com>
* xen-netback: move netif_napi_add before binding interruptWei Liu2014-08-261-3/+3
| | | | | | | | | | | | | | | | | | | | | Interrupt is enabled when bind_interdomain_evtchn_to_irqhandler returns. If there's interrupt pending interrupt handler is invoked. NAPI needs to be initialised before binding interrupt otherwise the interrupt handler will try to scheduling a NAPI instance that is not initialised yet, resulting in kernel OOPS. This fixes a regression introduced in ea2c5e13 ("xen-netback: move NAPI add/remove calls"). Ideally function calls to create kthreads should also be moved before binding but I intent to fix this regression with minimal changes and refactor the code with another patch. Reported-by: Thomas Leonard <talex5@gmail.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* xen-netback: remove loop waiting functionWei Liu2014-08-141-29/+0Star
| | | | | | | | | | | The original implementation relies on a loop to check if all inflight packets are freed. Now we have proper reference counting, there's no need to use loop anymore. Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* xen-netback: don't stop dealloc kthread too earlyWei Liu2014-08-143-7/+42
| | | | | | | | | | | | Reference count the number of packets in host stack, so that we don't stop the deallocation thread too early. If not, we can end up with xenvif_free permanently waiting for deallocation thread to unmap grefs. Reported-by: Thomas Leonard <talex5@gmail.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* xen-netback: move NAPI add/remove callsWei Liu2014-08-141-4/+5
| | | | | | | | | | | | | Originally netif_napi_add was in xenvif_init_queue and netif_napi_del was in xenvif_deinit_queue, while kthreads were handled in xenvif_connect and xenvif_disconnect. Move netif_napi_add and netif_napi_del to xenvif_connect and xenvif_disconnect so that they reside together with kthread operations. Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* xen-netback: fix debugfs entry creationWei Liu2014-08-141-4/+4
| | | | | | | | | | The original code is bogus. The function gets called in a loop which leaks entries created in previous rounds. Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Zoltan Kiss <zoltan.kiss@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* xen-netback: fix debugfs write length checkWei Liu2014-08-141-3/+6
| | | | | | | | | | | | | | Enlarge buffer size and check input length properly, so that we don't misuse -ENOSPC. Note that command like "kickXXXX" is still allowed, that's one patch for another day if we really want to be very strict on this. Reported-by: SeeChen Ng <seechen81@gmail.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Zoltan Kiss <zoltan.kiss@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* xen-netback: Don't deschedule NAPI when carrier offZoltan Kiss2014-08-111-5/+1Star
| | | | | | | | | | | | | | In the patch called "xen-netback: Turn off the carrier if the guest is not able to receive" NAPI was descheduled when the carrier was set off. That's not what most of the drivers do, and we don't have any specific reason to do so as well, so revert that change. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: xen-devel@lists.xenproject.org Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* xen-netback: Fix vif->disable handlingZoltan Kiss2014-08-081-2/+8
| | | | | | | | | | | | | In the patch called "xen-netback: Turn off the carrier if the guest is not able to receive" new branches were introduced to this if statement, risking that a queue with non-zero id can reenable the disabled interface. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: xen-devel@lists.xenproject.org Signed-off-by: David S. Miller <davem@davemloft.net>
* xen-netback: Turn off the carrier if the guest is not able to receiveZoltan Kiss2014-08-063-38/+123
| | | | | | | | | | | | | | | | | | | | | | Currently when the guest is not able to receive more packets, qdisc layer starts a timer, and when it goes off, qdisc is started again to deliver a packet again. This is a very slow way to drain the queues, consumes unnecessary resources and slows down other guests shutdown. This patch change the behaviour by turning the carrier off when that timer fires, so all the packets are freed up which were stucked waiting for that vif. Instead of the rx_queue_purge bool it uses the VIF_STATUS_RX_PURGE_EVENT bit to signal the thread that either the timeout happened or an RX interrupt arrived, so the thread can check what it should do. It also disables NAPI, so the guest can't transmit, but leaves the interrupts on, so it can resurrect. Only the queues which brought down the interface can enable it again, the bit QUEUE_STATUS_RX_STALLED makes sure of that. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: xen-devel@lists.xenproject.org Signed-off-by: David S. Miller <davem@davemloft.net>
* xen-netback: Using a new state bit instead of carrierZoltan Kiss2014-08-063-9/+18
| | | | | | | | | | | | | | | | | This patch introduces a new state bit VIF_STATUS_CONNECTED to track whether the vif is in a connected state. Using carrier will not work with the next patch in this series, which aims to turn the carrier temporarily off if the guest doesn't seem to be able to receive packets. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: xen-devel@lists.xenproject.org v2: - rename the bitshift type to "enum state_bit_shift" here, not in the next patch Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2014-07-221-23/+63
|\ | | | | | | | | | | | | | | | | Conflicts: drivers/infiniband/hw/cxgb4/device.c The cxgb4 conflict was simply overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
| * xen-netback: Fix pointer incrementation to avoid incorrect loggingZoltan Kiss2014-07-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | Due to this pointer is increased prematurely, the error log contains rubbish. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Reported-by: Armin Zentai <armin.zentai@ezit.hu> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: xen-devel@lists.xenproject.org Signed-off-by: David S. Miller <davem@davemloft.net>
| * xen-netback: Fix releasing header slot on error pathZoltan Kiss2014-07-211-5/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes this function aware that the first frag and the header might share the same ring slot. That could happen if the first slot is bigger than PKT_PROT_LEN. Due to this the error path might release that slot twice or never, depending on the error scenario. xenvif_idx_release is also removed from xenvif_idx_unmap, and called separately. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Reported-by: Armin Zentai <armin.zentai@ezit.hu> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: xen-devel@lists.xenproject.org Signed-off-by: David S. Miller <davem@davemloft.net>
| * xen-netback: Fix releasing frag_list skbs in error pathZoltan Kiss2014-07-211-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | When the grant operations failed, the skb is freed up eventually, and it tries to release the frags, if there is any. For the main skb nr_frags is set to 0 to avoid this, but on the frag_list it iterates through the frags array, and tries to call put_page on the page pointer which contains garbage at that time. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Reported-by: Armin Zentai <armin.zentai@ezit.hu> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: xen-devel@lists.xenproject.org Signed-off-by: David S. Miller <davem@davemloft.net>
| * xen-netback: Fix handling frag_list on grant op error pathZoltan Kiss2014-07-211-17/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | The error handling for skb's with frag_list was completely wrong, it caused double unmap attempts to happen if the error was on the first skb. Move it to the right place in the loop. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Reported-by: Armin Zentai <armin.zentai@ezit.hu> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: xen-devel@lists.xenproject.org Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: set name_assign_type in alloc_netdev()Tom Gundersen2014-07-161-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extend alloc_netdev{,_mq{,s}}() to take name_assign_type as argument, and convert all users to pass NET_NAME_UNKNOWN. Coccinelle patch: @@ expression sizeof_priv, name, setup, txqs, rxqs, count; @@ ( -alloc_netdev_mqs(sizeof_priv, name, setup, txqs, rxqs) +alloc_netdev_mqs(sizeof_priv, name, NET_NAME_UNKNOWN, setup, txqs, rxqs) | -alloc_netdev_mq(sizeof_priv, name, setup, count) +alloc_netdev_mq(sizeof_priv, name, NET_NAME_UNKNOWN, setup, count) | -alloc_netdev(sizeof_priv, name, setup) +alloc_netdev(sizeof_priv, name, NET_NAME_UNKNOWN, setup) ) v9: move comments here from the wrong commit Signed-off-by: Tom Gundersen <teg@jklm.no> Reviewed-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | xen-netback: Adding debugfs "io_ring_qX" filesZoltan Kiss2014-07-094-2/+200
|/ | | | | | | | | | | | | | | | This patch adds debugfs capabilities to netback. There used to be a similar patch floating around for classic kernel, but it used procfs. It is based on a very similar blkback patch. It creates xen-netback/[vifname]/io_ring_q[queueno] files, reading them output various ring variables etc. Writing "kick" into it imitates an interrupt happened, it can be useful to check whether the ring is just stalled due to a missed interrupt. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: xen-devel@lists.xenproject.org Signed-off-by: David S. Miller <davem@davemloft.net>
* xen-netback: bookkeep number of active queues in our own moduleWei Liu2014-06-263-52/+26Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The original code uses netdev->real_num_tx_queues to bookkeep number of queues and invokes netif_set_real_num_tx_queues to set the number of queues. However, netif_set_real_num_tx_queues doesn't allow real_num_tx_queues to be smaller than 1, which means setting the number to 0 will not work and real_num_tx_queues is untouched. This is bogus when xenvif_free is invoked before any number of queues is allocated. That function needs to iterate through all queues to free resources. Using the wrong number of queues results in NULL pointer dereference. So we bookkeep the number of queues in xen-netback to solve this problem. This fixes a regression introduced by multiqueue patchset in 3.16-rc1. There's another bug in original code that the real number of RX queues is never set. In current Xen multiqueue design, the number of TX queues and RX queues are in fact the same. We need to set the numbers of TX and RX queues to the same value. Also remove xenvif_select_queue and leave queue selection to core driver, as suggested by David Miller. Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> CC: Ian Campbell <ian.campbell@citrix.com> CC: Paul Durrant <paul.durrant@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: xen-netback: include linux/vmalloc.h againArnd Bergmann2014-06-121-0/+1
| | | | | | | | | | | | | | | | | | | commit e9ce7cb6b107 ("xen-netback: Factor queue-specific data into queue struct") added a use of vzalloc/vfree to interface.c, but removed the #include <linux/vmalloc.h> statement at the same time, which causes this build error: drivers/net/xen-netback/interface.c: In function 'xenvif_free': drivers/net/xen-netback/interface.c:754:2: error: implicit declaration of function 'vfree' [-Werror=implicit-function-declaration] vfree(vif->queues); ^ cc1: some warnings being treated as errors Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Andrew J. Bennieston <andrew.bennieston@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2014-06-061-11/+25
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/xen-netback/netback.c net/core/filter.c A filter bug fix overlapped some cleanups and a conversion over to some new insn generation macros. A xen-netback bug fix overlapped the addition of multi-queue support. Signed-off-by: David S. Miller <davem@davemloft.net>
| * xen-netback: Fix handling of skbs requiring too many slotsZoltan Kiss2014-06-061-11/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A recent commit (a02eb4 "xen-netback: worse-case estimate in xenvif_rx_action is underestimating") capped the slot estimation to MAX_SKB_FRAGS, but that triggers the next BUG_ON a few lines down, as the packet consumes more slots than estimated. This patch introduces full_coalesce on the skb callback buffer, which is used in start_new_rx_buffer() to decide whether netback needs coalescing more aggresively. By doing that, no packet should need more than (XEN_NETIF_MAX_TX_SIZE + 1) / PAGE_SIZE data slots (excluding the optional GSO slot, it doesn't carry data, therefore irrelevant in this case), as the provided buffers are fully utilized. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Cc: Paul Durrant <paul.durrant@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Paul Durrant <paul.durrant@gmail.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | xen-netback: Add support for multiple queuesAndrew J. Bennieston2014-06-044-18/+116
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Builds on the refactoring of the previous patch to implement multiple queues between xen-netfront and xen-netback. Writes the maximum supported number of queues into XenStore, and reads the values written by the frontend to determine how many queues to use. Ring references and event channels are read from XenStore on a per-queue basis and rings are connected accordingly. Also adds code to handle the cleanup of any already initialised queues if the initialisation of a subsequent queue fails. Signed-off-by: Andrew J. Bennieston <andrew.bennieston@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | xen-netback: Factor queue-specific data into queue structWei Liu2014-06-044-590/+819
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In preparation for multi-queue support in xen-netback, move the queue-specific data from struct xenvif into struct xenvif_queue, and update the rest of the code to use this. Also adds loops over queues where appropriate, even though only one is configured at this point, and uses alloc_netdev_mq() and the corresponding multi-queue netif wake/start/stop functions in preparation for multiple active queues. Finally, implements a trivial queue selection function suitable for ndo_select_queue, which simply returns 0 for a single queue and uses skb_get_hash() to compute the queue index otherwise. Signed-off-by: Andrew J. Bennieston <andrew.bennieston@citrix.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | xen-netback: Move grant_copy_op array back into struct xenvif.Andrew J. Bennieston2014-06-042-11/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This array was allocated separately in commit ac3d5ac2 ("xen-netback: fix guest-receive-side array sizes") due to it being very large, and a struct xenvif is allocated as the netdev_priv part of a struct net_device, i.e. via kmalloc() but falling back to vmalloc() if the initial alloc. fails. In preparation for the multi-queue patches, where this array becomes part of struct xenvif_queue and is always allocated through vzalloc(), move this back into the struct xenvif. Signed-off-by: Andrew J. Bennieston <andrew.bennieston@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2014-05-243-48/+86
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/bonding/bond_alb.c drivers/net/ethernet/altera/altera_msgdma.c drivers/net/ethernet/altera/altera_sgdma.c net/ipv6/xfrm6_output.c Several cases of overlapping changes. The xfrm6_output.c has a bug fix which overlaps the renaming of skb->local_df to skb->ignore_df. In the Altera TSE driver cases, the register access cleanups in net-next overlapped with bug fixes done in net. Similarly a bug fix to send ALB packets in the bonding driver using the right source address overlaps with cleanups in net-next. Signed-off-by: David S. Miller <davem@davemloft.net>
| * xen-netback: fix race between napi_complete() and interrupt handlerDavid Vrabel2014-05-163-30/+6Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the NAPI budget was not all used, xenvif_poll() would call napi_complete() /after/ enabling the interrupt. This resulted in a race between the napi_complete() and the napi_schedule() in the interrupt handler. The use of local_irq_save/restore() avoided by race iff the handler is running on the same CPU but not if it was running on a different CPU. Fix this properly by calling napi_complete() before reenabling interrupts (in the xenvif_napi_schedule_or_enable_irq() call). Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * xen-netback: Fix grant ref resolution in RX pathZoltan Kiss2014-05-161-18/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The original series for reintroducing grant mapping for netback had a patch [1] to handle receiving of packets from an another VIF. Grant copy on the receiving side needs the grant ref of the page to set up the op. The original patch assumed (wrongly) that the frags array haven't changed. In the case reported by Sander, the sending guest sent a packet where the linear buffer and the first frag were under PKT_PROT_LEN (=128) bytes. xenvif_tx_submit() then pulled up the linear area to 128 bytes, and ditched the first frag. The receiving side had an off-by-one problem when gathered the grant refs. This patch fixes that by checking whether the actual frag's page pointer is the same as the page in the original frag list. It can handle any kind of changes on the original frags array, like: - removing granted frags from the array at any point - adding local pages to the frags list anywhere - reordering the frags It's optimized to the most common case, when there is 1:1 relation between the frags and the list, plus works optimal when frags are removed from the end or the beginning. [1]: 3e2234: xen-netback: Handle foreign mapped pages on the guest RX path Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>