summaryrefslogtreecommitdiffstats
path: root/net
Commit message (Collapse)AuthorAgeFilesLines
...
| | * | | | | svcrdma: Add backward direction service for RPC/RDMA transportChuck Lever2015-11-022-0/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On NFSv4.1 mount points, the Linux NFS client uses this transport endpoint to receive backward direction calls and route replies back to the NFSv4.1 server. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Acked-by: "J. Bruce Fields" <bfields@fieldses.org> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Tested-By: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| | * | | | | xprtrdma: Handle incoming backward direction RPC callsChuck Lever2015-11-023-0/+161
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a code path in the rpcrdma_reply_handler() to catch incoming backward direction RPC calls and route them to the ULP's backchannel server. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Tested-By: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| | * | | | | xprtrdma: Add support for sending backward direction RPC repliesChuck Lever2015-11-023-0/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backward direction RPC replies are sent via the client transport's send_request method, the same way forward direction RPC calls are sent. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Tested-By: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| | * | | | | xprtrdma: Pre-allocate Work Requests for backchannelChuck Lever2015-11-023-2/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pre-allocate extra send and receive Work Requests needed to handle backchannel receives and sends. The transport doesn't know how many extra WRs to pre-allocate until the xprt_setup_backchannel() call, but that's long after the WRs are allocated during forechannel setup. So, use a fixed value for now. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Tested-By: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| | * | | | | xprtrdma: Pre-allocate backward rpc_rqst and send/receive buffersChuck Lever2015-11-025-12/+309
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | xprtrdma's backward direction send and receive buffers are the same size as the forechannel's inline threshold, and must be pre- registered. The consumer has no control over which receive buffer the adapter chooses to catch an incoming backwards-direction call. Any receive buffer can be used for either a forward reply or a backward call. Thus both types of RPC message must all be the same size. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Tested-By: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| | * | | | | SUNRPC: Abstract backchannel operationsChuck Lever2015-11-022-2/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | xprt_{setup,destroy}_backchannel() won't be adequate for RPC/RMDA bi-direction. In particular, receive buffers have to be pre- registered and posted in order to receive incoming backchannel requests. Add a virtual function call to allow the insertion of appropriate backchannel setup and destruction methods for each transport. In addition, freeing a backchannel request is a little different for RPC/RDMA. Introduce an rpc_xprt_op to handle the difference. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Tested-By: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| | * | | | | xprtrdma: Saving IRQs no longer needed for rb_lockChuck Lever2015-11-021-14/+10Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that RPC replies are processed in a workqueue, there's no need to disable IRQs when managing send and receive buffers. This saves noticeable overhead per RPC. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Tested-By: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| | * | | | | xprtrdma: Remove reply taskletChuck Lever2015-11-021-43/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Clean up: The reply tasklet is no longer used. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Tested-By: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| | * | | | | xprtrdma: Use workqueue to process RPC/RDMA repliesChuck Lever2015-11-024-18/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The reply tasklet is fast, but it's single threaded. After reply traffic saturates a single CPU, there's no more reply processing capacity. Replace the tasklet with a workqueue to spread reply handling across all CPUs. This also moves RPC/RDMA reply handling out of the soft IRQ context and into a context that allows sleeps. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Tested-By: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| | * | | | | xprtrdma: Replace send and receive arraysChuck Lever2015-11-022-91/+73Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The rb_send_bufs and rb_recv_bufs arrays are used to implement a pair of stacks for keeping track of free rpcrdma_req and rpcrdma_rep structs. Replace those arrays with free lists. To allow more than 512 RPCs in-flight at once, each of these arrays would be larger than a page (assuming 8-byte addresses and 4KB pages). Allowing up to 64K in-flight RPCs (as TCP now does), each buffer array would have to be 128 pages. That's an order-6 allocation. (Not that we're going there.) A list is easier to expand dynamically. Instead of allocating a larger array of pointers and copying the existing pointers to the new array, simply append more buffers to each list. This also makes it simpler to manage receive buffers that might catch backwards-direction calls, or to post receive buffers in bulk to amortize the overhead of ib_post_recv. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Devesh Sharma <devesh.sharma@avagotech.com> Tested-By: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| | * | | | | xprtrdma: Refactor reply handler error handlingChuck Lever2015-11-023-40/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Clean up: The error cases in rpcrdma_reply_handler() almost never execute. Ensure the compiler places them out of the hot path. No behavior change expected. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Devesh Sharma <devesh.sharma@avagotech.com> Tested-By: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| | * | | | | xprtrdma: Prevent loss of completion signalsChuck Lever2015-11-022-41/+38Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 8301a2c047cc ("xprtrdma: Limit work done by completion handler") was supposed to prevent xprtrdma's upcall handlers from starving other softIRQ work by letting them return to the provider before all CQEs have been polled. The logic assumes the provider will call the upcall handler again immediately if the CQ is re-armed while there are still queued CQEs. This assumption is invalid. The IBTA spec says that after a CQ is armed, the hardware must interrupt only when a new CQE is inserted. xprtrdma can't rely on the provider calling again, even though some providers do. Therefore, leaving CQEs on queue makes sense only when there is another mechanism that ensures all remaining CQEs are consumed in a timely fashion. xprtrdma does not have such a mechanism. If a CQE remains queued, the transport can wait forever to send the next RPC. Finally, move the wcs array back onto the stack to ensure that the poll array is always local to the CPU where the completion upcall is running. Fixes: 8301a2c047cc ("xprtrdma: Limit work done by completion ...") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Devesh Sharma <devesh.sharma@avagotech.com> Tested-By: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| | * | | | | xprtrdma: Re-arm after missed eventsChuck Lever2015-11-021-56/+10Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ib_req_notify_cq(IB_CQ_REPORT_MISSED_EVENTS) returns a positive value if WCs were added to a CQ after the last completion upcall but before the CQ has been re-armed. Commit 7f23f6f6e388 ("xprtrmda: Reduce lock contention in completion handlers") assumed that when ib_req_notify_cq() returned a positive RC, the CQ had also been successfully re-armed, making it safe to return control to the provider without losing any completion signals. That is an invalid assumption. Change both completion handlers to continue polling while ib_req_notify_cq() returns a positive value. Fixes: 7f23f6f6e388 ("xprtrmda: Reduce lock contention in ...") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Devesh Sharma <devesh.sharma@avagotech.com> Tested-By: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| | * | | | | xprtrdma: Enable swap-on-NFS/RDMAChuck Lever2015-11-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After adding a swapfile on an NFS/RDMA mount and removing the normal swap partition, I was able to push the NFS client well into swap without any issue. I forgot to swapoff the NFS file before rebooting. This pinned the NFS mount and the IB core and provider, causing shutdown to hang. I think this is expected and safe behavior. Probably shutdown scripts should "swapoff -a" before unmounting any filesystems. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Tested-By: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| | * | | | | xprtrdma: don't log warnings for flushed completionsSteve Wise2015-11-021-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unsignaled send WRs can get flushed as part of normal unmount, so don't log them as warnings. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| * | | | | | SUNRPC: Use MSG_SENDPAGE_NOTLAST in xs_send_pagedata()Trond Myklebust2015-10-081-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we're sending more than one page via kernel_sendpage(), then set MSG_SENDPAGE_NOTLAST between the pages so that we don't send suboptimal frames (see commit 2f5338442425 and commit 35f9c09fe9c7). Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | SUNRPC: Move AF_LOCAL receive data path into a workqueue contextTrond Myklebust2015-10-081-30/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we've done it for TCP and UDP, let's convert AF_LOCAL as well. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | SUNRPC: Move UDP receive data path into a workqueue contextTrond Myklebust2015-10-081-23/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we've done it for TCP, let's convert UDP as well. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | SUNRPC: Move TCP receive data path into a workqueue contextTrond Myklebust2015-10-081-15/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Stream protocols such as TCP can often build up a backlog of data to be read due to ordering. Combine this with the fact that some workloads such as NFS read()-intensive workloads need to receive a lot of data per RPC call, and it turns out that receiving the data from inside a softirq context can cause starvation. The following patch moves the TCP data receive into a workqueue context. We still end up calling tcp_read_sock(), but we do so from a process context, meaning that softirqs are enabled for most of the time. With this patch, I see a doubling of read bandwidth when running a multi-threaded iozone workload between a virtual client and server setup. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | SUNRPC: Refactor TCP receiveTrond Myklebust2015-10-081-15/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the TCP data receive loop out of xs_tcp_data_ready(). Doing so will allow us to move the data receive out of the softirq context in a set of followup patches. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* | | | | | | Merge branch 'akpm' (patches from Andrew)Linus Torvalds2015-11-076-11/+13
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge second patch-bomb from Andrew Morton: - most of the rest of MM - procfs - lib/ updates - printk updates - bitops infrastructure tweaks - checkpatch updates - nilfs2 update - signals - various other misc bits: coredump, seqfile, kexec, pidns, zlib, ipc, dma-debug, dma-mapping, ... * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (102 commits) ipc,msg: drop dst nil validation in copy_msg include/linux/zutil.h: fix usage example of zlib_adler32() panic: release stale console lock to always get the logbuf printed out dma-debug: check nents in dma_sync_sg* dma-mapping: tidy up dma_parms default handling pidns: fix set/getpriority and ioprio_set/get in PRIO_USER mode kexec: use file name as the output message prefix fs, seqfile: always allow oom killer seq_file: reuse string_escape_str() fs/seq_file: use seq_* helpers in seq_hex_dump() coredump: change zap_threads() and zap_process() to use for_each_thread() coredump: ensure all coredumping tasks have SIGNAL_GROUP_COREDUMP signal: remove jffs2_garbage_collect_thread()->allow_signal(SIGCONT) signal: introduce kernel_signal_stop() to fix jffs2_garbage_collect_thread() signal: turn dequeue_signal_lock() into kernel_dequeue_signal() signals: kill block_all_signals() and unblock_all_signals() nilfs2: fix gcc uninitialized-variable warnings in powerpc build nilfs2: fix gcc unused-but-set-variable warnings MAINTAINERS: nilfs2: add header file for tracing nilfs2: add tracepoints for analyzing reading and writing metadata files ...
| * | | | | | | mm, page_alloc: distinguish between being unable to sleep, unwilling to ↵Mel Gorman2015-11-076-11/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sleep and avoiding waking kswapd __GFP_WAIT has been used to identify atomic context in callers that hold spinlocks or are in interrupts. They are expected to be high priority and have access one of two watermarks lower than "min" which can be referred to as the "atomic reserve". __GFP_HIGH users get access to the first lower watermark and can be called the "high priority reserve". Over time, callers had a requirement to not block when fallback options were available. Some have abused __GFP_WAIT leading to a situation where an optimisitic allocation with a fallback option can access atomic reserves. This patch uses __GFP_ATOMIC to identify callers that are truely atomic, cannot sleep and have no alternative. High priority users continue to use __GFP_HIGH. __GFP_DIRECT_RECLAIM identifies callers that can sleep and are willing to enter direct reclaim. __GFP_KSWAPD_RECLAIM to identify callers that want to wake kswapd for background reclaim. __GFP_WAIT is redefined as a caller that is willing to enter direct reclaim and wake kswapd for background reclaim. This patch then converts a number of sites o __GFP_ATOMIC is used by callers that are high priority and have memory pools for those requests. GFP_ATOMIC uses this flag. o Callers that have a limited mempool to guarantee forward progress clear __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall into this category where kswapd will still be woken but atomic reserves are not used as there is a one-entry mempool to guarantee progress. o Callers that are checking if they are non-blocking should use the helper gfpflags_allow_blocking() where possible. This is because checking for __GFP_WAIT as was done historically now can trigger false positives. Some exceptions like dm-crypt.c exist where the code intent is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to flag manipulations. o Callers that built their own GFP flags instead of starting with GFP_KERNEL and friends now also need to specify __GFP_KSWAPD_RECLAIM. The first key hazard to watch out for is callers that removed __GFP_WAIT and was depending on access to atomic reserves for inconspicuous reasons. In some cases it may be appropriate for them to use __GFP_HIGH. The second key hazard is callers that assembled their own combination of GFP flags instead of starting with something like GFP_KERNEL. They may now wish to specify __GFP_KSWAPD_RECLAIM. It's almost certainly harmless if it's missed in most cases as other activity will wake kswapd. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Vitaly Wool <vitalywool@gmail.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | | Merge tag 'for-linus' of ↵Linus Torvalds2015-11-0717-350/+339Star
|\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma updates from Doug Ledford: "This is my initial round of 4.4 merge window patches. There are a few other things I wish to get in for 4.4 that aren't in this pull, as this represents what has gone through merge/build/run testing and not what is the last few items for which testing is not yet complete. - "Checksum offload support in user space" enablement - Misc cxgb4 fixes, add T6 support - Misc usnic fixes - 32 bit build warning fixes - Misc ocrdma fixes - Multicast loopback prevention extension - Extend the GID cache to store and return attributes of GIDs - Misc iSER updates - iSER clustering update - Network NameSpace support for rdma CM - Work Request cleanup series - New Memory Registration API" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (76 commits) IB/core, cma: Make __attribute_const__ declarations sparse-friendly IB/core: Remove old fast registration API IB/ipath: Remove fast registration from the code IB/hfi1: Remove fast registration from the code RDMA/nes: Remove old FRWR API IB/qib: Remove old FRWR API iw_cxgb4: Remove old FRWR API RDMA/cxgb3: Remove old FRWR API RDMA/ocrdma: Remove old FRWR API IB/mlx4: Remove old FRWR API support IB/mlx5: Remove old FRWR API support IB/srp: Dont allocate a page vector when using fast_reg IB/srp: Remove srp_finish_mapping IB/srp: Convert to new registration API IB/srp: Split srp_map_sg RDS/IW: Convert to new memory registration API svcrdma: Port to new memory registration API xprtrdma: Port to new memory registration API iser-target: Port to new memory registration API IB/iser: Port to new fast registration API ...
| * | | | | | | | RDS/IW: Convert to new memory registration APISagi Grimberg2015-10-293-115/+75Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Get rid of fast_reg page list and its construction. Instead, just pass the RDS sg list to ib_map_mr_sg and post the new ib_reg_wr. This is done both for server IW RDMA_READ registration and the client remote key registration. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Acked-by: Christoph Hellwig <hch@lst.de> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | | | | | | | svcrdma: Port to new memory registration APISagi Grimberg2015-10-292-57/+53Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of maintaining a fastreg page list, keep an sg table and convert an array of pages to a sg list. Then call ib_map_mr_sg and construct ib_reg_wr. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Acked-by: Christoph Hellwig <hch@lst.de> Tested-by: Steve Wise <swise@opengridcomputing.com> Tested-by: Selvin Xavier <selvin.xavier@avagotech.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | | | | | | | xprtrdma: Port to new memory registration APISagi Grimberg2015-10-292-52/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of maintaining a fastreg page list, keep an sg table and convert an array of pages to a sg list. Then call ib_map_mr_sg and construct ib_reg_wr. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Acked-by: Christoph Hellwig <hch@lst.de> Tested-by: Steve Wise <swise@opengridcomputing.com> Tested-by: Selvin Xavier <selvin.xavier@avagotech.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | | | | | | | Merge branch 'wr-cleanup' into k.o/for-4.4Doug Ledford2015-10-298-160/+176
| |\ \ \ \ \ \ \ \
| | * \ \ \ \ \ \ \ Merge branch 'wr-cleanup' of git://git.infradead.org/users/hch/rdma into ↵Doug Ledford2015-10-2912-200/+181Star
| | |\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | wr-cleanup Signed-off-by: Doug Ledford <dledford@redhat.com> Conflicts: drivers/infiniband/ulp/isert/ib_isert.c - Commit 4366b19ca5eb (iser-target: Change the recv buffers posting logic) changed the logic in isert_put_datain() and had to be hand merged
| | | * | | | | | | | IB: split struct ib_send_wrChristoph Hellwig2015-10-088-160/+176
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch split up struct ib_send_wr so that all non-trivial verbs use their own structure which embedds struct ib_send_wr. This dramaticly shrinks the size of a WR for most common operations: sizeof(struct ib_send_wr) (old): 96 sizeof(struct ib_send_wr): 48 sizeof(struct ib_rdma_wr): 64 sizeof(struct ib_atomic_wr): 96 sizeof(struct ib_ud_wr): 88 sizeof(struct ib_fast_reg_wr): 88 sizeof(struct ib_bind_mw_wr): 96 sizeof(struct ib_sig_handover_wr): 80 And with Sagi's pending MR rework the fast registration WR will also be down to a reasonable size: sizeof(struct ib_fastreg_wr): 64 Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> [srp, srpt] Reviewed-by: Chuck Lever <chuck.lever@oracle.com> [sunrpc] Tested-by: Haggai Eran <haggaie@mellanox.com> Tested-by: Sagi Grimberg <sagig@mellanox.com> Tested-by: Steve Wise <swise@opengridcomputing.com>
| * | | | | | | | | | IB/cma: Add support for network namespacesGuy Shapiro2015-10-288-11/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for network namespaces in the ib_cma module. This is accomplished by: 1. Adding network namespace parameter for rdma_create_id. This parameter is used to populate the network namespace field in rdma_id_private. rdma_create_id keeps a reference on the network namespace. 2. Using the network namespace from the rdma_id instead of init_net inside of ib_cma, when listening on an ID and when looking for an ID for an incoming request. 3. Decrementing the reference count for the appropriate network namespace when calling rdma_destroy_id. In order to preserve the current behavior init_net is passed when calling from other modules. Signed-off-by: Guy Shapiro <guysh@mellanox.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Yotam Kenneth <yotamke@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | | | | | | | | | Merge branch 'k.o/for-4.3-v1' into k.o/for-4.4Doug Ledford2015-10-215-40/+5Star
| |\ \ \ \ \ \ \ \ \ \ | | |/ / / / / / / / / | |/| / / / / / / / / | | |/ / / / / / / / | | | | | | | | | | Pick up the late fixes from the 4.3 cycle so we have them in our next branch.
* | | | | | | | | | Merge branch 'for-linus' of ↵Linus Torvalds2015-11-073-5/+5
|\ \ \ \ \ \ \ \ \ \ | |_|_|/ / / / / / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial updates from Jiri Kosina: "Trivial stuff from trivial tree that can be trivially summed up as: - treewide drop of spurious unlikely() before IS_ERR() from Viresh Kumar - cosmetic fixes (that don't really affect basic functionality of the driver) for pktcdvd and bcache, from Julia Lawall and Petr Mladek - various comment / printk fixes and updates all over the place" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: bcache: Really show state of work pending bit hwmon: applesmc: fix comment typos Kconfig: remove comment about scsi_wait_scan module class_find_device: fix reference to argument "match" debugfs: document that debugfs_remove*() accepts NULL and error values net: Drop unlikely before IS_ERR(_OR_NULL) mm: Drop unlikely before IS_ERR(_OR_NULL) fs: Drop unlikely before IS_ERR(_OR_NULL) drivers: net: Drop unlikely before IS_ERR(_OR_NULL) drivers: misc: Drop unlikely before IS_ERR(_OR_NULL) UBI: Update comments to reflect UBI_METAONLY flag pktcdvd: drop null test before destroy functions
| * | | | | | | | | net: Drop unlikely before IS_ERR(_OR_NULL)Viresh Kumar2015-09-293-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | IS_ERR(_OR_NULL) already contain an 'unlikely' compiler flag and there is no need to do that again from its callers. Drop it. Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
* | | | | | | | | | Merge branch 'next' of ↵Linus Torvalds2015-11-0610-45/+54
|\ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem update from James Morris: "This is mostly maintenance updates across the subsystem, with a notable update for TPM 2.0, and addition of Jarkko Sakkinen as a maintainer of that" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (40 commits) apparmor: clarify CRYPTO dependency selinux: Use a kmem_cache for allocation struct file_security_struct selinux: ioctl_has_perm should be static selinux: use sprintf return value selinux: use kstrdup() in security_get_bools() selinux: use kmemdup in security_sid_to_context_core() selinux: remove pointless cast in selinux_inode_setsecurity() selinux: introduce security_context_str_to_sid selinux: do not check open perm on ftruncate call selinux: change CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE default KEYS: Merge the type-specific data with the payload data KEYS: Provide a script to extract a module signature KEYS: Provide a script to extract the sys cert list from a vmlinux file keys: Be more consistent in selection of union members used certs: add .gitignore to stop git nagging about x509_certificate_list KEYS: use kvfree() in add_key Smack: limited capability for changing process label TPM: remove unnecessary little endian conversion vTPM: support little endian guests char: Drop owner assignment from i2c_driver ...
| * | | | | | | | | | KEYS: Merge the type-specific data with the payload dataDavid Howells2015-10-2110-45/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge the type-specific data with the payload data into one four-word chunk as it seems pointless to keep them separate. Use user_key_payload() for accessing the payloads of overloaded user-defined keys. Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-cifs@vger.kernel.org cc: ecryptfs@vger.kernel.org cc: linux-ext4@vger.kernel.org cc: linux-f2fs-devel@lists.sourceforge.net cc: linux-nfs@vger.kernel.org cc: ceph-devel@vger.kernel.org cc: linux-ima-devel@lists.sourceforge.net
* | | | | | | | | | | Merge tag 'tty-4.4-rc1' of ↵Linus Torvalds2015-11-051-30/+1Star
|\ \ \ \ \ \ \ \ \ \ \ | |_|_|_|_|_|_|/ / / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial driver updates from Greg KH: "Here is the big tty and serial driver update for 4.4-rc1. Lots of serial driver updates and a few small tty core changes. Full details in the shortlog. All of these have been in linux-next for a while" * tag 'tty-4.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (148 commits) tty: Use unbound workqueue for all input workers tty: Abstract tty buffer work tty: Prevent tty teardown during tty_write_message() tty: core: Use correct spinlock flavor in tiocspgrp() tty: Combine SIGTTOU/SIGTTIN handling serial: amba-pl011: fix incorrect integer size in pl011_fifo_to_tty() ttyFDC: Fix build problems due to use of module_{init,exit} tty: remove unneeded return statement serial: 8250_mid: add support for DMA engine handling from UART MMIO dmaengine: hsu: remove platform data dmaengine: hsu: introduce stubs for the exported functions dmaengine: hsu: make the UART driver in control of selecting this driver serial: fix mctrl helper functions serial: 8250_pci: Intel MID UART support to its own driver serial: fsl_lpuart: add earlycon support tty: disable unbind for old 74xx based serial/mpsc console port serial: pl011: Spelling s/clocks-names/clock-names/ n_tty: Remove reader wakeups for TTY_BREAK/TTY_PARITY chars tty: synclink, fix indentation serial: at91, fix rs485 properties ...
| * | | | | | | | | | tty: Remove ASYNC_CLOSING checks in open()/hangup() methodsPeter Hurley2015-10-181-30/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since at least before 2.6.30, tty drivers that do not drop the tty lock while closing cannot observe ASYNC_CLOSING set while holding the tty lock; this includes the tty driver's open() and hangup() methods, since the tty core calls these methods holding the tty lock. For these drivers, waiting for ASYNC_CLOSING to clear while opening is not required, since this condition cannot occur. Similarly, even when the open() method drops and reacquires the tty lock after blocking, ASYNC_CLOSING cannot be set (again, for drivers that do not drop the tty lock while closing). Now that tty port drivers no longer drop the tty lock while closing (since 'tty: Remove tty_wait_until_sent_from_close()'), the same conditions apply: waiting for ASYNC_CLOSING to clear while opening is not required, nor is re-checking ASYNC_CLOSING after dropping and reacquiring the tty lock while blocking (eg., in *_block_til_ready()). Note: The ASYNC_CLOSING flag state is still maintained since several bitrotting drivers use it for (dubious) other purposes. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | | | | | | | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2015-11-037-37/+46
|\ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Minor overlapping changes in net/ipv4/ipmr.c, in 'net' we were fixing the "BH-ness" of the counter bumps whilst in 'net-next' the functions were modified to take an explicit 'net' parameter. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | | | ipv6: fix tunnel error handlingMichal Kubeček2015-11-031-1/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Both tunnel6_protocol and tunnel46_protocol share the same error handler, tunnel6_err(), which traverses through tunnel6_handlers list. For ipip6 tunnels, we need to traverse tunnel46_handlers as we do e.g. in tunnel46_rcv(). Current code can generate an ICMPv6 error message with an IPv4 packet embedded in it. Fixes: 73d605d1abbd ("[IPSEC]: changing API of xfrm6_tunnel_register") Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | | | sit: fix sit0 percpu double allocationsEric Dumazet2015-11-031-22/+4Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sit0 device allocates its percpu storage twice : - One time in ipip6_tunnel_init() - One time in ipip6_fb_tunnel_init() Thus we leak 48 bytes per possible cpu per network namespace dismantle. ipip6_fb_tunnel_init() can be much simpler and does not return an error, and should be called after register_netdev() Note that ipip6_tunnel_clone_6rd() also needs to be called after register_netdev() (calling ipip6_tunnel_init()) Fixes: ebe084aafb7e ("sit: Use ipip6_tunnel_init as the ndo_init function.") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | | | ipv6: fix crash on ICMPv6 redirects with prohibited/blackholed sourceMatthias Schiffer2015-11-021-2/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are other error values besides ip6_null_entry that can be returned by ip6_route_redirect(): fib6_rule_action() can also result in ip6_blk_hole_entry and ip6_prohibit_entry if such ip rules are installed. Only checking for ip6_null_entry in rt6_do_redirect() causes ip6_ins_rt() to be called with rt->rt6i_table == NULL in these cases, making the kernel crash. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | | | ipmr: fix possible race resulting from improper usage of IP_INC_STATS_BH() ↵Ani Sinha2015-11-021-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | in preemptible context. Fixes the following kernel BUG : BUG: using __this_cpu_add() in preemptible [00000000] code: bash/2758 caller is __this_cpu_preempt_check+0x13/0x15 CPU: 0 PID: 2758 Comm: bash Tainted: P O 3.18.19 #2 ffffffff8170eaca ffff880110d1b788 ffffffff81482b2a 0000000000000000 0000000000000000 ffff880110d1b7b8 ffffffff812010ae ffff880007cab800 ffff88001a060800 ffff88013a899108 ffff880108b84240 ffff880110d1b7c8 Call Trace: [<ffffffff81482b2a>] dump_stack+0x52/0x80 [<ffffffff812010ae>] check_preemption_disabled+0xce/0xe1 [<ffffffff812010d4>] __this_cpu_preempt_check+0x13/0x15 [<ffffffff81419d60>] ipmr_queue_xmit+0x647/0x70c [<ffffffff8141a154>] ip_mr_forward+0x32f/0x34e [<ffffffff8141af76>] ip_mroute_setsockopt+0xe03/0x108c [<ffffffff810553fc>] ? get_parent_ip+0x11/0x42 [<ffffffff810e6974>] ? pollwake+0x4d/0x51 [<ffffffff81058ac0>] ? default_wake_function+0x0/0xf [<ffffffff810553fc>] ? get_parent_ip+0x11/0x42 [<ffffffff810613d9>] ? __wake_up_common+0x45/0x77 [<ffffffff81486ea9>] ? _raw_spin_unlock_irqrestore+0x1d/0x32 [<ffffffff810618bc>] ? __wake_up_sync_key+0x4a/0x53 [<ffffffff8139a519>] ? sock_def_readable+0x71/0x75 [<ffffffff813dd226>] do_ip_setsockopt+0x9d/0xb55 [<ffffffff81429818>] ? unix_seqpacket_sendmsg+0x3f/0x41 [<ffffffff813963fe>] ? sock_sendmsg+0x6d/0x86 [<ffffffff813959d4>] ? sockfd_lookup_light+0x12/0x5d [<ffffffff8139650a>] ? SyS_sendto+0xf3/0x11b [<ffffffff810d5738>] ? new_sync_read+0x82/0xaa [<ffffffff813ddd19>] compat_ip_setsockopt+0x3b/0x99 [<ffffffff813fb24a>] compat_raw_setsockopt+0x11/0x32 [<ffffffff81399052>] compat_sock_common_setsockopt+0x18/0x1f [<ffffffff813c4d05>] compat_SyS_setsockopt+0x1a9/0x1cf [<ffffffff813c4149>] compat_SyS_socketcall+0x180/0x1e3 [<ffffffff81488ea1>] cstar_dispatch+0x7/0x1e Signed-off-by: Ani Sinha <ani@arista.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | | | ipv4: update RTNH_F_LINKDOWN flag on UP eventJulian Anastasov2015-11-011-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When nexthop is part of multipath route we should clear the LINKDOWN flag when link goes UP or when first address is added. This is needed because we always set LINKDOWN flag when DEAD flag was set but now on UP the nexthop is not dead anymore. Examples when LINKDOWN bit can be forgotten when no NETDEV_CHANGE is delivered: - link goes down (LINKDOWN is set), then link goes UP and device shows carrier OK but LINKDOWN remains set - last address is deleted (LINKDOWN is set), then address is added and device shows carrier OK but LINKDOWN remains set Steps to reproduce: modprobe dummy ifconfig dummy0 192.168.168.1 up here add a multipath route where one nexthop is for dummy0: ip route add 1.2.3.4 nexthop dummy0 nexthop SOME_OTHER_DEVICE ifconfig dummy0 down ifconfig dummy0 up now ip route shows nexthop that is not dead. Now set the sysctl var: echo 1 > /proc/sys/net/ipv4/conf/dummy0/ignore_routes_with_linkdown now ip route will show a dead nexthop because the forgotten RTNH_F_LINKDOWN is propagated as RTNH_F_DEAD. Fixes: 8a3d03166f19 ("net: track link-status of ipv4 nexthops") Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | | | ipv4: fix to not remove local route on link downJulian Anastasov2015-11-012-9/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When fib_netdev_event calls fib_disable_ip on NETDEV_DOWN event we should not delete the local routes if the local address is still present. The confusion comes from the fact that both fib_netdev_event and fib_inetaddr_event use the NETDEV_DOWN constant. Fix it by returning back the variable 'force'. Steps to reproduce: modprobe dummy ifconfig dummy0 192.168.168.1 up ifconfig dummy0 down ip route list table local | grep dummy | grep host local 192.168.168.1 dev dummy0 proto kernel scope host src 192.168.168.1 Fixes: 8a3d03166f19 ("net: track link-status of ipv4 nexthops") Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | | | tipc: linearize arriving NAME_DISTR and LINK_PROTO buffersJon Paul Maloy2015-11-011-0/+5
| | |_|_|_|_|_|_|_|_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Testing of the new UDP bearer has revealed that reception of NAME_DISTRIBUTOR, LINK_PROTOCOL/RESET and LINK_PROTOCOL/ACTIVATE message buffers is not prepared for the case that those may be non-linear. We now linearize all such buffers before they are delivered up to the generic reception layer. In order for the commit to apply cleanly to 'net' and 'stable', we do the change in the function tipc_udp_recv() for now. Later, we will post a commit to 'net-next' moving the linearization to generic code, in tipc_named_rcv() and tipc_link_proto_rcv(). Fixes: commit d0f91938bede ("tipc: add ip/udp media type") Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | | | | | | switchdev: respect SKIP_EOPNOTSUPP flag in case there is no recursionJiri Pirko2015-11-031-5/+4Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Caller passing down the SKIP_EOPNOTSUPP switchdev flag expects that -EOPNOTSUPP cannot be returned. But in case of direct op call without recurtion, this may happen. So fix this by checking it always on the end of __switchdev_port_attr_set function. Fixes: 464314ea6c11 ("switchdev: skip over ports returning -EOPNOTSUPP when recursing ports") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | | | | | | net: sched: kill dead code in sch_choke.cPhil Sutter2015-11-031-59/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It looks like this has never been used at all. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | | | | | | irda: Delete an unnecessary check before the function call ↵Markus Elfring2015-11-031-2/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "irlmp_unregister_service" The irlmp_unregister_service() function tests whether its argument is NULL and then returns immediately. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | | | | | | Merge tag 'mac80211-for-davem-2015-11-03' of ↵David S. Miller2015-11-0317-210/+264
|\ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== Another set of fixes: * remove a warning on a check that can trigger without any errors having happened (Andrei) * correctly handle deauth request while in the process of associating (Andrei) * fix TDLS HT operation (Arik) * allow changing AID/listen interval during client setup (Ayala) * be more forgiving with WMM parameters to get HT/VHT in case of broken APs with bad WMM settings (Emmanuel, myself) * a number of other fixes (some in documentation) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | | | cfg80211: allow AID/listen interval changes for unassociated stationAyala Beker2015-11-031-9/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, cfg80211 rejects updates of AID and listen interval parameters for existing entries. This information is known only at association stage and as a result it's impossible to update entries that were added unassociated. Fix this by allowing updates of these properies for stations that the driver (or mac80211) assigned unassociated state. This then fixes mac80211's use of NL80211_FEATURE_FULL_AP_CLIENT_STATE. Signed-off-by: Ayala Beker <ayala.beker@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>