summaryrefslogtreecommitdiffstats
path: root/drivers/vhost
Commit message (Collapse)AuthorAgeFilesLines
* vhost/scsi: fix reuse of &vq->iov[out] in responseBenjamin Coddington2016-08-231-3/+3
| | | | | | | | | | | | | The address of the iovec &vq->iov[out] is not guaranteed to contain the scsi command's response iovec throughout the lifetime of the command. Rather, it is more likely to contain an iovec from an immediately following command after looping back around to vhost_get_vq_desc(). Pass along the iovec entirely instead. Fixes: 79c14141a487 ("vhost/scsi: Convert completion path to use copy_to_iter") Cc: stable@vger.kernel.org Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* vhost/test: fix after swiotlb changesMichael S. Tsirkin2016-08-151-4/+4
| | | | Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* vhost/vsock: fix vhost virtio_vsock_pkt use-after-freeStefan Hajnoczi2016-08-091-1/+5
| | | | | | | | | | | | | Stash the packet length in a local variable before handing over ownership of the packet to virtio_transport_recv_pkt() or virtio_transport_free_pkt(). This patch solves the use-after-free since pkt is no longer guaranteed to be alive. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds2016-08-067-192/+1612
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull virtio/vhost updates from Michael Tsirkin: - new vsock device support in host and guest - platform IOMMU support in host and guest, including compatibility quirks for legacy systems. - misc fixes and cleanups. * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: VSOCK: Use kvfree() vhost: split out vringh Kconfig vhost: detect 32 bit integer wrap around vhost: new device IOTLB API vhost: drop vringh dependency vhost: convert pre sorted vhost memory array to interval tree vhost: introduce vhost memory accessors VSOCK: Add Makefile and Kconfig VSOCK: Introduce vhost_vsock.ko VSOCK: Introduce virtio_transport.ko VSOCK: Introduce virtio_vsock_common.ko VSOCK: defer sock removal to transports VSOCK: transport-specific vsock_transport functions vhost: drop vringh dependency vop: pull in vhost Kconfig virtio: new feature to detect IOMMU device quirk balloon: check the number of available pages in leak balloon vhost: lockless enqueuing vhost: simplify work flushing
| * VSOCK: Use kvfree()Wei Yongjun2016-08-021-4/+1Star
| | | | | | | | | | | | | | Use kvfree() instead of open-coding it. Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vhost: split out vringh KconfigMichael S. Tsirkin2016-08-022-6/+5Star
| | | | | | | | | | | | | | | | | | | | | | vringh is pulled in by caif and mic, but the other vhost config does not need to be there. In particular, it makes no sense to have vhost net/scsi/sock under caif/mic. Create a separate Kconfig file and put vringh bits there. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vhost: detect 32 bit integer wrap aroundMichael S. Tsirkin2016-08-021-2/+14
| | | | | | | | | | | | Detect and fail early if long wrap around is triggered. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vhost: new device IOTLB APIJason Wang2016-08-023-50/+677
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch tries to implement an device IOTLB for vhost. This could be used with userspace(qemu) implementation of DMA remapping to emulate an IOMMU for the guest. The idea is simple, cache the translation in a software device IOTLB (which is implemented as an interval tree) in vhost and use vhost_net file descriptor for reporting IOTLB miss and IOTLB update/invalidation. When vhost meets an IOTLB miss, the fault address, size and access can be read from the file. After userspace finishes the translation, it writes the translated address to the vhost_net file to update the device IOTLB. When device IOTLB is enabled by setting VIRTIO_F_IOMMU_PLATFORM all vq addresses set by ioctl are treated as iova instead of virtual address and the accessing can only be done through IOTLB instead of direct userspace memory access. Before each round or vq processing, all vq metadata is prefetched in device IOTLB to make sure no translation fault happens during vq processing. In most cases, virtqueues are contiguous even in virtual address space. The IOTLB translation for virtqueue itself may make it a little slower. We might add fast path cache on top of this patch. Signed-off-by: Jason Wang <jasowang@redhat.com> [mst: use virtio feature bit: VHOST_F_DEVICE_IOTLB -> VIRTIO_F_IOMMU_PLATFORM ] [mst: fix build warnings ] Signed-off-by: Michael S. Tsirkin <mst@redhat.com> [ weiyj.lk: missing unlock on error ] Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
| * vhost: convert pre sorted vhost memory array to interval treeJason Wang2016-08-023-89/+128
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current pre-sorted memory region array has some limitations for future device IOTLB conversion: 1) need extra work for adding and removing a single region, and it's expected to be slow because of sorting or memory re-allocation. 2) need extra work of removing a large range which may intersect several regions with different size. 3) need trick for a replacement policy like LRU To overcome the above shortcomings, this patch convert it to interval tree which can easily address the above issue with almost no extra work. The patch could be used for: - Extend the current API and only let the userspace to send diffs of memory table. - Simplify Device IOTLB implementation. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vhost: introduce vhost memory accessorsJason Wang2016-08-021-15/+35
| | | | | | | | | | | | | | | | | | | | This patch introduces vhost memory accessors which were just wrappers for userspace address access helpers. This is a requirement for vhost device iotlb implementation which will add iotlb translations in those accessors. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * VSOCK: Add Makefile and KconfigAsias He2016-08-022-0/+18
| | | | | | | | | | | | | | | | Enable virtio-vsock and vhost-vsock. Signed-off-by: Asias He <asias@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * VSOCK: Introduce vhost_vsock.koAsias He2016-08-021-0/+722
| | | | | | | | | | | | | | | | | | VM sockets vhost transport implementation. This driver runs on the host. Signed-off-by: Asias He <asias@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vhost: drop vringh dependencyMichael S. Tsirkin2016-08-021-2/+0Star
| | | | | | | | | | | | | | vringh isn't used by vhost net or scsi - it's used by CAIF only at the moment. Drop the dependency. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vhost: lockless enqueuingJason Wang2016-08-012-30/+29Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We use spinlock to synchronize the work list now which may cause unnecessary contentions. So this patch switch to use llist to remove this contention. Pktgen tests shows about 5% improvement: Before: ~1300000 pps After: ~1370000 pps Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vhost: simplify work flushingJason Wang2016-08-011-32/+21Star
| | | | | | | | | | | | | | | | | | | | | | We used to implement the work flushing through tracking queued seq, done seq, and the number of flushing. This patch simplify this by just implement work flushing through another kind of vhost work with completion. This will be used by lockless enqueuing patch. Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* | tun: switch to use skb array for txJason Wang2016-07-011-1/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We used to queue tx packets in sk_receive_queue, this is less efficient since it requires spinlocks to synchronize between producer and consumer. This patch tries to address this by: - switch from sk_receive_queue to a skb_array, and resize it when tx_queue_len was changed. - introduce a new proto_ops peek_len which was used for peeking the skb length. - implement a tun version of peek_len for vhost_net to use and convert vhost_net to use peek_len if possible. Pktgen test shows about 15.3% improvement on guest receiving pps for small buffers: Before: ~1300000pps After : ~1500000pps Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | vhost_net: stop polling socket during rx processingJason Wang2016-06-071-31/+33
|/ | | | | | | | | | | | | | | | | | | We don't stop rx polling socket during rx processing, this will lead unnecessary wakeups from under layer net devices (E.g sock_def_readable() form tun). Rx will be slowed down in this way. This patch avoids this by stop polling socket during rx processing. A small drawback is that this introduces some overheads in light load case because of the extra start/stop polling, but single netperf TCP_RR does not notice any change. In a super heavy load case, e.g using pktgen to inject packet to guest, we get about ~8.8% improvement on pps: before: ~1240000 pkt/s after: ~1350000 pkt/s Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* target: make close_session optionalChristoph Hellwig2016-05-101-6/+0Star
| | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: make ->shutdown_session optionalChristoph Hellwig2016-05-101-6/+0Star
| | | | | | | | | | | | | | Turns out the template and thus many drivers got the return value wrong: 0 means the fabrics driver needs to put a session reference, which no driver except for the iSCSI target drivers did. Fortunately none of these drivers supports explicit Node ACLs, so the bug was harmless. Even without that only qla2xxx and iscsi every did real work in shutdown_session, so get rid of the boilerplate code in all other drivers. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* Merge branch 'for-next' of ↵Linus Torvalds2016-03-221-58/+41Star
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull SCSI target updates from Nicholas Bellinger: "The highlights this round include: - Add target_alloc_session() w/ callback helper for doing se_session allocation + tag + se_node_acl lookup. (HCH + nab) - Tree-wide fabric driver conversion to use target_alloc_session() - Convert sbp-target to use percpu_ida tag pre-allocation, and TARGET_SCF_ACK_KREF I/O krefs (Chris Boot + nab) - Convert usb-gadget to use percpu_ida tag pre-allocation, and TARGET_SCF_ACK_KREF I/O krefs (Andrzej Pietrasiewicz + nab) - Convert xen-scsiback to use percpu_ida tag pre-allocation, and TARGET_SCF_ACK_KREF I/O krefs (Juergen Gross + nab) - Convert tcm_fc to use TARGET_SCF_ACK_KREF I/O + TMR krefs - Convert ib_srpt to use percpu_ida tag pre-allocation - Add DebugFS node for qla2xxx target sess list (Quinn) - Rework iser-target connection termination (Jenny + Sagi) - Convert iser-target to new CQ API (HCH) - Add pass-through WRITE_SAME support for IBLOCK (Mike Christie) - Introduce data_bitmap for asynchronous access of data area (Sheng Yang + Andy) - Fix target_release_cmd_kref shutdown comp leak (Himanshu Madhani) Also, there is a separate PULL request coming for cxgb4 NIC driver prerequisites for supporting hw iscsi segmentation offload (ISO), that will be the base for a number of v4.7 developments involving iscsi-target hw offloads" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (36 commits) target: Fix target_release_cmd_kref shutdown comp leak target: Avoid DataIN transfers for non-GOOD SAM status target/user: Report capability of handling out-of-order completions to userspace target/user: Fix size_t format-spec build warning target/user: Don't free expired command when time out target/user: Introduce data_bitmap, replace data_length/data_head/data_tail target/user: Free data ring in unified function target/user: Use iovec[] to describe continuous area target: Remove enum transport_lunflags_table target/iblock: pass WRITE_SAME to device if possible iser-target: Kill the ->isert_cmd back pointer in struct iser_tx_desc iser-target: Kill struct isert_rdma_wr iser-target: Convert to new CQ API iser-target: Split and properly type the login buffer iser-target: Remove ISER_RECV_DATA_SEG_LEN iser-target: Remove impossible condition from isert_wait_conn iser-target: Remove redundant wait in release_conn iser-target: Rework connection termination iser-target: Separate flows for np listeners and connections cma events iser-target: Add new state ISER_CONN_BOUND to isert_conn ...
| * vhost/scsi: Convert to target_alloc_session usageNicholas Bellinger2016-03-111-58/+41Star
| | | | | | | | | | | | | | | | This patch converts vhost/scsi pre-allocation of vhost_scsi_cmd descriptors to use the new alloc_session callback(). Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* | vhost_net: basic polling supportJason Wang2016-03-113-5/+88
| | | | | | | | | | | | | | | | | | This patch tries to poll for new added tx buffer or socket receive queue for a while at the end of tx/rx processing. The maximum time spent on polling were specified through a new kind of vring ioctl. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* | vhost: introduce vhost_vq_avail_empty()Jason Wang2016-03-112-0/+15
| | | | | | | | | | | | | | | | | | | | This patch introduces a helper which will return true if we're sure that the available ring is empty for a specific vq. When we're not sure, e.g vq access failure, return false instead. This could be used for busy polling code to exit the busy loop. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* | vhost: introduce vhost_has_work()Jason Wang2016-03-112-0/+8
| | | | | | | | | | | | | | | | | | This path introduces a helper which can give a hint for whether or not there's a work queued in the work list. This could be used for busy polling code to exit the busy loop. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* | vhost: rename vhost_init_used()Greg Kurz2016-03-025-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Looking at how callers use this, maybe we should just rename init_used to vhost_vq_init_access. The _used suffix was a hint that we access the vq used ring. But maybe what callers care about is that it must be called after access_ok. Also, this function manipulates the vq->is_le field which isn't related to the vq used ring. This patch simply renames vhost_init_used() to vhost_vq_init_access() as suggested by Michael. No behaviour change. Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* | vhost: rename cross-endian helpersGreg Kurz2016-03-021-6/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The default use case for vhost is when the host and the vring have the same endianness (default native endianness). But there are cases where they differ and vhost should byteswap when accessing the vring. The first case is when the host is big endian and the vring belongs to a virtio 1.0 device, which is always little endian. This is covered by the vq->is_le field. This field is initialized when userspace calls the VHOST_SET_FEATURES ioctl. It is reset when the device stops. We already have a vhost_init_is_le() helper, but the reset operation is opencoded as follows: vq->is_le = virtio_legacy_is_little_endian(); It isn't clear that we are resetting vq->is_le here. This patch moves the code to a helper with a more explicit name. The other case where we may have to byteswap is when the architecture can switch endianness at runtime (bi-endian). If endianness differs in the host and in the guest, then legacy devices need to be used in cross-endian mode. This mode is available with CONFIG_VHOST_CROSS_ENDIAN_LEGACY=y, which introduces a vq->user_be field. Userspace may enable cross-endian mode by calling the SET_VRING_ENDIAN ioctl before the device is started. The cross-endian mode is disabled when the device is stopped. The current names of the helpers that manipulate vq->user_be are unclear. This patch renames those helpers to clearly show that this is cross-endian stuff and with explicit enable/disable semantics. No behaviour change. Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* | vhost: fix error path in vhost_init_used()Greg Kurz2016-03-021-4/+11
|/ | | | | | | | We don't want side effects. If something fails, we rollback vq->is_le to its previous value. Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* vhost: replace % with & on data pathMichael S. Tsirkin2015-12-071-3/+3
| | | | | | | We know vring num is a power of 2, so use & to mask the high bits. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* vhost: relax log address alignmentMichael S. Tsirkin2015-12-071-1/+1
| | | | | | | | | | | | commit 5d9a07b0de512b77bf28d2401e5fe3351f00a240 ("vhost: relax used address alignment") fixed the alignment for the used virtual address, but not for the physical address used for logging. That's a mistake: alignment should clearly be the same for virtual and physical addresses, Cc: stable@vger.kernel.org Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* Merge branch 'for-next' of ↵Linus Torvalds2015-11-141-22/+19Star
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull SCSI target updates from Nicholas Bellinger: "This series contains HCH's changes to absorb configfs attribute ->show() + ->store() function pointer usage from it's original tree-wide consumers, into common configfs code. It includes usb-gadget, target w/ drivers, netconsole and ocfs2 changes to realize the improved simplicity, that now renders the original include/target/configfs_macros.h CPP magic for fabric drivers and others, unnecessary and obsolete. And with common code in place, new configfs attributes can be added easier than ever before. Note, there are further improvements in-flight from other folks for v4.5 code in configfs land, plus number of target fixes for post -rc1 code" In the meantime, a new user of the now-removed old configfs API came in through the char/misc tree in commit 7bd1d4093c2f ("stm class: Introduce an abstraction for System Trace Module devices"). This merge resolution comes from Alexander Shishkin, who updated his stm class tracing abstraction to account for the removal of the old show_attribute and store_attribute methods in commit 517982229f78 ("configfs: remove old API") from this pull. As Alexander says about that patch: "There's no need to keep an extra wrapper structure per item and the awkward show_attribute/store_attribute item ops are no longer needed. This patch converts policy code to the new api, all the while making the code quite a bit smaller and easier on the eyes. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>" That patch was folded into the merge so that the tree should be fully bisectable. * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (23 commits) configfs: remove old API ocfs2/cluster: use per-attribute show and store methods ocfs2/cluster: move locking into attribute store methods netconsole: use per-attribute show and store methods target: use per-attribute show and store methods spear13xx_pcie_gadget: use per-attribute show and store methods dlm: use per-attribute show and store methods usb-gadget/f_serial: use per-attribute show and store methods usb-gadget/f_phonet: use per-attribute show and store methods usb-gadget/f_obex: use per-attribute show and store methods usb-gadget/f_uac2: use per-attribute show and store methods usb-gadget/f_uac1: use per-attribute show and store methods usb-gadget/f_mass_storage: use per-attribute show and store methods usb-gadget/f_sourcesink: use per-attribute show and store methods usb-gadget/f_printer: use per-attribute show and store methods usb-gadget/f_midi: use per-attribute show and store methods usb-gadget/f_loopback: use per-attribute show and store methods usb-gadget/ether: use per-attribute show and store methods usb-gadget/f_acm: use per-attribute show and store methods usb-gadget/f_hid: use per-attribute show and store methods ...
| * target: use per-attribute show and store methodsChristoph Hellwig2015-10-141-22/+19Star
| | | | | | | | | | | | | | | | | | | | | | This also allows to remove the target-specific old configfs macros, and gets rid of the target_core_fabric_configfs.h header which only had one function declaration left that could be moved to a better place. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Nicholas Bellinger <nab@linux-iscsi.org> Acked-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* | vhost: fix performance on LE hostsMichael S. Tsirkin2015-10-281-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 2751c9882b947292fcfb084c4f604e01724af804 ("vhost: cross-endian support for legacy devices") introduced a minor regression: even with cross-endian disabled, and even on LE host, vhost_is_little_endian is checking is_le flag so there's always a branch. To fix, simply check virtio_legacy_is_little_endian first. Cc: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds2015-09-184-6/+8
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | Pull virtio fixes and cleanups from Michael Tsirkin: "This fixes the virtio-test tool, and improves the error handling for virtio-ccw" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio/s390: handle failures of READ_VQ_CONF ccw tools/virtio: propagate V=X to kernel build vhost: move features to core tools/virtio: fix build after 4.2 changes
| * vhost: move features to coreMichael S. Tsirkin2015-09-164-6/+8
| | | | | | | | | | | | | | virtio 1 and any layout are core features, move them there. This fixes vhost test. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* | Merge 4.2-rc6 into char-misc-nextGreg Kroah-Hartman2015-08-101-4/+2Star
|\| | | | | | | | | | | We want the fixes in Linus's tree in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * vhost: fix error handling for memory region allocIgor Mammedov2015-07-271-4/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | callers of vhost_kvzalloc() expect the same behaviour on allocation error as from kmalloc/vmalloc i.e. NULL return value. So just return vzmalloc() returned value instead of returning ERR_PTR(-ENOMEM) Fixes: 4de7255f7d2be5 ("vhost: extend memory regions allocation to vmalloc") Spotted-by: Dan Carpenter <dan.carpenter@oracle.com> Suggested-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vhost: actually track log eventfd fileMarc-André Lureau2015-07-271-0/+1
| | | | | | | | | | | | | | | | | | While reviewing vhost log code, I found out that log_file is never set. Note: I haven't tested the change (QEMU doesn't use LOG_FD yet). Cc: stable@vger.kernel.org Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* | char: make misc_deregister a void functionGreg Kroah-Hartman2015-08-051-2/+2
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | With well over 200+ users of this api, there are a mere 12 users that actually checked the return value of this function. And all of them really didn't do anything with that information as the system or module was shutting down no matter what. So stop pretending like it matters, and just return void from misc_deregister(). If something goes wrong in the call, you will get a WARNING splat in the syslog so you know how to fix up your driver. Other than that, there's nothing that can go wrong. Cc: Alasdair Kergon <agk@redhat.com> Cc: Neil Brown <neilb@suse.com> Cc: Oleg Drokin <oleg.drokin@intel.com> Cc: Andreas Dilger <andreas.dilger@intel.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Wim Van Sebroeck <wim@iguana.be> Cc: Christine Caulfield <ccaulfie@redhat.com> Cc: David Teigland <teigland@redhat.com> Cc: Mark Fasheh <mfasheh@suse.com> Acked-by: Joel Becker <jlbec@evilplan.org> Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Acked-by: Alessandro Zummo <a.zummo@towertech.it> Acked-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds2015-07-231-16/+51
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull virtio/vhost fixes from Michael Tsirkin: "Bugfixes and documentation fixes. Igor's patch that allows users to tweak memory table size is borderline, but it does fix known crashes, so I merged it" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vhost: add max_mem_regions module parameter vhost: extend memory regions allocation to vmalloc 9p/trans_virtio: reset virtio device on remove virtio/s390: rename drivers/s390/kvm -> drivers/s390/virtio MAINTAINERS: separate section for s390 virtio drivers virtio: define virtio_pci_cfg_cap in header. virtio: Fix typecast of pointer in vring_init() virtio scsi: fix unused variable warning vhost: use binary search instead of linear in find_region() virtio_net: document VIRTIO_NET_CTRL_GUEST_OFFLOADS
| * vhost: add max_mem_regions module parameterIgor Mammedov2015-07-131-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | it became possible to use a bigger amount of memory slots, which is used by memory hotplug for registering hotplugged memory. However QEMU crashes if it's used with more than ~60 pc-dimm devices and vhost-net enabled since host kernel in module vhost-net refuses to accept more than 64 memory regions. Allow to tweak limit via max_mem_regions module paramemter with default value set to 64 slots. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vhost: extend memory regions allocation to vmallocIgor Mammedov2015-07-131-4/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | with large number of memory regions we could end up with high order allocations and kmalloc could fail if host is under memory pressure. Considering that memory regions array is used on hot path try harder to allocate using kmalloc and if it fails resort to vmalloc. It's still better than just failing vhost_set_memory() and causing guest crash due to it when a new memory hotplugged to guest. I'll still look at QEMU side solution to reduce amount of memory regions it feeds to vhost to make things even better, but it doesn't hurt for kernel to behave smarter and don't crash older QEMU's which could use large amount of memory regions. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vhost: use binary search instead of linear in find_region()Igor Mammedov2015-07-011-10/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For default region layouts performance stays the same as linear search i.e. it takes around 210ns average for translate_desc() that inlines find_region(). But it scales better with larger amount of regions, 235ns BS vs 300ns LS with 55 memory regions and it will be about the same values when allowed number of slots is increased to 509 like it has been done in kvm. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* | Merge branch 'for-next' of ↵Linus Torvalds2015-07-041-216/+3Star
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull SCSI target updates from Nicholas Bellinger: "It's been a busy development cycle for target-core in a number of different areas. The fabric API usage for se_node_acl allocation is now within target-core code, dropping the external API callers for all fabric drivers tree-wide. There is a new conversion to RCU hlists for se_node_acl and se_portal_group LUN mappings, that turns fast-past LUN lookup into a completely lockless code-path. It also removes the original hard-coded limitation of 256 LUNs per fabric endpoint. The configfs attributes for backends can now be shared between core and driver code, allowing existing drivers to use common code while still allowing flexibility for new backend provided attributes. The highlights include: - Merge sbc_verify_dif_* into common code (sagi) - Remove iscsi-target support for obsolete IFMarker/OFMarker (Christophe Vu-Brugier) - Add bidi support in target/user backend (ilias + vangelis + agover) - Move se_node_acl allocation into target-core code (hch) - Add crc_t10dif_update common helper (akinobu + mkp) - Handle target-core odd SGL mapping for data transfer memory (akinobu) - Move transport ID handling into target-core (hch) - Move task tag into struct se_cmd + support 64-bit tags (bart) - Convert se_node_acl->device_list[] to RCU hlist (nab + hch + paulmck) - Convert se_portal_group->tpg_lun_list[] to RCU hlist (nab + hch + paulmck) - Simplify target backend driver registration (hch) - Consolidate + simplify target backend attribute implementations (hch + nab) - Subsume se_port + t10_alua_tg_pt_gp_member into se_lun (hch) - Drop lun_sep_lock for se_lun->lun_se_dev RCU usage (hch + nab) - Drop unnecessary core_tpg_register TFO parameter (nab) - Use 64-bit LUNs tree-wide (hannes) - Drop left-over TARGET_MAX_LUNS_PER_TRANSPORT limit (hannes)" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (76 commits) target: Bump core version to v5.0 target: remove target_core_configfs.h target: remove unused TARGET_CORE_CONFIG_ROOT define target: consolidate version defines target: implement WRITE_SAME with UNMAP bit using ->execute_unmap target: simplify UNMAP handling target: replace se_cmd->execute_rw with a protocol_data field target/user: Fix inconsistent kmap_atomic/kunmap_atomic target: Send UA when changing LUN inventory target: Send UA upon LUN RESET tmr completion target: Send UA on ALUA target port group change target: Convert se_lun->lun_deve_lock to normal spinlock target: use 'se_dev_entry' when allocating UAs target: Remove 'ua_nacl' pointer from se_ua structure target_core_alua: Correct UA handling when switching states xen-scsiback: Fix compile warning for 64-bit LUN target: Remove TARGET_MAX_LUNS_PER_TRANSPORT target: use 64-bit LUNs target: Drop duplicate + unused se_dev_check_wce target: Drop unnecessary core_tpg_register TFO parameter ...
| * | target: Drop unnecessary core_tpg_register TFO parameterNicholas Bellinger2015-06-161-3/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch drops unnecessary target_core_fabric_ops parameter usage for core_tpg_register() during fabric driver TFO->fabric_make_tpg() se_portal_group creation callback execution. Instead, use the existing se_wwn->wwn_tf->tf_ops pointer to ensure fabric driver is really using the same TFO provided at module_init time. Also go ahead and drop the forward TFO declarations tree-wide, and handling the special case for iscsi-target discovery TPG. Cc: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * | target: target_core_configfs.h is not needed in fabric driversChristoph Hellwig2015-05-311-1/+0Star
| | | | | | | | | | | | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * | target: Move task tag into struct se_cmd + support 64-bit tagsBart Van Assche2015-05-311-6/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Simplify target core and target drivers by storing the task tag a.k.a. command identifier inside struct se_cmd. For several transports (e.g. SRP) tags are 64 bits wide. Hence add support for 64-bit tags. (Fix core_tmr_abort_task conversion spec warnings - nab) (Fix up usb-gadget to use 16-bit tags - HCH + bart) Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Andy Grover <agrover@redhat.com> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: <qla2xxx-upstream@qlogic.com> Cc: Felipe Balbi <balbi@ti.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Juergen Gross <jgross@suse.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * | target: move transport ID handling to the coreChristoph Hellwig2015-05-311-94/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | Now that struct se_portal_group contains a protocol identifier field we can take all the code to format an parse protocol identifiers in CDBs into common code instead of leaving this to low-level drivers. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * | target: remove the get_fabric_proto_ident methodChristoph Hellwig2015-05-311-23/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | Now that we store the protocol identifier in the tpg structure we don't need this method. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * | target: change core_tpg_register prototypeChristoph Hellwig2015-05-311-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | Remove the unneeded fabric_ptr argument, and change the type argument to pass in a SPC protocol identifier. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * | target: move node ACL allocation to core codeChristoph Hellwig2015-05-311-15/+0Star
| | | | | | | | | | | | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>