summaryrefslogtreecommitdiffstats
path: root/drivers/vhost
Commit message (Collapse)AuthorAgeFilesLines
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2018-12-101-3/+0Star
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking fixes from David Miller: "A decent batch of fixes here. I'd say about half are for problems that have existed for a while, and half are for new regressions added in the 4.20 merge window. 1) Fix 10G SFP phy module detection in mvpp2, from Baruch Siach. 2) Revert bogus emac driver change, from Benjamin Herrenschmidt. 3) Handle BPF exported data structure with pointers when building 32-bit userland, from Daniel Borkmann. 4) Memory leak fix in act_police, from Davide Caratti. 5) Check RX checksum offload in RX descriptors properly in aquantia driver, from Dmitry Bogdanov. 6) SKB unlink fix in various spots, from Edward Cree. 7) ndo_dflt_fdb_dump() only works with ethernet, enforce this, from Eric Dumazet. 8) Fix FID leak in mlxsw driver, from Ido Schimmel. 9) IOTLB locking fix in vhost, from Jean-Philippe Brucker. 10) Fix SKB truesize accounting in ipv4/ipv6/netfilter frag memory limits otherwise namespace exit can hang. From Jiri Wiesner. 11) Address block parsing length fixes in x25 from Martin Schiller. 12) IRQ and ring accounting fixes in bnxt_en, from Michael Chan. 13) For tun interfaces, only iface delete works with rtnl ops, enforce this by disallowing add. From Nicolas Dichtel. 14) Use after free in liquidio, from Pan Bian. 15) Fix SKB use after passing to netif_receive_skb(), from Prashant Bhole. 16) Static key accounting and other fixes in XPS from Sabrina Dubroca. 17) Partially initialized flow key passed to ip6_route_output(), from Shmulik Ladkani. 18) Fix RTNL deadlock during reset in ibmvnic driver, from Thomas Falcon. 19) Several small TCP fixes (off-by-one on window probe abort, NULL deref in tail loss probe, SNMP mis-estimations) from Yuchung Cheng" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (93 commits) net/sched: cls_flower: Reject duplicated rules also under skip_sw bnxt_en: Fix _bnxt_get_max_rings() for 57500 chips. bnxt_en: Fix NQ/CP rings accounting on the new 57500 chips. bnxt_en: Keep track of reserved IRQs. bnxt_en: Fix CNP CoS queue regression. net/mlx4_core: Correctly set PFC param if global pause is turned off. Revert "net/ibm/emac: wrong bit is used for STA control" neighbour: Avoid writing before skb->head in neigh_hh_output() ipv6: Check available headroom in ip6_xmit() even without options tcp: lack of available data can also cause TSO defer ipv6: sr: properly initialize flowi6 prior passing to ip6_route_output mlxsw: spectrum_switchdev: Fix VLAN device deletion via ioctl mlxsw: spectrum_router: Relax GRE decap matching check mlxsw: spectrum_switchdev: Avoid leaking FID's reference count mlxsw: spectrum_nve: Remove easily triggerable warnings ipv4: ipv6: netfilter: Adjust the frag mem limit when truesize changes sctp: frag_point sanity check tcp: fix NULL ref in tail loss probe tcp: Do not underestimate rwnd_limited net: use skb_list_del_init() to remove from RX sublists ...
| * vhost: fix IOTLB lockingJean-Philippe Brucker2018-12-041-3/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 78139c94dc8c ("net: vhost: lock the vqs one by one") moved the vq lock to improve scalability, but introduced a possible deadlock in vhost-iotlb. vhost_iotlb_notify_vq() now takes vq->mutex while holding the device's IOTLB spinlock. And on the vhost_iotlb_miss() path, the spinlock is taken while holding vq->mutex. Since calling vhost_poll_queue() doesn't require any lock, avoid the deadlock by not taking vq->mutex. Fixes: 78139c94dc8c ("net: vhost: lock the vqs one by one") Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | vhost/vsock: fix use-after-free in network stack callersStefan Hajnoczi2018-12-061-24/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the network stack calls .send_pkt()/.cancel_pkt() during .release(), a struct vhost_vsock use-after-free is possible. This occurs because .release() does not wait for other CPUs to stop using struct vhost_vsock. Switch to an RCU-enabled hashtable (indexed by guest CID) so that .release() can wait for other CPUs by calling synchronize_rcu(). This also eliminates vhost_vsock_lock acquisition in the data path so it could have a positive effect on performance. This is CVE-2018-14625 "kernel: use-after-free Read in vhost_transport_send_pkt". Cc: stable@vger.kernel.org Reported-and-tested-by: syzbot+bd391451452fb0b93039@syzkaller.appspotmail.com Reported-by: syzbot+e3e074963495f92a89ed@syzkaller.appspotmail.com Reported-by: syzbot+d5a0a170c5069658b141@syzkaller.appspotmail.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
* | vhost/vsock: fix reset orphans race with close timeoutStefan Hajnoczi2018-12-061-7/+15
|/ | | | | | | | | | | | | | | | | | If a local process has closed a connected socket and hasn't received a RST packet yet, then the socket remains in the table until a timeout expires. When a vhost_vsock instance is released with the timeout still pending, the socket is never freed because vhost_vsock has already set the SOCK_DONE flag. Check if the close timer is pending and let it close the socket. This prevents the race which can leak sockets. Reported-by: Maximilian Riemensberger <riemensberger@cadami.net> Cc: Graham Whaley <graham.whaley@gmail.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds2018-11-011-97/+329
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull virtio/vhost updates from Michael Tsirkin: "Fixes and tweaks: - virtio balloon page hinting support - vhost scsi control queue - misc fixes" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: MAINTAINERS: remove reference to bogus vsock file vhost/scsi: Use common handling code in request queue handler vhost/scsi: Extract common handling code from control queue handler vhost/scsi: Respond to control queue operations vhost/scsi: truncate T10 PI iov_iter to prot_bytes virtio-balloon: VIRTIO_BALLOON_F_PAGE_POISON mm/page_poison: expose page_poisoning_enabled to kernel modules virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT kvm_config: add CONFIG_VIRTIO_MENU
| * vhost/scsi: Use common handling code in request queue handlerBijan Mottahedeh2018-10-251-197/+164Star
| | | | | | | | | | | | | | | | Change the request queue handler to use common handling routines same as the control queue handler. Signed-off-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vhost/scsi: Extract common handling code from control queue handlerBijan Mottahedeh2018-10-251-99/+172
| | | | | | | | | | | | | | | | Prepare to change the request queue handler to use common handling routines. Signed-off-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vhost/scsi: Respond to control queue operationsBijan Mottahedeh2018-10-251-0/+190
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The vhost-scsi driver currently does not handle any control queue operations. In particular, vhost_scsi_ctl_handle_kick, merely prints out a debug message but does nothing else. This can cause guest VMs to hang. As part of SCSI recovery from an error, e.g., an I/O timeout, the SCSI midlayer attempts to abort the failed operation. The SCSI virtio driver translates the abort to a SCSI TMF request that gets put on the control queue (virtscsi_abort -> virtscsi_tmf). The SCSI virtio driver then waits indefinitely for this request to be completed, but it never will because vhost-scsi never responds to that request. To avoid a hang, always respond to control queue operations; explicitly reject TMF requests, and return a no-op response to event requests. Signed-off-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vhost/scsi: truncate T10 PI iov_iter to prot_bytesGreg Edwards2018-10-251-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commands with protection information included were not truncating the protection iov_iter to the number of protection bytes in the command. This resulted in vhost_scsi mis-calculating the size of the protection SGL in vhost_scsi_calc_sgls(), and including both the protection and data SG entries in the protection SGL. Fixes: 09b13fa8c1a1 ("vhost/scsi: Add ANY_LAYOUT support in vhost_scsi_handle_vq") Signed-off-by: Greg Edwards <gedwards@ddn.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Fixes: 09b13fa8c1a1093e9458549ac8bb203a7c65c62a Cc: stable@vger.kernel.org Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
* | vhost: Fix Spectre V1 vulnerabilityJason Wang2018-10-311-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | The idx in vhost_vring_ioctl() was controlled by userspace, hence a potential exploitation of the Spectre variant 1 vulnerability. Fixing this by sanitizing idx before using it to index d->vqs. Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: vhost: remove bad code lineTonghao Zhang2018-10-081-1/+0Star
| | | | | | | | | | Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: vhost: add rx busy polling in tx pathTonghao Zhang2018-09-271-21/+14Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch improves the guest receive performance. On the handle_tx side, we poll the sock receive queue at the same time. handle_rx do that in the same way. We set the poll-us=100us and use the netperf to test throughput and mean latency. When running the tests, the vhost-net kthread of that VM, is alway 100% CPU. The commands are shown as below. Rx performance is greatly improved by this patch. There is not notable performance change on tx with this series though. This patch is useful for bi-directional traffic. netperf -H IP -t TCP_STREAM -l 20 -- -O "THROUGHPUT, THROUGHPUT_UNITS, MEAN_LATENCY" Topology: [Host] ->linux bridge -> tap vhost-net ->[Guest] TCP_STREAM: * Without the patch: 19842.95 Mbps, 6.50 us mean latency * With the patch: 37598.20 Mbps, 3.43 us mean latency Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: vhost: factor out busy polling logic to vhost_net_busy_poll()Tonghao Zhang2018-09-271-40/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | Factor out generic busy polling logic and will be used for in tx path in the next patch. And with the patch, qemu can set differently the busyloop_timeout for rx queue. To avoid duplicate codes, introduce the helper functions: * sock_has_rx_data(changed from sk_has_rx_data) * vhost_net_busy_poll_try_queue Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: vhost: replace magic number of lock annotationTonghao Zhang2018-09-271-3/+3
| | | | | | | | | | | | | | | | Use the VHOST_NET_VQ_XXX as a subclass for mutex_lock_nested. Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: vhost: lock the vqs one by oneTonghao Zhang2018-09-271-17/+7Star
| | | | | | | | | | | | | | | | | | | | | | This patch changes the way that lock all vqs at the same, to lock them one by one. It will be used for next patch to avoid the deadlock. Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | vhost_net: add a missing error returnDan Carpenter2018-09-211-0/+1
| | | | | | | | | | | | | | | | | | | | | | We accidentally left out this error return so it leads to some use after free bugs later on. Fixes: 0a0be13b8fe2 ("vhost_net: batch submitting XDP buffers to underlayer sockets") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | vhost_net: batch submitting XDP buffers to underlayer socketsJason Wang2018-09-131-13/+161
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements XDP batching for vhost_net. The idea is first to try to do userspace copy and build XDP buff directly in vhost. Instead of submitting the packet immediately, vhost_net will batch them in an array and submit every 64 (VHOST_NET_BATCH) packets to the under layer sockets through msg_control of sendmsg(). When XDP is enabled on the TUN/TAP, TUN/TAP can process XDP inside a loop without caring GUP thus it can do batch map flushing. When XDP is not enabled or not supported, the underlayer socket need to build skb and pass it to network core. The batched packet submission allows us to do batching like netif_receive_skb_list() in the future. This saves lots of indirect calls for better cache utilization. For the case that we can't so batching e.g when sndbuf is limited or packet size is too large, we will go for usual one packet per sendmsg() way. Doing testpmd on various setups gives us: Test /+pps% XDP_DROP on TAP /+44.8% XDP_REDIRECT on TAP /+29% macvtap (skb) /+26% Netperf tests shows obvious improvements for small packet transmission: size/session/+thu%/+normalize% 64/ 1/ +2%/ 0% 64/ 2/ +3%/ +1% 64/ 4/ +7%/ +5% 64/ 8/ +8%/ +6% 256/ 1/ +3%/ 0% 256/ 2/ +10%/ +7% 256/ 4/ +26%/ +22% 256/ 8/ +27%/ +23% 512/ 1/ +3%/ +2% 512/ 2/ +19%/ +14% 512/ 4/ +43%/ +40% 512/ 8/ +45%/ +41% 1024/ 1/ +4%/ 0% 1024/ 2/ +27%/ +21% 1024/ 4/ +38%/ +73% 1024/ 8/ +15%/ +24% 2048/ 1/ +10%/ +7% 2048/ 2/ +16%/ +12% 2048/ 4/ 0%/ +2% 2048/ 8/ 0%/ +2% 4096/ 1/ +36%/ +60% 4096/ 2/ -11%/ -26% 4096/ 4/ 0%/ +14% 4096/ 8/ 0%/ +4% 16384/ 1/ -1%/ +5% 16384/ 2/ 0%/ +2% 16384/ 4/ 0%/ -3% 16384/ 8/ 0%/ +4% 65535/ 1/ 0%/ +10% 65535/ 2/ 0%/ +8% 65535/ 4/ 0%/ +1% 65535/ 8/ 0%/ +3% Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | tun: switch to new type of msg_controlJason Wang2018-09-131-2/+5
|/ | | | | | | | | | | | | | | | | | | | | This patch introduces to a new tun/tap specific msg_control: #define TUN_MSG_UBUF 1 #define TUN_MSG_PTR 2 struct tun_msg_ctl { int type; void *ptr; }; This allows us to pass different kinds of msg_control through sendmsg(). The first supported type is ubuf (TUN_MSG_UBUF) which will be used by the existed vhost_net zerocopy code. The second is XDP buff, which allows vhost_net to pass XDP buff to TUN. This could be used to implement accepting an array of XDP buffs from vhost_net in the following patches. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2018-08-271-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking fixes from David Miller: 1) ICE, E1000, IGB, IXGBE, and I40E bug fixes from the Intel folks. 2) Better fix for AB-BA deadlock in packet scheduler code, from Cong Wang. 3) bpf sockmap fixes (zero sized key handling, etc.) from Daniel Borkmann. 4) Send zero IPID in TCP resets and SYN-RECV state ACKs, to prevent attackers using it as a side-channel. From Eric Dumazet. 5) Memory leak in mediatek bluetooth driver, from Gustavo A. R. Silva. 6) Hook up rt->dst.input of ipv6 anycast routes properly, from Hangbin Liu. 7) hns and hns3 bug fixes from Huazhong Tan. 8) Fix RIF leak in mlxsw driver, from Ido Schimmel. 9) iova range check fix in vhost, from Jason Wang. 10) Fix hang in do_tcp_sendpages() with tls, from John Fastabend. 11) More r8152 chips need to disable RX aggregation, from Kai-Heng Feng. 12) Memory exposure in TCA_U32_SEL handling, from Kees Cook. 13) TCP BBR congestion control fixes from Kevin Yang. 14) hv_netvsc, ignore non-PCI devices, from Stephen Hemminger. 15) qed driver fixes from Tomer Tayar. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (77 commits) net: sched: Fix memory exposure from short TCA_U32_SEL qed: fix spelling mistake "comparsion" -> "comparison" vhost: correctly check the iova range when waking virtqueue qlge: Fix netdev features configuration. net: macb: do not disable MDIO bus at open/close time Revert "net: stmmac: fix build failure due to missing COMMON_CLK dependency" net: macb: Fix regression breaking non-MDIO fixed-link PHYs mlxsw: spectrum_switchdev: Do not leak RIFs when removing bridge i40e: fix condition of WARN_ONCE for stat strings i40e: Fix for Tx timeouts when interface is brought up if DCB is enabled ixgbe: fix driver behaviour after issuing VFLR ixgbe: Prevent unsupported configurations with XDP ixgbe: Replace GFP_ATOMIC with GFP_KERNEL igb: Replace mdelay() with msleep() in igb_integrated_phy_loopback() igb: Replace GFP_ATOMIC with GFP_KERNEL in igb_sw_init() igb: Use an advanced ctx descriptor for launchtime e1000: ensure to free old tx/rx rings in set_ringparam() e1000: check on netif_running() before calling e1000_up() ixgb: use dma_zalloc_coherent instead of allocator/memset ice: Trivial formatting fixes ...
| * vhost: correctly check the iova range when waking virtqueueJason Wang2018-08-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We don't wakeup the virtqueue if the first byte of pending iova range is the last byte of the range we just got updated. This will lead a virtqueue to wait for IOTLB updating forever. Fixing by correct the check and wake up the virtqueue in this case. Fixes: 6b1e6cc7855b ("vhost: new device IOTLB API") Reported-by: Peter Xu <peterx@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Tested-by: Peter Xu <peterx@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds2018-08-242-2/+2
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull virtio updates from Michael Tsirkin: "virtio, vhost: fixes, tweaks No new features but a bunch of tweaks such as switching balloon from oom notifier to shrinker" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vhost/scsi: increase VHOST_SCSI_PREALLOC_PROT_SGLS to 2048 vhost: allow vhost-scsi driver to be built-in virtio: pci-legacy: Validate queue pfn virtio: mmio-v1: Validate queue PFN virtio_balloon: replace oom notifier with shrinker virtio-balloon: kzalloc the vb struct virtio-balloon: remove BUG() in init_vqs
| * vhost/scsi: increase VHOST_SCSI_PREALLOC_PROT_SGLS to 2048Greg Edwards2018-08-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | The current value of VHOST_SCSI_PREALLOC_PROT_SGLS is too small to accommodate larger I/Os, e.g. 16-32 MiB, when the VIRTIO_SCSI_F_T10_PI feature bit is negotiated and the backing store supports T10 PI. vhost-scsi rejects the command with errors like: [ 59.581317] vhost_scsi_calc_sgls: requested sgl_count: 1820 exceeds pre-allocated max_sgls: 512 Signed-off-by: Greg Edwards <gedwards@ddn.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vhost: allow vhost-scsi driver to be built-inGreg Edwards2018-08-221-1/+1
| | | | | | | | | | | | | | | | It's useful to allow vhost-scsi to be built-in when testing vhost in L1 + L2 VMs and booting L1 VM with QEMU '-kernel' option. Signed-off-by: Greg Edwards <gedwards@ddn.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* | Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds2018-08-161-9/+7Star
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull SCSI updates from James Bottomley: "This is mostly updates to the usual drivers: mpt3sas, lpfc, qla2xxx, hisi_sas, smartpqi, megaraid_sas, arcmsr. In addition, with the continuing absence of Nic we have target updates for tcmu and target core (all with reviews and acks). The biggest observable change is going to be that we're (again) trying to switch to mulitqueue as the default (a user can still override the setting on the kernel command line). Other major core stuff is the removal of the remaining Microchannel drivers, an update of the internal timers and some reworks of completion and result handling" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (203 commits) scsi: core: use blk_mq_run_hw_queues in scsi_kick_queue scsi: ufs: remove unnecessary query(DM) UPIU trace scsi: qla2xxx: Fix issue reported by static checker for qla2x00_els_dcmd2_sp_done() scsi: aacraid: Spelling fix in comment scsi: mpt3sas: Fix calltrace observed while running IO & reset scsi: aic94xx: fix an error code in aic94xx_init() scsi: st: remove redundant pointer STbuffer scsi: qla2xxx: Update driver version to 10.00.00.08-k scsi: qla2xxx: Migrate NVME N2N handling into state machine scsi: qla2xxx: Save frame payload size from ICB scsi: qla2xxx: Fix stalled relogin scsi: qla2xxx: Fix race between switch cmd completion and timeout scsi: qla2xxx: Fix Management Server NPort handle reservation logic scsi: qla2xxx: Flush mailbox commands on chip reset scsi: qla2xxx: Fix unintended Logout scsi: qla2xxx: Fix session state stuck in Get Port DB scsi: qla2xxx: Fix redundant fc_rport registration scsi: qla2xxx: Silent erroneous message scsi: qla2xxx: Prevent sysfs access when chip is down scsi: qla2xxx: Add longer window for chip reset ...
| * | scsi: target: loop, usb, vhost, xen: use target_remove_sessionMike Christie2018-08-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This converts drivers that were only calling transport_deregister_session to use target_remove_session. The calling of transport_deregister_session_configfs via target_remove_session for these types of drivers is ok, because they were not exporting info from fields like sess_acl_list, sess->se_tpg and sess->fabric_sess_ptr from configfs accessible functions, so they will see no difference. Signed-off-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Felipe Balbi <balbi@kernel.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Andrzej Pietrasiewicz <andrzej.p@samsung.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Juergen Gross <jgross@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * | scsi: target: rename target_alloc_sessionMike Christie2018-08-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename target_alloc_session to target_setup_session to avoid confusion with the other transport session allocation function that only allocates the session and because the target_alloc_session does so much more. It allocates the session, sets up the nacl and registers the session. The next patch will then add a remove function to match the setup in this one, so it should make sense for all drivers, except iscsi, to just call those 2 functions to setup and remove a session. iscsi will continue to be the odd driver. Signed-off-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Chris Boot <bootc@bootc.net> Cc: Bryant G. Ly <bryantly@linux.vnet.ibm.com> Cc: Michael Cyr <mikecyr@linux.vnet.ibm.com> Cc: <qla2xxx-upstream@qlogic.com> Cc: Johannes Thumshirn <jth@kernel.org> Cc: Felipe Balbi <balbi@kernel.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Andrzej Pietrasiewicz <andrzej.p@samsung.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Juergen Gross <jgross@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * | scsi: target: Remove second argument from fabric_make_tpg()Bart Van Assche2018-07-021-3/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since most target drivers do not use the second fabric_make_tpg() argument ("group") and since it is trivial to derive the group pointer from the wwn pointer, do not pass the group pointer to fabric_make_tpg(). Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Mike Christie <mchristi@redhat.com> Cc: Felipe Balbi <felipe.balbi@linux.intel.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * | scsi: target: Convert target drivers to use sbitmapMatthew Wilcox2018-06-201-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The sbitmap and the percpu_ida perform essentially the same task, allocating tags for commands. The sbitmap outperforms the percpu_ida as documented here: https://lkml.org/lkml/2014/4/22/553 The sbitmap interface is a little harder to use, but being able to remove the percpu_ida code and getting better performance justifies the additional complexity. Signed-off-by: Matthew Wilcox <willy@infradead.org> Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com> # f_tcm Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * | scsi: target: Abstract tag freeingMatthew Wilcox2018-06-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Introduce target_free_tag() and convert all drivers to use it. Signed-off-by: Matthew Wilcox <willy@infradead.org> Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | | Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/netDavid S. Miller2018-08-091-3/+6
|\ \ \ | | |/ | |/| | | | | | | | | | | | | Overlapping changes in RXRPC, changing to ktime_get_seconds() whilst adding some tracepoints. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | vhost: reset metadata cache when initializing new IOTLBJason Wang2018-08-081-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We need to reset metadata cache during new IOTLB initialization, otherwise the stale pointers to previous IOTLB may be still accessed which will lead a use after free. Reported-by: syzbot+c51e6736a1bf614b3272@syzkaller.appspotmail.com Fixes: f88949138058 ("vhost: introduce O(1) vq metadata cache") Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | vhost: switch to use new message formatJason Wang2018-08-063-19/+93
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We use to have message like: struct vhost_msg { int type; union { struct vhost_iotlb_msg iotlb; __u8 padding[64]; }; }; Unfortunately, there will be a hole of 32bit in 64bit machine because of the alignment. This leads a different formats between 32bit API and 64bit API. What's more it will break 32bit program running on 64bit machine. So fixing this by introducing a new message type with an explicit 32bit reserved field after type like: struct vhost_msg_v2 { __u32 type; __u32 reserved; union { struct vhost_iotlb_msg iotlb; __u8 padding[64]; }; }; We will have a consistent ABI after switching to use this. To enable this capability, introduce a new ioctl (VHOST_SET_BAKCEND_FEATURE) for userspace to enable this feature (VHOST_BACKEND_F_IOTLB_V2). Fixes: 6b1e6cc7855b ("vhost: new device IOTLB API") Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | vhost_net: batch update used ring for datacopy TXJason Wang2018-07-221-15/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Like commit e2b3b35eb989 ("vhost_net: batch used ring update in rx"), this patches implements batch used ring update for datacopy TX (zerocopy has already done some kind of batching). Testpmd transmission from guest to host (XDP_DROP on tap) shows 25.8% improvement (from ~3.1Mpps to ~3.9Mpps) on Broadwell i7-5600U CPU @ 2.60GHz machine. Netperf TCP tests does not show obvious differences. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | vhost_net: rename VHOST_RX_BATCH to VHOST_NET_BATCHJason Wang2018-07-221-4/+4
| | | | | | | | | | | | | | | | | | | | | A more generic name which could be used for TX as well. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | vhost_net: rename vhost_rx_signal_used() to vhost_net_signal_used()Jason Wang2018-07-221-4/+4
| | | | | | | | | | | | | | | | | | | | | Rename for reusing this for TX. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | vhost_net: split out datacopy logicJason Wang2018-07-221-20/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of mixing zerocopy and datacopy logics, this patch tries to split datacopy logic out. This results for a more compact code and ad-hoc optimization could be done on top more easily. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | vhost_net: introduce tx_can_batch()Jason Wang2018-07-221-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | Introduce tx_can_batch() to determine whether TX could be batched. This will help to reduce the code duplication in the future. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | vhost_net: introduce get_tx_bufs()Jason Wang2018-07-221-17/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | Factor out logic of getting tx buffer and iov iter initialization. This will be used for reducing codes duplication in the future. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | vhost_net: introduce vhost_exceeds_weight()Jason Wang2018-07-221-5/+8
| | | | | | | | | | | | | | | Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | vhost_net: introduce helper to initialize tx iov iterJason Wang2018-07-221-9/+17
| | | | | | | | | | | | | | | | | | | | | Introduce init_iov_iter() in order to be reused by future patch. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | vhost_net: drop unnecessary parameterJason Wang2018-07-221-4/+2Star
| | | | | | | | | | | | | | | Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | vhost_net: Avoid rx vring kicks during busyloopToshiaki Makita2018-07-041-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We may run out of avail rx ring descriptor under heavy load but busypoll did not detect it so busypoll may have exited prematurely. Avoid this by checking rx ring full during busypoll. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | vhost_net: Avoid rx queue wake-ups during busypollToshiaki Makita2018-07-041-7/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We may run handle_rx() while rx work is queued. For example a packet can push the rx work during the window before handle_rx calls vhost_net_disable_vq(). In that case busypoll immediately exits due to vhost_has_work() condition and enables vq again. This can lead to another unnecessary rx wake-ups, so poll rx work instead of enabling the vq. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | vhost_net: Avoid tx vring kicks during busyloopToshiaki Makita2018-07-041-13/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Under heavy load vhost busypoll may run without suppressing notification. For example tx zerocopy callback can push tx work while handle_tx() is running, then busyloop exits due to vhost_has_work() condition and enables notification but immediately reenters handle_tx() because the pushed work was tx. In this case handle_tx() tries to disable notification again, but when using event_idx it by design cannot. Then busyloop will run without suppressing notification. Another example is the case where handle_tx() tries to enable notification but avail idx is advanced so disables it again. This case also leads to the same situation with event_idx. The problem is that once we enter this situation busyloop does not work under heavy load for considerable amount of time, because notification is likely to happen during busyloop and handle_tx() immediately enables notification after notification happens. Specifically busyloop detects notification by vhost_has_work() and then handle_tx() calls vhost_enable_notify(). Because the detected work was the tx work, it enters handle_tx(), and enters busyloop without suppression again. This is likely to be repeated, so with event_idx we are almost not able to suppress notification in this case. To fix this, poll the work instead of enabling notification when busypoll is interrupted by something. IMHO vhost_has_work() is kind of interruption rather than a signal to completely cancel the busypoll, so let's run busypoll after the necessary work is done. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | vhost_net: Rename local variables in vhost_net_rx_peek_head_lenToshiaki Makita2018-07-041-17/+17
|/ / | | | | | | | | | | | | | | So we can easily see which variable is for which, tx or rx. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* / vhost_net: validate sock before trying to put its fdJason Wang2018-06-231-1/+2
|/ | | | | | | | | | | | Sock will be NULL if we pass -1 to vhost_net_set_backend(), but when we meet errors during ubuf allocation, the code does not check for NULL before calling sockfd_put(), this will lead NULL dereferencing. Fixing by checking sock pointer before. Fixes: bab632d69ee4 ("vhost: vhost TX zero-copy support") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds2018-06-151-0/+3
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull virtio updates from Michael Tsirkin: "virtio, vhost: features, fixes - PCI virtual function support for virtio - DMA barriers for virtio strong barriers - bugfixes" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio: update the comments for transport features virtio_pci: support enabling VFs vhost: fix info leak due to uninitialized memory virtio_ring: switch to dma_XX barriers for rpmsg
| * vhost: fix info leak due to uninitialized memoryMichael S. Tsirkin2018-06-121-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | struct vhost_msg within struct vhost_msg_node is copied to userspace. Unfortunately it turns out on 64 bit systems vhost_msg has padding after type which gcc doesn't initialize, leaking 4 uninitialized bytes to userspace. This padding also unfortunately means 32 bit users of this interface are broken on a 64 bit kernel which will need to be fixed separately. Fixes: CVE-2018-1118 Cc: stable@vger.kernel.org Reported-by: Kevin Easton <kevin@guarana.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reported-by: syzbot+87cfa083e727a224754b@syzkaller.appspotmail.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* | treewide: kzalloc() -> kcalloc()Kees Cook2018-06-131-6/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The kzalloc() function has a 2-factor argument form, kcalloc(). This patch replaces cases of: kzalloc(a * b, gfp) with: kcalloc(a * b, gfp) as well as handling cases of: kzalloc(a * b * c, gfp) with: kzalloc(array3_size(a, b, c), gfp) as it's slightly less ugly than: kzalloc_array(array_size(a, b), c, gfp) This does, however, attempt to ignore constant size factors like: kzalloc(4 * 1024, gfp) though any constants defined via macros get caught up in the conversion. Any factors with a sizeof() of "unsigned char", "char", and "u8" were dropped, since they're redundant. The Coccinelle script used for this was: // Fix redundant parens around sizeof(). @@ type TYPE; expression THING, E; @@ ( kzalloc( - (sizeof(TYPE)) * E + sizeof(TYPE) * E , ...) | kzalloc( - (sizeof(THING)) * E + sizeof(THING) * E , ...) ) // Drop single-byte sizes and redundant parens. @@ expression COUNT; typedef u8; typedef __u8; @@ ( kzalloc( - sizeof(u8) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(__u8) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(char) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(unsigned char) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(u8) * COUNT + COUNT , ...) | kzalloc( - sizeof(__u8) * COUNT + COUNT , ...) | kzalloc( - sizeof(char) * COUNT + COUNT , ...) | kzalloc( - sizeof(unsigned char) * COUNT + COUNT , ...) ) // 2-factor product with sizeof(type/expression) and identifier or constant. @@ type TYPE; expression THING; identifier COUNT_ID; constant COUNT_CONST; @@ ( - kzalloc + kcalloc ( - sizeof(TYPE) * (COUNT_ID) + COUNT_ID, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * COUNT_ID + COUNT_ID, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * (COUNT_CONST) + COUNT_CONST, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * COUNT_CONST + COUNT_CONST, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * (COUNT_ID) + COUNT_ID, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * COUNT_ID + COUNT_ID, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * (COUNT_CONST) + COUNT_CONST, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * COUNT_CONST + COUNT_CONST, sizeof(THING) , ...) ) // 2-factor product, only identifiers. @@ identifier SIZE, COUNT; @@ - kzalloc + kcalloc ( - SIZE * COUNT + COUNT, SIZE , ...) // 3-factor product with 1 sizeof(type) or sizeof(expression), with // redundant parens removed. @@ expression THING; identifier STRIDE, COUNT; type TYPE; @@ ( kzalloc( - sizeof(TYPE) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(TYPE) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(TYPE) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(TYPE) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(THING) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kzalloc( - sizeof(THING) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kzalloc( - sizeof(THING) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kzalloc( - sizeof(THING) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) ) // 3-factor product with 2 sizeof(variable), with redundant parens removed. @@ expression THING1, THING2; identifier COUNT; type TYPE1, TYPE2; @@ ( kzalloc( - sizeof(TYPE1) * sizeof(TYPE2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kzalloc( - sizeof(THING1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kzalloc( - sizeof(THING1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kzalloc( - sizeof(TYPE1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) | kzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) ) // 3-factor product, only identifiers, with redundant parens removed. @@ identifier STRIDE, SIZE, COUNT; @@ ( kzalloc( - (COUNT) * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - (COUNT) * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - (COUNT) * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - (COUNT) * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) ) // Any remaining multi-factor products, first at least 3-factor products, // when they're not all constants... @@ expression E1, E2, E3; constant C1, C2, C3; @@ ( kzalloc(C1 * C2 * C3, ...) | kzalloc( - (E1) * E2 * E3 + array3_size(E1, E2, E3) , ...) | kzalloc( - (E1) * (E2) * E3 + array3_size(E1, E2, E3) , ...) | kzalloc( - (E1) * (E2) * (E3) + array3_size(E1, E2, E3) , ...) | kzalloc( - E1 * E2 * E3 + array3_size(E1, E2, E3) , ...) ) // And then all remaining 2 factors products when they're not all constants, // keeping sizeof() as the second factor argument. @@ expression THING, E1, E2; type TYPE; constant C1, C2, C3; @@ ( kzalloc(sizeof(THING) * C2, ...) | kzalloc(sizeof(TYPE) * C2, ...) | kzalloc(C1 * C2 * C3, ...) | kzalloc(C1 * C2, ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * (E2) + E2, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * E2 + E2, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * (E2) + E2, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * E2 + E2, sizeof(THING) , ...) | - kzalloc + kcalloc ( - (E1) * E2 + E1, E2 , ...) | - kzalloc + kcalloc ( - (E1) * (E2) + E1, E2 , ...) | - kzalloc + kcalloc ( - E1 * E2 + E1, E2 , ...) ) Signed-off-by: Kees Cook <keescook@chromium.org>
* | treewide: kmalloc() -> kmalloc_array()Kees Cook2018-06-135-10/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The kmalloc() function has a 2-factor argument form, kmalloc_array(). This patch replaces cases of: kmalloc(a * b, gfp) with: kmalloc_array(a * b, gfp) as well as handling cases of: kmalloc(a * b * c, gfp) with: kmalloc(array3_size(a, b, c), gfp) as it's slightly less ugly than: kmalloc_array(array_size(a, b), c, gfp) This does, however, attempt to ignore constant size factors like: kmalloc(4 * 1024, gfp) though any constants defined via macros get caught up in the conversion. Any factors with a sizeof() of "unsigned char", "char", and "u8" were dropped, since they're redundant. The tools/ directory was manually excluded, since it has its own implementation of kmalloc(). The Coccinelle script used for this was: // Fix redundant parens around sizeof(). @@ type TYPE; expression THING, E; @@ ( kmalloc( - (sizeof(TYPE)) * E + sizeof(TYPE) * E , ...) | kmalloc( - (sizeof(THING)) * E + sizeof(THING) * E , ...) ) // Drop single-byte sizes and redundant parens. @@ expression COUNT; typedef u8; typedef __u8; @@ ( kmalloc( - sizeof(u8) * (COUNT) + COUNT , ...) | kmalloc( - sizeof(__u8) * (COUNT) + COUNT , ...) | kmalloc( - sizeof(char) * (COUNT) + COUNT , ...) | kmalloc( - sizeof(unsigned char) * (COUNT) + COUNT , ...) | kmalloc( - sizeof(u8) * COUNT + COUNT , ...) | kmalloc( - sizeof(__u8) * COUNT + COUNT , ...) | kmalloc( - sizeof(char) * COUNT + COUNT , ...) | kmalloc( - sizeof(unsigned char) * COUNT + COUNT , ...) ) // 2-factor product with sizeof(type/expression) and identifier or constant. @@ type TYPE; expression THING; identifier COUNT_ID; constant COUNT_CONST; @@ ( - kmalloc + kmalloc_array ( - sizeof(TYPE) * (COUNT_ID) + COUNT_ID, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(TYPE) * COUNT_ID + COUNT_ID, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(TYPE) * (COUNT_CONST) + COUNT_CONST, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(TYPE) * COUNT_CONST + COUNT_CONST, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * (COUNT_ID) + COUNT_ID, sizeof(THING) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * COUNT_ID + COUNT_ID, sizeof(THING) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * (COUNT_CONST) + COUNT_CONST, sizeof(THING) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * COUNT_CONST + COUNT_CONST, sizeof(THING) , ...) ) // 2-factor product, only identifiers. @@ identifier SIZE, COUNT; @@ - kmalloc + kmalloc_array ( - SIZE * COUNT + COUNT, SIZE , ...) // 3-factor product with 1 sizeof(type) or sizeof(expression), with // redundant parens removed. @@ expression THING; identifier STRIDE, COUNT; type TYPE; @@ ( kmalloc( - sizeof(TYPE) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kmalloc( - sizeof(TYPE) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kmalloc( - sizeof(TYPE) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kmalloc( - sizeof(TYPE) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kmalloc( - sizeof(THING) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kmalloc( - sizeof(THING) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kmalloc( - sizeof(THING) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kmalloc( - sizeof(THING) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) ) // 3-factor product with 2 sizeof(variable), with redundant parens removed. @@ expression THING1, THING2; identifier COUNT; type TYPE1, TYPE2; @@ ( kmalloc( - sizeof(TYPE1) * sizeof(TYPE2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kmalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kmalloc( - sizeof(THING1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kmalloc( - sizeof(THING1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kmalloc( - sizeof(TYPE1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) | kmalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) ) // 3-factor product, only identifiers, with redundant parens removed. @@ identifier STRIDE, SIZE, COUNT; @@ ( kmalloc( - (COUNT) * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - COUNT * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - COUNT * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - (COUNT) * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - COUNT * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - (COUNT) * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - (COUNT) * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - COUNT * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) ) // Any remaining multi-factor products, first at least 3-factor products, // when they're not all constants... @@ expression E1, E2, E3; constant C1, C2, C3; @@ ( kmalloc(C1 * C2 * C3, ...) | kmalloc( - (E1) * E2 * E3 + array3_size(E1, E2, E3) , ...) | kmalloc( - (E1) * (E2) * E3 + array3_size(E1, E2, E3) , ...) | kmalloc( - (E1) * (E2) * (E3) + array3_size(E1, E2, E3) , ...) | kmalloc( - E1 * E2 * E3 + array3_size(E1, E2, E3) , ...) ) // And then all remaining 2 factors products when they're not all constants, // keeping sizeof() as the second factor argument. @@ expression THING, E1, E2; type TYPE; constant C1, C2, C3; @@ ( kmalloc(sizeof(THING) * C2, ...) | kmalloc(sizeof(TYPE) * C2, ...) | kmalloc(C1 * C2 * C3, ...) | kmalloc(C1 * C2, ...) | - kmalloc + kmalloc_array ( - sizeof(TYPE) * (E2) + E2, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(TYPE) * E2 + E2, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * (E2) + E2, sizeof(THING) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * E2 + E2, sizeof(THING) , ...) | - kmalloc + kmalloc_array ( - (E1) * E2 + E1, E2 , ...) | - kmalloc + kmalloc_array ( - (E1) * (E2) + E1, E2 , ...) | - kmalloc + kmalloc_array ( - E1 * E2 + E1, E2 , ...) ) Signed-off-by: Kees Cook <keescook@chromium.org>