summaryrefslogtreecommitdiffstats
path: root/hw
Commit message (Collapse)AuthorAgeFilesLines
* Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into ↵Peter Maydell2016-06-0224-1328/+7586
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | staging # gpg: Signature made Thu 02 Jun 2016 07:23:18 BST using RSA key ID 398D6211 # gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211 * remotes/jasowang/tags/net-pull-request: (31 commits) Add ENET device to i.MX6 SOC. Add ENET/Gbps Ethernet support to FEC device i.MX: move FEC device to a register array structure. i.MX: Rename i.MX FEC defines to ENET_XXX i.MX: reset TX/RX descriptors when FEC is disabled. i.MX: Fix FEC code for ECR register reset value. i.MX: Fix FEC code for MDIO address selection i.MX: Fix FEC code for MDIO operation selection net: handle optional VLAN header in checksum computation. net: improve UDP/TCP checksum computation. e1000e: Introduce qtest for e1000e device net: Introduce e1000e device emulation e1000: Move out code that will be reused in e1000e e1000_regs: Add definitions for Intel 82574-specific bits vmxnet3: Use pci_dma_* API instead of cpu_physical_memory_* net_pkt: Extend packet abstraction as required by e1000e functionality rtl8139: Move more TCP definitions to common header net_pkt: Name vmxnet3 packet abstractions more generic vmxnet3: Use common MAC address tracing macros net: Add macros for MAC address tracing ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * Add ENET device to i.MX6 SOC.Jean-Christophe Dubois2016-06-021-0/+17
| | | | | | | | | | | | | | | | | | | | | | This adds the ENET device to the i.MX6 SOC. This was tested by booting Linux on an Qemu i.MX6 instance and accessing the internet from the linux guest. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net> Signed-off-by: Jason Wang <jasowang@redhat.com>
| * Add ENET/Gbps Ethernet support to FEC deviceJean-Christophe Dubois2016-06-022-101/+573
| | | | | | | | | | | | | | | | | | | | | | The ENET device (present in i.MX6) is "derived" from FEC and backward compatible with it. This patch adds the necessary support of the added feature in the ENET device to allow Linux to use it (on supported processors). Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net> Signed-off-by: Jason Wang <jasowang@redhat.com>
| * i.MX: move FEC device to a register array structure.Jean-Christophe Dubois2016-06-021-176/+222
| | | | | | | | | | | | | | This is to prepare for the ENET Gb device of the i.MX6. Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net> Signed-off-by: Jason Wang <jasowang@redhat.com>
| * i.MX: Rename i.MX FEC defines to ENET_XXXJean-Christophe Dubois2016-06-021-27/+27
| | | | | | | | | | Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net> Signed-off-by: Jason Wang <jasowang@redhat.com>
| * i.MX: reset TX/RX descriptors when FEC is disabled.Jean-Christophe Dubois2016-06-021-0/+2
| | | | | | | | | | | | | | | | | | According to the FEC chapter of i.MX25 reference manual RX adn TX descriptors are reseted when the FEC device is disabled through ECR. Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net> Signed-off-by: Jason Wang <jasowang@redhat.com>
| * i.MX: Fix FEC code for ECR register reset value.Jean-Christophe Dubois2016-06-021-1/+1
| | | | | | | | | | | | | | | | | | | | According to the FEC chapter of i.MX25 reference manual ECR register is initialized at 0xf0000000 at reset time. We fix the value. Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net> Signed-off-by: Jason Wang <jasowang@redhat.com>
| * i.MX: Fix FEC code for MDIO address selectionJean-Christophe Dubois2016-06-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | According to the FEC chapter of i.MX25 reference manual When writing to MMFR register, the MDIO device and adress are selected by bit 27 to 23 and bit 22 to 18 respectively. This is a total of 10 bits that need to be used by the Phy chip/address decoding function. This patch fixes the number of bits used from 9 to 10. Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net> Signed-off-by: Jason Wang <jasowang@redhat.com>
| * i.MX: Fix FEC code for MDIO operation selectionJean-Christophe Dubois2016-06-021-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | According to the FEC chapter of i.MX25 reference manual When writing the MMFR register, bit 29 and 28 select the requested operation. * 10 means read operation with valid MII mgmt frame * 11 means read operation with non compliant MII mgmt frame * 01 means write operation with valid MII mgmt frame * 00 means write operation with non compliant MII mgmt frame So while bit 28 does change beween read/write for valid MII mgmt frame, the mening is inverted for non compliant MII mgmt frame. Bit 29 on the other hand means read/write whatever the type of mgmt frame involved. So this patch change the operation selection from bit 28 to bit 29 as it is more generic. Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net> Signed-off-by: Jason Wang <jasowang@redhat.com>
| * net: Introduce e1000e device emulationDmitry Fleytman2016-06-025-0/+4366
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces emulation for the Intel 82574 adapter, AKA e1000e. This implementation is derived from the e1000 emulation code, and utilizes the TX/RX packet abstractions that were initially developed for the vmxnet3 device. Although some parts of the introduced code may be shared with e1000, the differences are substantial enough so that the only shared resources for the two devices are the definitions in hw/net/e1000_regs.h. Similarly to vmxnet3, the new device uses virtio headers for task offloads (for backends that support virtio extensions). Usage of virtio headers may be forcibly disabled via a boolean device property "vnet" (which is enabled by default). In such case task offloads will be performed in software, in the same way it is done on backends that do not support virtio headers. The device code is split into two parts: 1. hw/net/e1000e.c: QEMU-specific code for a network device; 2. hw/net/e1000e_core.[hc]: Device emulation according to the spec. The new device name is e1000e. Intel specifications for the 82574 controller are available at: http://www.intel.com/content/dam/doc/datasheet/82574l-gbe-controller-datasheet.pdf Throughput measurement results (iperf2): Fedora 22 guest, TCP, RX 4 ++------------------------------------------+ | | | X X X X X 3.5 ++ X X X X | | X | | | 3 ++ | G | X | b | | / 2.5 ++ | s | | | | 2 ++ | | | | | 1.5 X+ | | | + + + + + + + + + + + + 1 ++--+---+---+---+---+---+---+---+---+---+---+ 32 64 128 256 512 1 2 4 8 16 32 64 B B B B B KB KB KB KB KB KB KB Buffer size Fedora 22 guest, TCP, TX 18 ++-------------------------------------------+ | X | 16 ++ X X X X X | X | 14 ++ | | | 12 ++ | G | X | b 10 ++ | / | | s 8 ++ | | | 6 ++ X | | | 4 ++ | | X | 2 ++ X | X + + + + + + + + + + + 0 ++--+---+---+---+---+----+---+---+---+---+---+ 32 64 128 256 512 1 2 4 8 16 32 64 B B B B B KB KB KB KB KB KB KB Buffer size Fedora 22 guest, UDP, RX 3 ++------------------------------------------+ | X | | 2.5 ++ | | | | | 2 ++ X | G | | b | | / 1.5 ++ | s | X | | | 1 ++ | | | | X | 0.5 ++ | | X | X + + + + + 0 ++-------+--------+-------+--------+--------+ 32 64 128 256 512 1 B B B B B KB Datagram size Fedora 22 guest, UDP, TX 1 ++------------------------------------------+ | X 0.9 ++ | | | 0.8 ++ | 0.7 ++ | | | G 0.6 ++ | b | | / 0.5 ++ | s | X | 0.4 ++ | | | 0.3 ++ | 0.2 ++ X | | | 0.1 ++ X | X X + + + + 0 ++-------+--------+-------+--------+--------+ 32 64 128 256 512 1 B B B B B KB Datagram size Windows 2012R2 guest, TCP, RX 3.2 ++------------------------------------------+ | X | 3 ++ | | | 2.8 ++ | | | 2.6 ++ X | G | X X X X X b 2.4 ++ X X | / | | s 2.2 ++ | | | 2 ++ | | X X | 1.8 ++ | | | 1.6 X+ | + + + + + + + + + + + + 1.4 ++--+---+---+---+---+---+---+---+---+---+---+ 32 64 128 256 512 1 2 4 8 16 32 64 B B B B B KB KB KB KB KB KB KB Buffer size Windows 2012R2 guest, TCP, TX 14 ++-------------------------------------------+ | | | X X 12 ++ | | | 10 ++ | | | G | | b 8 ++ | / | X | s 6 ++ | | | | | 4 ++ X | | | 2 ++ | | X X X | + X X + + X X + + + + + 0 X+--+---+---+---+---+----+---+---+---+---+---+ 32 64 128 256 512 1 2 4 8 16 32 64 B B B B B KB KB KB KB KB KB KB Buffer size Windows 2012R2 guest, UDP, RX 1.6 ++------------------------------------------X | | 1.4 ++ | | | 1.2 ++ | | X | | | G 1 ++ | b | | / 0.8 ++ | s | | 0.6 ++ X | | | 0.4 ++ | | X | | | 0.2 ++ X | X + + + + + 0 ++-------+--------+-------+--------+--------+ 32 64 128 256 512 1 B B B B B KB Datagram size Windows 2012R2 guest, UDP, TX 0.6 ++------------------------------------------+ | X | | 0.5 ++ | | | | | 0.4 ++ | G | | b | | / 0.3 ++ X | s | | | | 0.2 ++ | | | | X | 0.1 ++ | | X | X X + + + + 0 ++-------+--------+-------+--------+--------+ 32 64 128 256 512 1 B B B B B KB Datagram size Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com> Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
| * e1000: Move out code that will be reused in e1000eDmitry Fleytman2016-06-024-320/+573
| | | | | | | | | | | | | | | | | | Code that will be shared moved to a separate files. Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com> Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
| * e1000_regs: Add definitions for Intel 82574-specific bitsDmitry Fleytman2016-06-021-3/+342
| | | | | | | | | | | | | | Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com> Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
| * vmxnet3: Use pci_dma_* API instead of cpu_physical_memory_*Dmitry Fleytman2016-06-023-28/+44
| | | | | | | | | | | | | | | | | | | | To make this device and network packets abstractions ready for IOMMU. Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com> Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
| * net_pkt: Extend packet abstraction as required by e1000e functionalityDmitry Fleytman2016-06-024-117/+813
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch extends the TX/RX packet abstractions with features that will be used by the e1000e device implementation. Changes are: 1. Support iovec lists for RX buffers 2. Deeper RX packets parsing 3. Loopback option for TX packets 4. Extended VLAN headers handling 5. RSS processing for RX packets Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com> Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
| * rtl8139: Move more TCP definitions to common headerDmitry Fleytman2016-06-021-5/+0Star
| | | | | | | | | | | | | | Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com> Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
| * net_pkt: Name vmxnet3 packet abstractions more genericDmitry Fleytman2016-06-026-196/+196
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch drops "vmx" prefix from packet abstractions names to emphasize the fact they are generic and not tied to any specific network device. These abstractions will be reused by e1000e emulation implementation introduced by following patches so their names need generalization. This patch (except renamed files, adjusted comments and changes in MAINTAINTERS) was produced by: git grep -lz 'vmxnet_tx_pkt' | xargs -0 perl -i'' -pE "s/vmxnet_tx_pkt/net_tx_pkt/g" git grep -lz 'vmxnet_rx_pkt' | xargs -0 perl -i'' -pE "s/vmxnet_rx_pkt/net_rx_pkt/g" git grep -lz 'VmxnetTxPkt' | xargs -0 perl -i'' -pE "s/VmxnetTxPkt/NetTxPkt/g" git grep -lz 'VMXNET_TX_PKT' | xargs -0 perl -i'' -pE "s/VMXNET_TX_PKT/NET_TX_PKT/g" git grep -lz 'VmxnetRxPkt' | xargs -0 perl -i'' -pE "s/VmxnetRxPkt/NetRxPkt/g" git grep -lz 'VMXNET_RX_PKT' | xargs -0 perl -i'' -pE "s/VMXNET_RX_PKT/NET_RX_PKT/g" sed -ie 's/VMXNET_/NET_/g' hw/net/vmxnet_rx_pkt.c sed -ie 's/VMXNET_/NET_/g' hw/net/vmxnet_tx_pkt.c Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com> Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
| * vmxnet3: Use common MAC address tracing macrosDmitry Fleytman2016-06-022-7/+4Star
| | | | | | | | | | | | | | Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com> Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
| * vmxnet3: Use generic function for DSN capability definitionDmitry Fleytman2016-06-021-7/+5Star
| | | | | | | | | | | | | | Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com> Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
| * pcie: Introduce function for DSN capability creationDmitry Fleytman2016-06-021-0/+10
| | | | | | | | | | | | | | Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com> Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
| * pcie: Add support for PCIe CAP v1Dmitry Fleytman2016-06-021-18/+66
| | | | | | | | | | | | | | | | | | | | Added support for PCIe CAP v1, while reusing some of the existing v2 infrastructure. Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com> Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
| * msix: make msix_clr_pending() visible for clientsDmitry Fleytman2016-06-021-1/+1
| | | | | | | | | | | | | | | | | | This function will be used by e1000e device code. Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com> Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
| * net: mipsnet: check packet length against bufferPrasad J Pandit2016-05-251-0/+3
| | | | | | | | | | | | | | | | | | | | | | When receiving packets over MIPSnet network device, it uses receive buffer of size 1514 bytes. In case the controller accepts large(MTU) packets, it could lead to memory corruption. Add check to avoid it. Reported by: Oleksandr Bazhaniuk <oleksandr.bazhaniuk@intel.com> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Signed-off-by: Jason Wang <jasowang@redhat.com>
* | Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.7-20160531' ↵Peter Maydell2016-05-311-3/+11
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | into staging ppc patch queue for 2016-05-31 Here's another ppc patch queue. This batch is all preliminaries towards two significant features: 1) Full hypervisor-mode support for POWER8 Patches 1-8 start fixing various bugs with TCG's handling of hypervisor mode 2) CPU hotplug support Patches 9-12 make some preliminary fixes towards implementing CPU hotplug on ppc64 (and other non-x86 platforms). These patches are actually to generic code, not ppc, but are included here with Paolo's ACK. # gpg: Signature made Tue 31 May 2016 01:39:44 BST using RSA key ID 20D9B392 # gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" # gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" # gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392 * remotes/dgibson/tags/ppc-for-2.7-20160531: cpu: Add a sync version of cpu_remove() cpu: Reclaim vCPU objects exec: Do vmstate unregistration from cpu_exec_exit() exec: Remove cpu from cpus list during cpu_exec_exit() ppc: Add PPC_64H instruction flag to POWER7 and POWER8 ppc: Get out of emulation on SMT "OR" ops ppc: Fix sign extension issue in mtmsr(d) emulation ppc: Change 'invalid' bit mask of tlbiel and tlbie ppc: tlbie, tlbia and tlbisync are HV only ppc: Do some batching of TCG tlb flushes ppc: Use split I/D mmu modes to avoid flushes on interrupts ppc: Remove MMU_MODEn_SUFFIX definitions Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | ppc: Do some batching of TCG tlb flushesBenjamin Herrenschmidt2016-05-301-3/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On ppc64 especially, we flush the tlb on any slbie or tlbie instruction. However, those instructions often come in bursts of 3 or more (context switch will favor a series of slbie's for example to an slbia if the SLB has less than a certain number of entries in it, and tlbie's can happen in a series, with PAPR, H_BULK_REMOVE can remove up to 4 entries at a time. Doing a tlb_flush() each time is a waste of time. We end up doing a memset of the whole TLB, reloading it for the next instruction, memset'ing again, etc... Those instructions don't have to take effect immediately. For slbie, they can wait for the next context synchronizing event. For tlbie, the next tlbsync. This implements batching by keeping a flag that indicates that we have a TLB in need of flushing. We check it on interrupts, rfi's, isync's and tlbsync and flush the TLB if needed. This reduces the number of tlb_flush() on a boot to a ubuntu installer first dialog screen from roughly 360K down to 36K. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [clg: added a 'CPUPPCState *' variable in h_remove() and h_bulk_remove() ] Signed-off-by: Cédric Le Goater <clg@kaod.org> [dwg: removed spurious whitespace change, use 0/1 not true/false consistently, since tlb_need_flush has int type] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* | | memory: split memory_region_from_host from qemu_ram_addr_from_hostPaolo Bonzini2016-05-291-9/+7Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the old qemu_ram_addr_from_host to memory_region_from_host and make it return an offset within the region. For qemu_ram_addr_from_host return the ram_addr_t directly, similar to what it was before commit 1b5ec23 ("memory: return MemoryRegion from qemu_ram_addr_from_host", 2013-07-04). Reviewed-by: Marc-André Lureau <marcandre.lureau@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | | memory: remove qemu_get_ram_fd, qemu_set_ram_fd, qemu_ram_block_host_ptrPaolo Bonzini2016-05-292-11/+9Star
| | | | | | | | | | | | | | | | | | | | | | | | Remove direct uses of ram_addr_t and optimize memory_region_{get,set}_fd now that a MemoryRegion knows its RAMBlock directly. Reviewed-by: Marc-André Lureau <marcandre.lureau@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | | scsi-generic: Merge block max xfer len in INQUIRY responseFam Zheng2016-05-291-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The rationale is similar to the above mode sense response interception: this is practically the only channel to communicate restraints from elsewhere such as host and block driver. The scsi bus we attach onto can have a larger max xfer len than what is accepted by the host file system (guarding between the host scsi LUN and QEMU), in which case the SG_IO we generate would get -EINVAL. Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <1464243305-10661-3-git-send-email-famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | | scsi-block: always use SG_IOPaolo Bonzini2016-05-291-18/+196
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using pread/pwrite or io_submit has the advantage of eliminating the bounce buffer, but drops the SCSI status. This keeps the guest from seeing unit attention codes, as well as statuses such as RESERVATION CONFLICT. Because we know scsi-block operates on an SBC device we can still use the DMA helpers with SG_IO; just remember to patch the CDBs if the transfer is split into multiple segments. This means that scsi-block will always use the thread-pool unfortunately, instead of respecting aio=native. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | | scsi-disk: introduce scsi_disk_req_check_errorPaolo Bonzini2016-05-291-67/+22Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commonize all the checks for canceled requests and errors. The next patch will add another case to check for, in order to handle passthrough commands. There is no semantic change here; the only nontrivial modification is in scsi_write_do_fua, where cancellation has been checked earlier by both callers. Thus, the check is replaced with an assertion. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | | scsi-disk: add need_fua_emulation to SCSIDiskClassPaolo Bonzini2016-05-291-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | scsi-block will be able to do FUA just by passing the request through to the LUN (which is also more efficient); there is no need to emulate it like we do for scsi-disk. Add a new method to distinguish this. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | | scsi-disk: introduce dma_readv and dma_writevPaolo Bonzini2016-05-291-15/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These are replacements for blk_aio_readv and blk_aio_writev that allow customization of the data path. They reuse the DMA helpers' DMAIOFunc callback type, so that the same function can be used in either the QEMUSGList or the bounce-buffered case. This customization will be needed in the next patch to do zero-copy SG_IO on scsi-block. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | | scsi-disk: introduce a common base classPaolo Bonzini2016-05-291-14/+22
| | | | | | | | | | | | | | | | | | | | | | | | This will be the place to add DMAIOFuncs in the next patch. There are also a couple DeviceClass members that can be moved to the abstract class's initialization function. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | | bt: rewrite csrhci_write to avoid out-of-bounds writesPaolo Bonzini2016-05-291-21/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The usage of INT_MAX in this function confuses Coverity. I think the defect is bogus, however there is no protection against getting more than sizeof(s->inpkt) bytes from the character device backend. Rewrite the function to only fill in as much data as needed from buf into s->inpkt. The plen variable is replaced by a simple state machine and there is no need anymore to shift contents to the beginning of s->inpkt. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | | scsi: megasas: check 'read_queue_head' index valuePrasad J Pandit2016-05-291-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While doing MegaRAID SAS controller command frame lookup, routine 'megasas_lookup_frame' uses 'read_queue_head' value as an index into 'frames[MEGASAS_MAX_FRAMES=2048]' array. Limit its value within array bounds to avoid any OOB access. Reported-by: Li Qiang <liqiang6-s@360.cn> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Message-Id: <1464179110-18593-1-git-send-email-ppandit@redhat.com> Reviewed-by: Alexander Graf <agraf@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | | scsi: megasas: initialise local configuration data bufferPrasad J Pandit2016-05-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When reading MegaRAID SAS controller configuration via MegaRAID Firmware Interface(MFI) commands, routine megasas_dcmd_cfg_read uses an uninitialised local data buffer. Initialise this buffer to avoid stack information leakage. Reported-by: Li Qiang <liqiang6-s@360.cn> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Message-Id: <1464178304-12831-1-git-send-email-ppandit@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | | scsi: megasas: use appropriate property buffer sizePrasad J Pandit2016-05-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When setting MegaRAID SAS controller properties via MegaRAID Firmware Interface(MFI) commands, a user supplied size parameter is used to set property value. Use appropriate size value to avoid OOB access issues. Reported-by: Li Qiang <liqiang6-s@360.cn> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Message-Id: <1464172291-2856-2-git-send-email-ppandit@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | | scsi: mptsas: infinite loop while fetching requestsPrasad J Pandit2016-05-291-5/+4Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The LSI SAS1068 Host Bus Adapter emulator in Qemu, periodically looks for requests and fetches them. A loop doing that in mptsas_fetch_requests() could run infinitely if 's->state' was not operational. Move check to avoid such a loop. Reported-by: Li Qiang <liqiang6-s@360.cn> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Cc: qemu-stable@nongnu.org Message-Id: <1464077264-25473-1-git-send-email-ppandit@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | | scsi: pvscsi: check command descriptor ring buffer size (CVE-2016-4952)Prasad J Pandit2016-05-291-4/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Vmware Paravirtual SCSI emulation uses command descriptors to process SCSI commands. These descriptors come with their ring buffers. A guest could set the ring buffer size to an arbitrary value leading to OOB access issue. Add check to avoid it. Reported-by: Li Qiang <liqiang6-s@360.cn> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Cc: qemu-stable@nongnu.org Message-Id: <1464000485-27041-1-git-send-email-ppandit@redhat.com> Reviewed-by: Shmulik Ladkani <shmulik.ladkani@ravellosystems.com> Reviewed-by: Dmitry Fleytman <dmitry@daynix.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | | hw/char: QOM'ify milkymist-uart.cxiaoqiang zhao2016-05-293-6/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | drop the qemu_char_get_next_serial and use chardev prop instead Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com> Message-Id: <1464158344-12266-6-git-send-email-zxq_yx_007@163.com> Tested-by: Michael Walle <michael@walle.cc> Acked-by: Michael Walle <michael@walle.cc> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | | hw/char: QOM'ify lm32_uart.cxiaoqiang zhao2016-05-293-13/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Drop the old SysBus init function and use instance_init * Call qemu_chr_add_handlers in the realize callback * Use qdev chardev prop instead of qemu_char_get_next_serial * Add lm32_uart_create function to create lm32 uart device Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com> Message-Id: <1464158344-12266-5-git-send-email-zxq_yx_007@163.com> Tested-by: Michael Walle <michael@walle.cc> Acked-by: Michael Walle <michael@walle.cc> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | | hw/char: QOM'ify lm32_juart.cxiaoqiang zhao2016-05-294-13/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Drop the old SysBus init function * Call qemu_chr_add_handlers in the realize callback * Use qdev chardev prop instead of qemu_char_get_next_serial Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com> Message-Id: <1464158344-12266-4-git-send-email-zxq_yx_007@163.com> Tested-by: Michael Walle <michael@walle.cc> Acked-by: Michael Walle <michael@walle.cc> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | | hw/char: QOM'ify etraxfs_ser.cxiaoqiang zhao2016-05-292-12/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Drop the old SysBus init function and use instance_init * Call qemu_chr_add_handlers in the realize callback * Use qdev chardev prop instead of qemu_char_get_next_serial * Add etraxfs_ser_create function to create etraxfs serial device Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com> Message-Id: <1464158344-12266-3-git-send-email-zxq_yx_007@163.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | | hw/char: QOM'ify escc.cxiaoqiang zhao2016-05-291-11/+19
|/ / | | | | | | | | | | | | | | | | * Drop the old SysBus init function and use instance_init * Call qemu_chr_add_handlers in the realize callback Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com> Message-Id: <1464158344-12266-2-git-send-email-zxq_yx_007@163.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | spapr_iommu: Move table allocation to helpersAlexey Kardashevskiy2016-05-271-19/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At the moment presence of vfio-pci devices on a bus affect the way the guest view table is allocated. If there is no vfio-pci on a PHB and the host kernel supports KVM acceleration of H_PUT_TCE, a table is allocated in KVM. However, if there is vfio-pci and we do yet not KVM acceleration for these, the table has to be allocated by the userspace. At the moment the table is allocated once at boot time but next patches will reallocate it. This moves kvmppc_create_spapr_tce/g_malloc0 and their counterparts to helpers. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* | spapr_pci: Use correct DMA LIOBN when composing the device treeAlexey Kardashevskiy2016-05-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | The user could have picked LIOBN via the CLI but the device tree rendering code would still use the value derived from the PHB index (which is the default fallback if LIOBN is not set in the CLI). This replaces SPAPR_PCI_LIOBN() with the actual DMA LIOBN value. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* | spapr: ensure device trees are always associated with DRCJianjun Duan2016-05-272-17/+11Star
| | | | | | | | | | | | | | | | | | | | | | | | There are possible racing situations involving hotplug events and guest migration. For cases where a hotplug event is migrated, or the guest is in the process of fetching device tree at the time of migration, we need to ensure the device tree is created and associated with the corresponding DRC for devices that were hotplugged on the source, but 'coldplugged' on the target. Signed-off-by: Jianjun Duan <duanj@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* | Added negative check for get_image_size()Zhou Jie2016-05-271-0/+4
| | | | | | | | | | | | | | | | This patch adds check for negative return value from get_image_size(), where it is missing. It avoids unnecessary two function calls. Signed-off-by: Zhou Jie <zhoujie2011@cn.fujitsu.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* | hw/net/spapr_llan: Provide counter with dropped rx frames to the guestThomas Huth2016-05-271-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | The last 8 bytes of the receive buffer list page (that has been supplied by the guest with the H_REGISTER_LOGICAL_LAN call) contain a counter for frames that have been dropped because there was no suitable receive buffer available. This patch introduces code to use this field to provide the information about dropped rx packets to the guest. There it can be queried with "ethtool -S eth0 | grep rx_no_buffer". Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* | hw/net/spapr_llan: Delay flushing of the RX queue while adding new RX buffersThomas Huth2016-05-271-3/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the spapr-vlan device is trying to flush the RX queue after each RX buffer that has been added by the guest via the H_ADD_LOGICAL_LAN_BUFFER hypercall. In case the receive buffer pool was empty before, we only pass single packets to the guest this way. This can cause very bad performance if a sender is trying to stream fragmented UDP packets to the guest. For example when using the UDP_STREAM test from netperf with UDP packets that are much bigger than the MTU size, almost all UDP packets are dropped in the guest since the chances are quite high that at least one of the fragments got lost on the way. When flushing the receive queue, it's much better if we'd have a bunch of receive buffers available already, so that fragmented packets can be passed to the guest in one go. To do this, the spapr_vlan_receive() function should return 0 instead of -1 if there are no more receive buffers available, so that receive_disabled = 1 gets temporarily set for the receive queue, and we have to delay the queue flushing at the end of h_add_logical_lan_buffer() a little bit by using a timer, so that the guest gets a chance to add multiple RX buffers before we flush the queue again. This improves the UDP_STREAM test with the spapr-vlan device a lot: Running netserver -p 44444 -L <guestip> -f -D -4 in the guest, and netperf -p 44444 -L <hostip> -H <guestip> -t UDP_STREAM -l 60 -- -m 16384 in the host, I get the following values _without_ this patch: Socket Message Elapsed Messages Size Size Time Okay Errors Throughput bytes bytes secs # # 10^6bits/sec 229376 16384 60.00 1738970 0 3798.83 229376 60.00 23 0.05 That "0.05" means that almost all UDP packets got lost/discarded at the receiving side. With this patch applied, the value look much better: Socket Message Elapsed Messages Size Size Time Okay Errors Throughput bytes bytes secs # # 10^6bits/sec 229376 16384 60.00 1789104 0 3908.35 229376 60.00 22818 49.85 Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* | Merge remote-tracking branch 'remotes/awilliam/tags/vfio-update-20160526.1' ↵Peter Maydell2016-05-265-73/+871
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | into staging VFIO updates 2016-05-26 - Infrastructure and quirks to support IGD assignment (Alex Williamson) - Fixes to 128bit handling, IOMMU replay, IOMMU translation sanity checking (Alexey Kardashevskiy) # gpg: Signature made Thu 26 May 2016 18:50:29 BST using RSA key ID 3BB08B22 # gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>" # gpg: aka "Alex Williamson <alex@shazbot.org>" # gpg: aka "Alex Williamson <alwillia@redhat.com>" # gpg: aka "Alex Williamson <alex.l.williamson@gmail.com>" * remotes/awilliam/tags/vfio-update-20160526.1: vfio: Check that IOMMU MR translates to system address space memory: Fix IOMMU replay base address vfio: Fix 128 bit handling when deleting region vfio/pci: Add IGD documentation vfio/pci: Add a separate option for IGD OpRegion support vfio/pci: Intel graphics legacy mode assignment vfio/pci: Setup BAR quirks after capabilities probing vfio/pci: Consolidate VGA setup vfio/pci: Fix return of vfio_populate_vga() vfio: Create device specific region info helper vfio: Enable sparse mmap capability Signed-off-by: Peter Maydell <peter.maydell@linaro.org>