summaryrefslogtreecommitdiffstats
path: root/drivers/dma/ioat
Commit message (Collapse)AuthorAgeFilesLines
* ioat: cleanup ->timer_fn() and ->cleanup_fn() prototypesDan Williams2010-03-045-67/+40Star
| | | | | | | | | | | If the calling convention of ->timer_fn() and ->cleanup_fn() are unified across hardware versions we can drop parameters to ioat_init_channel() and unify ioat_is_dma_complete() implementations. Both ->timer_fn() and ->cleanup_fn() are modified to expect a struct dma_chan pointer. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* ioat3: interrupt coalescingDan Williams2010-03-042-5/+34
| | | | | | | | | | | | | The hardware automatically disables further interrupts after each event until rearmed. This allows a delay to be injected between the occurence of the interrupt and the running of the cleanup routine. The delay is scaled by the descriptor backlog and then written to the INTRDELAY register which specifies the number of microseconds to hold off interrupt delivery after an interrupt event occurs. According to powertop this reduces the interrupt rate from ~5000 intr/s to ~150 intr/s per without affecting throughput (simple dd to a raid6 array). Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* ioat: close potential BUG_ON race in the descriptor cleanup pathDan Williams2010-03-042-2/+2
| | | | | | | | | Since ioat_cleanup_preamble() and the update of the last completed descriptor are not synchronized there is a chance that two cleanup threads can see descriptors to clean. If the first cleans up all pending descriptors then the second will trigger the BUG_ON. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* ioat2: kill pending flagDan Williams2010-03-032-24/+12Star
| | | | | | | | The pending == 2 case no longer exists in the driver so, we can use ioat2_ring_pending() outside the lock to determine if there might be any descriptors in the ring that the hardware has not seen. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* ioat3: use ioat2_quiesce()Dan Williams2010-03-031-9/+1Star
| | | | | | Replace open coded ioat2_quiesce() call in ioat3_restart_channel Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* ioat3: cleanup, don't enable DCA completion writesDan Williams2010-03-031-2/+1Star
| | | | | | | We already disallow raid operations while DCA is globally enabled, so having it locally enabled is a nop and confusing when reading the code. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* ioat2,3: put channel hardware in known state at initDan Williams2009-12-196-32/+114
| | | | | | | | | | | | | | Put the ioat2 and ioat3 state machines in the halted state with all errors cleared. The ioat1 init path is not disturbed for stability, there are no reported ioat1 initiaization issues. Cc: <stable@kernel.org> Reported-by: Roland Dreier <rdreier@cisco.com> Tested-by: Roland Dreier <rdreier@cisco.com> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* ioat3: fix p-disabled q-continuationDan Williams2009-12-171-2/+4
| | | | | | | | | | | | | | | When continuing a pq calculation the driver needs 3 extra sources. The driver can perform a 3 source calculation with a single descriptor, but needs an extended descriptor to process up to 8 sources in one operation. However, in the p-disabled case only one extra source is needed. When continuing a p-disabled operation there are occasions (i.e. 0 < src_cnt % 8 < 3) where the tail operation does not need an extended descriptor. Properly account for this fact otherwise invalid 'dmacount' values will be written to hardware usually causing the channel to halt with 'invalid descriptor' errors. Cc: <stable@kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* ioat3: fix pq completion versus channel deallocation raceDan Williams2009-11-201-2/+2
| | | | | | | | | | | The completion of a pq operation is notified with a null descriptor appended to the end of the chain. This descriptor needs to be visible to dma clients otherwise the client is precluded from ensuring all operations are quiesced before freeing channel resources, i.e. due to descriptor polling it may get the completion notification ahead of the interrupt delivered by the null descriptor. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* async_tx: build-time toggling of async_{syndrome,xor}_val dma supportDan Williams2009-11-201-0/+10
| | | | | | | | | | | | | ioat3.2 does not support asynchronous error notifications which makes the driver experience latencies when non-zero pq validate results are expected. Provide a mechanism for turning off async_xor_val and async_syndrome_val via Kconfig. This approach is generally useful for any driver that specifies ASYNC_TX_DISABLE_CHANNEL_SWITCH and would like to force the async_tx api to fall back to the synchronous path for certain operations. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* ioat2,3: report all uncorrectable errorsDan Williams2009-11-203-3/+5
| | | | | | | Modify is_ioat_bug() to catch all errors that are uncorrectable, or not currently handled. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* ioat3: specify valid address for disabled-Q or disabled-PDan Williams2009-11-201-5/+17
| | | | | | | Although disabled, hardware still checks address validity, so duplicate the known address. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* ioat2,3: disable asynchronous error notificationsDan Williams2009-11-201-3/+1Star
| | | | | | | Error interrupts and error completions may cause channel hangs, so poll the channel status register after a timeout. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* ioat3: dca and raid operations are incompatibleDan Williams2009-11-203-1/+9
| | | | | | | | RAID operations cause a system hang on platforms with DCA (Direct-Cache-Access) enabled. So turn off RAID capabilities in this case. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* ioat: silence "dca disabled" messagesDan Williams2009-11-171-2/+2
| | | | | | | Turning off dca is not an "error", and the dca-enabled state can be viewed from sysfs. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* ioat3: fix uninitialized var warningsDan Williams2009-09-211-6/+9
| | | | | | | | | | | | | | | | | | drivers/dma/ioat/dma_v3.c: In function 'ioat3_prep_memset_lock': drivers/dma/ioat/dma_v3.c:439: warning: 'fill' may be used uninitialized in this function drivers/dma/ioat/dma_v3.c:437: warning: 'desc' may be used uninitialized in this function drivers/dma/ioat/dma_v3.c: In function '__ioat3_prep_xor_lock': drivers/dma/ioat/dma_v3.c:489: warning: 'xor' may be used uninitialized in this function drivers/dma/ioat/dma_v3.c:486: warning: 'desc' may be used uninitialized in this function drivers/dma/ioat/dma_v3.c: In function '__ioat3_prep_pq_lock': drivers/dma/ioat/dma_v3.c:631: warning: 'pq' may be used uninitialized in this function drivers/dma/ioat/dma_v3.c:628: warning: 'desc' may be used uninitialized in this function gcc-4.0, unlike gcc-4.3, does not see that these variables are initialized before use. Convert the descriptor loops to do-while make this initialization apparent. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* drivers/dma/ioat/dma_v2.c: fix warningsAndrew Morton2009-09-211-2/+3
| | | | | | | | | | drivers/dma/ioat/dma_v2.c: In function 'ioat2_dma_prep_memcpy_lock': drivers/dma/ioat/dma_v2.c:680: warning: 'hw' may be used uninitialized in this function drivers/dma/ioat/dma_v2.c:681: warning: 'desc' may be used uninitialized in this function Cc: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* ioat2: clarify ring size limitsDan Williams2009-09-171-3/+4
| | | | | | | | | With the addition of ioat_max_alloc_order it is not clear what the maximum allocation order is, so document that in the modinfo. Also take an opportunity to kill a stray semicolon. Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* ioat: driver version 4.0Dan Williams2009-09-101-1/+1
| | | | | | | | A new ring implementation and the addition of raid functionality constitutes a bump in the driver major version number. Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* dca: registering requesters in multiple dca domainsMaciej Sosnowski2009-09-101-1/+1
| | | | | | | | | | | | | This patch enables DCA support on multiple-IOH/multiple-IIO architectures. It modifies dca module by replacing single dca_providers list with dca_domains list, each domain containing separate list of providers. This approach lets dca driver manage multiple domains, i.e. sets of providers and requesters mapped back to the same PCI root complex device. The driver takes care to register each requester to a provider from the same domain. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
* Merge branch 'dmaengine' into async-tx-nextDan Williams2009-09-095-8/+26
|\ | | | | | | | | | | | | | | Conflicts: crypto/async_tx/async_xor.c drivers/dma/ioat/dma_v2.h drivers/dma/ioat/pci.c drivers/md/raid5.c
| * ioat2,3: cacheline align software descriptor allocationsDan Williams2009-09-093-4/+20
| | | | | | | | | | | | | | | | | | All the necessary fields for handling an ioat2,3 ring entry can fit into one cacheline. Move ->len prior to ->txd in struct ioat_ring_ent, and move allocation of these entries to a hw-cache-aligned kmem cache to reduce the number of cachelines dirtied for descriptor management. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| * ioat: implement a private tx_listDan Williams2009-09-092-4/+6
| | | | | | | | | | | | | | | | | | Drop ioatdma's use of tx_list from struct dma_async_tx_descriptor in preparation for removal of this field. Cc: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* | I/OAT: Convert to PCI_VDEVICE()Roland Dreier2009-09-091-23/+23
| | | | | | | | | | | | | | | | Trivial cleanup to make the PCI ID table easier to read. [dan.j.williams@intel.com: extended to v3.2 devices] Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* | Add MODULE_DEVICE_TABLE() so ioatdma module is autoloadedRoland Dreier2009-09-091-0/+1
| | | | | | | | | | | | | | | | | | The ioatdma module is missing aliases for the PCI devices it supports, so it is not autoloaded on boot. Add a MODULE_DEVICE_TABLE() to get these aliases. Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* | ioat3: segregate raid enginesDan Williams2009-09-093-9/+22
| | | | | | | | | | | | | | | | | | The cleanup routine for the raid cases imposes extra checks for handling raid descriptors and extended descriptors. If the channel does not support raid it can avoid this extra overhead by using the ioat2 cleanup path. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* | ioat3: ioat3.2 pci ids for Jasper ForestTom Picard2009-09-091-0/+13
| | | | | | | | | | | | | | | | | | | | Jasper Forest introduces raid offload support via ioat3.2 support. When raid offload is enabled two (out of 8 channels) will report raid5/raid6 offload capabilities. The remaining channels will only report ioat3.0 capabilities (memcpy). Signed-off-by: Tom Picard <tom.s.picard@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* | ioat3: interrupt descriptor supportDan Williams2009-09-091-1/+38
| | | | | | | | | | | | | | The async_tx api uses the DMA_INTERRUPT operation type to terminate a chain of issued operations with a callback routine. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* | ioat3: support xor via pq descriptorsDan Williams2009-09-091-0/+49
| | | | | | | | | | | | | | If a platform advertises pq capabilities, but not xor, then use ioat3_prep_pqxor and ioat3_prep_pqxor_val to simulate xor support. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* | ioat3: pq supportDan Williams2009-09-091-1/+264
| | | | | | | | | | | | | | | | | | ioat3.2 adds support for raid6 syndrome generation (xor sum of galois field multiplication products) using up to 8 sources. It can also perform an pq-zero-sum operation to validate whether the syndrome for a given set of sources matches a previously computed syndrome. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* | ioat3: xor self testDan Williams2009-09-094-3/+282
| | | | | | | | | | | | | | | | This adds a hardware specific self test to be called from ioat_probe. In the ioat3 case we will have tests for all the different raid operations, while ioat1 and ioat2 will continue to just test memcpy. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* | ioat3: xor supportDan Williams2009-09-094-3/+222
| | | | | | | | | | | | | | | | | | | | | | | | ioat3.2 adds xor offload support for up to 8 sources. It can also perform an xor-zero-sum operation to validate whether all given sources sum to zero, without writing to a destination. Xor descriptors differ from memcpy in that one operation may require multiple descriptors depending on the number of sources. When the number of sources exceeds 5 an extended descriptor is needed. These descriptors need to be accounted for when updating the DMA_COUNT register. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* | ioat3: enable dca for completion writesDan Williams2009-09-092-1/+3
| | | | | | | | | | | | | | Tag completion writes for direct cache access to reduce the latency of checking for descriptor completions. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* | ioat: add 'ioat' sysfs attributesDan Williams2009-09-096-6/+166
| | | | | | | | | | | | | | | | | | | | | | | | | | Export driver attributes for diagnostic purposes: 'ring_size': total number of descriptors available to the engine 'ring_active': number of descriptors in-flight 'capabilities': supported operation types for this channel 'version': Intel(R) QuickData specfication revision This also allows some chattiness to be removed from the driver startup as this information is now available via sysfs. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* | ioat3: split ioat3 support to its own file, add memsetDan Williams2009-09-097-84/+421
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Up until this point the driver for Intel(R) QuickData Technology engines, specification versions 2 and 3, were mostly identical save for a few quirks. Version 3.2 hardware adds many new capabilities (like raid offload support) requiring some infrastructure that is not relevant for v2. For better code organization of the new funcionality move v3 and v3.2 support to its own file dma_v3.c, and export some routines from the base files (dma.c and dma_v2.c) that can be reused directly. The first new capability included in this code reorganization is support for v3.2 memset operations. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* | ioat3: hardware version 3.2 register / descriptor definitionsDan Williams2009-09-094-2/+185
| | | | | | | | | | | | | | ioat3.2 adds raid5 and raid6 offload capabilities. Signed-off-by: Tom Picard <tom.s.picard@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* | ioat2+: add fence supportDan Williams2009-09-091-0/+1
|/ | | | | | | | | | | | | | | In preparation for adding more operation types to the ioat3 path the driver needs to honor the DMA_PREP_FENCE flag. For example the async_tx api will hand xor->memcpy->xor chains to the driver with the 'fence' flag set on the first xor and the memcpy operation. This flag in turn sets the 'fence' flag in the descriptor control field telling the hardware that future descriptors in the chain depend on the result of the current descriptor, so wait for all writes to complete before starting the next operation. Note that ioat1 does not prefetch the descriptor chain, so does not require/support fenced operations. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* ioat2,3: dynamically resize descriptor ringDan Williams2009-09-093-31/+187
| | | | | | | | | | | | Increment the allocation order of the descriptor ring every time we run out of descriptors up to a maximum of allocation order specified by the module parameter 'ioat_max_alloc_order'. After each idle period decrement the allocation order to a minimum order of 'ioat_ring_alloc_order' (i.e. the default ring size, tunable as a module parameter). Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* ioat: switch watchdog and reset handler from workqueue to timerDan Williams2009-09-095-428/+388Star
| | | | | | | | | | | | | | | | | | In order to support dynamic resizing of the descriptor ring or polling for a descriptor in the presence of a hung channel the reset handler needs to make progress while in a non-preemptible context. The current workqueue implementation precludes polling channel reset completion under spin_lock(). This conversion also allows us to return to opportunistic cleanup in the ioat2 case as the timer implementation guarantees at least one cleanup after every descriptor is submitted. This means the worst case completion latency becomes the timer frequency (for exceptional circumstances), but with the benefit of avoiding busy waiting when the lock is contended. Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* ioat1: trim ioat_dma_desc_swDan Williams2009-09-093-5/+7
| | | | | | | | Save 4 bytes per software descriptor by transmitting tx_cnt in an unused portion of the hardware descriptor. Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* ioat: ___devinit annotate the initialization pathsDan Williams2009-09-095-20/+24
| | | | | | | Mark all single use initialization routines with __devinit. Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* ioat: preserve chanctrl bits when re-arming interruptsDan Williams2009-09-093-14/+10Star
| | | | | | | | | | | | | The register write in ioat_dma_cleanup_tasklet is unfortunate in two ways: 1/ It clears the extra 'enable' bits that we set at alloc_chan_resources time 2/ It gives the impression that it disables interrupts when it is in fact re-arming interrupts [ Impact: fix, persist the value of the chanctrl register when re-arming ] Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* ioat: ignore reserved bits for chancnt and xfercapDan Williams2009-09-092-0/+14
| | | | | | | | Don't trust that the reserved bits are always zero, also sanity check the returned value. Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* ioat: cleanup completion status readsDan Williams2009-09-094-75/+46Star
| | | | | | | | | | The cleanup path makes an effort to only perform an atomic read of the 64-bit completion address. However in the 32-bit case it does not matter if we read the upper-32 and lower-32 non-atomically because the upper-32 will always be zero. Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* ioat: add some dev_dbg() callsDan Williams2009-09-094-4/+81
| | | | | | | Provide some output for debugging the driver. Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* ioat1: kill unused unmap parametersDan Williams2009-09-092-4/+0Star
| | | | | | | | | The unified ioat1/ioat2 ioat_dma_unmap() implementation derives the source and dest addresses from the unmap descriptor. There is no longer a need to track this information in struct ioat_desc_sw. Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* ioat2,3: convert to a true ring bufferDan Williams2009-09-096-686/+1122
| | | | | | | | | | | | | | | | | | | | Replace the current linked list munged into a ring with a native ring buffer implementation. The benefit of this approach is reduced overhead as many parameters can be derived from ring position with simple pointer comparisons and descriptor allocation/freeing becomes just a manipulation of head/tail pointers. It requires a contiguous allocation for the software descriptor information. Since this arrangement is significantly different from the ioat1 chain, move ioat2,3 support into its own file and header. Common routines are exported from driver/dma/ioat/dma.[ch]. Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* ioat: prepare the code for ioat[12]_dma_chan splitDan Williams2009-09-092-370/+390
| | | | | | | | | | | | | | | | | | | | | | Prepare the code for the conversion of the ioat2 linked-list-ring into a native ring buffer. After this conversion ioat2 channels will share less of the ioat1 infrastructure, but there will still be places where sharing is possible. struct ioat_chan_common is created to house the channel attributes that will remain common between ioat1 and ioat2 channels. For every routine that accesses both common and hardware specific fields the old unified 'ioat_chan' pointer is split into an 'ioat' and 'chan' pointer. Where 'chan' references common fields and 'ioat' the hardware/version specific. [ Impact: pure structure member movement/variable renames, no logic changes ] Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* ioat: fix self test interruptsDan Williams2009-09-091-1/+2
| | | | | | | | | | | If a callback is to be attached to a descriptor the channel needs to know at ->prep time so it can set the interrupt enable bit. This is in preparation for moving descriptor ioat2 descriptor preparation from ->submit to ->prep. Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* ioat1: move descriptor allocation from submit to prepDan Williams2009-09-091-89/+65Star
| | | | | | | | | | | | | | The async_tx api assumes that after a successful ->prep a subsequent ->submit will not fail due to a lack of resources. This also fixes a bug in the allocation failure case. Previously the descriptors allocated prior to the allocation failure would not be returned to the free list. Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>