openslx/kernel-qcow2-linux.git - In-kernel qcow2 (Kernel part)

	Commit message (Collapse)	Author	Age	Files	Lines
*	[SCSI] libsas: fix panic when single phy is disabled on a wide port	Mark Salyzyn	2011-10-02	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a wide port is being utilized to a target, if one disables only one of the phys, we get an OS crash: BUG: unable to handle kernel NULL pointer dereference at 0000000000000238 IP: [<ffffffff814ca9b1>] mutex_lock+0x21/0x50 PGD 4103f5067 PUD 41dba9067 PMD 0 Oops: 0002 [#1] SMP last sysfs file: /sys/bus/pci/slots/5/address CPU 0 Modules linked in: pm8001(U) ses enclosure fuse nfsd exportfs autofs4 ipmi_devintf ipmi_si ipmi_msghandler nfs lockd fscache nfs_acl auth_rpcgss 8021q fcoe libfcoe garp libfc scsi_transport_fc stp scsi_tgt llc sunrpc cpufreq_ondemand acpi_cpufreq freq_table ipv6 sr_mod cdrom dm_mirror dm_region_hash dm_log uinput sg i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support e1000e mlx4_ib ib_mad ib_core mlx4_en mlx4_core ext3 jbd mbcache sd_mod crc_t10dif usb_storage ata_generic pata_acpi ata_piix libsas(U) scsi_transport_sas dm_mod [last unloaded: pm8001] Modules linked in: pm8001(U) ses enclosure fuse nfsd exportfs autofs4 ipmi_devintf ipmi_si ipmi_msghandler nfs lockd fscache nfs_acl auth_rpcgss 8021q fcoe libfcoe garp libfc scsi_transport_fc stp scsi_tgt llc sunrpc cpufreq_ondemand acpi_cpufreq freq_table ipv6 sr_mod cdrom dm_mirror dm_region_hash dm_log uinput sg i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support e1000e mlx4_ib ib_mad ib_core mlx4_en mlx4_core ext3 jbd mbcache sd_mod crc_t10dif usb_storage ata_generic pata_acpi ata_piix libsas(U) scsi_transport_sas dm_mod [last unloaded: pm8001] Pid: 5146, comm: scsi_wq_5 Not tainted 2.6.32-71.29.1.el6.lustre.7.x86_64 #1 Storage Server RIP: 0010:[<ffffffff814ca9b1>] [<ffffffff814ca9b1>] mutex_lock+0x21/0x50 RSP: 0018:ffff8803e4e33d30 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000238 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff8803e664c800 RDI: 0000000000000238 RBP: ffff8803e4e33d40 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 R13: 0000000000000238 R14: ffff88041acb7200 R15: ffff88041c51ada0 FS: 0000000000000000(0000) GS:ffff880028200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000000238 CR3: 0000000410143000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process scsi_wq_5 (pid: 5146, threadinfo ffff8803e4e32000, task ffff8803e4e294a0) Stack: ffff8803e664c800 0000000000000000 ffff8803e4e33d70 ffffffffa001f06e <0> ffff8803e4e33d60 ffff88041c51ada0 ffff88041acb7200 ffff88041bc0aa00 <0> ffff8803e4e33d90 ffffffffa0032b6c 0000000000000014 ffff88041acb7200 Call Trace: [<ffffffffa001f06e>] sas_port_delete_phy+0x2e/0xa0 [scsi_transport_sas] [<ffffffffa0032b6c>] sas_unregister_devs_sas_addr+0xac/0xe0 [libsas] [<ffffffffa0034914>] sas_ex_revalidate_domain+0x204/0x330 [libsas] [<ffffffffa00307f0>] ? sas_revalidate_domain+0x0/0x90 [libsas] [<ffffffffa0030855>] sas_revalidate_domain+0x65/0x90 [libsas] [<ffffffff8108c7d0>] worker_thread+0x170/0x2a0 [<ffffffff81091ea0>] ? autoremove_wake_function+0x0/0x40 [<ffffffff8108c660>] ? worker_thread+0x0/0x2a0 [<ffffffff81091b36>] kthread+0x96/0xa0 [<ffffffff810141ca>] child_rip+0xa/0x20 [<ffffffff81091aa0>] ? kthread+0x0/0xa0 [<ffffffff810141c0>] ? child_rip+0x0/0x20 Code: ff ff 85 c0 75 ed eb d6 66 90 55 48 89 e5 48 83 ec 10 48 89 1c 24 4c 89 64 24 08 0f 1f 44 00 00 48 89 fb e8 92 f4 ff ff 48 89 df <f0> ff 0f 79 05 e8 25 00 00 00 65 48 8b 04 25 08 cc 00 00 48 2d RIP [<ffffffff814ca9b1>] mutex_lock+0x21/0x50 RSP <ffff8803e4e33d30> CR2: 0000000000000238 The following patch is admittedly a band-aid, and does not solve the root cause, but it still is a good candidate for hardening as a pointer check before reference. Signed-off-by: Mark Salyzyn <mark_salyzyn@us.xyratex.com> Tested-by: Jack Wang <jack_wang@usish.com> Cc: stable@kernel.org Signed-off-by: James Bottomley <JBottomley@Parallels.com>
*	[SCSI] qla2xxx: Fix crash in qla2x00_abort_all_cmds() on unload	Roland Dreier	2011-10-02	1	-4/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I hit a crash in qla2x00_abort_all_cmds() if the qla2xxx module is unloaded right after it is loaded. I debugged this down to the abort handling improperly treating a command of type SRB_ADISC_CMD as if it had a bsg_job to complete when that command actually uses the iocb_cmd part of the union. (I guess to hit this one has to unload the module while the async FC initialization is still in progress) It seems we should only look for a bsg_job if type is SRB_ELS_CMD_RPT, SRB_ELS_CMD_HST or SRB_CT_CMD, so switch the test to make that explicit. Signed-off-by: Roland Dreier <roland@purestorage.com> Acked-by: Chad Dupuis <chad.dupuis@qlogic.com> Cc: stable@kernel.org Signed-off-by: James Bottomley <JBottomley@Parallels.com>
*	Merge git://bedivere.hansenpartnership.com/git/scsi-rc-fixes-2.6	Linus Torvalds	2011-09-28	5	-3/+7
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* git://bedivere.hansenpartnership.com/git/scsi-rc-fixes-2.6: [SCSI] 3w-9xxx: fix iommu_iova leak [SCSI] cxgb3i: convert cdev->l2opt to use rcu to prevent NULL dereference [SCSI] scsi: qla4xxx needs libiscsi.o [SCSI] libsas: fix failure to revalidate domain for anything but the first expander child. [SCSI] aacraid: reset should disable MSI interrupt
\| *	[SCSI] 3w-9xxx: fix iommu_iova leak	James Bottomley	2011-09-26	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Following reports on the list, it looks like the 3e-9xxx driver will leak dma mappings every time we get a transient queueing error back from the card. This is because it maps the sg list in the routine that sends the command, but doesn't unmap again in the transient failure path (even though the command is sent back to the block layer). Fix by unmapping before returning the status. Reported-by: Chris Boot <bootc@bootc.net> Tested-by: Chris Boot <bootc@bootc.net> Acked-by: Adam Radford <aradford@gmail.com> Cc: stable@kernel.org Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] cxgb3i: convert cdev->l2opt to use rcu to prevent NULL dereference	Neil Horman	2011-09-26	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This oops was reported recently: d:mon> e cpu 0xd: Vector: 300 (Data Access) at [c0000000fd4c7120] pc: d00000000076f194: .t3_l2t_get+0x44/0x524 [cxgb3] lr: d000000000b02108: .init_act_open+0x150/0x3d4 [cxgb3i] sp: c0000000fd4c73a0 msr: 8000000000009032 dar: 0 dsisr: 40000000 current = 0xc0000000fd640d40 paca = 0xc00000000054ff80 pid = 5085, comm = iscsid d:mon> t [c0000000fd4c7450] d000000000b02108 .init_act_open+0x150/0x3d4 [cxgb3i] [c0000000fd4c7500] d000000000e45378 .cxgbi_ep_connect+0x784/0x8e8 [libcxgbi] [c0000000fd4c7650] d000000000db33f0 .iscsi_if_rx+0x71c/0xb18 [scsi_transport_iscsi2] [c0000000fd4c7740] c000000000370c9c .netlink_data_ready+0x40/0xa4 [c0000000fd4c77c0] c00000000036f010 .netlink_sendskb+0x4c/0x9c [c0000000fd4c7850] c000000000370c18 .netlink_sendmsg+0x358/0x39c [c0000000fd4c7950] c00000000033be24 .sock_sendmsg+0x114/0x1b8 [c0000000fd4c7b50] c00000000033d208 .sys_sendmsg+0x218/0x2ac [c0000000fd4c7d70] c00000000033f55c .sys_socketcall+0x228/0x27c [c0000000fd4c7e30] c0000000000086a4 syscall_exit+0x0/0x40 --- Exception: c01 (System Call) at 00000080da560cfc The root cause was an EEH error, which sent us down the offload_close path in the cxgb3 driver, which in turn sets cdev->l2opt to NULL, without regard for upper layer driver (like the cxgbi drivers) which might have execution contexts in the middle of its use. The result is the oops above, when t3_l2t_get attempts to dereference L2DATA(cdev)->nentries in arp_hash right after the EEH error handler sets it to NULL. The fix is to prevent the setting of the NULL pointer until after there are no further users of it. The t3cdev->l2opt pointer is now converted to be an rcu pointer and the L2DATA macro is now called under the protection of the rcu_read_lock(). When the EEH error path: t3_adapter_error->offload_close->cxgb3_offload_deactivate Is exectured, setting of that l2opt pointer to NULL, is now gated on an rcu quiescence point, preventing, allowing L2DATA callers to safely check for a NULL pointer without concern that the underlying data will be freeded before the pointer is dereferenced. This has been tested by the reporter and shown to fix the reproted oops [nhorman: fix up unitinialised variable reported by Dan Carpenter] Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Reviewed-by: Karen Xie <kxie@chelsio.com> Cc: stable@kernel.org Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] scsi: qla4xxx needs libiscsi.o	Randy Dunlap	2011-09-22	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	qla4xxx driver needs to be linked with libiscsi.o to fix build errors. This happens when no other drivers that use libiscsi.o are enabled. ERROR: "iscsi_conn_stop" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined! ERROR: "iscsi_conn_get_addr_param" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined! ERROR: "iscsi_session_teardown" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined! ERROR: "iscsi_host_alloc" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined! ERROR: "iscsi_conn_start" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined! ERROR: "iscsi_conn_send_pdu" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined! ERROR: "iscsi_session_get_param" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined! ERROR: "iscsi_conn_get_param" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined! ERROR: "iscsi_set_param" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined! ERROR: "iscsi_session_failure" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined! ERROR: "iscsi_complete_pdu" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined! ERROR: "iscsi_session_setup" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined! ERROR: "iscsi_conn_bind" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined! ERROR: "iscsi_conn_setup" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined! ERROR: "iscsi_itt_to_task" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined! Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Cc: stable@kernel.org Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] libsas: fix failure to revalidate domain for anything but the first ↵	Mark Salyzyn	2011-09-22	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	expander child. In an enclosure model where there are chaining expanders to a large body of storage, it was discovered that libsas, responding to a broadcast event change, would only revalidate the domain of first child expander in the list. The issue is that the pointer value to the discovered source device was used to break out of the loop, rather than the content of the pointer. This still remains non-compliant as the revalidate domain code is supposed to loop through all child expanders, and not stop at the first one it finds that reports a change count. However, the design of this routine does not allow multiple device discoveries and that would be a more complicated set of patches reserved for another day. We are fixing the glaring bug rather than refactoring the code. Signed-off-by: Mark Salyzyn <msalyzyn@us.xyratex.com> Cc: stable@kernel.org Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] aacraid: reset should disable MSI interrupt	Vasily Averin	2011-09-22	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	scsi reset on hardware with enabled MSI interrupts generates WARNING message [11027.798722] aacraid: Host adapter abort request (0,0,0,0) [11027.798814] aacraid: Host adapter reset request. SCSI hang ? [11087.762237] aacraid: SCSI bus appears hung [11135.082543] ------------[ cut here ]------------ [11135.082646] WARNING: at drivers/pci/msi.c:658 pci_enable_msi_block+0x251/0x290() Signed-off-by: Vasily Averin <vvs@sw.ru> Acked-by: Mark Salyzyn <mark_salyzyn@us.xyratex.com> Cc: stable@kernel.org Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* \|	scsi: fix qla2xxx printk format warning	Randy Dunlap	2011-09-24	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	sector_t can be different types, so cast it to its largest possible type. drivers/scsi/qla2xxx/qla_isr.c:1509:5: warning: format '%lx' expects type 'long unsigned int', but argument 5 has type 'sector_t' Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \|	scsi: SCSI_ISCI needs to select SCSI_SAS_HOST_SMP, fixes build error	Randy Dunlap	2011-09-24	1	-0/+1
\|/ \| \| \| \| \| \| \| \| \| \| \|	SCSI_ISCI needs to select SCSI_SAS_HOST_SMP to ensure that all needed symbols are available to it. Fixes this build error: ERROR: "try_test_sas_gpio_gp_bit" [drivers/scsi/isci/isci.ko] undefined! Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	Merge git://bedivere.hansenpartnership.com/git/scsi-rc-fixes-2.6	Linus Torvalds	2011-09-15	26	-200/+607
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* git://bedivere.hansenpartnership.com/git/scsi-rc-fixes-2.6: (25 commits) [SCSI] bnx2i: Fixed the endian on TTT for NOP out transmission [SCSI] libfc: fix referencing to fc_fcp_pkt from the frame pointer via fr_fsp() [SCSI] libfc: block SCSI eh thread for blocked rports [SCSI] libfc: fix fc_eh_host_reset [SCSI] fcoe: Fix deadlock between fip's recv_work and rtnl [SCSI] qla2xxx: Update version number to 8.03.07.07-k. [SCSI] qla2xxx: Set the task attributes after memsetting fcp cmnd. [SCSI] qla2xxx: Correct inadvertent loop state transitions during port-update handling. [SCSI] qla2xxx: Save and restore irq in the response queue interrupt handler. [SCSI] qla2xxx: Double check for command completion if abort mailbox command fails. [SCSI] qla2xxx: Acquire hardware lock while manipulating dsd list. [SCSI] qla2xxx: Fix qla24xx revision check while enabling interrupts. [SCSI] qla2xxx: T10 DIF - Fix incorrect error reporting. [SCSI] qla2xxx: T10 DIF - Handle uninitalized sectors. [SCSI] hpsa: fix physical device lun and target numbering problem [SCSI] hpsa: fix problem that OBDR devices are not detected [SCSI] isci: add version number [SCSI] isci: fix event-get pointer increment [SCSI] isci: dynamic interrupt coalescing [SCSI] isci: Leave requests alone if already terminating. ...
\| *	[SCSI] bnx2i: Fixed the endian on TTT for NOP out transmission	Eddie Wai	2011-08-29	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The iscsi_nopout task's TTT is defined as __be32 while the DMA memory to the chip is CPU specific. This creates a problem for unsolicited NOP-In responses where the TTT is not the RESERVED tag of 0xFFs. This patch adds a call to be32_to_cpu for the TTT specified. Signed-off-by: Eddie Wai <eddie.wai@broadcom.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] libfc: fix referencing to fc_fcp_pkt from the frame pointer via fr_fsp()	Yi Zou	2011-08-29	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In commit 6a716a8, while releasing the DDP context in case frame_send() failed, the frame may already be freed, so we should store the pointer to fc_fcp_pkt and release the DDP context using the locally stored fsp instead of getting fsp from the fr_fsp(fp) on a frame. Signed-off-by: Yi Zou <yi.zou@intel.com> Reported-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] libfc: block SCSI eh thread for blocked rports	Vasu Dev	2011-08-29	1	-2/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Call fc_block_scsi_eh() in all fcoe eh to blocks the scsi_eh thread for blocked rports. Signed-off-by: Vasu Dev <vasu.dev@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Reviewed-by: Yi Zou <yi.zou@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] libfc: fix fc_eh_host_reset	Vasu Dev	2011-08-29	2	-17/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Current fc_eh_host_reset leaves lport offline permanently due to FLOGI response getting handled by LOGO response from last reset as both had same exchange id. So fix this by having end to end exches clean-up using exchange abort along exches reset done from fc_eh_host_reset. This would avoid exchanges collision between the sessions across the reset. In this case implicit login should have done that but no aborting support for FIP frames, so just wait till lport->r_a_tov before restarting next flogi to ensure all exchanges are good to use again for next session. Below is the trace of LOGO from older session coming ahead of FLOGI response with same exche id 0x203:- 617 86.435165 4e.00.0b -> ff.ff.fc FC ELS LOGO 0x203 618 86.435195 4e.00.0b -> b6.02.00 FC ELS LOGO 0x213 619 86.435220 4e.00.0b -> 18.03.00 FC ELS LOGO 0x223 620 86.435244 4e.00.0b -> 18.02.00 FC ELS LOGO 0x233 621 86.435267 4e.00.0b -> 18.01.00 FC ELS LOGO 0x243 622 86.435349 00.00.00 -> ff.ff.fe FC ELS FLOGI 0x203 623 86.435549 ff.ff.fc -> 4e.00.0b FC ELS ACC (LOGO) 0x203 624 86.438721 ff.ff.fe -> 4e.00.0b FC ELS ACC (FLOGI) 0x203 625 86.442059 18.03.00 -> 4e.00.0b FC ELS ACC (LOGO) 0x223 626 86.443683 b6.02.00 -> 4e.00.0b FC ELS ACC (LOGO) 0x213 627 86.447693 18.01.00 -> 4e.00.0b FC ELS ACC (LOGO) 0x243 628 86.453499 18.02.00 -> 4e.00.0b FC ELS ACC (LOGO) 0x233 Signed-off-by: Vasu Dev <vasu.dev@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Reviewed-by: Yi Zou <yi.zou@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] fcoe: Fix deadlock between fip's recv_work and rtnl	Robert Love	2011-08-29	1	-5/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The rtnl cannot be held durrng the fcoe_interface_put. If it is the last reference on the fcoe_interface the fcoe_ctlr_destroy will be called as a part of the cleanup, ultimately calling cancel_work_sync(&fip->recv_work); If we are processing a flogi response we will be in the recv_work context and we will lock the rtnl to add a new unicast MAC address. This is how the deadlock can occur. The fix is simply to move the rtnl_lock/unlock into fcoe_interface_cleanup so that it can be unlocked before fcoe_interface_put is called. Here is the lockdep report: Jul 21 11:26:35 bubba [ 223.870702] ul 21 11:26:35 bubba [ 223.870704] ======================================================= Jul 21 11:26:35 bubba [ 223.871255] [ INFO: possible circular locking dependency detected ] Jul 21 11:26:35 bubba [ 223.871530] 3.0.0-rc7+ #1 Jul 21 11:26:35 bubba [ 223.871797] ------------------------------------------------------- Jul 21 11:26:35 bubba [ 223.872072] lockdeptest.sh/3464 is trying to acquire lock: Jul 21 11:26:35 bubba [ 223.872345] ((&fip->recv_work) Jul 21 11:26:35 bubba ){+.+.+.} Jul 21 11:26:35 bubba , at: Jul 21 11:26:35 bubba [<ffffffff810531f1>] wait_on_work+0x0/0xbd Jul 21 11:26:35 bubba [ 223.873022] Jul 21 11:26:35 bubba [ 223.873023] but task is already holding lock: Jul 21 11:26:35 bubba [ 223.873555] (rtnl_mutex Jul 21 11:26:35 bubba ){+.+.+.} Jul 21 11:26:35 bubba , at: Jul 21 11:26:35 bubba [<ffffffff813e8233>] rtnl_lock+0x12/0x14 Jul 21 11:26:35 bubba [ 223.874229] Jul 21 11:26:35 bubba [ 223.874230] which lock already depends on the new lock. Jul 21 11:26:35 bubba [ 223.874231] Jul 21 11:26:35 bubba [ 223.875032] Jul 21 11:26:35 bubba [ 223.875033] the existing dependency chain (in reverse order) is: Jul 21 11:26:35 bubba [ 223.875573] Jul 21 11:26:35 bubba [ 223.875573] -> #1 Jul 21 11:26:35 bubba (rtnl_mutex Jul 21 11:26:35 bubba ){+.+.+.} Jul 21 11:26:35 bubba : Jul 21 11:26:35 bubba [ 223.876301] Jul 21 11:26:35 bubba [<ffffffff8106c14a>] lock_acquire+0xd2/0xf7 Jul 21 11:26:35 bubba [ 223.876645] Jul 21 11:26:35 bubba [<ffffffff8151d975>] __mutex_lock_common+0x47/0x30d Jul 21 11:26:35 bubba [ 223.876991] Jul 21 11:26:35 bubba [<ffffffff8151dd36>] mutex_lock_nested+0x3b/0x40 Jul 21 11:26:35 bubba [ 223.877334] Jul 21 11:26:35 bubba [<ffffffff813e8233>] rtnl_lock+0x12/0x14 Jul 21 11:26:35 bubba [ 223.877675] Jul 21 11:26:35 bubba [<ffffffffa003d5a0>] fcoe_update_src_mac+0x2b/0x80 [fcoe] Jul 21 11:26:35 bubba [ 223.878022] Jul 21 11:26:35 bubba [<ffffffffa003d698>] fcoe_flogi_resp+0x5e/0x79 [fcoe] Jul 21 11:26:35 bubba [ 223.878366] Jul 21 11:26:35 bubba [<ffffffffa001566f>] fc_exch_recv+0x7f5/0x9da [libfc] Jul 21 11:26:35 bubba [ 223.878713] Jul 21 11:26:35 bubba [<ffffffffa00327d8>] fcoe_ctlr_recv_work+0x71f/0x10dc [libfcoe] Jul 21 11:26:35 bubba [ 223.879258] Jul 21 11:26:35 bubba [<ffffffff81053761>] process_one_work+0x1d7/0x347 Jul 21 11:26:35 bubba [ 223.879601] Jul 21 11:26:35 bubba [<ffffffff81054ade>] worker_thread+0xf8/0x17c Jul 21 11:26:35 bubba [ 223.879944] Jul 21 11:26:35 bubba [<ffffffff81058184>] kthread+0x7d/0x85 Jul 21 11:26:35 bubba [ 223.880287] Jul 21 11:26:35 bubba [<ffffffff81526414>] kernel_thread_helper+0x4/0x10 Jul 21 11:26:35 bubba [ 223.880634] Jul 21 11:26:35 bubba [ 223.880635] -> #0 Jul 21 11:26:35 bubba ((&fip->recv_work) Jul 21 11:26:35 bubba ){+.+.+.} Jul 21 11:26:35 bubba : Jul 21 11:26:35 bubba [ 223.881357] Jul 21 11:26:35 bubba [<ffffffff8106b93e>] __lock_acquire+0xb1d/0xe2c Jul 21 11:26:35 bubba [ 223.881695] Jul 21 11:26:35 bubba [<ffffffff8106c14a>] lock_acquire+0xd2/0xf7 Jul 21 11:26:35 bubba [ 223.882033] Jul 21 11:26:35 bubba [<ffffffff81053241>] wait_on_work+0x50/0xbd Jul 21 11:26:35 bubba [ 223.882378] Jul 21 11:26:35 bubba [<ffffffff81053b32>] __cancel_work_timer+0xb6/0xf4 Jul 21 11:26:35 bubba [ 223.882718] Jul 21 11:26:35 bubba [<ffffffff81053b8a>] cancel_work_sync+0xb/0xd Jul 21 11:26:35 bubba [ 223.883057] Jul 21 11:26:35 bubba [<ffffffffa00317e6>] fcoe_ctlr_destroy+0x1d/0x67 [libfcoe] Jul 21 11:26:35 bubba [ 223.883399] Jul 21 11:26:35 bubba [<ffffffffa003e51e>] fcoe_interface_release+0x21/0x45 [fcoe] Jul 21 11:26:35 bubba [ 223.883940] Jul 21 11:26:35 bubba [<ffffffff811fbbe6>] kref_put+0x43/0x4d Jul 21 11:26:35 bubba [ 223.884280] Jul 21 11:26:35 bubba [<ffffffffa003ebba>] fcoe_interface_put+0x17/0x19 [fcoe] Jul 21 11:26:35 bubba [ 223.884624] Jul 21 11:26:35 bubba [<ffffffffa003f2a6>] fcoe_interface_cleanup+0x188/0x193 [fcoe] Jul 21 11:26:35 bubba [ 223.885163] Jul 21 11:26:35 bubba [<ffffffffa003f303>] fcoe_destroy+0x52/0x72 [fcoe] Jul 21 11:26:35 bubba [ 223.885502] Jul 21 11:26:35 bubba [<ffffffffa00340a4>] fcoe_transport_destroy+0xab/0x110 [libfcoe] Jul 21 11:26:35 bubba [ 223.886045] Jul 21 11:26:35 bubba [<ffffffff81056153>] param_attr_store+0x43/0x62 Jul 21 11:26:35 bubba [ 223.886385] Jul 21 11:26:35 bubba [<ffffffff8105602d>] module_attr_store+0x21/0x25 Jul 21 11:26:35 bubba [ 223.886728] Jul 21 11:26:35 bubba [<ffffffff8114c23d>] sysfs_write_file+0x103/0x13f Jul 21 11:26:35 bubba [ 223.887068] Jul 21 11:26:35 bubba [<ffffffff810f3e7b>] vfs_write+0xa7/0xfa Jul 21 11:26:35 bubba [ 223.887406] Jul 21 11:26:35 bubba [<ffffffff810f4073>] sys_write+0x45/0x69 Jul 21 11:26:35 bubba [ 223.887742] Jul 21 11:26:35 bubba [<ffffffff815252bb>] system_call_fastpath+0x16/0x1b Jul 21 11:26:35 bubba [ 223.888083] Jul 21 11:26:35 bubba [ 223.888084] other info that might help us debug this: Jul 21 11:26:35 bubba [ 223.888085] Jul 21 11:26:35 bubba [ 223.888879] Possible unsafe locking scenario: Jul 21 11:26:35 bubba [ 223.888881] Jul 21 11:26:35 bubba [ 223.889411] CPU0 CPU1 Jul 21 11:26:35 bubba [ 223.889683] ---- ---- Jul 21 11:26:35 bubba [ 223.889955] lock( Jul 21 11:26:35 bubba rtnl_mutex Jul 21 11:26:35 bubba ); Jul 21 11:26:35 bubba [ 223.890349] lock( Jul 21 11:26:35 bubba (&fip->recv_work) Jul 21 11:26:35 bubba ); Jul 21 11:26:35 bubba [ 223.890751] lock( Jul 21 11:26:35 bubba rtnl_mutex Jul 21 11:26:35 bubba ); Jul 21 11:26:35 bubba [ 223.891154] lock( Jul 21 11:26:35 bubba (&fip->recv_work) Jul 21 11:26:35 bubba ); Jul 21 11:26:35 bubba [ 223.891549] Jul 21 11:26:35 bubba [ 223.891550] * DEADLOCK * Jul 21 11:26:35 bubba [ 223.891551] Jul 21 11:26:35 bubba [ 223.892347] 6 locks held by lockdeptest.sh/3464: Jul 21 11:26:35 bubba [ 223.892621] #0: Jul 21 11:26:35 bubba (&buffer->mutex Jul 21 11:26:35 bubba ){+.+.+.} Jul 21 11:26:35 bubba , at: Jul 21 11:26:35 bubba [<ffffffff8114c171>] sysfs_write_file+0x37/0x13f Jul 21 11:26:35 bubba [ 223.893359] #1: Jul 21 11:26:35 bubba (s_active Jul 21 11:26:35 bubba ){++++.+} Jul 21 11:26:35 bubba , at: Jul 21 11:26:35 bubba [<ffffffff8114c21c>] sysfs_write_file+0xe2/0x13f Jul 21 11:26:35 bubba [ 223.894094] #2: Jul 21 11:26:35 bubba (param_lock Jul 21 11:26:35 bubba ){+.+.+.} Jul 21 11:26:35 bubba , at: Jul 21 11:26:35 bubba [<ffffffff81056146>] param_attr_store+0x36/0x62 Jul 21 11:26:35 bubba [ 223.894835] #3: Jul 21 11:26:35 bubba (ft_mutex Jul 21 11:26:35 bubba ){+.+.+.} Jul 21 11:26:35 bubba , at: Jul 21 11:26:35 bubba [<ffffffffa0034017>] fcoe_transport_destroy+0x1e/0x110 [libfcoe] Jul 21 11:26:35 bubba [ 223.895574] #4: Jul 21 11:26:35 bubba (fcoe_config_mutex Jul 21 11:26:35 bubba ){+.+.+.} Jul 21 11:26:35 bubba , at: Jul 21 11:26:35 bubba [<ffffffffa003f2c9>] fcoe_destroy+0x18/0x72 [fcoe] Jul 21 11:26:35 bubba [ 223.896314] #5: Jul 21 11:26:35 bubba (rtnl_mutex Jul 21 11:26:35 bubba ){+.+.+.} Jul 21 11:26:35 bubba , at: Jul 21 11:26:35 bubba [<ffffffff813e8233>] rtnl_lock+0x12/0x14 Jul 21 11:26:35 bubba [ 223.897047] Jul 21 11:26:35 bubba [ 223.897048] stack backtrace: Jul 21 11:26:35 bubba [ 223.897578] Pid: 3464, comm: lockdeptest.sh Not tainted 3.0.0-rc7+ #1 Jul 21 11:26:35 bubba [ 223.897853] Call Trace: Jul 21 11:26:35 bubba [ 223.898128] [<ffffffff81068e16>] print_circular_bug+0x1f8/0x209 Jul 21 11:26:35 bubba [ 223.898416] [<ffffffff8106b93e>] __lock_acquire+0xb1d/0xe2c Jul 21 11:26:35 bubba [ 223.898699] [<ffffffff810531f1>] ? wait_on_cpu_work+0xe6/0xe6 Jul 21 11:26:35 bubba [ 223.898982] [<ffffffff8106c14a>] lock_acquire+0xd2/0xf7 Jul 21 11:26:35 bubba [ 223.899263] [<ffffffff810531f1>] ? wait_on_cpu_work+0xe6/0xe6 Jul 21 11:26:35 bubba [ 223.899547] [<ffffffff8104a097>] ? mod_timer+0x8f/0x98 Jul 21 11:26:35 bubba [ 223.899827] [<ffffffff81053241>] wait_on_work+0x50/0xbd Jul 21 11:26:35 bubba [ 223.900108] [<ffffffff810531f1>] ? wait_on_cpu_work+0xe6/0xe6 Jul 21 11:26:35 bubba [ 223.900390] [<ffffffff81053b32>] __cancel_work_timer+0xb6/0xf4 Jul 21 11:26:35 bubba [ 223.900671] [<ffffffff81053b8a>] cancel_work_sync+0xb/0xd Jul 21 11:26:35 bubba [ 223.900953] [<ffffffffa00317e6>] fcoe_ctlr_destroy+0x1d/0x67 [libfcoe] Jul 21 11:26:35 bubba [ 223.901237] [<ffffffffa003e51e>] fcoe_interface_release+0x21/0x45 [fcoe] Jul 21 11:26:35 bubba [ 223.901522] [<ffffffffa003e4fd>] ? fcoe_enable+0x6b/0x6b [fcoe] Jul 21 11:26:35 bubba [ 223.901803] [<ffffffff811fbbe6>] kref_put+0x43/0x4d Jul 21 11:26:35 bubba [ 223.902083] [<ffffffffa003ebba>] fcoe_interface_put+0x17/0x19 [fcoe] Jul 21 11:26:35 bubba [ 223.902367] [<ffffffffa003f2a6>] fcoe_interface_cleanup+0x188/0x193 [fcoe] Jul 21 11:26:35 bubba [ 223.902653] [<ffffffff8151dd36>] ? mutex_lock_nested+0x3b/0x40 Jul 21 11:26:35 bubba [ 223.902939] [<ffffffffa003f303>] fcoe_destroy+0x52/0x72 [fcoe] Jul 21 11:26:35 bubba [ 223.903223] [<ffffffffa00340a4>] fcoe_transport_destroy+0xab/0x110 [libfcoe] Jul 21 11:26:35 bubba [ 223.903508] [<ffffffff81056153>] param_attr_store+0x43/0x62 Jul 21 11:26:35 bubba [ 223.903792] [<ffffffff8105602d>] module_attr_store+0x21/0x25 Jul 21 11:26:35 bubba [ 223.904075] [<ffffffff8114c23d>] sysfs_write_file+0x103/0x13f Jul 21 11:26:35 bubba [ 223.904357] [<ffffffff810f3e7b>] vfs_write+0xa7/0xfa Jul 21 11:26:35 bubba [ 223.904642] [<ffffffff810f51d6>] ? fget_light+0x35/0x96 Jul 21 11:26:35 bubba [ 223.904923] [<ffffffff810f4073>] sys_write+0x45/0x69 Jul 21 11:26:35 bubba [ 223.905204] [<ffffffff815252bb>] system_call_fastpath+0x16/0x1b Jul 21 11:26:36 bubba [ 223.964438] ixgbe 0000:05:00.0: eth3: detected SFP+: 5 Jul 21 11:26:37 bubba [ 225.196702] ixgbe 0000:05:00.0: eth3: NIC Link is Up 10 Gbps, Flow Control: None Signed-off-by: Robert Love <robert.w.love@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Reviewed-by: Yi Zou <yi.zou@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] qla2xxx: Update version number to 8.03.07.07-k.	Chad Dupuis	2011-08-27	1	-1/+1
\| \| \| \| \| \| \| \|	Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] qla2xxx: Set the task attributes after memsetting fcp cmnd.	Saurav Kashyap	2011-08-27	1	-10/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The memset of the fcp_cmnd struct needs to be moved so that it will not zero-out valid data. Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com> Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] qla2xxx: Correct inadvertent loop state transitions during ↵	Andrew Vasquez	2011-08-27	2	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	port-update handling. Transitioning to a LOOP_UPDATE loop-state could cause the driver to miss normal link/target processing. LOOP_UPDATE is a crufty artifact leftover from at time the driver performed it's own internal command-queuing. Safely remove this state. Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] qla2xxx: Save and restore irq in the response queue interrupt handler.	Saurav Kashyap	2011-08-27	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com> Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] qla2xxx: Double check for command completion if abort mailbox command ↵	Chad Dupuis	2011-08-27	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	fails. Close a small window where we could falsely fail an abort request if the mailbox command fails but the command was returned during interrupt context. Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] qla2xxx: Acquire hardware lock while manipulating dsd list.	Saurav Kashyap	2011-08-27	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The dsd list shouldn't be manipulated without taking the per host hardware lock to prevent multiple callers from trampling upon one another. Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com> Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] qla2xxx: Fix qla24xx revision check while enabling interrupts.	Chad Dupuis	2011-08-27	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since we enable interrupts before initializing the firmware, use the chip revision from PCI config space directly to perform the chip revision check. Also remove the unnecessary firmware attributes test. Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] qla2xxx: T10 DIF - Fix incorrect error reporting.	Arun Easi	2011-08-27	8	-64/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fix: - Disables app tag peeking; correct tag check will be added when the SCSI API is available. - Always derive ref_tag from scsi_get_lba() - Removes incorrect swap of FCP_LUN in FCP_CMND - Moves app-tag error check before ref-tag check. The reason being, currently there is no interface in SCSI to retrieve the app-tag for protection I/Os, so driver puts zero for app-tag in the firmware interface, but requests not to validate it, but when a ref-tag error is detected by firmware, it would put expected/actual tags for all the protection tags (guard/app/ref). As driver checks for app tag error first, a ref-tag error is incorrectly flagged as app-tag error. - Convert HBA specific checks to capability based. Signed-off-by: Arun Easi <arun.easi@qlogic.com> Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] qla2xxx: T10 DIF - Handle uninitalized sectors.	Arun Easi	2011-08-27	6	-38/+335
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Driver needs to update protection bytes for uninitialized sectors as they are not DMA-d. Signed-off-by: Arun Easi <arun.easi@qlogic.com> Reviewed-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] hpsa: fix physical device lun and target numbering problem	Stephen M. Cameron	2011-08-26	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a physical device exposed to the OS by hpsa is replaced (e.g. one hot plug tape drive is replaced by another, or a tape drive is placed into "OBDR" mode in which it acts like a CD-ROM device) and a rescan is initiated, the replaced device will be added to the SCSI midlayer with target and lun numbers set to -1. After that, a panic is likely to ensue. When a physical device is replaced, the lun and target number should be preserved. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Cc: stable@kernel.org Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] hpsa: fix problem that OBDR devices are not detected	Stephen M. Cameron	2011-08-26	1	-20/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The test to detect OBDR ("One Button Disaster Recovery") cd-rom devices was comparing against uninitialized data. Fixed by moving the test for the device to where the inquiry data is collected, and uninitialized variable altogether as it wasn't really being used. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Cc: stable@kernel.org Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] isci: add version number	Dan Williams	2011-08-24	1	-1/+10
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] isci: fix event-get pointer increment	Dan Williams	2011-08-24	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Hardware only increments the put pointer on event types >= 4. Do not increment the get pointer for event type 3. Reported-by: Kapil Karkra <kapil.karkra@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] isci: dynamic interrupt coalescing	Dan Williams	2011-08-24	2	-1/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Hardware allows both an outstanding number commands and a timeout value (whichever occurs first) as a gate to the next interrupt generation. This scheme at completion time looks at the remaining number of outstanding tasks and sets the timeout to maximize small transaction operation. If transactions are large (take more than a few 10s of microseconds to complete) then performance is not interrupt processing bound, so the small timeouts this scheme generates are overridden by the time it takes for a completion to arrive. Tested-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] isci: Leave requests alone if already terminating.	Jeff Skirvin	2011-08-24	1	-2/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of immediately completing any request that has a second termination call made on it, wait for the TC done/abort HW event. Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] isci: Adding documentation to API change and fixup sysfs registration	Dave Jiang	2011-08-24	1	-18/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Adding API update for adding isci_id entry scsi_host sysfs entry. Also fixing up the sysfs registration to the scsi_host template Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] isci: change sas phy timeouts from 54us to 59us	Marcin Tomczak	2011-08-24	2	-0/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Need the following workaround in the driver for interoperability with the older Intel SSD drives and any other SATA drive that may exhibit the same behavior. This is a corner case where SCU speed is limited to either 3G or 1.5G and the drive has a period of DC idle when it switches speed during SATA speed negotiation. Workaround :change PHYTOV[31:24] from 0x36 to 0x3B. Signed-off-by: Marcin Tomczak <marcin.tomczak@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] isci: fix 32-bit operation when CONFIG_HIGHMEM64G=n	Dan Williams	2011-08-24	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The unsolicited frame control infrastructure requires a table of dma addresses for the hardware to lookup the frame buffer location by an index. The hardware expects the elements of this table to be 64-bit quantities, so we cannot reference these elements as dma_addr_t. All unsolicited frame protocols are affected, particularly SATA-PIO and SMP which prevented direct-attached SATA drives and expander-attached drives to not be discovered. Cc: <stable@kernel.org> Reported-by: Jacek Danecki <jacek.danecki@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] isci: fix sata response handling	Dan Williams	2011-08-24	1	-12/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A bug (likely copy/paste) that has been carried from the original implementation. The unsolicited frame handling structure returns the d2h fis in the isci_request.stp.rsp buffer. Cc: <stable@kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* \|	scsi: qla4xxx driver depends on NET	Randy Dunlap	2011-09-11	1	-1/+1
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When CONFIG_NET is disabled, SCSI_QLA_ISCSI selects SCSI_ISCSI_ATTRS, which uses network interfaces, so the build fails with multiple errors: warning: (ISCSI_TCP && SCSI_CXGB3_ISCSI && SCSI_CXGB4_ISCSI && SCSI_QLA_ISCSI && INFINIBAND_ISER) selects SCSI_ISCSI_ATTRS which has unmet direct dependencies (SCSI && NET) ERROR: "skb_trim" [drivers/scsi/scsi_transport_iscsi.ko] undefined! ERROR: "netlink_kernel_create" [drivers/scsi/scsi_transport_iscsi.ko] undefined! ERROR: "netlink_kernel_release" [drivers/scsi/scsi_transport_iscsi.ko] undefined! ... so make SCSI_QLA_ISCSI also depend on NET to prevent the build errors. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Cc: Ravi Anand <ravi.anand@qlogic.com> Cc: Vikas Chaudhary <vikas.chaudhary@qlogic.com> Cc: iscsi-driver@qlogic.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6	Linus Torvalds	2011-07-30	90	-4487/+11870
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (71 commits) [SCSI] fcoe: cleanup cpu selection for incoming requests [SCSI] fcoe: add fip retry to avoid missing critical keep alive [SCSI] libfc: fix warn on in lport retry [SCSI] libfc: Remove the reference to FCP packet from scsi_cmnd in case of error [SCSI] libfc: cleanup sending SRR request [SCSI] libfc: two minor changes in comments [SCSI] libfc, fcoe: ignore rx frame with wrong xid info [SCSI] libfc: release exchg cache [SCSI] libfc: use FC_MAX_ERROR_CNT [SCSI] fcoe: remove unused ptype field in fcoe_rcv_info [SCSI] bnx2fc: Update copyright and bump version to 1.0.4 [SCSI] bnx2fc: Tx BDs cache in write tasks [SCSI] bnx2fc: Do not arm CQ when there are no CQEs [SCSI] bnx2fc: hold tgt lock when calling cmd_release [SCSI] bnx2fc: Enable support for sequence level error recovery [SCSI] bnx2fc: HSI changes for tape [SCSI] bnx2fc: Handle REC_TOV error code from firmware [SCSI] bnx2fc: REC/SRR link service request and response handling [SCSI] bnx2fc: Support 'sequence cleanup' task [SCSI] dh_rdac: Associate HBA and storage in rdac_controller to support partitions in storage ...
\| *	[SCSI] fcoe: cleanup cpu selection for incoming requests	Vasu Dev	2011-07-28	1	-30/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Cleanup to: - have selection for all types of frames, not just FCP. - remove redundant cpu_online check once fcoe_select_cpu called as this is not required since later code flow check for offlined cpu. - Simplify fcoe_select_cpu() by removing unnecessary checks to skip curr_cpu, this also fixes possibly infinite loop in case of curr_cpu is the only cpu while iterating in the loop. This cleanup mainly applies to target as incoming request are mostly for target, therefore Kiran has verified the patch with target also. Signed-off-by: Vasu Dev <vasu.dev@intel.com> Tested-by: Kiran Patil <kiran.patil@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] fcoe: add fip retry to avoid missing critical keep alive	Vasu Dev	2011-07-28	1	-6/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use pending queue to retry FIP frame in case its tx fails and use common pending queue for both fcoe and fip frames using fcoe_port_send. Signed-off-by: Vasu Dev <vasu.dev@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] libfc: fix warn on in lport retry	Vasu Dev	2011-07-28	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The lport retry timer hits warn on in case it has become ready in response from fip login from fcoe_ctlr_flogi_send(), this is possible but safe code path, therefore removing this warn on. Jun 22 03:16:30 10.0.16.6 [488198.316517] host3: Assigned Port ID 180f02 Jun 22 03:16:32 10.0.16.6 [488200.091561] ------------[ cut here ]------------ Jun 22 03:16:32 10.0.16.6 [488200.091586] WARNING: at drivers/scsi/libfc/fc_lport.c:1355 fc_lport_timeout+0xd9/0xe0 [libfc]() Signed-off-by: Vasu Dev <vasu.dev@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] libfc: Remove the reference to FCP packet from scsi_cmnd in case of error	Neerav Parikh	2011-07-28	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	fc_queuecommand() allocates an FCP packet for each SCSI command and sends it out on the wire. In the process it stores the reference to the FCP packet in the scsi_cmnd structure. Now, in case under stress testing the libfc exchange layer runs out of exchanges the fc_queuecommand() may not be able to send out commands out on the wire. In such a scenario if there is an error in sending the FCP packet out the wire; fc_queuecommand() deletes the FCP packet from internal queue, releases the FCP packet and returns a SCSI_MLQUEUE_HOST_BUSY status to the scsi-ml. But, the reference to the FCP packet set in the scsi_cmnd is not removed from the scsi_cmnd in this code path. This might lead to a crash under stress testing where the scsi_cmnd failed by fc_queuecommand() comes up to fc_eh_abort() via scsi eh thread. fc_eh_abort() will get reference to the FCP packet to be aborted from the scsi_cmnd for further FCP abort related processing and then try to release the FCP packet that has already been released. This patch removes the FCP packet reference from the scsi_cmnd before returning back from fc_queuecommand() in case of an error in sending out the FCP packet. Signed-off-by: Neerav Parikh <Neerav.Parikh@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] libfc: cleanup sending SRR request	Hillf Danton	2011-07-28	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The variable on stack, namely cdb_op, is not used but removed. [ Patch reworked by Robert Love due to invalid patch format ] Signed-off-by: Hillf Danton <dhillf@gmail.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] libfc: two minor changes in comments	Hillf Danton	2011-07-28	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	One change is to cleanup typo in comment for fc_fcp_recv(), another corrects the misleading comment for fc_fcp_abts_resp(). [ Patch reworked by Robert Love due to invalid patch format ] Signed-off-by: Hillf Danton <dhillf@gmail.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] libfc, fcoe: ignore rx frame with wrong xid info	Vasu Dev	2011-07-28	2	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Drop the rx frame having xid with wrong cpu info or received with xid not matching to our xid. Not dropping such frame is causing panic as that causes accessing data struct beyond their bounds. Signed-off-by: Vasu Dev <vasu.dev@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] libfc: release exchg cache	Hillf Danton	2011-07-28	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If fail to create workqueue, the newly created cache for exchg has to be released. Signed-off-by: Hillf Danton <dhillf@gmail.com> Reviewed-by: Vasu Dev <vasu.dev@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] libfc: use FC_MAX_ERROR_CNT	Hillf Danton	2011-07-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Though defined, FC_MAX_ERROR_CNT is not used. It is used now for CRC error in the path of receiving FCP frame. Signed-off-by: Hillf Danton <dhillf@gmail.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] fcoe: remove unused ptype field in fcoe_rcv_info	Yi Zou	2011-07-28	2	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is no need to cache the ptype in fcoe_rcv_info struct as it is never used anywhere. Signed-off-by: Yi Zou <yi.zou@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] bnx2fc: Update copyright and bump version to 1.0.4	Bhanu Prakash Gollapudi	2011-07-28	6	-8/+8
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] bnx2fc: Tx BDs cache in write tasks	Bhanu Prakash Gollapudi	2011-07-28	1	-7/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When there is a single BD for the entire data to be transmitted, use the BD inside the SGL context and set the cached SGE indication in the task context Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
\| *	[SCSI] bnx2fc: Do not arm CQ when there are no CQEs	Bhanu Prakash Gollapudi	2011-07-28	1	-2/+4
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>