summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* nfs: count DIO good bytes correctly with mirroringPeng Tao2015-02-031-4/+8
| | | | | | | | | | | | | When resending to MDS, we might resend multiple mirroring requests to MDS. As a result, nfs_direct_good_bytes() ends up counting bytes multiple times, causing application to get wrong return results in read/write syscalls. Fix it by tracking start of a dreq and checking the range of pgio header. Cc: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Peng Tao <tao.peng@primarydata.com>
* nfs41: wait for LAYOUTRETURN before retrying LAYOUTGETPeng Tao2015-02-033-3/+45
| | | | | | Also take care to stop waiting if someone clears retry bit. Signed-off-by: Peng Tao <tao.peng@primarydata.com>
* nfs: add a helper to set NFS_ODIRECT_RESCHED_WRITES to direct writesPeng Tao2015-02-032-0/+7
| | | | | | To allow pnfs LD to ask direct writes to be resend. Signed-off-by: Peng Tao <tao.peng@primarydata.com>
* nfs41: add NFS_LAYOUT_RETRY_LAYOUTGET to layout header flagsPeng Tao2015-02-032-0/+21
| | | | | | | | | | | | Use it to indicate that LD wants to retry layoutget. LD can set it whenever it wants the common pnfs code to return and retry pnfs path through a new layout. The bit gets cleared when client does a new layoutget, when client closes the file (ROC case), or when kernel needs to evict the inode (non-ROC case). Signed-off-by: Peng Tao <tao.peng@primarydata.com>
* nfs/flexfiles: send layoutreturn before freeing lsegPeng Tao2015-02-031-25/+56
| | | | | | | | | | | | Otherwise we'll lose error tracking information when encoding layoutreturn. pnfs_put_lseg may be called from rpc callbacks. So we should not call pnfs_send_layoutreturn directly because it can deadlock in the rpc layer. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <loghyr@primarydata.com>
* nfs41: introduce NFS_LAYOUT_RETURN_BEFORE_CLOSEPeng Tao2015-02-033-7/+36
| | | | | | | | | | | | | | | When it is set, generic pnfs would try to send layoutreturn right before last close/delegation_return regard less NFS_LAYOUT_ROC is set or not. LD can then make sure layoutreturn is always sent rather than being omitted. The difference against NFS_LAYOUT_RETURN is that NFS_LAYOUT_RETURN_BEFORE_CLOSE does not block usage of the layout so LD can set it and expect generic layer to try pnfs path at the same time. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <loghyr@primarydata.com>
* nfs41: allow async version layoutreturnPeng Tao2015-02-033-8/+16
| | | | | Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <loghyr@primarydata.com>
* nfs41: add range to layoutreturn argsPeng Tao2015-02-033-5/+7
| | | | | | | So that callers can specify which range to return. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <loghyr@primarydata.com>
* pnfs: allow LD to ask to resend read through pnfsPeng Tao2015-02-032-1/+16
| | | | | | | | | If current IO cannot be completed due to some transient errors, LD may want to ask generic layer to resend the request through pnfs again. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <loghyr@primarydata.com>
* nfs: add nfs_pgio_current_mirror helperPeng Tao2015-02-034-14/+24
| | | | | | | | | | Let it return current nfs_pgio_mirror in use depending on pg_mirror_count. For read, we always use pg_mirrors[0], so this effectively gives us freedom to use pg_mirror_idx to track the actual mirror to read from through out the IO stack. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <loghyr@primarydata.com>
* nfs: only reset desc->pg_mirror_idx when mirroring is supportedPeng Tao2015-02-032-4/+11
| | | | | | | | | so that we don't reset desc->pg_mirror_idx for read unnecessarily. Remove WARN_ON_ONCE from __nfs_pageio_add_request to allow LD to set pg_mirror_idx for read where pg_mirror_count is always 1. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <loghyr@primarydata.com>
* nfs41: add a debug warning if we destroy an unempty layoutPeng Tao2015-02-031-0/+2
| | | | | | | So that we can detect the case if some layout segments are still pinned which is surely a bug that we need to fix. Signed-off-by: Peng Tao <tao.peng@primarydata.com>
* pnfs: fail comparison when bucket verifier not setWeston Andros Adamson2015-02-031-1/+5
| | | | | | | | This skips the WARN_ON_ONCE, but doesnt change behavior (the memcmp would fail). Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>
* nfs: mirroring support for direct ioWeston Andros Adamson2015-02-031-14/+57
| | | | | | | | The current mirroring code only notices short writes to the first mirror. This patch keeps per-mirror byte counts and only considers a byte to be written once all mirrors report so. Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
* nfs: add mirroring support to pgio layerWeston Andros Adamson2015-02-039-67/+311
| | | | | | | | | | | This patch adds mirrored write support to the pgio layer. The default is to use one mirror, but pgio callers may define callbacks to change this to any value up to the (arbitrarily selected) limit of 16. The basic idea is to break out members of nfs_pageio_descriptor that cannot be shared between mirrored DSes and put them in a new structure. Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
* pnfs: pass ds_commit_idx through the commit pathWeston Andros Adamson2015-02-036-17/+24
| | | | | | | | | | Pass ds_commit_idx through the nfs commit path. It's used to select the commit bucket when using pnfs and is ignored when not using pnfs. Several functions had to be changed: nfs_retry_commit, nfs_mark_request_commit, pnfs_mark_request_commit and the pnfs layout driver .mark_request_commit functions. Signed-off-by: Tom Haynes <loghyr@primarydata.com>
* nfs: rename pgio header ds_idx to ds_commit_idxWeston Andros Adamson2015-02-033-11/+9Star
| | | | | | | 'ds_commit_idx' is a better name - it is used to select the right commit bucket for pnfs. Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
* nfs: handle overlapping reqs in lock_and_joinWeston Andros Adamson2015-02-031-6/+11
| | | | | | | This is needed for mirrored DS support, where multuple requests cover the same range. Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
* pnfs: release lseg in pnfs_generic_pg_cleanupWeston Andros Adamson2015-02-035-18/+21
| | | | | | | | This is needed to support mirrored writes - the first write can't just trash the lseg, we need to keep it around until all mirrors have written. Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
* nfs: introduce pg_cleanup op for pgio descriptorsWeston Andros Adamson2015-02-032-1/+5
| | | | | | Add a new operation to nfs_pageio_ops that is called on nfs_pageio_complete. Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
* nfs/filelayout: use pnfs_error_mark_layout_for_returnPeng Tao2015-02-032-11/+1Star
| | | | | | | | | | | | Instead of calling layoutreturn directly, call pnfs_error_mark_layout_for_return to mark layouts for return and let generic code return layout when layout segments are freed. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com> Conflicts: fs/nfs/filelayout/filelayout.c
* nfs41: clear NFS_LAYOUT_RETURN if layoutreturn is sent or failed to sendPeng Tao2015-02-032-0/+6
| | | | | | | So that pnfs path is not disabled for ever. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>
* nfs41: send layoutreturn in last put_lsegPeng Tao2015-02-031-1/+37
| | | | | | | | If current lseg is the last lseg marked with NFS_LSEG_LAYOUTRETURN, send layoutreturn. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>
* nfs41: don't use a layout if it is marked for returningPeng Tao2015-02-033-5/+20
| | | | | | | | And if we are to return the same type of layouts, don't bother sending more layoutgets. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>
* nfs41: add a helper to mark layout for returnPeng Tao2015-02-032-0/+59
| | | | | | | | | | | | | It marks all matching layout segments as NFS_LSEG_LAYOUTRETURN, which is an indicator for pnfs_put_lseg() to send layoutreturn, and also prevents pnfs_update_layout() from using the returning segments. Once it is set, it never gets cleared. It also sets proper io failure bit so that pnfs path can be retried after PNFS_LAYOUTGET_RETRY_TIMEOUT second. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>
* nfs41: make a helper function to send layoutreturnPeng Tao2015-02-031-20/+33
| | | | | | | It allows to specify different iomode to return. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>
* nfs41: pass iomode through layoutreturn argsPeng Tao2015-02-033-1/+3
| | | | | | | So that it is possible to return a specific iomode layouts. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>
* nfs: save server READ/WRITE/COMMIT statusPeng Tao2015-02-034-3/+15
| | | | | | | Flexfiles layout would want to use them to report DS IO status. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>
* nfs41: serialize first layoutget of a filePeng Tao2015-02-032-4/+32
| | | | | | | | | | | | | | | | | Per RFC 5661 Errata 3208: | A client MAY always forget its layout state and associated | layout stateid at any time (See also section 12.5.5.1). | In such case, the client MUST use a non-layout stateid for the next | LAYOUTGET operation. This will signal the server that the client has | no more layouts on the file and its respective layout state can be | released before issuing a new layout in response to LAYOUTGET. In order to make such a signal unique to server, client needs to serialize all layoutgets using non-layout stateid. We implement this by serializing layoutgets when client has no layout segments at hand. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>
* nfs41: close a small race window when adding new layout to global listPeng Tao2015-02-031-5/+3Star
| | | | | Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>
* nfs/flexclient: export pnfs_layoutcommit_inodePeng Tao2015-02-031-0/+1
| | | | | | flexfiles needs to start layoutcommit when necessary Signed-off-by: Peng Tao <tao.peng@primarydata.com>
* nfs: set hostname when creating nfsv3 ds connectionPeng Tao2015-02-031-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | lockd assumes hostname exists otherwise kernel oops. It can be reproduced by following steps: 1. mount flexfile MDS 2. write some files 3. mount DS via nfsv3 BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff8134f332>] strlen+0x2/0x20 PGD 0 Oops: 0000 [#1] SMP Modules linked in: nfsd(F) nfs_layout_flexfiles(F) rpcsec_gss_krb5(F) auth_rpcgss(F) nfsv4(F) dns_resolver(F) nfsv3(F) nfs_acl(F) nfs(F) lockd(F) sunrpc(F) fscache(F) ebtable_nat(F) nf_conntrack_netbios_ns(F) nf_conntrack_broadcast(F) ipt_MASQUERADE(F) ip6table_nat(F) nf_nat_ipv6(F) ip6table_mangle(F) ip6t_REJECT(F) nf_conntrack_ipv6(F) nf_defrag_ipv6(F) iptable_nat(F) nf_nat_ipv4(F) nf_nat(F) iptable_mangle(F) nf_conntrack_ipv4(F) nf_defrag_ipv4(F) xt_conntrack(F) nf_conntrack(F) ebtable_filter(F) ebtables(F) ip6table_filter(F) ip6_tables(F) bnep(F) snd_ens1371(F) snd_rawmidi(F) snd_ac97_codec(F) btusb(F) ac97_bus(F) snd_seq(F) snd_seq_device(F) snd_pcm(F) ppdev(F) bluetooth(F) 6lowpan_iphc(F) rfkill(F) vmw_balloon(F) snd_timer(F) snd(F) soundcore(F) gameport(F) i2c_piix4(F) e1000(F) vmw_vmci(F) parport_pc(F) parport(F) shpchp(F) uinput(F) xfs(F) libcrc32c(F) vmwgfx(F) ttm(F) drm(F) mptspi(F) scsi_transport_spi(F) mptscsih(F) mptbase(F) i2c_core(F) CPU: 0 PID: 10397 Comm: mount.nfs Tainted: GF 3.14.7-100.pd_client.001.fc16.x86_64 #1 Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/31/2013 task: ffff880008942600 ti: ffff880007990000 task.ti: ffff880007990000 RIP: 0010:[<ffffffff8134f332>] [<ffffffff8134f332>] strlen+0x2/0x20 RSP: 0018:ffff880007991aa0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff880038d39c20 RCX: 0000000000000004 RDX: 0000000000000006 RSI: 0000000000000010 RDI: 0000000000000000 RBP: ffff880007991b38 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000014600 R11: 0000000000000400 R12: ffffffff81cc8580 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000004 FS: 00007f90cd2ef880(0000) GS:ffff88003f600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001710000 CR4: 00000000001407f0 Stack: ffffffffa045f52c ffff880001782230 ffff880004141e28 0006880007991ac8 ffffffff816dc14b ffff880000000000 ffff880038d39c20 0000000000000010 0000000481cc0006 0000000000000000 ffffffffa0410be8 000000000000c014 Call Trace: [<ffffffffa045f52c>] ? nlmclnt_lookup_host+0x4c/0x2c0 [lockd] [<ffffffff816dc14b>] ? _raw_spin_unlock_bh+0x1b/0x20 [<ffffffffa0410be8>] ? svc_destroy+0xb8/0x140 [sunrpc] [<ffffffffa045c323>] nlmclnt_init+0x53/0xc0 [lockd] [<ffffffffa047d2dc>] ? nfs_get_client+0x1cc/0x340 [nfs] [<ffffffffa047c2e7>] nfs_start_lockd+0xa7/0xd0 [nfs] [<ffffffffa047df71>] nfs_create_server+0x181/0x5c0 [nfs] [<ffffffffa04460f3>] nfs3_create_server+0x13/0x30 [nfsv3] [<ffffffffa048a0bc>] nfs_try_mount+0x21c/0x300 [nfs] [<ffffffff811ca32d>] ? __kmalloc_track_caller+0x1ad/0x240 [<ffffffffa048b677>] ? nfs_fs_mount+0xc37/0xd80 [nfs] [<ffffffffa048ad05>] nfs_fs_mount+0x2c5/0xd80 [nfs] [<ffffffffa048a830>] ? nfs_clone_super+0x140/0x140 [nfs] [<ffffffffa048a240>] ? nfs_clone_sb_security+0x40/0x40 [nfs] [<ffffffff811e7e43>] mount_fs+0x43/0x1b0 [<ffffffff81193100>] ? __alloc_percpu+0x10/0x20 [<ffffffff812026e6>] vfs_kern_mount+0x76/0x120 [<ffffffff81204917>] do_mount+0x237/0xa80 Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* sunrpc: add rpc_count_iostats_idxWeston Andros Adamson2015-02-032-7/+21
| | | | | | | | | | | Add a call to tally stats for a task under a different statsidx than what's contained in the task structure. This is needed to properly account for pnfs reads/writes when the DS nfs version != the MDS version. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>
* NFSv4.1/NFSv3: Add pNFS callbacks for nfs3_(read|write|commit)_done()Trond Myklebust2015-02-031-0/+9
| | | | | | | Enable pNFS callbacks to allow flex files to work correctly with a NFSv3-enabled data server. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* nfs: allow to specify cred in nfs_initiate_pgioPeng Tao2015-02-033-11/+14
| | | | | | | | so that flexfile layout client can pass in DS credential instead of using user cred, which will be done in the next patch. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>
* nfs4: export nfs4_sequence_donePeng Tao2015-02-032-4/+7
| | | | | Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>
* nfs4: pass slot table to nfs40_setup_sequencePeng Tao2015-02-032-11/+17
| | | | | | | flexclient needs this as there is no nfs_server to DS connection. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>
* nfs: allow different protocol in nfs_initiate_commitPeng Tao2015-02-034-4/+7
| | | | | | | | pnfs flexfile layout client may want to use NFSv3 ops rather than the default MDS v4 ops. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>
* pnfs: Add nfs_rpc_ops in calls to nfs_initiate_pgioTom Haynes2015-02-036-8/+13
| | | | Signed-off-by: Tom Haynes <loghyr@primarydata.com>
* nfs41: create NFSv3 DS connection if specifiedPeng Tao2015-02-033-5/+93
| | | | | Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>
* nfs41: allow LD to choose DS connection version/minor_versionPeng Tao2015-02-035-8/+14
| | | | | | | flexfile layout may need to set such when making DS connections. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>
* nfsv3: introduce nfs3_set_ds_clientPeng Tao2015-02-035-5/+46
| | | | | | | | The flexfiles layout wants to create DS connection over NFSv3. Add nfs3_set_ds_client to allow that to happen. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>
* nfs41: move file layout macros to generic pnfsPeng Tao2015-02-032-10/+11
| | | | | | | | | | | | | They can be reused by flexfile layout as well. Also add a code such that if read fails on one DS and there are other DSes available to use, don't resend through MDS but through pNFS so that client can read from other DSes. Reviewed-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>
* nfs41: allow LD to choose DS connection auth flavorPeng Tao2015-02-035-9/+14
| | | | | | | | flexfile layout may use different auth flavor as specified by MDS. Reviewed-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>
* nfs41: pull nfs4_ds_connect from file layout to generic pnfsPeng Tao2015-02-033-73/+89
| | | | | | | | It can be reused by flexfiles layout client. Reviewed-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>
* nfs41: pull decode_ds_addr from file layout to generic pnfsPeng Tao2015-02-033-150/+154
| | | | | | | | It can be reused by flexfile layout. Reviewed-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>
* nfs41: pull data server cache from file layout to generic pnfsPeng Tao2015-02-034-252/+265
| | | | | | | | | | Also pull nfs4_pnfs_ds_addr and nfs4_pnfs_ds to generic pnfs. They can all be reused by flexfile layout as well. Reviewed-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>
* pnfs: Do not grab the commit_info lock twice when rescheduling writesTom Haynes2015-02-033-26/+23Star
| | | | | Acked-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: Tom Haynes <loghyr@primarydata.com>
* pnfs: Prepare for flexfiles by pulling out common codeTom Haynes2015-02-036-290/+330
| | | | | | | | The flexfilelayout driver will share some common code with the filelayout driver. This set of changes refactors that common code out to avoid any module depenencies. Signed-off-by: Tom Haynes <loghyr@primarydata.com>
* Linux 3.19-rc5Linus Torvalds2015-01-181-1/+1
|