summaryrefslogtreecommitdiffstats
path: root/fs/nfs
Commit message (Collapse)AuthorAgeFilesLines
* NFSv4.1: Fix the CREATE_SESSION slot number accountingTrond Myklebust2016-09-111-2/+10
| | | | | | | | | | Ensure that we conform to the algorithm described in RFC5661, section 18.36.4 for when to bump the sequence id. In essence we do it for all cases except when the RPC call timed out, or in case of the server returning NFS4ERR_DELAY or NFS4ERR_STALE_CLIENTID. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org
* pNFS: Don't forget the layout stateid if there are outstanding LAYOUTGETsTrond Myklebust2016-09-041-1/+2
| | | | | | | | | | | | | | | | | | | | | | | If there are outstanding LAYOUTGET rpc calls, then we want to ensure that we keep the layout stateid around so we that don't inadvertently pick up an old/misordered sequence id. The race is as follows: Client Server ====== ====== LAYOUTGET(seqid) LAYOUTGET(seqid) return LAYOUTGET(seqid+1) return LAYOUTGET(seqid+2) process LAYOUTGET(seqid+2) forget layout process LAYOUTGET(seqid+1) If it forgets the layout stateid before processing seqid+1, then the client will not check the layout->plh_barrier, and so will set the stateid with seqid+1. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* pNFS: Clear out all layout segments if the server unsets lrp->res.lrs_presentTrond Myklebust2016-09-031-3/+6
| | | | | | | | If the server fails to set lrp->res.lrs_present in the LAYOUTRETURN reply, then that means it believes the client holds no more layout state for that file, and that the layout stateid is now invalid. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* pNFS: Fix pnfs_set_layout_stateid() to clear NFS_LAYOUT_INVALID_STIDTrond Myklebust2016-09-031-17/+19
| | | | | | | If the layout was marked as invalid, we want to ensure to initialise the layout header fields correctly. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* pNFS: Ensure LAYOUTGET and LAYOUTRETURN are properly serialisedTrond Myklebust2016-09-031-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | According to RFC5661, the client is responsible for serialising LAYOUTGET and LAYOUTRETURN to avoid ambiguity. Consider the case where we send both in parallel. Client Server ====== ====== LAYOUTGET(seqid=X) LAYOUTRETURN(seqid=X) LAYOUTGET return seqid=X+1 LAYOUTRETURN return seqid=X+2 Process LAYOUTRETURN Forget layout stateid Process LAYOUTGET Set seqid=X+1 The client processes the layoutget/layoutreturn in the wrong order, and since the result of the layoutreturn was to clear the only existing layout segment, the client forgets the layout stateid. When the LAYOUTGET comes in, it is treated as having a completely new stateid, and so the client sets the wrong sequence id... Fix is to check if there are outstanding LAYOUTGET requests before we send the LAYOUTRETURN (note that LAYOUGET will already wait if it sees an outstanding LAYOUTRETURN). Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org # v4.5+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* NFS: Fix error reporting in nfs_file_write()Trond Myklebust2016-09-031-1/+4
| | | | | | | | | When doing O_DSYNC writes, the actual write errors are reported through generic_write_sync(), so we must test the result. Reported-by: J. R. Okajima <hooanon05g@gmail.com> Fixes: 18290650b1c8 ("NFS: Move buffered I/O locking into nfs_file_write()") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* NFSv4.x: Fix a refcount leak in nfs_callback_up_netTrond Myklebust2016-08-301-0/+1
| | | | | | | | On error, the callers expect us to return without bumping nn->cb_users[]. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org # v3.7+
* NFS4: Avoid migration loopsBenjamin Coddington2016-08-301-0/+5
| | | | | | | | | | | If a server returns itself as a location while migrating, the client may end up getting stuck attempting to migrate twice to the same server. Catch this by checking if the nfs_client found is the same as the existing client. For the other two callers to nfs4_set_client, the nfs_client will always be ERR_PTR(-EINVAL). Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* pNFS/flexfiles: Fix an Oopsable condition when connection to the DS failsTrond Myklebust2016-08-292-28/+28
| | | | | | | | | | | | | If the attempt to connect to a DS fails inside ff_layout_pg_init_read or ff_layout_pg_init_write, then we currently end up clearing the layout segment carried by the struct nfs_pageio_descriptor, causing an Oops when we later call into ff_layout_read_pagelist/ff_layout_write_pagelist. The fix is to ensure we return the layout and then retry. Fixes: 446ca2195303 ("pNFS/flexfiles: When initing reads or writes, we...") Cc: stable@vger.kernel.org # v4.7+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* NFSv4.1: Remove obsolete and incorrrect assignment in nfs4_callback_sequenceTrond Myklebust2016-08-281-1/+0Star
| | | | Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* NFSv4.1: Close callback races for OPEN, LAYOUTGET and LAYOUTRETURNTrond Myklebust2016-08-281-13/+65
| | | | | | | | | | Defer freeing the slot until after we have processed the results from OPEN and LAYOUTGET. This means that the server can rely on the mechanism in RFC5661 Section 2.10.6.3 to ensure that replies to an OPEN or LAYOUTGET/RETURN RPC call don't race with the callbacks that apply to them. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* NFSv4.1: Defer bumping the slot sequence number until we free the slotTrond Myklebust2016-08-282-3/+9
| | | | | | | | | | For operations like OPEN or LAYOUTGET, which return recallable state (i.e. delegations and layouts) we want to enable the mechanism for resolving recall races in RFC5661 Section 2.10.6.3. To do so, we will want to defer bumping the slot's sequence number until we have finished processing the RPC results. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* NFSv4.1: Delay callback processing when there are referring triplesTrond Myklebust2016-08-284-4/+29
| | | | | | | | | | | If CB_SEQUENCE tells us that the processing of this request depends on the completion of one or more referring triples (see RFC 5661 Section 2.10.6.3), delay the callback processing until after the RPC requests being referred to have completed. If we end up delaying for more than 1/2 second, then fall back to returning NFS4ERR_DELAY in reply to the callback. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* NFSv4.1: Fix Oopsable condition in server callback racesTrond Myklebust2016-08-283-4/+35
| | | | | | | | | The slot table hasn't been an array since v3.7. Ensure that we use nfs4_lookup_slot() to access the slot correctly. Fixes: 87dda67e7386 ("NFSv4.1: Allow SEQUENCE to resize the slot table...") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org # v3.8+
* pnfs/blocklayout: update last_write_offset atomically with extentsBenjamin Coddington2016-08-233-5/+10
| | | | | | | | | | | | | | | Block/SCSI layout write completion may add committable extents to the extent tree before updating the layout's last-written byte under the inode lock. If a sync happens before this value is updated, then prepare_layoutcommit may find and encode these extents which would produce a LAYOUTCOMMIT request whose encoded extents are larger than the request's loca_length. Fix this by using a last-written byte value that is updated atomically with the extent tree so that commitable extents always match. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* pNFS: The client must not do I/O to the DS if it's lease has expiredTrond Myklebust2016-08-231-0/+1
| | | | | | | | | | | | | Ensure that the client conforms to the normative behaviour described in RFC5661 Section 12.7.2: "If a client believes its lease has expired, it MUST NOT send I/O to the storage device until it has validated its lease." So ensure that we wait for the lease to be validated before using the layout. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org # v3.20+
* pNFS: Handle NFS4ERR_OLD_STATEID correctly in LAYOUTSTAT callsTrond Myklebust2016-08-192-6/+29
| | | | | | We normally want to update the stateid and then retry, Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* pNFS/flexfiles: Set reasonable default retrans values for the data channelTrond Myklebust2016-08-161-2/+2
| | | | | | | | | | | | | Prior to this patch, the retrans value was set at 5, meaning that we could see a maximum retransmission timeout value of more than 6 minutes. That's a tad high for NFSv3 where the protocol does allow the server to drop requests at any time. Since this is a data channel, let's just set retrans to 0, and the default timeout to 60s. The user can continue to adjust these defaults using the dataserver_retrans and dataserver_timeo module parameters. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* NFS: Allow the mount option retrans=0Trond Myklebust2016-08-163-8/+26
| | | | | | | | | | | We should allow retrans=0 as just meaning that every timeout is a major timeout, and that there is no increment in the timeout value. For instance, this means that we would allow TCP users to specify a flat timeout value of 60s, by specifying "timeo=600,retrans=0" in their mount option string. Siged-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* pNFS/flexfiles: Fix layoutstat periodic reportingTrond Myklebust2016-08-152-5/+5
| | | | | | | | | | Putting the periodicity timer in the mirror instances is causing non-scalable reporting behaviour and missed reporting intervals. When you recall layouts and/or implement client side mirroring, it leads to consecutive reports with only a few ms between RPC calls. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Fixes: d0379a5d066a9 ("pNFS/flexfiles: Support server-supplied...")
* Merge tag 'nfs-for-4.8-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds2016-08-125-12/+32
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull NFS client bugfixes from Trond Myklebust: "Highlights include: - Stable patch from Olga to fix RPCSEC_GSS upcalls when the same user needs multiple different security services (e.g. krb5i and krb5p). - Stable patch to fix a regression introduced by the use of SO_REUSEPORT, and that prevented the use of multiple different NFS versions to the same server. - TCP socket reconnection timer fixes. - Patch from Neil to disable the use of IPv6 temporary addresses" * tag 'nfs-for-4.8-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFSv4: Cap the transport reconnection timer at 1/2 lease period NFSv4: Cleanup the setting of the nfs4 lease period SUNRPC: Limit the reconnect backoff timer to the max RPC message timeout SUNRPC: Fix reconnection timeouts NFSv4.2: LAYOUTSTATS may return NFS4ERR_ADMIN/DELEG_REVOKED SUNRPC: disable the use of IPv6 temporary addresses. SUNRPC: allow for upcalls for same uid but different gss service SUNRPC: Fix up socket autodisconnect SUNRPC: Handle EADDRNOTAVAIL on connection failures
| * NFSv4: Cap the transport reconnection timer at 1/2 lease periodTrond Myklebust2016-08-061-0/+3
| | | | | | | | | | | | | | | | | | We don't want to miss a lease period renewal due to the TCP connection failing to reconnect in a timely fashion. To ensure this doesn't happen, cap the reconnection timer so that we retry the connection attempt at least every 1/2 lease period. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * NFSv4: Cleanup the setting of the nfs4 lease periodTrond Myklebust2016-08-064-12/+27
| | | | | | | | | | | | | | Make a helper function nfs4_set_lease_period() and have nfs41_setup_state_renewal() and nfs4_do_fsinfo() use it. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * NFSv4.2: LAYOUTSTATS may return NFS4ERR_ADMIN/DELEG_REVOKEDTrond Myklebust2016-08-051-0/+2
| | | | | | | | | | | | | | We should handle those errors in the same way we handle the other stateid errors: by invalidating the faulty layout stateid. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* | Merge branch 'work.const-qstr' of ↵Linus Torvalds2016-08-066-21/+22
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull qstr constification updates from Al Viro: "Fairly self-contained bunch - surprising lot of places passes struct qstr * as an argument when const struct qstr * would suffice; it complicates analysis for no good reason. I'd prefer to feed that separately from the assorted fixes (those are in #for-linus and with somewhat trickier topology)" * 'work.const-qstr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: qstr: constify instances in adfs qstr: constify instances in lustre qstr: constify instances in f2fs qstr: constify instances in ext2 qstr: constify instances in vfat qstr: constify instances in procfs qstr: constify instances in fuse qstr constify instances in fs/dcache.c qstr: constify instances in nfs qstr: constify instances in ocfs2 qstr: constify instances in autofs4 qstr: constify instances in hfs qstr: constify instances in hfsplus qstr: constify instances in logfs qstr: constify dentry_init_security
| * | qstr: constify instances in nfsAl Viro2016-07-216-21/+22
| | | | | | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | | Merge tag 'nfs-for-4.8-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds2016-07-3128-552/+866
|\ \ \ | | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull NFS client updates from Trond Myklebust: "Highlights include: Stable bugfixes: - nfs: don't create zero-length requests - several LAYOUTGET bugfixes Features: - several performance related features - more aggressive caching when we can rely on close-to-open cache consistency - remove serialisation of O_DIRECT reads and writes - optimise several code paths to not flush to disk unnecessarily. However allow for the idiosyncracies of pNFS for those layout types that need to issue a LAYOUTCOMMIT before the metadata can be updated on the server. - SUNRPC updates to the client data receive path - pNFS/SCSI support RH/Fedora dm-mpath device nodes - pNFS files/flexfiles can now use unprivileged ports when the generic NFS mount options allow it. Bugfixes: - Don't use RDMA direct data placement together with data integrity or privacy security flavours - Remove the RDMA ALLPHYSICAL memory registration mode as it has potential security holes. - Several layout recall fixes to improve NFSv4.1 protocol compliance. - Fix an Oops in the pNFS files and flexfiles connection setup to the DS - Allow retry of operations that used a returned delegation stateid - Don't mark the inode as revalidated if a LAYOUTCOMMIT is outstanding - Fix writeback races in nfs4_copy_range() and nfs42_proc_deallocate()" * tag 'nfs-for-4.8-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (104 commits) pNFS: Actively set attributes as invalid if LAYOUTCOMMIT is outstanding NFSv4: Clean up lookup of SECINFO_NO_NAME NFSv4.2: Fix warning "variable ‘stateids’ set but not used" NFSv4: Fix warning "no previous prototype for ‘nfs4_listxattr’" SUNRPC: Fix a compiler warning in fs/nfs/clnt.c pNFS: Remove redundant smp_mb() from pnfs_init_lseg() pNFS: Cleanup - do layout segment initialisation in one place pNFS: Remove redundant stateid invalidation pNFS: Remove redundant pnfs_mark_layout_returned_if_empty() pNFS: Clear the layout metadata if the server changed the layout stateid pNFS: Cleanup - don't open code pnfs_mark_layout_stateid_invalid() NFS: pnfs_mark_matching_lsegs_return() should match the layout sequence id pNFS: Do not set plh_return_seq for non-callback related layoutreturns pNFS: Ensure layoutreturn acts as a completion for layout callbacks pNFS: Fix CB_LAYOUTRECALL stateid verification pNFS: Always update the layout barrier seqid on LAYOUTGET pNFS: Always update the layout stateid if NFS_LAYOUT_INVALID_STID is set pNFS: Clear the layout return tracking on layout reinitialisation pNFS: LAYOUTRETURN should only update the stateid if the layout is valid nfs: don't create zero-length requests ...
| * | pNFS: Actively set attributes as invalid if LAYOUTCOMMIT is outstandingBenjamin Coddington2016-07-281-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A LAYOUTCOMMIT then subsequent GETATTR may both return the same attributes, and in that case NFS_INO_INVALID_ATTR is never set on the second pass through nfs_update_inode(). The existing check to skip the clearing of NFS_INO_INVALID_ATTR if a LAYOUTCOMMIT is outstanding does not help in this case (see commit 10b7e9ad4488: "pNFS: Don't mark the inode as revalidated if a LAYOUTCOMMIT is outstanding"). We know that if a LAYOUTCOMMIT is outstanding then attributes will need upating, so always set NFS_INO_INVALID_ATTR. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | NFSv4: Clean up lookup of SECINFO_NO_NAMETrond Myklebust2016-07-261-8/+2Star
| | | | | | | | | | | | | | | | | | | | | Use the minor version ops cached in struct nfs_client instead of looking them up again. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | NFSv4.2: Fix warning "variable ‘stateids’ set but not used"Trond Myklebust2016-07-241-2/+10
| | | | | | | | | | | | | | | | | | | | | Replace it with a test for whether or not the sent a stateid in violation of what we asked for. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | NFSv4: Fix warning "no previous prototype for ‘nfs4_listxattr’"Trond Myklebust2016-07-241-1/+1
| | | | | | | | | | | | | | | | | | Make it static Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | Merge branch 'nfs-rdma'Trond Myklebust2016-07-241-1/+5
| |\ \
| | * | NFS: Don't drop CB requests with invalid principalsChuck Lever2016-07-111-1/+5
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before commit 778be232a207 ("NFS do not find client in NFSv4 pg_authenticate"), the Linux callback server replied with RPC_AUTH_ERROR / RPC_AUTH_BADCRED, instead of dropping the CB request. Let's restore that behavior so the server has a chance to do something useful about it, and provide a warning that helps admins correct the problem. Fixes: 778be232a207 ("NFS do not find client in NFSv4 ...") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| * | Merge branch 'pnfs'Trond Myklebust2016-07-246-136/+218
| |\ \
| | * | pNFS: Remove redundant smp_mb() from pnfs_init_lseg()Trond Myklebust2016-07-241-1/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | It's not visible yet, and won't be until after we grab the inode->i_lock. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | pNFS: Cleanup - do layout segment initialisation in one placeTrond Myklebust2016-07-241-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | ...instead of splitting the initialisation over init_lseg() and pnfs_layout_process(). Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | pNFS: Remove redundant stateid invalidationTrond Myklebust2016-07-241-1/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The layout stateid will be invalidated once it holds no more layout segments anyway. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | pNFS: Remove redundant pnfs_mark_layout_returned_if_empty()Trond Myklebust2016-07-244-16/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | That's already being taken care of in pnfs_layout_remove_lseg(). Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | pNFS: Clear the layout metadata if the server changed the layout stateidTrond Myklebust2016-07-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the server changed the layout stateid's "other" field, then we should treat the old layout as being completely gone. In that case, we want to clear the metadata such as scheduled layoutreturns. Do this by calling pnfs_mark_layout_stateid_invalid(). Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | pNFS: Cleanup - don't open code pnfs_mark_layout_stateid_invalid()Trond Myklebust2016-07-244-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Ensure nfs42_layoutstat_done() layoutget don't open code layout stateid invalidation. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | NFS: pnfs_mark_matching_lsegs_return() should match the layout sequence idTrond Myklebust2016-07-241-14/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When determining which layout segments to return, we do want pnfs_mark_matching_lsegs_return to check that they match the layout sequence id. This ensures that we don't waste time if the server is replaying a layout recall that has already been satisfied. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | pNFS: Do not set plh_return_seq for non-callback related layoutreturnsTrond Myklebust2016-07-241-7/+6Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In cases where we need to send a layoutreturn in order to propagate an error, we should not tie that to a specific layout stateid. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | pNFS: Ensure layoutreturn acts as a completion for layout callbacksTrond Myklebust2016-07-241-15/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we return NFS_OK to the CB_LAYOUTRECALL, we are required to send a layoutreturn that "completes" that layout recall request, using the correct stateid. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | pNFS: Fix CB_LAYOUTRECALL stateid verificationTrond Myklebust2016-07-241-19/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We want to evaluate in this order: If the client holds no layout for this inode, then return NFS4ERR_NOMATCHING_LAYOUT; it probably forgot the layout. If the client finds the inode among the list of layouts, but the corresponding stateid has not yet been initialised, then return NFS4ERR_DELAY to ask the server to retry once the outstanding LAYOUTGET is complete. If the current layout stateid's "other" field does not match the recalled stateid, return NFS4ERR_BAD_STATEID. If already processing a layout recall with a newer stateid, return NFS4ERR_OLD_STATEID. This can only happens for servers that are non-compliant with the NFSv4.1 protocol. If already processing a layout recall with an older stateid, return NFS4ERR_DELAY to ask the server to retry once the outstanding LAYOUTRETURN is complete. Again, this is technically incompliant with the NFSv4.1 protocol. If the current layout sequence id is newer than the recalled stateid's sequence id, return NFS4ERR_OLD_STATEID. This too implies protocol non-compliance. If the current layout sequence id is older than the recalled stateid's sequence id+1, return NFS4ERR_DELAY. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | pNFS: Always update the layout barrier seqid on LAYOUTGETTrond Myklebust2016-07-241-13/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, pnfs_set_layout_stateid() will update the layout sequence id barrier only if the stateid itself is newer than the current layout stateid. However in a situation where multiple LAYOUTGET calls and a LAYOUTRETURN raced, it is entirely possible for one of the LAYOUTGET to set the current stateid to something newer than the LAYOUTRETURN that needs to set the barrier. The fix is to allow the "update_barrier" flag to force a check as to whether or not the barrier needs to be updated. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | pNFS: Always update the layout stateid if NFS_LAYOUT_INVALID_STID is setTrond Myklebust2016-07-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | If the layout stateid is invalid, then pnfs_set_layout_stateid() must always initialise it. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | pNFS: Clear the layout return tracking on layout reinitialisationTrond Myklebust2016-07-241-5/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Ensure that we don't carry over layoutreturn info from a previous incarnation of this layout. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | pNFS: LAYOUTRETURN should only update the stateid if the layout is validTrond Myklebust2016-07-242-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | If the layout was completely returned, then ignore the returned layout stateid. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | Merge commit 'e7bdea7750eb'Trond Myklebust2016-07-248-23/+45
| | |\| | | | | | | | | | | | | Needed in order to work on top of pNFS changes in Linus' upstream kernel.
| | * | Fix NULL pointer dereference in bl_free_device().Artem Savkov2016-07-221-9/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When bl_parse_deviceid() fails in bl_alloc_deviceid_node() on blkdev_get_by_*() step we get an pnfs_block_dev struct that is uninitialized except for bdev field which is set to whatever error blkdev_get_by_*() returns. bl_free_device() then tries to call blkdev_put() if bdev is not 0 resulting in a wrong pointer dereference. Fixing this by setting bdev in struct pnfs_block_dev only if we didn't get an error from blkdev_get_by_*(). Signed-off-by: Artem Savkov <asavkov@redhat.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>