summaryrefslogtreecommitdiffstats
path: root/fs/xfs/libxfs/xfs_da_btree.c
Commit message (Collapse)AuthorAgeFilesLines
* xfs: rename flist/free_list to dfopsDarrick J. Wong2016-08-031-3/+3
| | | | | | | | | | Mechanical change of flist/free_list to dfops, since they're now deferred ops, not just a freeing list. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* libxfs: directory node splitting does not have an extra blockDave Chinner2016-07-221-30/+29Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | xfsprogs source commit 4280e59dcbc4cd8e01585efe788a68eb378048e8 xfs_da3_split() has to handle all three versions of the directory/attribute btree structure. The attr tree is v1, the dir tre is v2 or v3. The main difference between the v1 and v2/3 trees is the way tree nodes are split - in the v1 tree we can require a double split to occur because the object to be inserted may be larger than the space made by splitting a leaf. In this case we need to do a double split - one to split the full leaf, then another to allocate an empty leaf block in the correct location for the new entry. This does not happen with dir (v2/v3) formats as the objects being inserted are always guaranteed to fit into the new space in the split blocks. Indeed, for directories they *may* be an extra block on this buffer pointer. However, it's guaranteed not to be a leaf block (i.e. a directory data block) - the directory code only ever places hash index or free space blocks in this pointer (as a cursor of sorts), and so to use it as a directory data block will immediately corrupt the directory. The problem is that the code assumes that there may be extra blocks that we need to link into the tree once we've split the root, but this is not true for either dir or attr trees, because the extra attr block is always consumed by the last node split before we split the root. Hence the linking in an extra block is always wrong at the root split level, and this manifests itself in repair as a directory corruption in a repaired directory, leaving the directory rebuild incomplete. This is a dir v2 zero-day bug - it was in the initial dir v2 commit that was made back in February 1998. Fix this by ensuring the linking of the blocks after the root split never tries to make use of the extra blocks that may be held in the cursor. They are held there for other purposes and should never be touched by the root splitting code. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: print name of verifier if it failsEric Sandeen2016-01-041-0/+1
| | | | | | | | | | | | This adds a name to each buf_ops structure, so that if a verifier fails we can print the type of verifier that failed it. Should be a slight debugging aid, I hope. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: validate metadata LSNs against log on v5 superblocksBrian Foster2015-10-121-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since the onset of v5 superblocks, the LSN of the last modification has been included in a variety of on-disk data structures. This LSN is used to provide log recovery ordering guarantees (e.g., to ensure an older log recovery item is not replayed over a newer target data structure). While this works correctly from the point a filesystem is formatted and mounted, userspace tools have some problematic behaviors that defeat this mechanism. For example, xfs_repair historically zeroes out the log unconditionally (regardless of whether corruption is detected). If this occurs, the LSN of the filesystem is reset and the log is now in a problematic state with respect to on-disk metadata structures that might have a larger LSN. Until either the log catches up to the highest previously used metadata LSN or each affected data structure is modified and written out without incident (which resets the metadata LSN), log recovery is susceptible to filesystem corruption. This problem is ultimately addressed and repaired in the associated userspace tools. The kernel is still responsible to detect the problem and notify the user that something is wrong. Check the superblock LSN at mount time and fail the mount if it is invalid. From that point on, trigger verifier failure on any metadata I/O where an invalid LSN is detected. This results in a filesystem shutdown and guarantees that we do not log metadata changes with invalid LSNs on disk. Since this is a known issue with a known recovery path, present a warning to instruct the user how to recover. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* Merge branch 'xfs-misc-fixes-for-4.3-4' into for-nextDave Chinner2015-09-011-0/+1
|\
| * libxfs: bad magic number should set da block buffer errorDarrick J. Wong2015-08-281-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | If xfs_da3_node_read_verify() doesn't recognize the magic number of a buffer it's just read, set the buffer error to -EFSCORRUPTED so that the error can be sent up to userspace. Without this patch we'll notice the bad magic eventually while trying to traverse or change the block, but we really ought to fail early in the verifier. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* | Merge branch 'xfs-misc-fixes-for-4.3-2' into for-nextDave Chinner2015-08-201-9/+14
|\ \
| * | xfs: swap leaf buffer into path struct atomically during path shiftBrian Foster2015-08-191-9/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The node directory lookup code uses a state structure that tracks the path of buffers used to search for the hash of a filename through the leaf blocks. When the lookup encounters a block that ends with the requested hash, but the entry has not yet been found, it must shift over to the next block and continue looking for the entry (i.e., duplicate hashes could continue over into the next block). This shift mechanism involves walking back up and down the state structure, replacing buffers at the appropriate btree levels as necessary. When a buffer is replaced, the old buffer is released and the new buffer read into the active slot in the path structure. Because the buffer is read directly into the path slot, a buffer read failure can result in setting a NULL buffer pointer in an active slot. This throws off the state cleanup code in xfs_dir2_node_lookup(), which expects to release a buffer from each active slot. Instead, a BUG occurs due to a NULL pointer dereference: BUG: unable to handle kernel NULL pointer dereference at 00000000000001e8 IP: [<ffffffffa0585063>] xfs_trans_brelse+0x2a3/0x3c0 [xfs] ... RIP: 0010:[<ffffffffa0585063>] [<ffffffffa0585063>] xfs_trans_brelse+0x2a3/0x3c0 [xfs] ... Call Trace: [<ffffffffa05250c6>] xfs_dir2_node_lookup+0xa6/0x2c0 [xfs] [<ffffffffa0519f7c>] xfs_dir_lookup+0x1ac/0x1c0 [xfs] [<ffffffffa055d0e1>] xfs_lookup+0x91/0x290 [xfs] [<ffffffffa05580b3>] xfs_vn_lookup+0x73/0xb0 [xfs] [<ffffffff8122de8d>] lookup_real+0x1d/0x50 [<ffffffff8123330e>] path_openat+0x91e/0x1490 [<ffffffff81235079>] do_filp_open+0x89/0x100 ... This has been reproduced via a parallel fsstress and filesystem shutdown workload in a loop. The shutdown triggers the read error in the aforementioned codepath and causes the BUG in xfs_dir2_node_lookup(). Update xfs_da3_path_shift() to update the active path slot atomically with respect to the caller when a buffer is replaced. This ensures that the caller always sees the old or new buffer in the slot and prevents the NULL pointer dereference. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* | | Merge branch 'xfs-meta-uuid' into for-nextDave Chinner2015-07-291-2/+2
|\| |
| * | xfs: create new metadata UUID field and incompat flagEric Sandeen2015-07-291-2/+2
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a new superblock field, sb_meta_uuid. If set, along with a new incompat flag, the code will use that field on a V5 filesystem to compare to metadata UUIDs, which allows us to change the user- visible UUID at will. Userspace handles the setting and clearing of the incompat flag as appropriate, as the UUID gets changed; i.e. setting the user-visible UUID back to the original UUID (as stored in the new field) will remove the incompatible feature flag. If the incompat flag is not set, this copies the user-visible UUID into into the meta_uuid slot in memory when the superblock is read from disk; the meta_uuid field is not written back to disk in this case. The remainder of this patch simply switches verifiers, initializers, etc to use the new sb_meta_uuid field. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* / xfs: xfs_bunmapi() does not need XFS_BMAPI_METADATA flagDave Chinner2015-07-291-2/+2
|/ | | | | | | | | | | xfs_bunmapi() doesn't care what type of extent is being freed and does not look at the XFS_BMAPI_METADATA flag at all. As such we can remove the XFS_BMAPI_METADATA from all callers that use it. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: fix shadow warning in xfs_da3_root_split()Fabian Frederick2015-03-251-4/+4
| | | | | | | | | | Use icnodehdr for struct xfs_da3_icnode_hdr instead of nodehdr (already declared above). Signed-off-by: Fabian Frederick <fabf@skynet.be> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* Merge branch 'xfs-misc-fixes-for-3.19-2' into for-nextDave Chinner2014-12-031-4/+0Star
|\ | | | | | | | | Conflicts: fs/xfs/xfs_iops.c
| * xfs: fix set-but-unused warningsDave Chinner2014-12-031-4/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | The kernel compile doesn't turn on these checks by default, so it's only when I do a kernel-user sync that I find that there are lots of compiler warnings waiting to be fixed. Fix up these set-but-unused warnings. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
* | xfs: move most of xfs_sb.h to xfs_format.hChristoph Hellwig2014-11-281-1/+0Star
| | | | | | | | | | | | | | | | | | More on-disk format consolidation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* | xfs: merge xfs_ag.h into xfs_format.hChristoph Hellwig2014-11-281-1/+0Star
|/ | | | | | | | | | More on-disk format consolidation. A few declarations that weren't on-disk format related move into better suitable spots. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: don't send null bp to xfs_trans_brelse()Eric Sandeen2014-10-021-1/+2
| | | | | | | | | | | | | | | In this case, if bp is NULL, error is set, and we send a NULL bp to xfs_trans_brelse, which will try to dereference it. Test whether we actually have a buffer before we try to free it. Coverity spotted this. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: require 64-bit sector_tChristoph Hellwig2014-07-301-1/+1
| | | | | | | | | | | | | Trying to support tiny disks only and saving a bit memory might have made sense on an SGI O2 15 years ago, but is pretty pointless today. Remove the rarely tested codepath that uses various smaller in-memory types to reduce our test matrix and make the codebase a little bit smaller and less complicated. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: global error sign conversionDave Chinner2014-06-251-21/+21
| | | | | | | | | | | | | | | | | | | | | | | | | Convert all the errors the core XFs code to negative error signs like the rest of the kernel and remove all the sign conversion we do in the interface layers. Errors for conversion (and comparison) found via searches like: $ git grep " E" fs/xfs $ git grep "return E" fs/xfs $ git grep " E[A-Z].*;$" fs/xfs Negation points found via searches like: $ git grep "= -[a-z,A-Z]" fs/xfs $ git grep "return -[a-z,A-D,F-Z]" fs/xfs $ git grep " -[a-z].*;" fs/xfs [ with some bits I missed from Brian Foster ] Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* libxfs: move source filesDave Chinner2014-06-251-0/+2665
Move all the source files that are shared with userspace into libxfs/. This is done as one big chunk simpy to get it done quickly Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>