summaryrefslogtreecommitdiffstats
path: root/fs
Commit message (Collapse)AuthorAgeFilesLines
...
| * | | | | ext4: avoid potential extra brelse in setup_new_flex_group_blocks() Vasily Averin2018-11-031-6/+2Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently bh is set to NULL only during first iteration of for cycle, then this pointer is not cleared after end of using. Therefore rollback after errors can lead to extra brelse(bh) call, decrements bh counter and later trigger an unexpected warning in __brelse() Patch moves brelse() calls in body of cycle to exclude requirement of brelse() call in rollback. Fixes: 33afdcc5402d ("ext4: add a function which sets up group blocks ...") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # 3.3+
* | | | | | Merge branch 'for-linus' of ↵Linus Torvalds2018-11-101-5/+17
|\ \ \ \ \ \ | | |_|/ / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull namespace fixes from Eric Biederman: "I believe all of these are simple obviously correct bug fixes. These fall into two groups: - Fixing the implementation of MNT_LOCKED which prevents lesser privileged users from seeing unders mounts created by more privileged users. - Fixing the extended uid and group mapping in user namespaces. As well as ensuring the code looks correct I have spot tested these changes as well and in my testing the fixes are working" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: mount: Prevent MNT_DETACH from disconnecting locked mounts mount: Don't allow copying MNT_UNBINDABLE|MNT_LOCKED mounts mount: Retest MNT_LOCKED in do_umount userns: also map extents in the reverse map to kernel IDs
| * | | | | mount: Prevent MNT_DETACH from disconnecting locked mountsEric W. Biederman2018-11-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Timothy Baldwin <timbaldwin@fastmail.co.uk> wrote: > As per mount_namespaces(7) unprivileged users should not be able to look under mount points: > > Mounts that come as a single unit from more privileged mount are locked > together and may not be separated in a less privileged mount namespace. > > However they can: > > 1. Create a mount namespace. > 2. In the mount namespace open a file descriptor to the parent of a mount point. > 3. Destroy the mount namespace. > 4. Use the file descriptor to look under the mount point. > > I have reproduced this with Linux 4.16.18 and Linux 4.18-rc8. > > The setup: > > $ sudo sysctl kernel.unprivileged_userns_clone=1 > kernel.unprivileged_userns_clone = 1 > $ mkdir -p A/B/Secret > $ sudo mount -t tmpfs hide A/B > > > "Secret" is indeed hidden as expected: > > $ ls -lR A > A: > total 0 > drwxrwxrwt 2 root root 40 Feb 12 21:08 B > > A/B: > total 0 > > > The attack revealing "Secret": > > $ unshare -Umr sh -c "exec unshare -m ls -lR /proc/self/fd/4/ 4<A" > /proc/self/fd/4/: > total 0 > drwxr-xr-x 3 root root 60 Feb 12 21:08 B > > /proc/self/fd/4/B: > total 0 > drwxr-xr-x 2 root root 40 Feb 12 21:08 Secret > > /proc/self/fd/4/B/Secret: > total 0 I tracked this down to put_mnt_ns running passing UMOUNT_SYNC and disconnecting all of the mounts in a mount namespace. Fix this by factoring drop_mounts out of drop_collected_mounts and passing 0 instead of UMOUNT_SYNC. There are two possible behavior differences that result from this. - No longer setting UMOUNT_SYNC will no longer set MNT_SYNC_UMOUNT on the vfsmounts being unmounted. This effects the lazy rcu walk by kicking the walk out of rcu mode and forcing it to be a non-lazy walk. - No longer disconnecting locked mounts will keep some mounts around longer as they stay because the are locked to other mounts. There are only two users of drop_collected mounts: audit_tree.c and put_mnt_ns. In audit_tree.c the mounts are private and there are no rcu lazy walks only calls to iterate_mounts. So the changes should have no effect except for a small timing effect as the connected mounts are disconnected. In put_mnt_ns there may be references from process outside the mount namespace to the mounts. So the mounts remaining connected will be the bug fix that is needed. That rcu walks are allowed to continue appears not to be a problem especially as the rcu walk change was about an implementation detail not about semantics. Cc: stable@vger.kernel.org Fixes: 5ff9d8a65ce8 ("vfs: Lock in place mounts from more privileged users") Reported-by: Timothy Baldwin <timbaldwin@fastmail.co.uk> Tested-by: Timothy Baldwin <timbaldwin@fastmail.co.uk> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
| * | | | | mount: Don't allow copying MNT_UNBINDABLE|MNT_LOCKED mountsEric W. Biederman2018-11-081-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jonathan Calmels from NVIDIA reported that he's able to bypass the mount visibility security check in place in the Linux kernel by using a combination of the unbindable property along with the private mount propagation option to allow a unprivileged user to see a path which was purposefully hidden by the root user. Reproducer: # Hide a path to all users using a tmpfs root@castiana:~# mount -t tmpfs tmpfs /sys/devices/ root@castiana:~# # As an unprivileged user, unshare user namespace and mount namespace stgraber@castiana:~$ unshare -U -m -r # Confirm the path is still not accessible root@castiana:~# ls /sys/devices/ # Make /sys recursively unbindable and private root@castiana:~# mount --make-runbindable /sys root@castiana:~# mount --make-private /sys # Recursively bind-mount the rest of /sys over to /mnnt root@castiana:~# mount --rbind /sys/ /mnt # Access our hidden /sys/device as an unprivileged user root@castiana:~# ls /mnt/devices/ breakpoint cpu cstate_core cstate_pkg i915 intel_pt isa kprobe LNXSYSTM:00 msr pci0000:00 platform pnp0 power software system tracepoint uncore_arb uncore_cbox_0 uncore_cbox_1 uprobe virtual Solve this by teaching copy_tree to fail if a mount turns out to be both unbindable and locked. Cc: stable@vger.kernel.org Fixes: 5ff9d8a65ce8 ("vfs: Lock in place mounts from more privileged users") Reported-by: Jonathan Calmels <jcalmels@nvidia.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
| * | | | | mount: Retest MNT_LOCKED in do_umountEric W. Biederman2018-11-081-2/+8
| | |_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It was recently pointed out that the one instance of testing MNT_LOCKED outside of the namespace_sem is in ksys_umount. Fix that by adding a test inside of do_umount with namespace_sem and the mount_lock held. As it helps to fail fails the existing test is maintained with an additional comment pointing out that it may be racy because the locks are not held. Cc: stable@vger.kernel.org Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Fixes: 5ff9d8a65ce8 ("vfs: Lock in place mounts from more privileged users") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* | | | | Merge tag 'ceph-for-4.20-rc2' of https://github.com/ceph/ceph-clientLinus Torvalds2018-11-093-12/+14
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull Ceph fixes from Ilya Dryomov: "Two CephFS fixes (copy_file_range and quota) and a small feature bit cleanup" * tag 'ceph-for-4.20-rc2' of https://github.com/ceph/ceph-client: libceph: assume argonaut on the server side ceph: quota: fix null pointer dereference in quota check ceph: add destination file data sync before doing any remote copy
| * | | | | libceph: assume argonaut on the server sideIlya Dryomov2018-11-081-9/+3Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | No one is running pre-argonaut. In addition one of the argonaut features (NOSRCADDR) has been required since day one (and a half, 2.6.34 vs 2.6.35) of the kernel client. Allow for the possibility of reusing these feature bits later. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com>
| * | | | | ceph: quota: fix null pointer dereference in quota checkLuis Henriques2018-11-081-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a possible null pointer dereference in check_quota_exceeded, detected by the static checker smatch, with the following warning:    fs/ceph/quota.c:240 check_quota_exceeded()     error: we previously assumed 'realm' could be null (see line 188) Fixes: b7a2921765cf ("ceph: quota: support for ceph.quota.max_files") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Luis Henriques <lhenriques@suse.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
| * | | | | ceph: add destination file data sync before doing any remote copyLuis Henriques2018-11-081-2/+9
| |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we try to copy into a file that was just written, any data that is remote copied will be overwritten by our buffered writes once they are flushed.  When this happens, the call to invalidate_inode_pages2_range will also return a -EBUSY error. This patch fixes this by also sync'ing the destination file before starting any copy. Fixes: 503f82a9932d ("ceph: support copy_file_range file operation") Signed-off-by: Luis Henriques <lhenriques@suse.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
* | | | | xfs: fix overflow in xfs_attr3_leaf_verifyDave Chinner2018-11-061-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | generic/070 on 64k block size filesystems is failing with a verifier corruption on writeback or an attribute leaf block: [ 94.973083] XFS (pmem0): Metadata corruption detected at xfs_attr3_leaf_verify+0x246/0x260, xfs_attr3_leaf block 0x811480 [ 94.975623] XFS (pmem0): Unmount and run xfs_repair [ 94.976720] XFS (pmem0): First 128 bytes of corrupted metadata buffer: [ 94.978270] 000000004b2e7b45: 00 00 00 00 00 00 00 00 3b ee 00 00 00 00 00 00 ........;....... [ 94.980268] 000000006b1db90b: 00 00 00 00 00 81 14 80 00 00 00 00 00 00 00 00 ................ [ 94.982251] 00000000433f2407: 22 7b 5c 82 2d 5c 47 4c bb 31 1c 37 fa a9 ce d6 "{\.-\GL.1.7.... [ 94.984157] 0000000010dc7dfb: 00 00 00 00 00 81 04 8a 00 0a 18 e8 dd 94 01 00 ................ [ 94.986215] 00000000d5a19229: 00 a0 dc f4 fe 98 01 68 f0 d8 07 e0 00 00 00 00 .......h........ [ 94.988171] 00000000521df36c: 0c 2d 32 e2 fe 20 01 00 0c 2d 58 65 fe 0c 01 00 .-2.. ...-Xe.... [ 94.990162] 000000008477ae06: 0c 2d 5b 66 fe 8c 01 00 0c 2d 71 35 fe 7c 01 00 .-[f.....-q5.|.. [ 94.992139] 00000000a4a6bca6: 0c 2d 72 37 fc d4 01 00 0c 2d d8 b8 f0 90 01 00 .-r7.....-...... [ 94.994789] XFS (pmem0): xfs_do_force_shutdown(0x8) called from line 1453 of file fs/xfs/xfs_buf.c. Return address = ffffffff815365f3 This is failing this check: end = ichdr.freemap[i].base + ichdr.freemap[i].size; if (end < ichdr.freemap[i].base) >>>>> return __this_address; if (end > mp->m_attr_geo->blksize) return __this_address; And from the buffer output above, the freemap array is: freemap[0].base = 0x00a0 freemap[0].size = 0xdcf4 end = 0xdd94 freemap[1].base = 0xfe98 freemap[1].size = 0x0168 end = 0x10000 freemap[2].base = 0xf0d8 freemap[2].size = 0x07e0 end = 0xf8b8 These all look valid - the block size is 0x10000 and so from the last check in the above verifier fragment we know that the end of freemap[1] is valid. The problem is that end is declared as: uint16_t end; And (uint16_t)0x10000 = 0. So we have a verifier bug here, not a corruption. Fix the verifier to use uint32_t types for the check and hence avoid the overflow. Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=201577 Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
* | | | | xfs: print buffer offsets when dumping corrupt buffersDarrick J. Wong2018-11-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use DUMP_PREFIX_OFFSET when printing hex dumps of corrupt buffers because modern Linux now prints a 32-bit hash of our 64-bit pointer when using DUMP_PREFIX_ADDRESS: 00000000b4bb4297: 00 00 00 00 00 00 00 00 3b ee 00 00 00 00 00 00 ........;....... 00000005ec77e26: 00 00 00 00 02 d0 5a 00 00 00 00 00 00 00 00 00 ......Z......... 000000015938018: 21 98 e8 b4 fd de 4c 07 bc ea 3c e5 ae b4 7c 48 !.....L...<...|H This is totally worthless for a sequential dump since we probably only care about tracking the buffer offsets and afaik there's no way to recover the actual pointer from the hashed value. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* | | | | xfs: Fix error code in 'xfs_ioc_getbmap()'Christophe JAILLET2018-11-061-1/+1
|/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In this function, once 'buf' has been allocated, we unconditionally return 0. However, 'error' is set to some error codes in several error handling paths. Before commit 232b51948b99 ("xfs: simplify the xfs_getbmap interface") this was not an issue because all error paths were returning directly, but now that some cleanup at the end may be needed, we must propagate the error code. Fixes: 232b51948b99 ("xfs: simplify the xfs_getbmap interface") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
* | | | Merge tag 'tags/upstream-4.20-rc1' of git://git.infradead.org/linux-ubifsLinus Torvalds2018-11-0421-292/+1982
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull UBIFS updates from Richard Weinberger: - Full filesystem authentication feature, UBIFS is now able to have the whole filesystem structure authenticated plus user data encrypted and authenticated. - Minor cleanups * tag 'tags/upstream-4.20-rc1' of git://git.infradead.org/linux-ubifs: (26 commits) ubifs: Remove unneeded semicolon Documentation: ubifs: Add authentication whitepaper ubifs: Enable authentication support ubifs: Do not update inode size in-place in authenticated mode ubifs: Add hashes and HMACs to default filesystem ubifs: authentication: Authenticate super block node ubifs: Create hash for default LPT ubfis: authentication: Authenticate master node ubifs: authentication: Authenticate LPT ubifs: Authenticate replayed journal ubifs: Add auth nodes to garbage collector journal head ubifs: Add authentication nodes to journal ubifs: authentication: Add hashes to index nodes ubifs: Add hashes to the tree node cache ubifs: Create functions to embed a HMAC in a node ubifs: Add helper functions for authentication support ubifs: Add separate functions to init/crc a node ubifs: Format changes for authentication support ubifs: Store read superblock node ubifs: Drop write_node ...
| * | | | ubifs: Remove unneeded semicolonDing Xiang2018-10-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | delete redundant semicolon Signed-off-by: Ding Xiang <dingxiang@cmss.chinamobile.com> Signed-off-by: Richard Weinberger <richard@nod.at>
| * | | | ubifs: Enable authentication supportSascha Hauer2018-10-232-1/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the preparations all being done this patch now enables authentication support for UBIFS. Authentication is enabled when the newly introduced auth_key and auth_hash_name mount options are passed. auth_key provides the key which is used for authentication whereas auth_hash_name provides the hashing algorithm used for this FS. Passing these options make authentication mandatory and only UBIFS images that can be authenticated with the given key are allowed. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Richard Weinberger <richard@nod.at>
| * | | | ubifs: Do not update inode size in-place in authenticated modeSascha Hauer2018-10-233-38/+113
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In authenticated mode we cannot fixup the inode sizes in-place during recovery as this would invalidate the hashes and HMACs we stored for this inode. Instead, we just write the updated inodes to the journal. We can only do this after ubifs_rcvry_gc_commit() is done though, so for authenticated mode call ubifs_recover_size() after ubifs_rcvry_gc_commit() and not vice versa as normally done. Calling ubifs_recover_size() after ubifs_rcvry_gc_commit() has the drawback that after a commit the size fixup information is gone, so when a powercut happens while recovering from another powercut we may lose some data written right before the first powercut. This is why we only do this in authenticated mode and leave the behaviour for unauthenticated mode untouched. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Richard Weinberger <richard@nod.at>
| * | | | ubifs: Add hashes and HMACs to default filesystemSascha Hauer2018-10-231-7/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch calculates the necessary hashes and HMACs for the default filesystem so that the dynamically created default fs can be authenticated. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Richard Weinberger <richard@nod.at>
| * | | | ubifs: authentication: Authenticate super block nodeSascha Hauer2018-10-231-1/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a HMAC covering the super block node and adds the logic that decides if a filesystem shall be mounted unauthenticated or authenticated. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Richard Weinberger <richard@nod.at>
| * | | | ubifs: Create hash for default LPTSascha Hauer2018-10-233-3/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During creation of the default filesystem on an empty flash the default LPT is created. With this patch a hash over the default LPT is calculated which can be added to the default filesystems master node. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Richard Weinberger <richard@nod.at>
| * | | | ubfis: authentication: Authenticate master nodeSascha Hauer2018-10-233-10/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The master node contains hashes over the root index node and the LPT. This patch adds a HMAC to authenticate the master node itself. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Richard Weinberger <richard@nod.at>
| * | | | ubifs: authentication: Authenticate LPTSascha Hauer2018-10-233-0/+134
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The LPT needs to be authenticated aswell. Since the LPT is only written during commit it is enough to authenticate the whole LPT with a single hash which is stored in the master node. Only the leaf nodes (pnodes) are hashed which makes the implementation much simpler than it would be to hash the complete LPT. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Richard Weinberger <richard@nod.at>
| * | | | ubifs: Authenticate replayed journalSascha Hauer2018-10-231-2/+144
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make sure that during replay all buds can be authenticated. To do this we calculate the hash chain until we find an authentication node and check the HMAC in that node against the current status of the hash chain. After a power cut it can happen that some nodes have been written, but not yet the authentication node for them. These nodes have to be discarded during replay. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Richard Weinberger <richard@nod.at>
| * | | | ubifs: Add auth nodes to garbage collector journal headSascha Hauer2018-10-231-3/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To be able to authenticate the garbage collector journal head add authentication nodes to the buds the garbage collector creates. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Richard Weinberger <richard@nod.at>
| * | | | ubifs: Add authentication nodes to journalSascha Hauer2018-10-236-18/+153
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Nodes that are written to flash can only be authenticated through the index after the next commit. When a journal replay is necessary the nodes are not yet referenced by the index and thus can't be authenticated. This patch overcomes this situation by creating a hash over all nodes beginning from the commit start node over the reference node(s) and the buds themselves. From time to time we insert authentication nodes. Authentication nodes contain a HMAC from the current hash state, so that they can be used to authenticate a journal replay up to the point where the authentication node is. The hash is continued afterwards so that theoretically we would only have to check the HMAC of the last authentication node we find. Overall we get this picture: ,,,,,,,, ,......,........................................... ,. CS , hash1.----. hash2.----. ,. | , . |hmac . |hmac ,. v , . v . v ,.REF#0,-> bud -> bud -> bud.-> auth -> bud -> bud.-> auth ... ,..|...,........................................... , | , , | ,,,,,,,,,,,,,,, . | hash3,----. , | , |hmac , v , v , REF#1 -> bud -> bud,-> auth ... ,,,|,,,,,,,,,,,,,,,,,, v REF#2 -> ... | V ... Note how hash3 covers CS, REF#0 and REF#1 so that it is not possible to exchange or skip any reference nodes. Unlike the picture suggests the auth nodes themselves are not hashed. With this it is possible for an offline attacker to cut each journal head or to drop the last reference node(s), but not to skip any journal heads or to reorder any operations. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Richard Weinberger <richard@nod.at>
| * | | | ubifs: authentication: Add hashes to index nodesSascha Hauer2018-10-237-14/+81
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With this patch the hashes over the index nodes stored in the tree node cache are written to flash and are checked when read back from flash. The hash of the root index node is stored in the master node. During journal replay the hashes are regenerated from the read nodes and stored in the tree node cache. This means the nodes must previously be authenticated by other means. This is done in a later patch. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Richard Weinberger <richard@nod.at>
| * | | | ubifs: Add hashes to the tree node cacheSascha Hauer2018-10-234-30/+135
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As part of the UBIFS authentication support every branch in the index gets a hash covering the referenced node. To make that happen the tree node cache needs hashes over the nodes. This patch adds a hash argument to ubifs_tnc_add() and ubifs_tnc_add_nm(). The hashes are calculated from the callers of these functions which actually prepare the nodes. With this patch all the leaf nodes of the index tree get hashes, but currently nothing is done with these hashes, this is left for a later patch. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Richard Weinberger <richard@nod.at>
| * | | | ubifs: Create functions to embed a HMAC in a nodeSascha Hauer2018-10-232-6/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With authentication support some nodes (master node, super block node) get a HMAC embedded into them. This patch adds functions to prepare and write such a node. The difficulty is that besides the HMAC the nodes also have a CRC which must stay valid. This means we first have to initialize all fields in the node, then calculate the HMAC (not covering the CRC) and finally calculate the CRC. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Richard Weinberger <richard@nod.at>
| * | | | ubifs: Add helper functions for authentication supportSascha Hauer2018-10-234-0/+722
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the various helper functions needed for authentication support. We need functions to hash nodes, to embed HMACs into a node and to compare hashes and HMACs. Most functions first check if this filesystem is authenticated and bail out early if not, which makes the functions safe to be called with disabled authentication. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Richard Weinberger <richard@nod.at>
| * | | | ubifs: Add separate functions to init/crc a nodeSascha Hauer2018-10-232-15/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When adding authentication support we will embed a HMAC into some nodes. To prepare these nodes we have to first initialize the nodes, then add a HMAC and finally add a CRC. To accomplish this add separate ubifs_init_node/ubifs_crc_node functions. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Richard Weinberger <richard@nod.at>
| * | | | ubifs: Format changes for authentication supportSascha Hauer2018-10-233-3/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the changes to the on disk format needed for authentication support. We'll add: * a HMAC covering super block node * a HMAC covering the master node * a hash over the root index node to the master node * a hash over the LPT to the master node * a flag to the filesystem flag indicating the filesystem is authenticated * an authentication node necessary to authenticate the nodes written to the journal heads while they are written. * a HMAC of a well known message to the super block node to be able to check if the correct key is provided And finally, not visible in this patch, nevertheless explained here: * hashes over the referenced child nodes in each branch of a index node Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Richard Weinberger <richard@nod.at>
| * | | | ubifs: Store read superblock nodeSascha Hauer2018-10-233-22/+8Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The superblock node is read/modified/written several times throughout the UBIFS code. Instead of reading it from the device each time just keep a copy in memory and write back the modified copy when necessary. This patch helps for authentication support, here we not only have to read the superblock node, but also have to authenticate it, which is easier if we do it once during initialization. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Richard Weinberger <richard@nod.at>
| * | | | ubifs: Drop write_nodeSascha Hauer2018-10-231-34/+5Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | write_node() is used only once and can easily be replaced with calls to ubifs_prepare_node()/write_head() which makes the code a bit shorter. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Richard Weinberger <richard@nod.at>
| * | | | ubifs: Implement ubifs_lpt_lookup using ubifs_pnode_lookupSascha Hauer2018-10-231-18/+2Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ubifs_lpt_lookup() starts by looking up the nth pnode in the LPT. We already have this functionality in ubifs_pnode_lookup(). Use this function rather than open coding its functionality. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Richard Weinberger <richard@nod.at>
| * | | | ubifs: Export pnode_lookup as ubifs_pnode_lookupSascha Hauer2018-10-233-36/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ubifs_lpt_lookup could be implemented using pnode_lookup. To make that possible move pnode_lookup from lpt.c to lpt_commit.c. Rename it to ubifs_pnode_lookup since it's now exported. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Richard Weinberger <richard@nod.at>
| * | | | ubifs: Pass ubifs_zbranch to read_znode()Sascha Hauer2018-10-231-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | read_znode() takes len, lnum and offs arguments which the caller all extracts from the same struct ubifs_zbranch *. When adding authentication support we would have to add a pointer to a hash to the arguments which is also part of struct ubifs_zbranch. Pass the ubifs_zbranch * instead so that we do not have to add another argument. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Richard Weinberger <richard@nod.at>
| * | | | ubifs: Pass ubifs_zbranch to try_read_node()Sascha Hauer2018-10-231-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | try_read_node() takes len, lnum and offs arguments which the caller all extracts from the same struct ubifs_zbranch *. When adding authentication support we would have to add a pointer to a hash to the arguments which is also part of struct ubifs_zbranch. Pass the ubifs_zbranch * instead so that we do not have to add another argument. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Richard Weinberger <richard@nod.at>
| * | | | ubifs: Refactor create_default_filesystem()Sascha Hauer2018-10-231-48/+47Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | create_default_filesystem() allocates memory for a node, writes that node and frees the memory directly afterwards. With this patch we allocate memory for all nodes at the beginning of the function and free the memory at the end. This makes it easier to implement authentication support since with authentication support we'll need the contents of some nodes when creating other nodes. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Richard Weinberger <richard@nod.at>
* | | | | Merge tag 'nfs-for-4.20-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds2018-11-041-1/+1
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull NFS client bugfixes from Trond Myklebust: "Highlights include: Bugfix: - Fix build issues on architectures that don't provide 64-bit cmpxchg Cleanups: - Fix a spelling mistake" * tag 'nfs-for-4.20-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFS: fix spelling mistake, EACCESS -> EACCES SUNRPC: Use atomic(64)_t for seq_send(64)
| * | | | | NFS: fix spelling mistake, EACCESS -> EACCESColin Ian King2018-11-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Trivial fix to a spelling mistake of the error access name EACCESS, rename to EACCES Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
* | | | | | Merge tag '4.20-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds2018-11-0312-99/+530
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull cifs fixes and updates from Steve French: "Three small fixes (one Kerberos related, one for stable, and another fixes an oops in xfstest 377), two helpful debugging improvements, three patches for cifs directio and some minor cleanup" * tag '4.20-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: fix signed/unsigned mismatch on aio_read patch cifs: don't dereference smb_file_target before null check CIFS: Add direct I/O functions to file_operations CIFS: Add support for direct I/O write CIFS: Add support for direct I/O read smb3: missing defines and structs for reparse point handling smb3: allow more detailed protocol info on open files for debugging smb3: on kerberos mount if server doesn't specify auth type use krb5 smb3: add trace point for tree connection cifs: fix spelling mistake, EACCESS -> EACCES cifs: fix return value for cifs_listxattr
| * | | | | | cifs: fix signed/unsigned mismatch on aio_read patchSteve French2018-11-021-6/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The patch "CIFS: Add support for direct I/O read" had a signed/unsigned mismatch (ssize_t vs. size_t) in the return from one function. Similar trivial change in aio_write Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reported-by: Julia Lawall <julia.lawall@lip6.fr>
| * | | | | | cifs: don't dereference smb_file_target before null checkColin Ian King2018-11-021-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a null check on dst_file->private data which suggests it can be potentially null. However, before this check, pointer smb_file_target is derived from dst_file->private and dereferenced in the call to tlink_tcon, hence there is a potential null pointer deference. Fix this by assigning smb_file_target and target_tcon after the null pointer sanity checks. Detected by CoverityScan, CID#1475302 ("Dereference before null check") Fixes: 04b38d601239 ("vfs: pull btrfs clone API to vfs layer") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Steve French <stfrench@microsoft.com>
| * | | | | | CIFS: Add direct I/O functions to file_operationsLong Li2018-11-021-6/+4Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With direct read/write functions implemented, add them to file_operations. Dircet I/O is used under two conditions: 1. When mounting with "cache=none", CIFS uses direct I/O for all user file data transfer. 2. When opening a file with O_DIRECT, CIFS uses direct I/O for all data transfer on this file. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
| * | | | | | CIFS: Add support for direct I/O writeLong Li2018-11-022-41/+164
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With direct I/O write, user supplied buffers are pinned to the memory and data are transferred directly from user buffers to the transport layer. Change in v3: add support for kernel AIO Change in v4: Refactor common write code to __cifs_writev for direct and non-direct I/O. Retry on direct I/O failure. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
| * | | | | | CIFS: Add support for direct I/O readLong Li2018-11-023-39/+192
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With direct I/O read, we transfer the data directly from transport layer to the user data buffer. Change in v3: add support for kernel AIO Change in v4: Refactor common read code to __cifs_readv for direct and non-direct I/O. Retry on direct I/O failure. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
| * | | | | | smb3: missing defines and structs for reparse point handlingSteve French2018-11-022-0/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We were missing some structs from MS-FSCC relating to reparse point handling. Add them to protocol defines in smb2pdu.h Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com>
| * | | | | | smb3: allow more detailed protocol info on open files for debuggingSteve French2018-11-024-0/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to debug complex problems it is often helpful to have detailed information on the client and server view of the open file information. Add the ability for root to view the list of smb3 open files and dump the persistent handle and other info so that it can be more easily correlated with server logs. Sample output from "cat /proc/fs/cifs/open_files" # Version:1 # Format: # <tree id> <persistent fid> <flags> <count> <pid> <uid> <filename> <mid> 0x5 0x800000378 0x8000 1 7704 0 some-file 0x14 0xcb903c0c 0x84412e67 0x8000 1 7754 1001 rofile 0x1a6d 0xcb903c0c 0x9526b767 0x8000 1 7720 1000 file 0x1a5b 0xcb903c0c 0x9ce41a21 0x8000 1 7715 0 smallfile 0xd67 Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
| * | | | | | smb3: on kerberos mount if server doesn't specify auth type use krb5Steve French2018-11-021-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some servers (e.g. Azure) do not include a spnego blob in the SMB3 negotiate protocol response, so on kerberos mounts ("sec=krb5") we can fail, as we expected the server to list its supported auth types (OIDs in the spnego blob in the negprot response). Change this so that on krb5 mounts we default to trying krb5 if the server doesn't list its supported protocol mechanisms. Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> CC: Stable <stable@vger.kernel.org>
| * | | | | | smb3: add trace point for tree connectionSteve French2018-11-022-1/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In debugging certain scenarios, especially reconnect cases, it can be helpful to have a dynamic trace point for the result of tree connect. See sample output below from a reconnect event. The new event is 'smb3_tcon' TASK-PID CPU# |||| TIMESTAMP FUNCTION | | | |||| | | cifsd-6071 [001] .... 2659.897923: smb3_reconnect: server=localhost current_mid=0xa kworker/1:1-71 [001] .... 2666.026342: smb3_cmd_done: sid=0x0 tid=0x0 cmd=0 mid=0 kworker/1:1-71 [001] .... 2666.026576: smb3_cmd_err: sid=0xc49e1787 tid=0x0 cmd=1 mid=1 status=0xc0000016 rc=-5 kworker/1:1-71 [001] .... 2666.031677: smb3_cmd_done: sid=0xc49e1787 tid=0x0 cmd=1 mid=2 kworker/1:1-71 [001] .... 2666.031921: smb3_cmd_done: sid=0xc49e1787 tid=0x6e78f05f cmd=3 mid=3 kworker/1:1-71 [001] .... 2666.031923: smb3_tcon: xid=0 sid=0xc49e1787 tid=0x0 unc_name=\\localhost\test rc=0 kworker/1:1-71 [001] .... 2666.032097: smb3_cmd_done: sid=0xc49e1787 tid=0x6e78f05f cmd=11 mid=4 kworker/1:1-71 [001] .... 2666.032265: smb3_cmd_done: sid=0xc49e1787 tid=0x7912332f cmd=3 mid=5 kworker/1:1-71 [001] .... 2666.032266: smb3_tcon: xid=0 sid=0xc49e1787 tid=0x0 unc_name=\\localhost\IPC$ rc=0 kworker/1:1-71 [001] .... 2666.032386: smb3_cmd_done: sid=0xc49e1787 tid=0x7912332f cmd=11 mid=6 Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
| * | | | | | cifs: fix spelling mistake, EACCESS -> EACCESColin Ian King2018-11-022-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Trivial fix to a spelling mistake of the error access name EACCESS, rename to EACCES Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Steve French <stfrench@microsoft.com>