summaryrefslogtreecommitdiffstats
path: root/fs
Commit message (Collapse)AuthorAgeFilesLines
* fs, proc: introduce CONFIG_PROC_CHILDRENIago López Galeiras2015-06-263-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 818411616baf ("fs, proc: introduce /proc/<pid>/task/<tid>/children entry") introduced the children entry for checkpoint restore and the file is only available on kernels configured with CONFIG_EXPERT and CONFIG_CHECKPOINT_RESTORE. This is available in most distributions (Fedora, Debian, Ubuntu, CoreOS) because they usually enable CONFIG_EXPERT and CONFIG_CHECKPOINT_RESTORE. But Arch does not enable CONFIG_EXPERT or CONFIG_CHECKPOINT_RESTORE. However, the children proc file is useful outside of checkpoint restore. I would like to use it in rkt. The rkt process exec() another program it does not control, and that other program will fork()+exec() a child process. I would like to find the pid of the child process from an external tool without iterating in /proc over all processes to find which one has a parent pid equal to rkt. This commit introduces CONFIG_PROC_CHILDREN and makes CONFIG_CHECKPOINT_RESTORE select it. This allows enabling /proc/<pid>/task/<tid>/children without needing to enable CONFIG_CHECKPOINT_RESTORE and CONFIG_EXPERT. Alban tested that /proc/<pid>/task/<tid>/children is present when the kernel is configured with CONFIG_PROC_CHILDREN=y but without CONFIG_CHECKPOINT_RESTORE Signed-off-by: Iago López Galeiras <iago@endocode.com> Tested-by: Alban Crequy <alban@endocode.com> Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Serge Hallyn <serge.hallyn@canonical.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Djalal Harouni <djalal@endocode.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* proc: fix PAGE_SIZE limit of /proc/$PID/cmdlineAlexey Dobriyan2015-06-261-9/+196
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | /proc/$PID/cmdline truncates output at PAGE_SIZE. It is easy to see with $ cat /proc/self/cmdline $(seq 1037) 2>/dev/null However, command line size was never limited to PAGE_SIZE but to 128 KB and relatively recently limitation was removed altogether. People noticed and ask questions: http://stackoverflow.com/questions/199130/how-do-i-increase-the-proc-pid-cmdline-4096-byte-limit seq file interface is not OK, because it kmalloc's for whole output and open + read(, 1) + sleep will pin arbitrary amounts of kernel memory. To not do that, limit must be imposed which is incompatible with arbitrary sized command lines. I apologize for hairy code, but this it direct consequence of command line layout in memory and hacks to support things like "init [3]". The loops are "unrolled" otherwise it is either macros which hide control flow or functions with 7-8 arguments with equal line count. There should be real setproctitle(2) or something. [akpm@linux-foundation.org: fix a billion min() warnings] Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Tested-by: Jarod Wilson <jarod@redhat.com> Acked-by: Jarod Wilson <jarod@redhat.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Jan Stancek <jstancek@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'akpm' (patches from Andrew)Linus Torvalds2015-06-2521-177/+180
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge first patchbomb from Andrew Morton: - a few misc things - ocfs2 udpates - kernel/watchdog.c feature work (took ages to get right) - most of MM. A few tricky bits are held up and probably won't make 4.2. * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (91 commits) mm: kmemleak_alloc_percpu() should follow the gfp from per_alloc() mm, thp: respect MPOL_PREFERRED policy with non-local node tmpfs: truncate prealloc blocks past i_size mm/memory hotplug: print the last vmemmap region at the end of hot add memory mm/mmap.c: optimization of do_mmap_pgoff function mm: kmemleak: optimise kmemleak_lock acquiring during kmemleak_scan mm: kmemleak: avoid deadlock on the kmemleak object insertion error path mm: kmemleak: do not acquire scan_mutex in kmemleak_do_cleanup() mm: kmemleak: fix delete_object_*() race when called on the same memory block mm: kmemleak: allow safe memory scanning during kmemleak disabling memcg: convert mem_cgroup->under_oom from atomic_t to int memcg: remove unused mem_cgroup->oom_wakeups frontswap: allow multiple backends x86, mirror: x86 enabling - find mirrored memory ranges mm/memblock: allocate boot time data structures from mirrored memory mm/memblock: add extra "flags" to memblock to allow selection of memory based on attribute mm: do not ignore mapping_gfp_mask in page cache allocation paths mm/cma.c: fix typos in comments mm/oom_kill.c: print points as unsigned int mm/hugetlb: handle races in alloc_huge_page and hugetlb_reserve_pages ...
| * mm: do not ignore mapping_gfp_mask in page cache allocation pathsMichal Hocko2015-06-252-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | page_cache_read, do_generic_file_read, __generic_file_splice_read and __ntfs_grab_cache_pages currently ignore mapping_gfp_mask when calling add_to_page_cache_lru which might cause recursion into fs down in the direct reclaim path if the mapping really relies on GFP_NOFS semantic. This doesn't seem to be the case now because page_cache_read (page fault path) doesn't seem to suffer from the reclaim recursion issues and do_generic_file_read and __generic_file_splice_read also shouldn't be called under fs locks which would deadlock in the reclaim path. Anyway it is better to obey mapping gfp mask and prevent from later breakage. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Michal Hocko <mhocko@suse.cz> Cc: Dave Chinner <david@fromorbit.com> Cc: Neil Brown <neilb@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Anton Altaparmakov <anton@tuxera.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * mm/hugetlb: reduce arch dependent code about hugetlb_prefault_arch_hookZhang Zhen2015-06-251-1/+0Star
| | | | | | | | | | | | | | | | | | | | Currently we have many duplicates in definitions of hugetlb_prefault_arch_hook. In all architectures this function is empty. Signed-off-by: Zhang Zhen <zhenzhang.zhang@huawei.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * procfs: treat parked tasks as sleeping for task stateChris Metcalf2015-06-251-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allowing watchdog threads to be parked means that we now have the opportunity of actually seeing persistent parked threads in the output of /proc/<pid>/stat and /proc/<pid>/status. The existing code reported such threads as "Running", which is kind-of true if you think of the case where we park them as part of taking cpus offline. But if we allow parking them indefinitely, "Running" is pretty misleading, so we report them as "Sleeping" instead. We could simply report them with a new string, "Parked", but it feels like it's a bit risky for userspace to see unexpected new values; the output is already documented in Documentation/filesystems/proc.txt, and it seems like a mistake to change that lightly. The scheduler does report parked tasks with a "P" in debugging output from sched_show_task() or dump_cpu_task(), but that's a different API. Similarly, the trace_ctxwake_* routines report a "P" for parked tasks, but again, different API. This change seemed slightly cleaner than updating the task_state_array to have additional rows. TASK_DEAD should be subsumed by the exit_state bits; TASK_WAKEKILL is just a modifier; and TASK_WAKING can very reasonably be reported as "Running" (as it is now). Only TASK_PARKED shows up with unreasonable output here. Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Ulrich Obergfell <uobergfe@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * ocfs2: mark local functions as staticJoseph Qi2015-06-252-6/+6
| | | | | | | | | | | | | | | | | | | | Some functions are only used locally, so mark them as static. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * ocfs2: use swap() in ocfs2_double_lock()Fabian Frederick2015-06-251-9/+2Star
| | | | | | | | | | | | | | | | | | | | | | | | | | Use kernel.h macro definition. Thanks to Julia Lawall for Coccinelle scripting support. Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Julia Lawall <julia.lawall@lip6.fr> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * ocfs2: use swap() in swap_refcount_rec()Fabian Frederick2015-06-251-4/+2Star
| | | | | | | | | | | | | | | | | | | | | | | | | | Use kernel.h macro definition. Thanks to Julia Lawall for Coccinelle scripting support. Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Julia Lawall <julia.lawall@lip6.fr> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * ocfs2: use swap() in dx_leaf_sort_swap()Fabian Frederick2015-06-251-4/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | | | Use kernel.h macro definition. Thanks to Julia Lawall for Coccinelle scripting support. Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Julia Lawall <julia.lawall@lip6.fr> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * ocfs2: fix wrong check in ocfs2_direct_IO_get_blocksJoseph Qi2015-06-252-3/+15
| | | | | | | | | | | | | | | | | | | | | | | | contig_blocks gotten from ocfs2_extent_map_get_blocks cannot be compared with clusters_to_alloc. So convert it to clusters first. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Reviewed-by: Weiwei Wang <wangww631@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * ocfs2: fix NULL pointer dereference in function ocfs2_abort_trigger()Xue jiufei2015-06-251-3/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ocfs2_abort_trigger() use bh->b_assoc_map to get sb. But there's no function to set bh->b_assoc_map in ocfs2, it will trigger NULL pointer dereference while calling this function. We can get sb from bh->b_bdev->bd_super instead of b_assoc_map. [akpm@linux-foundation.org: update comment, per Joseph] Signed-off-by: joyce.xue <xuejiufei@huawei.com> Cc: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * ocfs2: o2net: should remove debugfs in o2net_init() out branchalex chen2015-06-251-1/+1
| | | | | | | | | | | | | | | | | | Signed-off-by: Alex Chen <alex.chen@huawei.com> Reviewed-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * ocfs2: remove OCFS2_IOCB_SEM lock type in direct ioWeiWei Wang2015-06-253-37/+4Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In ocfs2 direct read/write, OCFS2_IOCB_SEM lock type is used to protect inode->i_alloc_sem rw semaphore lock in the earlier kernel version. However, in the latest kernel, inode->i_alloc_sem rw semaphore lock is not used at all, so OCFS2_IOCB_SEM lock type needs to be removed. Signed-off-by: Weiwei Wang <wangww631@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * ocfs2: do not BUG if jbd2_journal_dirty_metadata failsJoseph Qi2015-06-251-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | jbd2_journal_dirty_metadata may fail. Currently it cannot take care of non zero return value and just BUG in ocfs2_journal_dirty. This patch is aborting the handle and journal instead of BUG. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Cc: joyce.xue <xuejiufei@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * ocfs2: remove BUG_ON(!empty_extent) in __ocfs2_rotate_tree_left()Xue jiufei2015-06-251-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ocfs2_rotate_tree_left() calls __ocfs2_rotate_tree_left() for left rotation while non-rightmost path containing an empty extent in the leaf block. __ocfs2_rotate_tree_left() returns -EAGAIN if right subtree having an empty extent and pass the empty_extent_path to caller. The caller ocfs2_rotate_tree_left() will restart rotation from the returned path. It will trigger the BUG_ON(!ocfs2_is_empty_extent) when the et on disk is as follows: eb0 is the leaf block of path(say path_a) passed to ocfs2_rotate_tree_left, which has an empty rec[0]. eb1 is the leaf block of path(say path_b) that just right to path_a, which has no empty record. eb2 is the leaf block of path(say path_c) that just right to path_b, which has an empty rec[0]. And path_c is also the rightmost path. Now we want to remove the empty rec[0] in eb0: ocfs2_rotate_tree_left: -> call __ocfs2_rotate_tree_left with path_a as its input *path* -> call ocfs2_rotate_subtree_left with path_a as its input *left_path* and path_b as its input *right_path*. it will move rec[0] in eb1 to eb0, and rec[0] in eb0 is not empty now. -> continue to call ocfs2_rotate_subtree_left with path_b as its input *left_path* and path_c as its input *right_path*, and return -EAGAIN because eb2 has an empty rec[0] -> call __ocfs2_rotate_tree_left with path_c as it input, rotate all records in eb2 to left and return 0. -> call __ocfs2_rotate_tree_left with path_a as its input, and triggers the BUG_ON(!ocfs2_is_empty_extent) as the rec[0] in eb0 is not empty. So the BUG_ON() should be removed and return 0 if rec[0] is no longer an empty extent. Signed-off-by: joyce.xue <xuejiufei@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * ocfs2: return error when ocfs2_figure_merge_contig_type() failsXue jiufei2015-06-251-10/+24
| | | | | | | | | | | | | | | | | | | | | | | | ocfs2_figure_merge_contig_type() still returns CONTIG_NONE when some error occurs which will cause an unpredictable error. So return a proper errno when ocfs2_figure_merge_contig_type() fails. Signed-off-by: joyce.xue <xuejiufei@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * ocfs2/dlm: cleanup unused function __dlm_wait_on_lockres_flags_setJoseph Qi2015-06-251-1/+0Star
| | | | | | | | | | | | | | | | | | | | | | __dlm_wait_on_lockres_flags_set() is declared but not implemented and used. So remove it. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * ocfs2: use retval instead of status for checking errorDaeseok Youn2015-06-251-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The use of 'status' in __ocfs2_add_entry() can return wrong value. Some functions' return value in __ocfs2_add_entry(), i.e ocfs2_journal_access_di() is saved to 'status'. But 'status' is not used in 'bail' label for returning result of __ocfs2_add_entry(). So use retval instead of status. Signed-off-by: Daeseok Youn <daeseok.youn@gmail.com> Reviewed-by: Joseph Qi <joseph.qi@huawei.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * ocfs2: fix a tiny race when truncate dio orohaned entryJoseph Qi2015-06-254-46/+39Star
| | | | | | | | | | | | | | | | | | | | | | | | | | Once dio crashed it will leave an entry in orphan dir. And orphan scan will take care of the clean up. There is a tiny race case that the same entry will be truncated twice and then trigger the BUG in ocfs2_del_inode_from_orphan. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * ocfs2: remove __mlog_cpu_guessAndrew Morton2015-06-251-16/+3Star
| | | | | | | | | | | | | | | | | | | | | | | | raw_smp_processor_id() is the means of avoiding the runtime preemptibility check. [akpm@linux-foundation.org: fix printk warning] Cc: Joe Perches <joe@perches.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * ocfs2: reduce object size of mlog usesJoe Perches2015-06-252-30/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using a function for __mlog_printk instead of a macro reduces the object size of built-in.o by about 190KB, or ~18% overall (x86-64 defconfig with all ocfs2 options) $ size fs/ocfs2/built-in.o* text data bss dec hex filename 870954 118471 134408 1123833 1125f9 fs/ocfs2/built-in.o,new 1064081 118071 134408 1316560 1416d0 fs/ocfs2/built-in.o.old Miscellanea: - Move the used-once __mlog_cpu_guess statement expression macro to the masklog.c file above the use in __mlog_printk function - Simplify the mlog macro moving the and/or logic and level code into __mlog_printk [akpm@linux-foundation.org: export __mlog_printk() to other ocfs2 modules] Signed-off-by: Joe Perches <joe@perches.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * configfs: unexport/make static config_item_init()Fabian Frederick2015-06-251-2/+1Star
| | | | | | | | | | | | | | | | | | config_item_init() is only used in item.c Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * NTFS: use kvfree() in ntfs_free()Pekka Enberg2015-06-251-6/+1Star
| | | | | | | | | | | | | | | | | | Use kvfree() instead of open-coding it. Signed-off-by: Pekka Enberg <penberg@kernel.org> Cc: Anton Altaparmakov <anton@tuxera.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge tag 'please-pull-pstore' of ↵Linus Torvalds2015-06-252-19/+39
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux Pull pstore updates from Tony Luck: "Miscellaneous pstore improvements" * tag 'please-pull-pstore' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux: ramoops: make it possible to change mem_type param. pstore/ram: verify ramoops header before saving record fs/pstore: Optimization function ramoops_init_przs fs/pstore: update the backend parameter in pstore module pstore: do not use message compression without lock
| * | ramoops: make it possible to change mem_type param.Wang Long2015-05-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we set ramoops.mem_type=1 in command line, the current code can not change mem_type to 1, because it is assigned to 0 in function ramoops_register_dummy. This patch make it possible to change mem_type parameter in command line. Signed-off-by: Wang Long <long.wanglong@huawei.com> Acked-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
| * | pstore/ram: verify ramoops header before saving recordBen Zhang2015-05-211-12/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On some devices the persistent memory contains junk after a cold boot, and /dev/pstore/dmesg-ramoops-* are created with random data which is not the result of a kernel crash. This patch adds a ramoops header check and skips any persistent_ram_zone that does not have a valid header. Signed-off-by: Ben Zhang <benzh@chromium.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
| * | fs/pstore: Optimization function ramoops_init_przslong.wanglong2015-05-211-5/+3Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The value of cxt->record_size does not change in the loop, so this patch optimize the assign statement by dropping sz entirely and using cxt->record_size in its place. Signed-off-by: Wang Long <long.wanglong@huawei.com> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
| * | fs/pstore: update the backend parameter in pstore moduleWang Long2015-05-211-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch update the module parameter backend, so it is visible through /sys/module/pstore/parameters/backend. For example: if pstore backend is ramoops, with this patch: # cat /sys/module/pstore/parameters/backend ramoops and without this patch: # cat /sys/module/pstore/parameters/backend (null) Signed-off-by: Wang Long <long.wanglong@huawei.com> Acked-by: Mark Salyzyn <salyzyn@android.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
| * | pstore: do not use message compression without lockKonstantin Khlebnikov2015-05-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | pstore_compress() uses static stream buffer for zlib-deflate which easily crashes when several concurrent threads use one shared state. Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Tony Luck <tony.luck@intel.com>
* | | Merge tag 'for-f2fs-4.2' of ↵Linus Torvalds2015-06-2529-616/+3775
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "New features: - per-file encryption (e.g., ext4) - FALLOC_FL_ZERO_RANGE - FALLOC_FL_COLLAPSE_RANGE - RENAME_WHITEOUT Major enhancement/fixes: - recovery broken superblocks - enhance f2fs_trim_fs with a discard_map - fix a race condition on dentry block allocation - fix a deadlock during summary operation - fix a missing fiemap result .. and many minor bug fixes and clean-ups were done" * tag 'for-f2fs-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (83 commits) f2fs: do not trim preallocated blocks when truncating after i_size f2fs crypto: add alloc_bounce_page f2fs crypto: fix to handle errors likewise ext4 f2fs: drop the volatile_write flag only f2fs: skip committing valid superblock f2fs: setting discard option in parse_options() f2fs: fix to return exact trimmed size f2fs: support FALLOC_FL_INSERT_RANGE f2fs: hide common code in f2fs_replace_block f2fs: disable the discard option when device doesn't support f2fs crypto: remove alloc_page for bounce_page f2fs: fix a deadlock for summary page lock vs. sentry_lock f2fs crypto: clean up error handling in f2fs_fname_setup_filename f2fs crypto: avoid f2fs_inherit_context for symlink f2fs crypto: do not set encryption policy for non-directory by ioctl f2fs crypto: allow setting encryption policy once f2fs crypto: check context consistent for rename2 f2fs: avoid duplicated code by reusing f2fs_read_end_io f2fs crypto: use per-inode tfm structure f2fs: recovering broken superblock during mount ...
| * | | f2fs: do not trim preallocated blocks when truncating after i_sizeChao Yu2015-06-121-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we perform generic/092 in xfstests, output is like below: XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 0: [0..10239]: data 0: [0..10239]: data -1: [10240..20479]: unwritten +1: [10240..14335]: unwritten This is because with this testcase, we redefine the regulation for truncate in perallocated space past i_size as below: "There was some confused about what the fs was supposed to do when you truncate at i_size with preallocated space past i_size. We decided on the following things. 1) truncate(i_size) will trim all blocks past i_size. 2) truncate(x) where x > i_size will not trim all blocks past i_size. " This method is used in xfs, and then ext4/btrfs will follow the rule. This patch fixes to follow the new rule for f2fs. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | | f2fs crypto: add alloc_bounce_pageJaegeuk Kim2015-06-121-8/+15
| | | | | | | | | | | | | | | | | | | | | | | | This patch adds alloc_bounce_page likewise ext4. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | | f2fs crypto: fix to handle errors likewise ext4Jaegeuk Kim2015-06-121-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | This patch makes some error handling policies same with ext4. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | | f2fs: drop the volatile_write flag onlyJaegeuk Kim2015-06-091-4/+2Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | When aborting volatile_writes, let's drop its flag and give up any further volatile_writes. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | | f2fs: skip committing valid superblockChao Yu2015-06-083-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In recovery procedure for superblock, we try to write data of valid superblock into invalid one for recovery, work should be finished here, but then still we will write the valid one with its original data. This operation is not needed. Let's skip doing this unnecessary work. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | | f2fs: setting discard option in parse_options()Chao Yu2015-06-081-11/+9Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For the first mount of f2fs image with realtime discard option, we will disable discard option if device is not supported, but for remount operation, our discard option can still be set, this should be avoided. This patch moves configuring of discard option to parse_options() to fix this issue. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | | f2fs: fix to return exact trimmed sizeJaegeuk Kim2015-06-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now, we add all the candidates for trim commands and then finally issue discard commands. So, we should count the trimmed size in back-end. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | | f2fs: support FALLOC_FL_INSERT_RANGEChao Yu2015-06-021-2/+100
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | FALLOC_FL_INSERT_RANGE flag for ->fallocate was introduced in commit dd46c787788d ("fs: Add support FALLOC_FL_INSERT_RANGE for fallocate"). The effect of FALLOC_FL_INSERT_RANGE command is the opposite of FALLOC_FL_COLLAPSE_RANGE, if this command was performed, all data from offset to EOF in our file will be shifted to right as given length, and then range [offset, offset + length] becomes a hole. This command is useful for our user who wants to add some data in the middle of the file, for example: video/music editor will insert a keyframe in specified position of media file, with this command we can easily create a hole for inserting without removing original data. This patch introduces f2fs_insert_range() to support FALLOC_FL_INSERT_RANGE. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Yuan Zhong <yuan.mark.zhong@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | | f2fs: hide common code in f2fs_replace_blockChao Yu2015-06-024-20/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch clean up codes through: 1.rename f2fs_replace_block to __f2fs_replace_block(). 2.introduce new f2fs_replace_block() to include __f2fs_replace_block() and some common related codes around __f2fs_replace_block(). Then, newly introduced function f2fs_replace_block can be used by following patch. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | | f2fs: disable the discard option when device doesn't supportChenxi Mao2015-06-021-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current f2fs check the whether the blk device can support discard. However, the code will cause the discard option cannot be enabled. Because the clear_opt(sbi, DISCARD) will be invoked forever. This patch can fix this issue. Jaegeuk Kim: The original patch was intended to disable the discard option when device does not support trim command. Rather than remaining the buggy patch, let's replace with this patch as an integrated one. Signed-off-by: Chenxi Mao <chenxi.mao2013@gmail.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | | f2fs crypto: remove alloc_page for bounce_pageJaegeuk Kim2015-06-022-23/+13Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We don't need to call alloc_page() prior to mempool_alloc(), since the mempool_alloc() calls alloc_page() internally. And, if __GFP_WAIT is set, it never fails on page allocation, so let's give GFP_NOWAIT and handle ENOMEM by writepage(). Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | | f2fs: fix a deadlock for summary page lock vs. sentry_lockJaegeuk Kim2015-06-021-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In f2fs_gc: In f2fs_replace_block: - lock_page(sum_page) - check_valid_map() - mutex_lock(sentry_lock) - mutex_lock(sentry_lock) - change_curseg() - lock_page(sum_page) This patch fixes the deadlock condition. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | | f2fs crypto: clean up error handling in f2fs_fname_setup_filenameJaegeuk Kim2015-06-021-14/+10Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Sync with: ext4 crypto: clean up error handling in ext4_fname_setup_filename Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | | f2fs crypto: avoid f2fs_inherit_context for symlinkJaegeuk Kim2015-06-021-4/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes to call f2fs_inherit_context twice for newly created symlink. The original one is called by f2fs_add_link(), which invokes f2fs_setxattr. If the second one is called again, f2fs_setxattr is triggered again with same encryption index. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | | f2fs crypto: do not set encryption policy for non-directory by ioctlChao Yu2015-06-022-6/+3Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Encryption policy should only be set to an empty directory through ioctl, This patch add a judgement condition to verify type of the target inode to avoid incorrectly configuring for non-directory. Additionally, remove unneeded inline data conversion since regular or symlink file should not be processed here. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | | f2fs crypto: allow setting encryption policy onceChao Yu2015-06-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch add XATTR_CREATE flag in setxattr when setting encryption context for inode. Without this flag the context could be set more than once, this should never happen. So, fix it. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | | f2fs crypto: check context consistent for rename2Chao Yu2015-06-021-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For exchange rename, we should check context consistent of encryption between new_dir and old_inode or old_dir and new_inode. Otherwise inheritance of parent's encryption context will be broken. Signed-off-by: Chao Yu <chao2.yu@samsung.com> [Jaegeuk Kim: sync with ext4 approach] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | | f2fs: avoid duplicated code by reusing f2fs_read_end_ioChao Yu2015-06-021-28/+4Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch tries to clean up code because part code of f2fs_read_end_io and mpage_end_io are the same, so it's better to merge and reuse them. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | | f2fs crypto: use per-inode tfm structureJaegeuk Kim2015-06-029-167/+96Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch applies the following ext4 patch: ext4 crypto: use per-inode tfm structure As suggested by Herbert Xu, we shouldn't allocate a new tfm each time we read or write a page. Instead we can use a single tfm hanging off the inode's crypt_info structure for all of our encryption needs for that inode, since the tfm can be used by multiple crypto requests in parallel. Also use cmpxchg() to avoid races that could result in crypt_info structure getting doubly allocated or doubly freed. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>