summaryrefslogtreecommitdiffstats
path: root/fs/ocfs2/super.c
Commit message (Collapse)AuthorAgeFilesLines
...
* | push BKL down into ->put_superChristoph Hellwig2009-06-121-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move BKL into ->put_super from the only caller. A couple of filesystems had trivial enough ->put_super (only kfree and NULLing of s_fs_info + stuff in there) to not get any locking: coda, cramfs, efs, hugetlbfs, omfs, qnx4, shmem, all others got the full treatment. Most of them probably don't need it, but I'd rather sort that out individually. Preferably after all the other BKL pushdowns in that area. [AV: original used to move lock_super() down as well; these changes are removed since we don't do lock_super() at all in generic_shutdown_super() now] [AV: fuse, btrfs and xfs are known to need no damn BKL, exempt] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | ocfs2: remove ->write_super and stop maintaining ->s_dirtChristoph Hellwig2009-06-121-14/+0Star
| | | | | | | | | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | block: Do away with the notion of hardsect_sizeMartin K. Petersen2009-05-221-1/+1
|/ | | | | | | | | | | | | | Until now we have had a 1:1 mapping between storage device physical block size and the logical block sized used when addressing the device. With SATA 4KB drives coming out that will no longer be the case. The sector size will be 4KB but the logical block size will remain 512-bytes. Hence we need to distinguish between the physical block size and the logical ditto. This patch renames hardsect_size to logical_block_size. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* ocfs2: recover orphans in offline slots during recovery and mountSrinivas Eeda2009-04-031-0/+6
| | | | | | | | | | | | | | | | | During recovery, a node recovers orphans in it's slot and the dead node(s). But if the dead nodes were holding orphans in offline slots, they will be left unrecovered. If the dead node is the last one to die and is holding orphans in other slots and is the first one to mount, then it only recovers it's own slot, which leaves orphans in offline slots. This patch queues complete_recovery to clean orphans for all offline slots during mount and node recovery. Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com> Acked-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
* ocfs2: Add a name indexed b-tree to directory inodesMark Fasheh2009-04-031-0/+6
| | | | | | | | | | | | | | | | This patch makes use of Ocfs2's flexible btree code to add an additional tree to directory inodes. The new tree stores an array of small, fixed-length records in each leaf block. Each record stores a hash value, and pointer to a block in the traditional (unindexed) directory tree where a dirent with the given name hash resides. Lookup exclusively uses this tree to find dirents, thus providing us with constant time name lookups. Some of the hashing code was copied from ext3. Unfortunately, it has lots of unfixed checkpatch errors. I left that as-is so that tracking changes would be easier. Signed-off-by: Mark Fasheh <mfasheh@suse.com> Acked-by: Joel Becker <joel.becker@oracle.com>
* ocfs2: Expose the file system state via debugfsSunil Mushran2009-04-031-0/+176
| | | | | | | | | | This patch creates a per mount debugfs file, fs_state, which exposes information like, cluster stack in use, states of the downconvert, recovery and commit threads, number of journal txns, some allocation stats, list of all slots, etc. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
* ocfs2: add IO error check in ocfs2_get_sector()wengang wang2009-02-261-0/+7
| | | | | | | Check for IO error in ocfs2_get_sector(). Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
* ocfs2: lock the metaecc process for xattr bucketTao Ma2009-02-261-0/+1
| | | | | | | | | | | For other metadata in ocfs2, metaecc is checked in ocfs2_read_blocks with io_mutex held. While for xattr bucket, it is calculated by the whole buckets. So we have to add a spin_lock to prevent multiple processes calculating metaecc. Signed-off-by: Tao Ma <tao.ma@oracle.com> Tested-by: Tristan Ye <tristan.ye@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
* ocfs2: Push out dropping of dentry lock to ocfs2_wqJan Kara2009-02-021-0/+3
| | | | | | | | | | Dropping of last reference to dentry lock is a complicated operation involving dropping of reference to inode. This can get complicated and quota code in particular needs to obtain some quota locks which leads to potential deadlock. Thus we defer dropping of inode reference to ocfs2_wq. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
* ocfs2: Validate superblock with checksum and ecc.Joel Becker2009-01-051-0/+11
| | | | | | | | The superblock is read via a raw call. Validate it after we find it from its signature. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
* ocfs2: Enable quota accounting on mount, disable on umountJan Kara2009-01-051-0/+222
| | | | | | | | | Enable quota usage tracking on mount and disable it on umount. Also add support for quota on and quota off quotactls and usrquota and grpquota mount options. Add quota features among supported ones. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
* ocfs2: Periodic quota syncingMark Fasheh2009-01-051-0/+7
| | | | | | | | | This patch creates a work queue for periodic syncing of locally cached quota information to the global quota files. We constantly queue a delayed work item, to get the periodic behavior. Signed-off-by: Mark Fasheh <mfasheh@suse.com> Acked-by: Jan Kara <jack@suse.cz>
* ocfs2: Implementation of local and global quota file handlingJan Kara2009-01-051-2/+36
| | | | | | | | | | | | | | | | | For each quota type each node has local quota file. In this file it stores changes users have made to disk usage via this node. Once in a while this information is synced to global file (and thus with other nodes) so that limits enforcement at least aproximately works. Global quota files contain all the information about usage and limits. It's mostly handled by the generic VFS code (which implements a trie of structures inside a quota file). We only have to provide functions to convert structures from on-disk format to in-memory one. We also have to provide wrappers for various quota functions starting transactions and acquiring necessary cluster locks before the actual IO is really started. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
* ocfs2: Assign feature bits and system inodes to quota feature and quota filesJan Kara2009-01-051-0/+17
| | | | | Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
* ocfs2: add mount option and Kconfig option for aclTiger Yang2009-01-051-0/+33
| | | | | | | | This patch adds the Kconfig option "CONFIG_OCFS2_FS_POSIX_ACL" and mount options "acl" to enable acls in Ocfs2. Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
* ocfs2: Don't check for NULL before brelse()Mark Fasheh2008-10-141-2/+1Star
| | | | | | This is pointless as brelse() already does the check. Signed-off-by: Mark Fasheh
* ocfs2: Add xattr mount option in ocfs2_show_options()Sunil Mushran2008-10-141-0/+5
| | | | | | | | Patch adds check for [no]user_xattr in ocfs2_show_options() that completes the list of all mount options. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
* ocfs2: Switch over to JBD2.Joel Becker2008-10-141-4/+6
| | | | | | | | | | | | | | | | | | | | | | ocfs2 wants JBD2 for many reasons, not the least of which is that JBD is limiting our maximum filesystem size. It's a pretty trivial change. Most functions are just renamed. The only functional change is moving to Jan's inode-based ordered data mode. It's better, too. Because JBD2 reads and writes JBD journals, this is compatible with any existing filesystem. It can even interact with JBD-based ocfs2 as long as the journal is formated for JBD. We provide a compatibility option so that paranoid people can still use JBD for the time being. This will go away shortly. [ Moved call of ocfs2_begin_ordered_truncate() from ocfs2_delete_inode() to ocfs2_truncate_for_delete(). --Mark ] Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
* ocfs2: Add the 'inode64' mount option.Joel Becker2008-10-141-0/+17
| | | | | | | | | | | Now that ocfs2 limits inode numbers to 32bits, add a mount option to disable the limit. This parallels XFS. 64bit systems can handle the larger inode numbers. [ Added description of inode64 mount option in ocfs2.txt. --Mark ] Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
* ocfs2: Add incompatible flag for extended attributeTiger Yang2008-10-141-1/+2
| | | | | | | | | This patch adds the s_incompat flag for extended attribute support. This helps us ensure that older versions of Ocfs2 or ocfs2-tools will not be able to mount a volume with xattr support. Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
* ocfs2: Add extended attribute supportTiger Yang2008-10-141-0/+14
| | | | | | | | | | This patch implements storing extended attributes both in inode or a single external block. We only store EA's in-inode when blocksize > 512 or that inode block has free space for it. When an EA's value is larger than 80 bytes, we will store the value via b-tree outside inode or block. Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
* ocfs2: reserve inline space for extended attributeTiger Yang2008-10-141-0/+2
| | | | | | | | | | Add the structures and helper functions we want for handling inline extended attributes. We also update the inline-data handlers so that they properly function in the event that we have both inline data and inline attributes sharing an inode block. Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
* ocfs2: throttle back local alloc when low on disk spaceMark Fasheh2008-10-131-1/+3
| | | | | | | | | | | | | | | | | | | | | | | Ocfs2's local allocator disables itself for the duration of a mount point when it has trouble allocating a large enough area from the primary bitmap. That can cause performance problems, especially for disks which were only temporarily full or fragmented. This patch allows for the allocator to shrink it's window first, before being disabled. Later, it can also be re-enabled so that any performance drop is minimized. To do this, we allow the value of osb->local_alloc_bits to be shrunk when needed. The default value is recorded in a mostly read-only variable so that we can re-initialize when required. Locking had to be updated so that we could protect changes to local_alloc_bits. Mostly this involves protecting various local alloc values with the osb spinlock. A new state is also added, OCFS2_LA_THROTTLED, which is used when the local allocator is has shrunk, but is not disabled. If the available space dips below 1 megabyte, the local alloc file is disabled. In either case, local alloc is re-enabled 30 seconds after the event, or when an appropriate amount of bits is seen in the primary bitmap. Signed-off-by: Mark Fasheh <mfasheh@suse.com>
* ocfs2: Track local alloc bits internallyMark Fasheh2008-10-131-3/+5
| | | | | | | | | Do this instead of tracking absolute local alloc size. This avoids needless re-calculatiion of bits from bytes in localalloc.c. Additionally, the value is now in a more natural unit for internal file system bitmap work. Signed-off-by: Mark Fasheh <mfasheh@suse.com>
* vfs: Use const for kernel parser tableSteven Whitehouse2008-10-131-1/+1
| | | | | | | | | | | | | | This is a much better version of a previous patch to make the parser tables constant. Rather than changing the typedef, we put the "const" in all the various places where its required, allowing the __initconst exception for nfsroot which was the cause of the previous trouble. This was posted for review some time ago and I believe its been in -mm since then. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Alexander Viro <aviro@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [PATCH 2/2] ocfs2: Fix race between mount and recoverySunil Mushran2008-08-011-1/+11
| | | | | | | | | | | | | | | | | | | | | | | | As the fs recovery is asynchronous, there is a small chance that another node can mount (and thus recover) the slot before the recovery thread gets to it. If this happens, the recovery thread will block indefinitely on the journal/slot lock as that lock will be held for the duration of the mount (by design) by the node assigned to that slot. The solution implemented is to keep track of the journal replays using a recovery generation in the journal inode, which will be incremented by the thread replaying that journal. The recovery thread, before attempting the blocking lock on the journal/slot lock, will compare the generation on disk with what it has cached and skip recovery if it does not match. This bug appears to have been inadvertently introduced during the mount/umount vote removal by mainline commit 34d024f84345807bf44163fac84e921513dde323. In the mount voting scheme, the messaging would indirectly indicate that the slot was being recovered. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
* SL*B: drop kmem cache argument from constructorAlexey Dobriyan2008-07-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Kmem cache passed to constructor is only needed for constructors that are themselves multiplexeres. Nobody uses this "feature", nor does anybody uses passed kmem cache in non-trivial way, so pass only pointer to object. Non-trivial places are: arch/powerpc/mm/init_64.c arch/powerpc/mm/hugetlbpage.c This is flag day, yes. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Christoph Lameter <cl@linux-foundation.org> Cc: Jon Tollefson <kniht@linux.vnet.ibm.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Matt Mackall <mpm@selenic.com> [akpm@linux-foundation.org: fix arch/powerpc/mm/hugetlbpage.c] [akpm@linux-foundation.org: fix mm/slab.c] [akpm@linux-foundation.org: fix ubifs] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ocfs2: Handle error during journal loadWengang Wang2008-07-141-1/+5
| | | | | | | | This patch ensures the mount fails if the fs is unable to load the journal. Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Acked-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
* ocfs2: Add inode stealing for ocfs2_reserve_new_inodeTao Ma2008-04-181-0/+1
| | | | | | | | | | Inode allocation is modified to look in other nodes allocators during extreme out of space situations. We retry our own slot when space is freed back to the global bitmap, or whenever we've allocated more than 1024 inodes from another slot. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
* ocfs2: Add the USERSPACE_STACK incompat bit.Joel Becker2008-04-181-1/+89
| | | | | | | | | | | | | | | | | | The filesystem gains the USERSPACE_STACK incomat bit and the s_cluster_info field on the superblock. When a userspace stack is in use, the name of the stack is stored on-disk for mount-time verification. The "cluster_stack" option is added to mount(2) processing. The mount process needs to pass the matching stack name. If the passed name and the on-disk name do not match, the mount is failed. When using the classic o2cb stack, the incompat bit is *not* set and no mount option is used other than the usual heartbeat=local. Thus, the filesystem is compatible with older tools. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
* ocfs2: Break out stackglue into modules.Joel Becker2008-04-181-7/+9
| | | | | | | | | | | | | | | | | | We define the ocfs2_stack_plugin structure to represent a stack driver. The o2cb stack code is split into stack_o2cb.c. This becomes the ocfs2_stack_o2cb.ko module. The stackglue generic functions are similarly split into the ocfs2_stackglue.ko module. This module now provides an interface to register drivers. The ocfs2_stack_o2cb driver registers itself. As part of this interface, ocfs2_stackglue can load drivers on demand. This is accomplished in ocfs2_cluster_connect(). ocfs2_cluster_disconnect() is now notified when a _hangup() is pending. If a hangup is pending, it will not release the driver module and will let _hangup() do that. Signed-off-by: Joel Becker <joel.becker@oracle.com>
* ocfs2: Clean up stackglue initializationJoel Becker2008-04-181-4/+2Star
| | | | | | | | The stack glue initialization function needs a better name so that it can be used cleanly when stackglue becomes a module. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
* ocfs2: Fill node number during cluster stack initMark Fasheh2008-04-181-33/+0Star
| | | | | | | | | | | | | It doesn't make sense to query for a node number before connecting to the cluster stack. This should be safe to do because node_num is only just printed, and we're actually only moving the setting of node num a small amount further in the mount process. [ Disconnect when node query fails -- Joel ] Reviewed-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
* ocfs2: Move o2hb functionality into the stack glue.Joel Becker2008-04-181-13/+10Star
| | | | | | | | | | | | | | | | | The last bit of classic stack used directly in ocfs2 code is o2hb. Specifically, the check for heartbeat during mount and the call to ocfs2_hb_ctl during unmount. We create an extra API, ocfs2_cluster_hangup(), to encapsulate the call to ocfs2_hb_ctl. Other stacks will just leave hangup() empty. The check for heartbeat is moved into ocfs2_cluster_connect(). It will be matched by a similar check for other stacks. With this change, only stackglue.c includes cluster/ headers. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
* ocfs2: Abstract out node number queries.Joel Becker2008-04-181-11/+11
| | | | | | | | | | | | | | | | | ocfs2 asks the cluster stack for the local node's node number for two reasons; to fill the slot map and to print it. While the slot map isn't necessary for userspace cluster stacks, the printing is very nice for debugging. Thus we add ocfs2_cluster_this_node() as a generic API to get this value. It is anticipated that the slot map will not be used under a userspace cluster stack, so validity checks of the node num only need to exist in the slot map code. Otherwise, it just gets used and printed as an opaque value. [ Fixed up some "int" versus "unsigned int" issues and made osb->node_num truly opaque. --Mark ] Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
* ocfs2: Introduce the new ocfs2_cluster_connect/disconnect() API.Joel Becker2008-04-181-8/+5Star
| | | | | | | | | | | | | | | | | | | | | | | | This step introduces a cluster stack agnostic API for initializing and exiting. fs/ocfs2/dlmglue.c no longer uses o2cb/o2dlm knowledge to connect to the stack. It is all handled in stackglue.c. heartbeat.c no longer needs to know how it gets called. ocfs2_do_node_down() is now a clean recovery trigger. The big gotcha is the ordering of initializations and de-initializations done underneath ocfs2_cluster_connect(). ocfs2_dlm_init() used to do all o2dlm initialization in one block. Thus, the o2dlm functionality of ocfs2_cluster_connect() is very straightforward. ocfs2_dlm_shutdown(), however, did a few things between de-registration of the eviction callback and actually shutting down the domain. Now de-registration and shutdown of the domain are wrapped within the single ocfs2_cluster_disconnect() call. I've checked the code paths to make sure we can safely tear down things in ocfs2_dlm_shutdown() before calling ocfs2_cluster_disconnect(). The filesystem has already set itself to ignore the callback. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
* ocfs2: Separate out dlm lock functions.Joel Becker2008-04-181-0/+4
| | | | | | | | | | | This is the first in a series of patches to isolate ocfs2 from the underlying cluster stack. Here we wrap the dlm locking functions with ocfs2-specific calls. Because ocfs2 always uses the same dlm lock status callbacks, we can eliminate the callbacks from the filesystem visible functions. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
* ocfs2: Change the recovery map to an array of node numbers.Joel Becker2008-04-181-25/+8Star
| | | | | | | | | | | | | | | | | | | | | | | | | | The old recovery map was a bitmap of node numbers. This was sufficient for the maximum node number of 254. Going forward, we want node numbers to be UINT32. Thus, we need a new recovery map. Note that we can't keep track of slots here. We must write down the node number to recovery *before* we get the locks needed to convert a node number into a slot number. The recovery map is now an array of unsigned ints, max_slots in size. It moves to journal.c with the rest of recovery. Because it needs to be initialized, we move all of recovery initialization into a new function, ocfs2_recovery_init(). This actually cleans up ocfs2_initialize_super() a little as well. Following on, recovery cleaup becomes part of ocfs2_recovery_exit(). A number of node map functions are rendered obsolete and are removed. Finally, waiting on recovery is wrapped in a function rather than naked checks on the recovery_event. This is a cleanup from Mark. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
* ocfs2: Move slot map access into slot_map.cMark Fasheh2008-04-181-2/+1Star
| | | | | | | | | journal.c and dlmglue.c would refresh the slot map by hand. Instead, have the update and clear functions do the work inside slot_map.c. The eventual result is to make ocfs2_slot_info defined privately in slot_map.c Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
* ocfs2: Negotiate locking protocol versions.Joel Becker2008-02-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | Currently, when ocfs2 nodes connect via TCP, they advertise their compatibility level. If the versions do not match, two nodes cannot speak to each other and they disconnect. As a result, this provides no forward or backwards compatibility. This patch implements a simple protocol negotiation at the dlm level by introducing a major/minor version number scheme for entities that communicate. Specifically, o2dlm has a major/minor version for interaction with o2dlm on other nodes, and ocfs2 itself has a major/minor version for interacting with the filesystem on other nodes. This will allow rolling upgrades of ocfs2 clusters when changes to the locking or network protocols can be done in a backwards compatible manner. In those cases, only the minor number is changed and the negotatied protocol minor is returned from dlm join. In the far less likely event that a required protocol change makes backwards compatibility impossible, we simply bump the major number. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Silence false lockdep warningsJan Kara2008-01-261-2/+2
| | | | | | | | | Create separate lockdep lock classes for system file's i_mutexes. They are used to guard allocations and similar things and thus rank differently than i_mutex of a regular file or directory. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* [PATCH 2/2] ocfs2: cluster aware flock()Mark Fasheh2008-01-261-0/+19
| | | | | | | | Hook up ocfs2_flock(), using the new flock lock type in dlmglue.c. A new mount option, "localflocks" is added so that users can revert to old functionality as need be. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Local alloc window size changeable via mount optionSunil Mushran2008-01-261-0/+17
| | | | | | | | | | | | | | Local alloc is a performance optimization in ocfs2 in which a node takes a window of bits from the global bitmap and then uses that for all small local allocations. This window size is fixed to 8MB currently. This patch allows users to specify the window size in MB including disabling it by passing in 0. If the number specified is too large, the fs will use the default value of 8MB. mount -o localalloc=X /dev/sdX /mntpoint Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Support commit= mount optionMark Fasheh2008-01-261-0/+23
| | | | | | | | | Mostly taken from ext3. This allows the user to set the jbd commit interval, in seconds. The default of 5 seconds stays the same, but now users can easily increase the commit interval. Typically, this would be increased in order to benefit performance at the expense of data-safety. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Initalize bitmap_cpg of ocfs2_super to be the maximum.Tao Ma2008-01-251-18/+1Star
| | | | | | | | | | | | | | | | | | | | This value is initialized from global_bitmap->id2.i_chain.cl_cpg. If there is only 1 group, it will be equal to the total clusters in the volume. So as for online resize, it should change for all the nodes in the cluster. It isn't easy and there is no corresponding lock for it. bitmap_cpg is only used in 2 areas: 1. Check whether the suballoc is too large for us to allocate from the global bitmap, so it is little used. And now the suballoc size is 2048, it rarely meet this situation and the check is almost useless. 2. Calculate which group a cluster belongs to. We use it during truncate to figure out which cluster group an extent belongs too. But we should be OK if we increase it though as the cluster group calculated shouldn't change and we only ever have a small bitmap_cpg on file systems with a single cluster group. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Rename ocfs2_meta_[un]lockMark Fasheh2008-01-251-3/+3
| | | | | | | Call this the "inode_lock" now, since it covers both data and meta data. This patch makes no functional changes. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Remove data locksMark Fasheh2008-01-251-1/+0Star
| | | | | | | | | | | | | | The meta lock now covers both meta data and data, so this just removes the now-redundant data lock. Combining locks saves us a round of lock mastery per inode and one less lock to ping between nodes during read/write. We don't lose much - since meta locks were always held before a data lock (and at the same level) ordered writeout mode (the default) ensured that flushing for the meta data lock also pushed out data anyways. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Remove mount/unmount votesMark Fasheh2008-01-251-38/+5Star
| | | | | | | | | | | | The node maps that are set/unset by these votes are no longer relevant, thus we can remove the mount and umount votes. Since those are the last two remaining votes, we can also remove the entire vote infrastructure. The vote thread has been renamed to the downconvert thread, and the small amount of functionality related to managing it has been moved into fs/ocfs2/dlmglue.c. All references to votes have been removed or updated. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Remove fs dependency on ocfs2_heartbeat moduleMark Fasheh2008-01-251-8/+0Star
| | | | | | | | | Now that the dlm exposes domain information to us, we don't need generic node up / node down callbacks. And since the DLM is only telling us when a node goes down unexpectedly, we no longer need to optimize away node down callbacks via the umount map. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Reset journal parameters after s_mount_opt updateMark Fasheh2007-11-281-3/+3
| | | | | | | Right now we're just setting them from the existing parameters, not the new ones that a remount specified. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>