summaryrefslogtreecommitdiffstats
path: root/block/qcow2.h
Commit message (Collapse)AuthorAgeFilesLines
* qcow2: remove n_start and n_end of qcow2_alloc_cluster_offset()Hu Tao2014-02-091-1/+1
| | | | | | | | | | | | | | | | | | | n_start can be actually calculated from offset. The number of sectors to be allocated(n_end - n_start) can be passed in in num. By removing n_start and n_end, we can save two parameters. The side effect is there is a bug in qcow2.c:preallocate() that passes incorrect n_start to qcow2_alloc_cluster_offset() is fixed. The bug can be triggerred by a larger cluster size than the default value(65536), for example: ./qemu-img create -f qcow2 \ -o 'cluster_size=131072,preallocation=metadata' file.img 4G Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: fix wrong value of L1E_OFFSET_MASK, L2E_OFFSET_MASK and REFT_OFFSET_MASKHu Tao2014-01-241-3/+3
| | | | | | | | | Accoring to qcow spec, the offset fields in l1e, l2e and ref table entry start at bit 9. The offset is cluster offset, and the smallest possible cluster size is 512 bytes. Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* snapshot: distinguish id and name in load_tmpWenchao Xia2013-12-041-1/+4
| | | | | | | | | | | | Since later this function will be used so improve it. The only caller of it now is qemu-img, and it is not impacted by introduce function bdrv_snapshot_load_tmp_by_id_or_name() that call bdrv_snapshot_load_tmp() twice to keep old search logic. bdrv_snapshot_load_tmp_by_id_or_name() return int to let caller know the errno, and errno will be used later. Also fix a typo in comments of bdrv_snapshot_delete(). Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* qcow2: Add more overlap check bitmask macrosMax Reitz2013-10-111-3/+11
| | | | | | | | | | Introduces the macros QCOW2_OL_CONSTANT and QCOW2_OL_ALL in addition to the already existing QCOW2_OL_CACHED, signifying all metadata overlap checks that can be performed in constant time (regardless of image size etc.) and truly all available overlap checks, respectively. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Add overlap-check optionsMax Reitz2013-10-111-0/+9
| | | | | | | | Add runtime options to tune the overlap checks to be performed before write accesses. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Make overlap check mask variableMax Reitz2013-10-111-3/+2Star
| | | | | | | | Replace the QCOW2_OL_DEFAULT macro by a variable overlap_check in BDRVQcowState. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Use negated overflow check maskMax Reitz2013-10-111-2/+2
| | | | | | | | | | | In qcow2_check_metadata_overlap and qcow2_pre_write_overlap_check, change the parameter signifying the checks to perform from its current positive form to a negative one, i.e., it will no longer explicitly specify every check to perform but rather a mask of checks not to perform. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: qcow2 - used QEMU_PACKED for on-disk structuresJeff Cody2013-09-251-1/+1
| | | | | | | | | | QCowHeader and QCowExtension are structs that reside in the on-disk image format, and are read and written directly via bdrv_pread()/write(), and as such should be packed to avoid any unintentional struct padding. Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* snapshot: distinguish id and name in snapshot deleteWenchao Xia2013-09-121-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | Snapshot creation actually already distinguish id and name since it take a structured parameter *sn, but delete can't. Later an accurate delete is needed in qmp_transaction abort and blockdev-snapshot-delete-sync, so change its prototype. Also *errp is added to tip error, but return value is kepted to let caller check what kind of error happens. Existing caller for it are savevm, delvm and qemu-img, they are not impacted by introducing a new function bdrv_snapshot_delete_by_id_or_name(), which check the return value and do the operation again. Before this patch: For qcow2, it search id first then name to find the one to delete. For rbd, it search name. For sheepdog, it does nothing. After this patch: For qcow2, logic is the same by call it twice in caller. For rbd, it always fails in delete with id, but still search for name in second try, no change to user. Some code for *errp is based on Pavel's patch. Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com> Signed-off-by: Pavel Hrdina <phrdina@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Save refcount order in BDRVQcowStateMax Reitz2013-09-121-0/+1
| | | | | | | | | | Save the image refcount order in BDRVQcowState. This will be relevant for future code supporting different refcount orders than four and also for code that needs to verify a certain refcount order for an opened image. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2-cluster: Expand zero clustersMax Reitz2013-09-121-0/+5
| | | | | | | | | | | Add functionality for expanding zero clusters. This is necessary for downgrading the image version to one without zero cluster support. For non-backed images, this function may also just discard zero clusters instead of truly expanding them. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2-cache: Empty cacheMax Reitz2013-09-121-0/+2
| | | | | | | | Add a function for emptying a cache, i.e., flushing it and marking all elements invalid. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Discard VM state in active L1 after creating snapshotKevin Wolf2013-09-121-0/+5
| | | | | | | | | | | | | | | | During savevm, the VM state is written to the active L1 of the image and then a snapshot is taken. After that, the VM state isn't needed any more in the active L1 and should be discarded. This is implemented by this patch. The impact of not discarding the VM state is that a snapshot can never become smaller than any previous snapshot (because it would be padded with old VM state), and more importantly that future savevm operations cause unnecessary COWs (with associated flushes), which makes subsequent snapshots much slower. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
* qcow2: Pass discard type to qcow2_discard_clusters()Kevin Wolf2013-09-121-1/+1
| | | | | | | | The function will be used internally instead of only being called for guest discard requests. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
* qcow2-refcount: Repair OFLAG_COPIED errorsMax Reitz2013-08-301-0/+1
| | | | | | | | | | | Since the OFLAG_COPIED checks are now executed after the refcounts have been repaired (if repairing), it is safe to assume that they are correct but the OFLAG_COPIED flag may be not. Therefore, if its value differs from what it should be (considering the according refcount), that discrepancy can be repaired by correctly setting (or clearing that flag. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Metadata overlap checksMax Reitz2013-08-301-0/+39
| | | | | | | | | | | | | | | | | | Two new functions are added; the first one checks a given range in the image file for overlaps with metadata (main header, L1 tables, L2 tables, refcount table and blocks). The second one should be used immediately before writing to the image file as it calls the first function and, upon collision, marks the image as corrupt and makes the BDS unusable, thereby preventing further access. Both functions take a bitmask argument specifying the structures which should be checked for overlaps, making it possible to also check metadata writes against colliding with other structures. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Add corrupt bitMax Reitz2013-08-301-1/+6
| | | | | | | | | This adds an incompatible bit indicating corruption to qcow2. Any image with this bit set may not be written to unless for repairing (and subsequently clearing the bit if the repair has been successful). Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block/qcow2.h: Avoid "1LL << 63" (shifts into sign bit)Peter Maydell2013-08-301-3/+3
| | | | | | | | | | | | | | | | The expression "1LL << 63" tries to shift the 1 into the sign bit of a 'long long', which provokes a clang sanitizer warning: runtime error: left shift of 1 by 63 places cannot be represented in type 'long long' Use "1ULL << 63" as the definition of QCOW_OFLAG_COPIED instead to avoid this. For consistency, we also update the other QCOW_OFLAG definitions to use the ULL suffix rather than LL, though only the shift by 63 is undefined behaviour. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Use dashes instead of underscores in optionsKevin Wolf2013-07-261-4/+4
| | | | | | | | This is what QMP wants to use. The options haven't been enabled in any release yet, so we're still free to change them. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
* qcow2: Batch discardsKevin Wolf2013-06-241-0/+11
| | | | | | | | | | | | | This optimises the discard operation for freed clusters by batching discard requests (both snapshot deletion and bdrv_discard end up updating the refcounts cluster by cluster). Note that we don't discard asynchronously, but keep s->lock held. This is to avoid that a freed cluster is reallocated and written to while the discard is still in flight. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* qcow2: Options to enable discard for freed clustersKevin Wolf2013-06-241-0/+5
| | | | | | | | | Deleted snapshots are discarded in the image file by default, discard requests take their default from the -drive discard=... option and other places that free clusters must always be enabled explicitly. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* qcow2: Add refcount update reason to all callersKevin Wolf2013-06-241-3/+13
| | | | | | | | | | This adds a refcount update reason to all callers of update_refcounts(), so that a follow-up patch can use this information to decide whether clusters that reach a refcount of 0 should be discarded in the image file. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* qcow2: Catch some L1 table index overflowsKevin Wolf2013-05-141-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This catches the situation that is described in the bug report at https://bugs.launchpad.net/qemu/+bug/865518 and goes like this: $ qemu-img create -f qcow2 huge.qcow2 $((1024*1024))T Formatting 'huge.qcow2', fmt=qcow2 size=1152921504606846976 encryption=off cluster_size=65536 lazy_refcounts=off $ qemu-io /tmp/huge.qcow2 -c "write $((1024*1024*1024*1024*1024*1024 - 1024)) 512" Segmentation fault With this patch applied the segfault will be avoided, however the case will still fail, though gracefully: $ qemu-img create -f qcow2 /tmp/huge.qcow2 $((1024*1024))T Formatting 'huge.qcow2', fmt=qcow2 size=1152921504606846976 encryption=off cluster_size=65536 lazy_refcounts=off qemu-img: The image size is too large for file format 'qcow2' Note that even long before these overflow checks kick in, you get insanely high memory usage (up to INT_MAX * sizeof(uint64_t) = 16 GB for the L1 table), so with somewhat smaller image sizes you'll probably see qemu aborting for a failed g_malloc(). If you need huge image sizes, you should increase the cluster size to the maximum of 2 MB in order to get higher limits. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* aes: move aes.h from include/block to include/qemuAurelien Jarno2013-04-131-1/+1
| | | | | | | | | | | Move aes.h from include/block to include/qemu to show it can be reused by other subsystems. Cc: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@gmail.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
* qcow2: Allow requests with multiple l2metasKevin Wolf2013-03-281-0/+3
| | | | | | | | | | Instead of expecting a single l2meta, have a list of them. This allows to still have a single I/O request for the guest data, even though multiple l2meta may be needed in order to describe both a COW overwrite and a new cluster allocation (typical sequential write case). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* qcow2: Finalise interface of handle_alloc()Kevin Wolf2013-03-281-0/+5
| | | | | | | | The interface works completely on a byte granularity now and duplicated parameters are removed. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* qcow2: handle_alloc(): Get rid of keep_clusters parameterKevin Wolf2013-03-281-0/+5
| | | | | | | | | handle_alloc() is now called with the offset at which the actual new allocation starts instead of the offset at which the whole write request starts, part of which may already be processed. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* qcow2: Change handle_dependency to byte granularityKevin Wolf2013-03-281-0/+11
| | | | | | | | | | | This is a more precise description of what really constitutes a dependency. The behaviour doesn't change at this point because the COW area of the old request is still aligned to cluster boundaries and therefore an overlap is detected wheneven the requests touch any part of the same cluster. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* qcow2: Handle dependencies earlierKevin Wolf2013-03-281-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | Handling overlapping allocations isn't just a detail of cluster allocation. It is rather one of three ways to get the host cluster offset for a write request: 1. If a request overlaps an in-flight allocations, the cluster offset can be taken from there (this is what handle_dependencies will evolve into) or the request must just wait until the allocation has completed. Accessing the L2 is not valid in this case, it has outdated information. 2. Outside overlapping areas, check the clusters that can be written to as they are, with no COW involved. 3. If a COW is required, allocate new clusters Changing the code to reflect this doesn't change the behaviour because overlaps cannot exist for clusters that are kept in step 2. It does however make it easier for later patches to work on clusters that belong to an allocation that is still in flight. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* qcow2: Fix segfault in qcow2_invalidate_cacheKevin Wolf2013-03-191-0/+3
| | | | | | | Need to pass an options QDict to qcow2_open() now. This fixes a segfault on the migration target with qcow2. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Allow lazy refcounts to be enabled on the command lineKevin Wolf2013-03-151-0/+1
| | | | | | | | | | | | | | | | qcow2 images now accept a boolean lazy_refcounts options. Use it like this: -drive file=test.qcow2,lazy_refcounts=on If the option is specified on the command line, it overrides the default specified by the qcow2 header flags that were set when creating the image. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* block: move include files to include/block/Paolo Bonzini2012-12-191-2/+2
| | | | Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* qcow2: Enable dirty flag in qcow2_alloc_cluster_link_l2Kevin Wolf2012-12-131-0/+2
| | | | | | | | This is closer to where the dirty flag is really needed, and it avoids having checks for special cases related to cluster allocation directly in the writev loop. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Allocate l2meta only for cluster allocationsKevin Wolf2012-12-131-2/+5
| | | | | | | | | Even for writes to already allocated clusters, an l2meta is allocated, though it stays effectively unused. After this patch, only allocating requests still have one. Each l2meta now describes an in-flight request that writes to clusters that are not yet hooked up in the L2 table. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Drop l2meta.cluster_offsetKevin Wolf2012-12-131-4/+1Star
| | | | | | | | There's no real reason to have an l2meta for normal requests that don't allocate anything. Before we can get rid of it, we must return the host cluster offset in a different way. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Introduce Qcow2COWRegionKevin Wolf2012-12-131-6/+23
| | | | | | | | This makes it easier to address the areas for which a COW must be performed. As a nice side effect, the COW code in qcow2_alloc_cluster_link_l2 becomes really trivial. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Round QCowL2Meta.offset down to cluster boundaryKevin Wolf2012-12-131-0/+22
| | | | | | | | The offset within the cluster is already present as n_start and this is what the code uses. QCowL2Meta.offset is only needed at a cluster granularity. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: implement lazy refcountsStefan Hajnoczi2012-08-061-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | Lazy refcounts is a performance optimization for qcow2 that postpones refcount metadata updates and instead marks the image dirty. In the case of crash or power failure the image will be left in a dirty state and repaired next time it is opened. Reducing metadata I/O is important for cache=writethrough and cache=directsync because these modes guarantee that data is on disk after each write (hence we cannot take advantage of caching updates in RAM). Refcount metadata is not needed for guest->file block address translation and therefore does not need to be on-disk at the time of write completion - this is the motivation behind the lazy refcount optimization. The lazy refcount optimization must be enabled at image creation time: qemu-img create -f qcow2 -o compat=1.1,lazy_refcounts=on a.qcow2 10G qemu-system-x86_64 -drive if=virtio,file=a.qcow2,cache=writethrough Update qemu-iotests 031 and 036 since the extension header size changes when we add feature bit table entries. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: introduce dirty bitStefan Hajnoczi2012-08-061-0/+8
| | | | | | | | | | | | This patch adds an incompatible feature bit to mark images that have not been closed cleanly. When a dirty image file is opened a consistency check and repair is performed. Update qemu-iotests 031 and 036 since the extension header size changes when we add feature bit table entries. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: always operate caches in writeback modePaolo Bonzini2012-06-151-4/+1Star
| | | | | | | | | Writethrough does not need special-casing anymore in the qcow2 caches. The block layer adds flushes after every guest-initiated data write, and these will also flush the qcow2 caches to the OS. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Support for fixing refcount inconsistenciesKevin Wolf2012-06-151-1/+2
| | | | Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Zero write supportKevin Wolf2012-04-201-0/+1
| | | | Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Support for feature table header extensionKevin Wolf2012-04-201-0/+12
| | | | | | | Instead of printing an ugly bitmask, qemu can now print a more helpful string even for yet unknown features. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Support reading zero clustersKevin Wolf2012-04-201-0/+5
| | | | | | This adds support for reading zero clusters in version 3 images. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Version 3 imagesKevin Wolf2012-04-201-1/+16
| | | | | | | | | | | | | This adds the basic infrastructure to qcow2 to handle version 3 images. It includes code to create v3 images, allow header updates for v3 images and checks feature bits. It still misses support for zero clusters, so this is not a fully compliant implementation of v3 yet. The default for creating new images stays at v2 for now. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Ignore reserved bits in refcount table entriesKevin Wolf2012-04-201-0/+2
| | | | Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Ignore reserved bits in get_cluster_offsetKevin Wolf2012-04-201-0/+21
| | | | | | | | | | | With this change, reading from a qcow2 image ignores all reserved bits that are set in an L1 or L2 table entry. Now get_cluster_offset() assigns *cluster_offset only the offset without any other flags. The cluster type is not longer encoded in the offset, but a positive return value in case of success. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Save disk size in snapshot headerKevin Wolf2012-04-201-0/+1
| | | | | | | | | | | | | This allows that different snapshots of an image can have different sizes, which is a requirement for enabling image resizing even with images that have internal snapshots. We don't do the actual support for it now, but make sure that the additional field is present and not completely ignored in all version 3 images. When trying to load a snapshot of different size, it returns an error. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Reduce number of I/O requestsKevin Wolf2012-03-121-0/+1
| | | | | | | | | | | | | | If the first part of a write request is allocated, but the second isn't and it can be allocated so that the resulting area is contiguous, handle it at once. This is a common case for sequential writes. After this patch, alloc_cluster_offset() only checks if the clusters are already allocated or how many new clusters can be allocated contigouosly. The actual cluster allocation is split off into a new function do_alloc_cluster_offset(). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
* qcow2: Add qcow2_alloc_clusters_at()Kevin Wolf2012-03-121-0/+2
| | | | | | | | | This function allows to allocate clusters at a given offset in the image file. This is useful if you want to allocate the second part of an area that must be contiguous. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>