summaryrefslogtreecommitdiffstats
path: root/block/qcow2-refcount.c
Commit message (Collapse)AuthorAgeFilesLines
* qcow2: Make qcow2_free_any_clusters() free only one clusterAlberto Garcia2020-09-151-4/+4
| | | | | | | | | | | | | | | | This function takes an L2 entry and a number of clusters to free. Although in principle it can free any type of cluster (using the L2 entry to determine its type) in practice the API is broken because compressed clusters have a variable size and there is no way to free more than one without having the L2 entry of each one of them. The good news all callers are passing nb_clusters=1 so we can simply get rid of that parameter. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-Id: <77cea0f4616f921d37e971b3c5b18a2faa24b173.1599573989.git.berto@igalia.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
* qcow2: Use macros for the L1, refcount and bitmap table entry sizesAlberto Garcia2020-09-151-43/+46
| | | | | | | | | This patch replaces instances of sizeof(uint64_t) in the qcow2 driver with macros that indicate what those sizes are actually referring to. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-Id: <20200828110828.13833-1-berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
* qcow2: Add subcluster support to check_refcounts_l2()Alberto Garcia2020-08-251-5/+11
| | | | | | | | | | | | | | | | | | | | | The offset field of an uncompressed cluster's L2 entry must be aligned to the cluster size, otherwise it is invalid. If the cluster has no data then it means that the offset points to a preallocation, so we can clear the offset field without affecting the guest-visible data. This is what 'qemu-img check' does when run in repair mode. On traditional qcow2 images this can only happen when QCOW_OFLAG_ZERO is set, and repairing such entries turns the clusters from ZERO_ALLOC into ZERO_PLAIN. Extended L2 entries have no ZERO_ALLOC clusters and no QCOW_OFLAG_ZERO but the idea is the same: if none of the subclusters are allocated then we can clear the offset field and leave the bitmap untouched. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <9f4ed1d0a34b0a545b032c31ecd8c14734065342.1594396418.git.berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
* qcow2: Add l2_entry_size()Alberto Garcia2020-08-251-6/+8
| | | | | | | | | | | | | | qcow2 images with subclusters have 128-bit L2 entries. The first 64 bits contain the same information as traditional images and the last 64 bits form a bitmap with the status of each individual subcluster. Because of that we cannot assume that L2 entries are sizeof(uint64_t) anymore. This function returns the proper value for the image. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <d34d578bd0380e739e2dde3e8dd6187d3d249fa9.1594396418.git.berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
* qcow2: Add get_l2_entry() and set_l2_entry()Alberto Garcia2020-08-251-8/+9
| | | | | | | | | | | | | | | | | The size of an L2 entry is 64 bits, but if we want to have subclusters we need extended L2 entries. This means that we have to access L2 tables and slices differently depending on whether an image has extended L2 entries or not. This patch replaces all l2_slice[] accesses with calls to get_l2_entry() and set_l2_entry(). Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <9586363531fec125ba1386e561762d3e4224e9fc.1594396418.git.berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
* block: Comment cleanupsEric Blake2020-05-051-1/+1
| | | | | | | | | | | It's been a while since we got rid of the sector-based bdrv_read and bdrv_write (commit 2e11d756); let's finish the job on a few remaining comments. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200428213807.776655-1-eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
* block: Add flags to bdrv(_co)_truncate()Kevin Wolf2020-04-301-1/+1
| | | | | | | | | | | | | Now that block drivers can support flags for .bdrv_co_truncate, expose the parameter in the node level interfaces bdrv_co_truncate() and bdrv_truncate(). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200424125448.63318-3-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: update_refcount(): Reset old_table_index after qcow2_cache_put()Kevin Wolf2020-02-181-0/+1
| | | | | | | | | | | | | | In the case that update_refcount() frees a refcount block, it evicts it from the metadata cache. Before doing so, however, it returns the currently used refcount block to the cache because it might be the same. Returning the refcount block early means that we need to reset old_table_index so that we reload the refcount block in the next iteration if it is actually still in use. Fixes: f71c08ea8e60f035485a512fd2af8908567592f0 Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20200211094900.17315-2-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Don't round the L1 table allocation up to the sector sizeAlberto Garcia2020-02-061-1/+1
| | | | | | | | | | | The L1 table is read from disk using the byte-based bdrv_pread() and is never accessed beyond its last element, so there's no need to allocate more memory than that. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: b2e27214ec7b03a585931bcf383ee1ac3a641a10.1579374329.git.berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>
* block: Add @exact parameter to bdrv_co_truncate()Max Reitz2019-10-281-1/+1
| | | | | | | | | | | | | | | | | | | | We have two drivers (iscsi and file-posix) that (in some cases) return success from their .bdrv_co_truncate() implementation if the block device is larger than the requested offset, but cannot be shrunk. Some callers do not want that behavior, so this patch adds a new parameter that they can use to turn off that behavior. This patch just adds the parameter and lets the block/io.c and block/block-backend.c functions pass it around. All other callers always pass false and none of the implementations evaluate it, so that this patch does not change existing behavior. Future patches take care of that. Suggested-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20190918095144.955-5-mreitz@redhat.com Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
* qcow2: Fix corruption bug in qcow2_detect_metadata_preallocation()Kevin Wolf2019-10-251-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | qcow2_detect_metadata_preallocation() calls qcow2_get_refcount() which requires s->lock to be taken to protect its accesses to the refcount table and refcount blocks. However, nothing in this code path actually took the lock. This could cause the same cache entry to be used by two requests at the same time, for different tables at different offsets, resulting in image corruption. As it would be preferable to base the detection on consistent data (even though it's just heuristics), let's take the lock not only around the qcow2_get_refcount() calls, but around the whole function. This patch takes the lock in qcow2_co_block_status() earlier and asserts in qcow2_detect_metadata_preallocation() that we hold the lock. Fixes: 69f47505ee66afaa513305de0c1895a224e52c45 Cc: qemu-stable@nongnu.org Reported-by: Michael Weiser <michael.weiser@gmx.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Tested-by: Michael Weiser <michael.weiser@gmx.de> Reviewed-by: Michael Weiser <michael.weiser@gmx.de> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
* Include qemu-common.h exactly where neededMarkus Armbruster2019-06-121-1/+0Star
| | | | | | | | | | | | | | | | No header includes qemu-common.h after this commit, as prescribed by qemu-common.h's file comment. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-5-armbru@redhat.com> [Rebased with conflicts resolved automatically, except for include/hw/arm/xlnx-zynqmp.h hw/arm/nrf51_soc.c hw/arm/msf2-soc.c block/qcow2-refcount.c block/qcow2-cluster.c block/qcow2-cache.c target/arm/cpu.h target/lm32/cpu.h target/m68k/cpu.h target/mips/cpu.h target/moxie/cpu.h target/nios2/cpu.h target/openrisc/cpu.h target/riscv/cpu.h target/tilegx/cpu.h target/tricore/cpu.h target/unicore32/cpu.h target/xtensa/cpu.h; bsd-user/main.c and net/tap-bsd.c fixed up]
* block/qcow2-refcount: add trace-point to qcow2_process_discardsVladimir Sementsov-Ogievskiy2019-06-041-1/+6
| | | | | | | | Let's at least trace ignored failure. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: avoid recursive block_status call if possibleVladimir Sementsov-Ogievskiy2019-06-041-0/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | drv_co_block_status digs bs->file for additional, more accurate search for hole inside region, reported as DATA by bs since 5daa74a6ebc. This accuracy is not free: assume we have qcow2 disk. Actually, qcow2 knows, where are holes and where is data. But every block_status request calls lseek additionally. Assume a big disk, full of data, in any iterative copying block job (or img convert) we'll call lseek(HOLE) on every iteration, and each of these lseeks will have to iterate through all metadata up to the end of file. It's obviously ineffective behavior. And for many scenarios we don't need this lseek at all. However, lseek is needed when we have metadata-preallocated image. So, let's detect metadata-preallocation case and don't dig qcow2's protocol file in other cases. The idea is to compare allocation size in POV of filesystem with allocations size in POV of Qcow2 (by refcounts). If allocation in fs is significantly lower, consider it as metadata-preallocation case. 102 iotest changed, as our detector can't detect shrinked file as metadata-preallocation, which don't seem to be wrong, as with metadata preallocation we always have valid file length. Two other iotests have a slight change in their QMP output sequence: Active 'block-commit' returns earlier because the job coroutine yields earlier on a blocking operation. This operation is loading the refcount blocks in qcow2_detect_metadata_preallocation(). Suggested-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2.h: add missing includeVladimir Sementsov-Ogievskiy2019-05-281-1/+0Star
| | | | | | | | | | | | qcow2.h depends on block_int.h. Compilation isn't broken currently only due to block_int.h always included before qcow2.h. Though, it seems better to directly include block_int.h in qcow2.h. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20190506142741.41731-2-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>
* qcow2: Define and use QCOW2_COMPRESSED_SECTOR_SIZEAlberto Garcia2019-05-201-11/+14
| | | | | | | | | | | | When an L2 table entry points to a compressed cluster the space used by the data is specified in 512-byte sectors. This size is independent from BDRV_SECTOR_SIZE and is specific to the qcow2 file format. The QCOW2_COMPRESSED_SECTOR_SIZE constant defined in this patch makes this explicit. Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Replace bdrv_write() with bdrv_pwrite()Alberto Garcia2019-05-101-2/+2
| | | | | | | | | There's only one bdrv_write() call left in the qcow2 code, and it can be trivially replaced with the byte-based bdrv_pwrite(). Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2-refcount: don't mask corruptions under internal errorsVladimir Sementsov-Ogievskiy2019-05-071-10/+9Star
| | | | | | | | | | | | | No reasons for not reporting found corruptions as corruptions in case of some internal errors, especially in case of just failed to fix l2 entry (and in this case, missed corruptions may influence comparing logic, when we calculate difference between corruptions fields of two results) Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20190227131433.197063-6-vsementsov@virtuozzo.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
* qcow2-refcount: check_refcounts_l2: don't count fixed cluster as allocatedVladimir Sementsov-Ogievskiy2019-05-071-9/+9
| | | | | | | | | Do not count a cluster which is fixed to be ZERO as allocated. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20190227131433.197063-5-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>
* qcow2-refcount: check_refcounts_l2: reduce ignored overlapsVladimir Sementsov-Ogievskiy2019-05-071-7/+9
| | | | | | | | | | | Reduce number of structures ignored in overlap check: when checking active table ignore active tables, when checking inactive table ignore inactive ones. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20190227131433.197063-4-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>
* qcow2-refcount: avoid eating RAMVladimir Sementsov-Ogievskiy2019-05-071-0/+19
| | | | | | | | | | | | | | | | qcow2_inc_refcounts_imrt() (through realloc_refcount_array()) can eat an unpredictable amount of memory on corrupted table entries, which are referencing regions far beyond the end of file. Prevent this, by skipping such regions from further processing. Interesting that iotest 138 checks exactly the behavior which we fix here. So, change the test appropriately. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20190227131433.197063-3-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>
* qcow2-refcount: fix check_oflag_copiedVladimir Sementsov-Ogievskiy2019-05-071-4/+4
| | | | | | | | | Increase corruptions_fixed only after successful fix. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20190227131433.197063-2-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>
* qcow2: Support external data file in qemu-img checkKevin Wolf2019-03-081-11/+30
| | | | | | | | | | | | For external data files, data clusters must be excluded from the refcount calculations. Instead, an implicit refcount of 1 is assumed for the COPIED flag. Compressed clusters and internal snapshots are incompatible with external data files, so print an error if they are in use for images with an external data file. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: External file I/OKevin Wolf2019-03-081-9/+30
| | | | | | | | | This changes the qcow2 implementation to direct all guest data I/O to s->data_file rather than bs->file, while metadata I/O still uses bs->file. At the moment, this is still always the same, but soon we'll add options to set s->data_file to an external data file. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Pass bs to qcow2_get_cluster_type()Kevin Wolf2019-03-081-5/+5
| | | | Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Assert that refcount block offsets fit in the refcount tableAlberto Garcia2019-02-011-0/+3
| | | | | | | | | | | | | | | | | | | Refcount table entries have a field to store the offset of the refcount block. The rest of the bits of the entry are currently reserved. The offset is always taken from the entry using REFT_OFFSET_MASK to ensure that we only use the bits that belong to that field. While that mask is used every time we read from the refcount table, it is never used when we write to it. Due to the other constraints of the qcow2 format QEMU can never produce refcount block offsets that don't fit in that field so any such offset when allocating a refcount block would indicate a bug in QEMU. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Don't allow overflow during cluster allocationEric Blake2018-11-191-7/+13
| | | | | | | | | | | | | | | | | | | | | | | Our code was already checking that we did not attempt to allocate more clusters than what would fit in an INT64 (the physical maximimum if we can access a full off_t's worth of data). But this does not catch smaller limits enforced by various spots in the qcow2 image description: L1 and normal clusters of L2 are documented as having bits 63-56 reserved for other purposes, capping our maximum offset at 64PB (bit 55 is the maximum bit set). And for compressed images with 2M clusters, the cap drops the maximum offset to bit 48, or a maximum offset of 512TB. If we overflow that offset, we would write compressed data into one place, but try to decompress from another, which won't work. It's actually possible to prove that overflow can cause image corruption without this patch; I'll add the iotests separately in the next commit. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Read outside array bounds in qcow2_pre_write_overlap_check()Liam Merwick2018-11-121-8/+10
| | | | | | | | | | | | | | | | The commit for 0e4e4318eaa5 increments QCOW2_OL_MAX_BITNR but does not add an array entry for QCOW2_OL_BITMAP_DIRECTORY_BITNR to metadata_ol_names[]. As a result, an array dereference of metadata_ol_names[8] in qcow2_pre_write_overlap_check() could result in a read outside of the array bounds. Fixes: 0e4e4318eaa5 ('qcow2: add overlap check for bitmap directory') Cc: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Liam Merwick <Liam.Merwick@oracle.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 1541453919-25973-6-git-send-email-Liam.Merwick@oracle.com Signed-off-by: Max Reitz <mreitz@redhat.com>
* Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into stagingPeter Maydell2018-07-101-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Block layer patches: - Copy offloading fixes for when the copy increases the image size - Temporary revert of the removal of deprecated -drive options - Fix request serialisation in the image fleecing scenario - Fix copy-on-read crash with unaligned image size - Fix another drain crash # gpg: Signature made Tue 10 Jul 2018 16:37:52 BST # gpg: using RSA key 7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * remotes/kevin/tags/for-upstream: (24 commits) block: Use common write req handling in truncate block: Fix bdrv_co_truncate overlap check block: Use common req handling in copy offloading block: Use common req handling for discard block: Fix handling of image enlarging write block: Extract common write req handling block: Use uint64_t for BdrvTrackedRequest byte fields block: Use BdrvChild to discard block: Add copy offloading trace points block: Prefix file driver trace points with "file_" Revert "block: Remove deprecated -drive geometry options" Revert "block: Remove deprecated -drive option addr" Revert "block: Remove deprecated -drive option serial" Revert "block: Remove dead deprecation warning code" block/blklogwrites: Make sure the log sector size is not too small qapi/block-core.json: Add missing documentation for blklogwrites log-append option block/backup: fix fleecing scheme: use serialized writes block: add BDRV_REQ_SERIALISING flag block: split flags in copy_range block/io: fix copy_range ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * block: Use BdrvChild to discardFam Zheng2018-07-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | Other I/O functions are already using a BdrvChild pointer in the API, so make discard do the same. It makes it possible to initiate the same permission checks before doing I/O, and much easier to share the helper functions for this, which will be added and used by write, truncate and copy range paths. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* | qcow2: add overlap check for bitmap directoryVladimir Sementsov-Ogievskiy2018-07-091-0/+10
|/ | | | | | Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20180705151515.779173-1-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>
* qcow2: Repair OFLAG_COPIED when fixing leaksMax Reitz2018-06-111-8/+17
| | | | | | | | | | | | | | | | | | | Repairing OFLAG_COPIED is usually safe because it is done after the refcounts have been repaired. Therefore, it we did not find anyone else referencing a data or L2 cluster, it makes no sense to not set OFLAG_COPIED -- and the other direction (clearing OFLAG_COPIED) is always safe, anyway, it may just induce leaks. Furthermore, if OFLAG_COPIED is actually consistent with a wrong (leaky) refcount, we will decrement the refcount with -r leaks, but OFLAG_COPIED will then be wrong. qemu-img check should not produce images that are more corrupted afterwards then they were before. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1527085 Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20180509200059.31125-2-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
* block: use local path for local headersMichael S. Tsirkin2018-05-311-1/+1
| | | | | | | | | | When pulling in headers that are in the same directory as the C file (as opposed to one in include/), we should use its relative path, without a directory. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
* Fix error message about compressed clusters with OFLAG_COPIEDAlberto Garcia2018-05-151-2/+2
| | | | | | | | | | | | | | | | Compressed clusters are not supposed to have the COPIED bit set. "qemu-img check" detects that and prints an error message reporting the number of the affected host cluster. This doesn't make much sense because compressed clusters are not aligned to host clusters, so it would be better to report the offset instead. Plus, the calculation is wrong and it uses the raw L2 entry as if it was simply an offset. This patch fixes the error message and reports the offset of the compressed cluster. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: 0f687957feb72e80c740403191a47e607c2463fe.1523376013.git.berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>
* qcow2: Reset free_cluster_index when allocating a new refcount blockAlberto Garcia2018-03-261-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we try to allocate new clusters we first look for available ones starting from s->free_cluster_index and once we find them we increase their reference counts. Before we get to call update_refcount() to do this last step s->free_cluster_index is already pointing to the next cluster after the ones we are trying to allocate. During update_refcount() it may happen however that we also need to allocate a new refcount block in order to store the refcounts of these new clusters (and to complicate things further that may also require us to grow the refcount table). After all this we don't know if the clusters that we originally tried to allocate are still available, so we return -EAGAIN to ask the caller to restart the search for free clusters. This is what can happen in a common scenario: 1) We want to allocate a new cluster and we see that cluster N is free. 2) We try to increase N's refcount but all refcount blocks are full, so we allocate a new one at N+1 (where s->free_cluster_index was pointing at). 3) Once we're done we return -EAGAIN to look again for a free cluster, but now s->free_cluster_index points at N+2, so that's the one we allocate. Cluster N remains unallocated and we have a hole in the qcow2 file. This can be reproduced easily: qemu-img create -f qcow2 -o cluster_size=512 hd.qcow2 1M qemu-io -c 'write 0 124k' hd.qcow2 After this the image has 132608 bytes (256 clusters), and the refcount block is full. If we write 512 more bytes it should allocate two new clusters: the data cluster itself and a new refcount block. qemu-io -c 'write 124k 512' hd.qcow2 However the image has now three new clusters (259 in total), and the first one of them is empty (and unallocated): dd if=hd.qcow2 bs=512c skip=256 count=1 | hexdump -C If we write larger amounts of data in the last step instead of the 512 bytes used in this example we can create larger holes in the qcow2 file. What this patch does is reset s->free_cluster_index to its previous value when alloc_refcount_block() returns -EAGAIN. This way the caller will try to allocate again the original clusters if they are still free. The output of iotest 026 also needs to be updated because now that images have no holes some tests fail at a different point and the number of leaked clusters is different. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Make qemu-img check detect corrupted L1 tables in snapshotsAlberto Garcia2018-03-091-0/+14
| | | | | | | | | | | | | 'qemu-img check' cannot detect if a snapshot's L1 table is corrupted. This patch checks the table's offset and size and reports corruption if the values are not valid. This patch doesn't add code to fix that corruption yet, only to detect and report it. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Check snapshot L1 tables in qcow2_check_metadata_overlap()Alberto Garcia2018-03-091-1/+9
| | | | | | | | | | | The inactive-l2 overlap check iterates uses the L1 tables from all snapshots, but it does not validate them first. We now have a function to take care of this, so let's use it. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: introduce qcow2_write_caches and qcow2_flush_cachesPaolo Bonzini2018-03-091-0/+28
| | | | | | | | | | They will be used to avoid recursively taking s->lock during bdrv_open or bdrv_check. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1516279431-30424-7-git-send-email-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qcow2: Replace align_offset() with ROUND_UP()Alberto Garcia2018-03-021-2/+2
| | | | | | | | | | | | | | | | The align_offset() function is equivalent to the ROUND_UP() macro so there's no need to use the former. The ROUND_UP() name is also a bit more explicit. This patch uses ROUND_UP() instead of the slower QEMU_ALIGN_UP() because align_offset() already requires that the second parameter is a power of two. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20180215131008.5153-1-berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>
* qcow2: Update qcow2_update_snapshot_refcount() to support L2 slicesAlberto Garcia2018-02-131-14/+18
| | | | | | | | | | | | | | | | | qcow2_update_snapshot_refcount() increases the refcount of all clusters of a given snapshot. In order to do that it needs to load all its L2 tables and iterate over their entries. Since we'll be loading L2 slices instead of full tables we need to add an extra loop that iterates over all slices of each L2 table. This function doesn't need any additional changes so apart from that this patch simply updates the variable name from l2_table to l2_slice. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: 5f4db199b9637f4833b58487135124d70add8cf0.1517840877.git.berto@igalia.com Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
* qcow2: Prepare qcow2_update_snapshot_refcount() for adding L2 slice supportAlberto Garcia2018-02-131-69/+75
| | | | | | | | | | | | | | | | | | | | | Adding support for L2 slices to qcow2_update_snapshot_refcount() needs (among other things) an extra loop that iterates over all slices of each L2 table. Putting all changes in one patch would make it hard to read because all semantic changes would be mixed with pure indentation changes. To make things easier this patch simply creates a new block and changes the indentation of all lines of code inside it. Thus, all modifications in this patch are cosmetic. There are no semantic changes and no variables are renamed yet. The next patch will take care of that. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 8ffaa5e55bd51121f80e498f4045b64902a94293.1517840877.git.berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>
* qcow2: Remove BDS parameter from qcow2_cache_is_table_offset()Alberto Garcia2018-02-131-3/+3
| | | | | | | | | | | | This function was only using the BlockDriverState parameter to pass it to qcow2_cache_get_table_addr(). This is no longer necessary so this parameter can be removed. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: eb0ed90affcf302e5a954bafb5931b5215483d3a.1517840877.git.berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>
* qcow2: Remove BDS parameter from qcow2_cache_discard()Alberto Garcia2018-02-131-3/+3
| | | | | | | | | | | | This function was only using the BlockDriverState parameter to pass it to qcow2_cache_get_table_idx() and qcow2_cache_table_release(). This is no longer necessary so this parameter can be removed. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 9724f7e38e763ad3be32627c6b7fe8df9edb1476.1517840877.git.berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>
* qcow2: Remove BDS parameter from qcow2_cache_put()Alberto Garcia2018-02-131-15/+15
| | | | | | | | | | | | This function was only using the BlockDriverState parameter to pass it to qcow2_cache_get_table_idx(). This is no longer necessary so this parameter can be removed. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 6f98155489054a457563da77cdad1a66ebb3e896.1517840876.git.berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>
* qcow2: Remove BDS parameter from qcow2_cache_entry_mark_dirty()Alberto Garcia2018-02-131-8/+6Star
| | | | | | | | | | | | This function was only using the BlockDriverState parameter to pass it to qcow2_cache_get_table_idx(). This is no longer necessary so this parameter can be removed. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 5c40516a91782b083c1428b7b6a41bb9e2679bfb.1517840876.git.berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>
* qcow2: Repair unaligned preallocated zero clustersMax Reitz2018-01-231-12/+58
| | | | | | | | | | We can easily repair unaligned preallocated zero clusters by discarding them, so why not do it? Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20171110203759.14018-2-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
* qcow2: Add bounds check to get_refblock_offset()Max Reitz2017-11-171-1/+25
| | | | | | | | | | Reported-by: R. Nageswara Sastry <nasastry@in.ibm.com> Buglink: https://bugs.launchpad.net/qemu/+bug/1728661 Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20171110203111.7666-5-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
* qcow2: Prevent allocating compressed clusters at offset 0Alberto Garcia2017-11-141-0/+7
| | | | | | | | | | | | | | | | If the refcount data is corrupted then we can end up trying to allocate a new compressed cluster at offset 0 in the image, triggering an assertion in qcow2_alloc_bytes() that would crash QEMU: qcow2_alloc_bytes: Assertion `offset' failed. This patch adds an explicit check for this scenario and a new test case. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: fb53467cf48e95ff3330def1cf1003a5b862b7d9.1509718618.git.berto@igalia.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
* qcow2: Prevent allocating refcount blocks at offset 0Alberto Garcia2017-11-141-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Each entry in the qcow2 cache contains an offset field indicating the location of the data in the qcow2 image. If the offset is 0 then it means that the entry contains no data and is available to be used when needed. Because of that it is not possible to store in the cache the first cluster of the qcow2 image (offset = 0). This is not a problem because that cluster always contains the qcow2 header and we're not using this cache for that. However, if the qcow2 image is corrupted it can happen that we try to allocate a new refcount block at offset 0, triggering this assertion and crashing QEMU: qcow2_cache_entry_mark_dirty: Assertion `c->entries[i].offset != 0' failed This patch adds an explicit check for this scenario and a new test case. This problem was originally reported here: https://bugs.launchpad.net/qemu/+bug/1728615 Reported-by: R.Nageswara Sastry <nasastry@in.ibm.com> Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 92a2fadd10d58b423f269c1d1a309af161cdc73f.1509718618.git.berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>
* qcow2: truncate the tail of the image file after shrinking the imagePavel Butsykin2017-10-061-0/+22
| | | | | | | | | | | Now after shrinking the image, at the end of the image file, there might be a tail that probably will never be used. So we can find the last used cluster and cut the tail. Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20170929121613.25997-3-pbutsykin@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>