diff options
author | Michael Scherle | 2022-08-02 12:09:36 +0200 |
---|---|---|
committer | Michael Scherle | 2022-08-02 12:09:36 +0200 |
commit | 7d6836356d70ee2b3beef067f95c9295d068f602 (patch) | |
tree | ab4d5ffcf843ebf972b5eff484a2be32517e71e7 | |
parent | corrected retun values for cowtest (diff) | |
download | dnbd3-7d6836356d70ee2b3beef067f95c9295d068f602.tar.gz dnbd3-7d6836356d70ee2b3beef067f95c9295d068f602.tar.xz dnbd3-7d6836356d70ee2b3beef067f95c9295d068f602.zip |
improved cow readme.md
-rw-r--r-- | src/fuse/cowDoc/readme.md | 70 |
1 files changed, 36 insertions, 34 deletions
diff --git a/src/fuse/cowDoc/readme.md b/src/fuse/cowDoc/readme.md index 1e77802..f3784e9 100644 --- a/src/fuse/cowDoc/readme.md +++ b/src/fuse/cowDoc/readme.md @@ -10,17 +10,19 @@ # Introduction -This extension to the fuse dnbd3 client allows images to be mounted writable. The changes are saved in a separate file (also called Copy on Write, cow for short) on the client computer. These changes are uploaded to the cow server in the background. Once the user unmounts the image, any remaining changes are uploaded. As soon as all changes have been uploaded, the changes can be merged into a copy of the original image on the cow server (this can be set in the start parameters). +This extension of the fuse dnbd3 client makes it possible to mount images in a writable way. The changes are saved in a separate file ) on the client computer (also called Copy on Write, cow for short). These changes are uploaded to the cow server in the background. As soon as the user unmounts the image, all remaining changes are uploaded. Once all have been uploaded, the changes can be merged into a copy of the original image on the cow server (this can be set in the start parameters). + A typical use case is updating or adding software to an existing image. # Usage ### New Parameters -- `-c <path>` Enables the cow functionality, the argument sets the path for the temporary `meta` and `data` file in which the writes are stored. -- `-C <address>` sets the address of the cow server. The cow server is responsible for merging the original image with the changes from the client. -- `-L <path>` Similar to `-c <path>` but instead of creating a new session, it loads an existing from the given path. -- `-m` if set, the client will request a merge after the image is unmounted and all change are uploaded. +- `-c <path>` Enables the cow functionality. The `path` parameter sets the path for the temporary `meta` and `data` files in which the changes are saved. +- `-C <address>` sets the address of the cow server. The Cow server is responsible for merging the original image with the client's changes. + +- `- L <path>` Similar to `-c <path>`, but instead of creating a new session, an existing one is loaded from the specified path. +- `-m` the client requests a merge after the image has been unmounted and all changes have been uploaded. - `cowStatFile` creates a status file at the same location as the data and meta file. The file contains information about the current session, for more information see [here](#status). - `--cowStatStdout` similar to `--cowStatFile` but the information will be printed in the stdout. @@ -39,16 +41,16 @@ Example parameters for creating a new cow session: ## Data structure -The data structure is split in two main parts. The actual data from the write on the image and its corresponding metadata. Its also important to distinguish between a dnbd3 block which is 4096byte and and cow block which groups 320 dnbd3 blocks together. An cow block has an `cow_block_metadata_t` struct which holds the corresponding meta data. The metadata is used to determine if and block has been written on, where this block is stored in the data file, when it was last modified and when it was uploaded. But more later. +The data structure is divided into two main parts. The actual data of the writing on the image and the corresponding metadata. It is also important to distinguish between a dnbd3 block, which is 4096 bytes in size, and a cow block, which combines 320 dnbd3 blocks. A cow block has a `cow_block_metadata_t` structure that contains the corresponding metadata. The metadata is used to determine if a dnbd3 block has been written to, where that block is stored in the `data` file, when it was last modified and when it was uploaded. But more on this later. ### Blockmetadata ![Datastructure](img/datastructure.jpg) -The data structure for storing metadata cow blocks contains a Layer 1(L1) and a Layer 2 (L2). L1 contains pointers to the L2's. -The whole L1 array is initialized at the beginning and cannot be resized, so the size of the L1 array limits the total size of the image. -The L2's are dynamically created once needed. So at the beginning, all L1 pointers will be null. The L2's are arrays which contain 1024 +TThe data structure for storing "cow_block_metadata_t" contains a layer 1 (L1) and a layer 2 (L2). L1 contains pointers to the L2's. +The entire L1 array is initialised at the beginning and cannot be resized, therefore the size of the L1 array limits the total size of the image. +The L2's are created dynamically as they are needed. So at the beginning all L1 pointers are zero. The L2's are arrays containing 1024 `cow_block_metadata_t` structs. ```C @@ -60,19 +62,19 @@ typedef struct cow_block_metadata atomic_char bitfield[40]; } cow_block_metadata_t; ``` -Each `cow_block_metadata_t` contains a 40 byte so 320 bit bit field. The bit field indicates whether the corresponding dnbd3 block contains data or not. For e.g. if the bit field starts with 01.., the first 4096 contains not data and the next 4096 contain data. -So each `cow_block_metadata_t` stores the metadata of up to 320*4096 byte if all bits are set to 1. The offset is the offset where in the data file is the corresponding data stored. The timeChanged property contains the unix when the block was last modified. It's 0 if it was never modified or if the last changes are already uploaded. +Each `cow_block_metadata_t` contains a 40 byte, 320 bit bit field. The bit field indicates whether the corresponding dnbd3 block contains data or not. If, for example, the bit field begins with 01... the first 4096 contain no data and the next 4096 contain data. +So each `cow_block_metadata_t` stores the metadata of up to 320*4096 bytes if all bits are set to 1. The offset field is the offset where in the data file the corresponding data is stored. The property timeChanged contains the Unix time when the block was last changed. It is 0 if it has never been changed or if the last changes have already been uploaded. -The L2 arrays and `cow_block_metadata_t` are sorted to original Offsets. So the first L1 pointer, thus the first L2 array, addresses the first 1024 * 320 * 4096 Bytes (L2Size * bitfieldsize * DNBD3Blocksize) of Data and so on. +The L2 arrays and `cow_block_metadata_t` are sorted according to the original offsets of the image. Thus the first L1 pointer, i.e. the first L2 array, addresses the first 1024 * 320 * 4096 bytes (L2Size * bitfieldsize * DNBD3Blocksize) of data and so on. -So for example, to get the `cow_block_metadata_t` for offset 4033085440 you would take L1[3] since +For example, to get the "cow_block_metadata_t" for offset 4033085440, one would take L1[3], since ``` 4033085440 / ( COW_L2_STORAGE_CAPACITY ) ≈ 3.005 ``` - Then you would take the fifth `cow_block_metadata_t` in the L2 array because of + Then one would take the fifth `cow_block_metadata_t` in the L2 array because of ``` (4033085440 mod COW_L2_STORAGE_CAPACITY) / COW_METADATA_STORAGE_CAPACITY = 5 ``` @@ -88,12 +90,14 @@ COW_METADATA_STORAGE_CAPACITY = 320 * 4096 -For an read request, for every 4096byte block it will be checked if the block is already locally on the computer (therefore was already written before). If so it will be read from disk, otherwise it will be requested from the dnbd3 server. To increase performance, multiple following blocks that are also local/non local like the block before will be combined to to one larger reads from disc respectively one larger request from the server. +When a read request is made, it is checked for each 4096-byte block whether the block already exists locally on the computer (i.e. has already been written once). If so, it is read from the hard disk, otherwise it is requested from the dnbd3 server. To increase performance, several subsequent blocks that are also local/non-local are combined into a larger read from the hard disk or request from the server. ![readrequest](img/readrequest.svg) -The graph shown above is somewhat simplified for better visibility. The reads from the server happen async. So it will not be waiting for the server to respond, rather the it will move on with the next blocks. As soon as the respond from the server is finished, the data will be written in -the fuse buffer. Each request to the dnbd3 server will increase the `workCounter` variable by one and every time a request is done it will be decreased by one. Once `workCounter` is 0 again, fuse_request will be returned. +The diagram above is somewhat simplified for clarity. The server's read operations are asynchronous. This means that it does not wait for a response from the server, but continues with the next blocks. As soon as the server's response is complete, the data is written to the fuse buffer. +Each request to the dnbd3 server increases the variable `workCounter` by one, and each time a request is completed, it is decreased by one. As soon as `workCounter` is 0 again, fuse_request is returned. This is done to ensure that all asyncronous requests are completed before the request is returned. + + Also on the local side, it has to break the loop once the end of an `cow_block_metadata_t` is reached, since the next data offset of the next `cow_block_metadata_t` is very likely not directly after it in the data file. @@ -165,14 +169,14 @@ typedef struct cowfile_metadata_header char imageName[200]; // 200byte } cowfile_metadata_header_t; ``` -After this header at byte 8192 starts the l1 and then the l2 data structure mentioned above. +After this header, the above-mentioned l1 and then the l2 data structure begins at byte 8192. ### data -The `data` files contain the magicValue and at the 40 * 8 * 4096 Offset(capacity of one cowfile_metadata_header_t) starts the first block data. +The `data` files contain the magicValue and at the 40 * 8 * 4096 offset (capacity of a cowfile_metadata_header_t) the first data block starts. ### magic values in the file headers -The magic values in both files are used to ensure that a suitable file is read and that the machine has the correct endianness. +The magic values in both files are used to ensure that an appropriate file is read and that the machine has the correct endianness. ```C //config.h #define COW_FILE_META_MAGIC_VALUE ((uint64_t)0xEBE44D6E72F7825E) // Magic Value to recognize a Cow meta file @@ -188,8 +192,7 @@ tidStatUpdater ``` ```tidCowUploader``` is the thread that uploads blocks to the cow server. -```tidStatUpdater``` updates the stats in stdout or the stats files -(depending on parameters). +```tidStatUpdater``` updates the stats in stdout or the stats files (depending on parameters). ### Locks @@ -218,17 +221,15 @@ The following configuration variables have been added to ```config.h```. #define COW_API_START_MERGE "%s/api/File/StartMerge" ``` -- ```COW_MIN_UPLOAD_DELAY``` defines the minimum time in seconds that must have elapsed since the last modification of a cow block before it is uploaded. This value can be fine tuned. A larger value usually reduces duplicate block uploads. While a lower value usually reduces the time for the final upload after the image got unmounted. If you define `COW_DUMP_BLOCK_UPLOADS` and have set the command line parameter `--cowStatFile`, then after the block upload is complete, a list of all blocks and sorted by the number of uploads will be dumped into status.txt. This can help adjusting `COW_MIN_UPLOAD_DELAY`. - -- ```COW_STATS_UPDATE_TIME``` defines the update frequency in seconds of the stdout print/ stats file update. Setting this too low could impact the performance since it hast to loop over all blocks. -- ```COW_MAX_PARALLEL_UPLOADS``` defines to maximal number of parallel block uploads. These number is used once the image hast was dismounted and the final blocks are uploaded. -- ```COW_MAX_PARALLEL_BACKGROUND_UPLOADS``` defines to maximal number of parallel block uploads. These number is used will the image is still mounted and the user is still using it. +- ```COW_MIN_UPLOAD_DELAY``` sets the minimum time in seconds that must have elapsed since the last change to a cow block before it is uploaded. This value can be fine-tuned. A larger value usually reduces the double uploading of blocks. A smaller value reduces the time for the final upload after the image has been unmounted. If you set `COW_DUMP_BLOCK_UPLOADS` and set the command line parameter `--cowStatFile`, then a list of all blocks, sorted by the number of uploads, will be written to the status.txt file after the block upload is complete. This can help in fine-tuning `COW_MIN_UPLOAD_DELAY`. +- ```COW_STATS_UPDATE_TIME``` defines the update frequency of the stdout output/statistics file in seconds. Setting it too low could affect performance as a loop runs over all blocks. +- ```COW_MAX_PARALLEL_BACKGROUND_UPLOADS``` defines the maximum number of parallel block uploads. This number is used when the image is still mounted and the user is still using it. +- ```COW_MAX_PARALLEL_UPLOADS``` defines the maximum number of parallel block uploads. This number is used once the image has been unmounted to upload the remaining modified blocks. # REST Api -To transfer the data to the cow server, the following rest API is used: - +The following Rest API is used to transmit the data and commands to the cow server: ### /api/File/Create @@ -239,7 +240,7 @@ To transfer the data to the cow server, the following rest API is used: | ---- | ----------- | | 200 | Success | -This request is used once a new cow session is created. The returned guid is used in all later requests to identify the session. +This request is used as soon as a new cow session is created. The returned guid is used in all subsequent requests to identify the session. ### /api/File/Update @@ -258,7 +259,8 @@ This request is used once a new cow session is created. The returned guid is us | ---- | ----------- | | 200 | Success | -Used for uploading a block of data. The blocknumber is the absolute block number. The body contains an "application/octet-stream" where the first bytes are the bit field directly followed by the actual blockdata. +Used to upload a data block. The block number is the absolute block number. The body contains an "application/octet-stream", where the first bytes are the bit field, directly followed by the actual block data. + ### /api/File/StartMerge @@ -275,7 +277,7 @@ Used for uploading a block of data. The blocknumber is the absolute block number | Code | Description | | ---- | ----------- | | 200 | Success | -Used to start the merging on the server. +Used to start the merge on the server. ### /api/File/GetTopModifiedBlocks @@ -293,7 +295,7 @@ Used to start the merging on the server. | ---- | ----------- | | 200 | Success | -This request returns a list that contains Block Ids and the amount of times this block got uploaded, sorted by the amount of uploads. This is useful to adjust the `COW_MIN_UPLOAD_DELAY`. +This request returns a list containing the block IDs and the number of uploads of this block, sorted by the number of uploads. This is useful to adjust the `COW_MIN_UPLOAD_DELAY`. ### /api/File/Status @@ -310,7 +312,7 @@ This request returns a list that contains Block Ids and the amount of times this | ---- | ----------- | | 200 | Success | -Returns the SessionStatus Model, which gives information about the session. +Returns the SessionStatus model that provides information about the session. ### Models |