Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | [FUSE] In panic mode, use a pending range for alt check | Simon Rettberg | 2018-07-05 | 1 | -5/+43 |
| | | | | | | | | If we lost connection and then go check all known alt servers, see if we have some pending request queued and if so, use its offset and length for the alt server probe. This ensures that the server being tested is able to satisfy at least the next request we'll send. | ||||
* | [SERVER] Don't keep bg replication blocks in fs cache | Simon Rettberg | 2018-07-05 | 1 | -1/+5 |
| | | | | | | Further improving cache handling, don't keep blocks in cache that have been requested via background replication. It's likely these aren't needed in the near future. | ||||
* | [SERVER] Always use fsync instead of fdatasync | Simon Rettberg | 2018-07-05 | 2 | -9/+2 |
| | | | | | | | | Now that we support sparse files, using just fdatasync isn't safe anymore. Instead of handling both cases differently just drop fdatasync, the difference has probably been marginal all along anyways. | ||||
* | [SERVER] Use O_DIRECT for integrity checks | Simon Rettberg | 2018-07-04 | 2 | -30/+67 |
| | | | | | | | | | | | | | | The idea is that for full image checks, we don't want to pollute the fs cache with gigabytes of data that won't be needed again soon. This would certainly hurt performance on servers that dont have hundreds of GBs of RAM. For single block checks during replication this has the advantage that we don't check the block in memory before it hit the disk once, but actually flush the data to disk, then remove it from the page cache, and only then read it again, from disk. TODO: Might be worth making this a config option | ||||
* | [SERVER] Refactor uplink/cache handling, improve crc checking | Simon Rettberg | 2018-07-04 | 9 | -290/+351 |
| | | | | | | | | | | | | | The cacheFd is now moved to the uplink data structure and will only be handled by the uplink thread. The integrity checker now supports checking all blocks of an image. This will be triggered automatically whenever a check for a single block failed. Also, if a crc check on startup fails, the image won't be discarded anymore, but rather a full check will be initiated. Furthermore, when calling image_updateCacheMap() on an image that was previously complete, the cache map will now be re-initialized, and a new uplink connection created. | ||||
* | [SERVER] Use likely/unlikely in uplink disk writing loop | Simon Rettberg | 2018-06-25 | 2 | -4/+12 |
| | |||||
* | [SERVER] Try to re-open cacheFd if writing fails | Simon Rettberg | 2018-06-25 | 3 | -7/+58 |
| | | | | | | | In scenarios where the proxy is using an NFS server as storage (for whatever crazy reason) or when the cacheFd goes bad through e.g. a switchroot, try to re-open it instead of just disabling caching forever. | ||||
* | [SERVER] Make sure image has read fd before reading | Simon Rettberg | 2018-06-13 | 3 | -29/+60 |
| | |||||
* | [FUSE] Return 0 instead of EIO if trying to read past end | Simon Rettberg | 2018-06-13 | 1 | -1/+1 |
| | | | | | read() calls are supposed to return 0 when reading at EOF, so properly mimic that behavior. | ||||
* | [FUSE] Move variables into block where they're being used | Simon Rettberg | 2018-06-13 | 1 | -4/+4 |
| | |||||
* | [SERVER] Print info about signal sender | Simon Rettberg | 2018-05-03 | 1 | -5/+46 |
| | |||||
* | [SERVER] Don't spam log in vmdkLegacyMode for unknown images | Simon Rettberg | 2018-05-02 | 1 | -3/+7 |
| | |||||
* | [SERVER] Proper exit code and message when shutting down due to error or signal | Simon Rettberg | 2018-04-27 | 1 | -2/+6 |
| | |||||
* | [SERVER] Fix deadlock on shutdown (via image_tryFreeAll) | Simon Rettberg | 2018-04-24 | 1 | -4/+8 |
| | | | | | imageListLock was locked on twice in the call stack, which is bad if you're using non-recursive locks. | ||||
* | [SERVER] Acquire write lock before initializing array | Simon Rettberg | 2018-04-16 | 1 | -1/+5 |
| | |||||
* | [SERVER] Add bgrMinClients: Thresold to control when BGR starts | Simon Rettberg | 2018-04-12 | 3 | -5/+16 |
| | | | | | Background replication will not kick in if there aren't at least that many clients connected. | ||||
* | [SERVER] Mark spammy replication messages as DEBUG2 instead of 1 | Simon Rettberg | 2018-04-11 | 1 | -3/+3 |
| | |||||
* | [SERVER] Option to disable timestamps on stdout/console (default: disabled) | Simon Rettberg | 2018-04-11 | 4 | -9/+31 |
| | |||||
* | [SERVER] More error handling and logging when caching received data to disk | Simon Rettberg | 2018-04-10 | 1 | -4/+13 |
| | |||||
* | [SERVER] Ignore SIGPIPE | Simon Rettberg | 2018-04-10 | 1 | -0/+1 |
| | |||||
* | [SERVER] Error handling and logging when saving cache map | Simon Rettberg | 2018-04-10 | 1 | -24/+37 |
| | |||||
* | [SHARED] Reset errno | Simon Rettberg | 2018-04-05 | 1 | -0/+2 |
| | |||||
* | [KERNEL] Pre/post 4.11 handling of request ops | Jonathan Bauer | 2018-04-05 | 2 | -11/+30 |
| | |||||
* | [KERNEL] #ifs and #defines for timer pre/post 4.15 | Simon Rettberg | 2018-04-05 | 2 | -13/+18 |
| | |||||
* | [KERNEL] Macros for packing CMD_* into struct request | Simon Rettberg | 2018-03-27 | 1 | -4/+21 |
| | | | | Version check for pre or post 4.11 | ||||
* | Follow https://lwn.net/Articles/735887/ | Rafael Gieschke | 2018-03-24 | 2 | -4/+16 |
| | |||||
* | Include `linux/signal.h` for `siginitsetinv`, `sigmask`, `sigprocmask` | Rafael Gieschke | 2018-03-24 | 1 | -0/+1 |
| | |||||
* | Follow ↵ | Rafael Gieschke | 2018-03-23 | 2 | -11/+11 |
| | | | | https://github.com/torvalds/linux/commit/aebf526b53aea164508730427597d45f3e06b376 | ||||
* | [SERVER] Delete image files after releasing image to get rid of stale .map files | Simon Rettberg | 2018-03-19 | 1 | -7/+9 |
| | |||||
* | [SERVER] image.c: Add size to RPC data, rename bytesReceived, always add ↵ | Simon Rettberg | 2018-03-19 | 1 | -7/+11 |
| | | | | uplink if existent | ||||
* | [SERVER] Increase read() block size when calculating CRC32 | Simon Rettberg | 2018-03-19 | 1 | -1/+1 |
| | |||||
* | [SERVER] image_getCompletenessEstimate: Fix reversed logic in timeout check | Simon Rettberg | 2018-03-19 | 1 | -1/+3 |
| | |||||
* | [SERVER] Fix int overflows on 32bit builds in CRC generation | Simon Rettberg | 2018-03-16 | 2 | -7/+7 |
| | |||||
* | [SERVER] Make sparse file mode actually work | Simon Rettberg | 2018-03-16 | 3 | -9/+24 |
| | |||||
* | [SERVER] Experimental support for sparse files in proxy mode | Simon Rettberg | 2018-03-15 | 5 | -11/+67 |
| | | | | | | | | | | | | Will not preallocate images in this mode. Old images are only deleted if the disk is full, determined by write() calls to the cache file yielding ENOSPC or EDQUOT. In such a case, the least recently used image(s) will be deleted to free up at least 256MiB, and then the write() call will be repeated. This *should* work somewhat reliably unless the cache partition is ridiculously small. Performance might suffer a little, and disk fragmentation might occur much faster than in prealloc mode. Testing is needed. | ||||
* | [SERVER] Make TSAN happy | Simon Rettberg | 2017-12-19 | 1 | -1/+3 |
| | |||||
* | [SERVER] jansson < 2.6 compat | Simon Rettberg | 2017-11-10 | 1 | -0/+5 |
| | |||||
* | [SERVER] Check RLIMIT_NOFILE on startup and try to increase if required | Simon Rettberg | 2017-11-08 | 1 | -0/+39 |
| | |||||
* | [SERVER] altservers: Short timeout during RTT measurement, round request range | Simon Rettberg | 2017-11-08 | 2 | -5/+10 |
| | | | | | Rounding to 4k so caching works efficiently This should now close #3231 | ||||
* | [SERVER] rpc: Add q=logfile, q=altservers and q=config to /query | Simon Rettberg | 2017-11-08 | 5 | -8/+77 |
| | |||||
* | [SERVER] Add multiple config options for limiting stuff | Simon Rettberg | 2017-11-08 | 6 | -20/+183 |
| | | | | | maxClients, maxImages, maxPayload, maxReplicationSize Refs #3231 | ||||
* | [SERVER] altservers: Tweak, cleanup, refactor, rename | Simon Rettberg | 2017-11-08 | 5 | -27/+40 |
| | |||||
* | [SERVER] Properly clamp to 4k borders in updateCachemap() | Simon Rettberg | 2017-11-07 | 1 | -2/+9 |
| | | | | Refs #3231 | ||||
* | [SERVER] Use multiConnect() to find uplink for replication | Simon Rettberg | 2017-11-07 | 2 | -13/+38 |
| | | | | | Just as in the fuse client, this will speed things up if we have several alt-servers in our list which are not reachable. | ||||
* | [FUSE] Split final multiConnect-wait across multiple calls | Simon Rettberg | 2017-11-07 | 1 | -8/+8 |
| | | | | | | There might be more than one pending connect, but each call to multiConnect() can return at most one fd, so we could be ignoring some successful connections. | ||||
* | [SHARED] Add log_hasMask() to check if a certain loglevel is set | Simon Rettberg | 2017-11-07 | 2 | -0/+11 |
| | |||||
* | [FUSE] Reset salen before getpeername() call | Simon Rettberg | 2017-11-07 | 1 | -1/+2 |
| | |||||
* | [FUSE] Make use of sock_multiConnect() for initial connection | Simon Rettberg | 2017-11-06 | 1 | -10/+36 |
| | | | | | | This speeds up initialization with a long list of servers where the first in the list don't work, as the delay between servers is now lowered to 100ms. | ||||
* | [SHARED] Add sockaddr2dnbd3 func, add multiConnect func, EINTR handling | Simon Rettberg | 2017-11-06 | 2 | -30/+141 |
| | | | | | | | | | | EINTR was apparently not handled properly on non-linux for the connect() syscall. sockaddr2dnbd3 is what resolveToDnbd3Host already did internally, now it's its own function. sock_multiConnect() is a wrapper around connect() and poll, making it easy to connect to multiple hosts in a cascaded manner, with a slight delay between connect calls. | ||||
* | [FUSE] Remember up to 16 alt servers, but work only with 5 | Simon Rettberg | 2017-11-04 | 1 | -8/+59 |
| | | | | | | | 5 servers are considered "active", that is, are being measured for their RTT regularly. If We have more than 5 servers and one of the active ones isn't reachable repeatedly, the two servers will swap position. |