summaryrefslogtreecommitdiffstats
path: root/src/server/image.c
Commit message (Collapse)AuthorAgeFilesLines
* [SERVER] Introduce ignoreAllocErrorsSimon Rettberg2020-02-241-2/+7
| | | | | If enabled, a failed fallocate will not abort image replication, but retry with sparse mode.
* [SERVER] Lookup image on storage even in proxy modeSimon Rettberg2020-01-281-8/+11
| | | | | | | In proxy mode, when rid 0 is requested, we now first query our uplink servers for the latest revision and if this fails, like in non-proxy mode, we'll see what the latest version on disk is.
* [SERVER] Fix checking images without cache mapSimon Rettberg2019-10-291-7/+11
|
* [SERVER] Make buffer when reading for crc check largerSimon Rettberg2019-09-111-1/+1
|
* [SERVER] Make integrity checks on startup asyncSimon Rettberg2019-09-101-25/+24Star
|
* [SERVER] rpc: Add cachemap featureSimon Rettberg2019-09-061-0/+16
|
* [SERVER] Introduce autoFreeDiskSpaceDelaySimon Rettberg2019-09-051-6/+8
| | | | | | | | This setting allows you to control the formerly hard-coded timeout of 10 hours before a proxy would start deleting old images in order to free up space for new images. Setting it to -1 entirely disables automatic deletion, in case you have an external process for freeing up disk space.
* [SERVER] Support limiting alt-servers to specific namespaceSimon Rettberg2019-09-041-1/+1
| | | | | | Not really namespace but simple string matching for the image path. Path is matched from start with no support for glob or regex, so usually you want to have a trailing '/' to limit to certain directories.
* [SERVER] Fix indentationSimon Rettberg2019-09-031-4/+4
|
* [SERVER] Fix image_updateCachemap()Simon Rettberg2019-09-031-4/+8
|
* [SERVER] No uplink_init when checking working state; improve loggingSimon Rettberg2019-08-301-8/+10
|
* [SERVER] Use weakref for cache mapsSimon Rettberg2019-08-291-76/+132
| | | | | | Gets rid of a bunch of locking, especially the hot path in net.c where clients are requesting data. Many clients unsing the same incomplete image previously created a bottleneck here.
* [SERVER] Reintroduce check whether readFd is actually != -1Simon Rettberg2019-08-281-1/+3
|
* [SERVER] Make signal handling more POSIXSimon Rettberg2019-08-281-8/+2Star
| | | | | | | | | | | | | | | According to POSIX, a signal sent to a PID can be delivered to an arbitrary thread of that process that hasn't the signal blocked. This seens to never happen on Linux, but would mess things up since the code expected the main signal handler to only be executed by the main thread. This should now be fixed by examining the destination PID of the signal as well as the ID of the thread currently running the signal handler. If we notice the signal wasn't sent by our own PID and the handler is not currently run by the main thread, we re-send the signal to the main thread. Otherwise, if the signal was sent by our own PID but the handler is not run in the main thread, do nothing. This way we can use pthread_kill() to wake up threads that might be stuck in a blocking syscall when it's time to shut down.
* [SERVER] Remove old commentsSimon Rettberg2019-08-281-30/+0Star
|
* [SERVER] Handle closeUnusedFd via timerSimon Rettberg2019-08-281-17/+19
|
* [SERVER] Use reference counting for uplinkSimon Rettberg2019-08-271-22/+17Star
| | | | First step towards less locking for proxy mode
* [SERVER] Get rid of alt-servers thread, per-uplink rtt historySimon Rettberg2019-08-221-3/+3
| | | | | | | | | | Alt-Server checks are now run using the threadpool, so we don't need a queue and dedicated thread anymore. The rtt history is now kept per uplink, so many uplinks won't overwhelm the history, making its time window very short. Also the fail counter is now split up; a global one for when the server actually isn't reachable, a local (per-uplink) one for when the server is reachable but doesn't serve the requested image.
* [SERVER] Add struct representing active connection to uplink serverSimon Rettberg2019-08-181-1/+1
|
* [SERVER] Better lock debugging: Always check lock orderSimon Rettberg2019-08-071-5/+5
| | | | | | Lock order is predefined in locks.h. Immediately bail out if a lock with lower priority is obtained while the same thread already holds one with higher priority.
* [SERVER] Make image->users atomic and get rid of some lockingSimon Rettberg2019-08-021-52/+39Star
| | | | | | | | With this change it should be safe to read the users count of an image without locking first, assuming you already have a reference on the image or are otherwise sure it cannot be freed, i.e. in an active uplink. Updating users, or checking whether it's 0 in order to free the image should only be done while holding the imageListLock.
* [SERVER] Turn all spinlocks into mutexesSimon Rettberg2019-07-261-97/+99
| | | | | | | | Just assume sane platforms offer smart mutexes that have a fast-path with spinlocks internally for locks that have little to no congestion. In all other cases, mutexes should perform better anyways.
* [SERVER] Export image idle time in json rpcSimon Rettberg2019-01-311-3/+6
| | | | Counter in seconds for how long this image hasn't been used.
* [SERVER] Use O_DIRECT for integrity checksSimon Rettberg2018-07-041-4/+12
| | | | | | | | | | | | | | The idea is that for full image checks, we don't want to pollute the fs cache with gigabytes of data that won't be needed again soon. This would certainly hurt performance on servers that dont have hundreds of GBs of RAM. For single block checks during replication this has the advantage that we don't check the block in memory before it hit the disk once, but actually flush the data to disk, then remove it from the page cache, and only then read it again, from disk. TODO: Might be worth making this a config option
* [SERVER] Refactor uplink/cache handling, improve crc checkingSimon Rettberg2018-07-041-216/+73Star
| | | | | | | | | | | | | The cacheFd is now moved to the uplink data structure and will only be handled by the uplink thread. The integrity checker now supports checking all blocks of an image. This will be triggered automatically whenever a check for a single block failed. Also, if a crc check on startup fails, the image won't be discarded anymore, but rather a full check will be initiated. Furthermore, when calling image_updateCacheMap() on an image that was previously complete, the cache map will now be re-initialized, and a new uplink connection created.
* [SERVER] Try to re-open cacheFd if writing failsSimon Rettberg2018-06-251-1/+44
| | | | | | | In scenarios where the proxy is using an NFS server as storage (for whatever crazy reason) or when the cacheFd goes bad through e.g. a switchroot, try to re-open it instead of just disabling caching forever.
* [SERVER] Make sure image has read fd before readingSimon Rettberg2018-06-131-29/+52
|
* [SERVER] Don't spam log in vmdkLegacyMode for unknown imagesSimon Rettberg2018-05-021-3/+7
|
* [SERVER] Fix deadlock on shutdown (via image_tryFreeAll)Simon Rettberg2018-04-241-4/+8
| | | | | imageListLock was locked on twice in the call stack, which is bad if you're using non-recursive locks.
* [SERVER] Mark spammy replication messages as DEBUG2 instead of 1Simon Rettberg2018-04-111-3/+3
|
* [SERVER] Error handling and logging when saving cache mapSimon Rettberg2018-04-101-24/+37
|
* [SERVER] Delete image files after releasing image to get rid of stale .map filesSimon Rettberg2018-03-191-7/+9
|
* [SERVER] image.c: Add size to RPC data, rename bytesReceived, always add ↵Simon Rettberg2018-03-191-7/+11
| | | | uplink if existent
* [SERVER] Increase read() block size when calculating CRC32Simon Rettberg2018-03-191-1/+1
|
* [SERVER] image_getCompletenessEstimate: Fix reversed logic in timeout checkSimon Rettberg2018-03-191-1/+3
|
* [SERVER] Fix int overflows on 32bit builds in CRC generationSimon Rettberg2018-03-161-6/+6
|
* [SERVER] Make sparse file mode actually workSimon Rettberg2018-03-161-5/+12
|
* [SERVER] Experimental support for sparse files in proxy modeSimon Rettberg2018-03-151-10/+34
| | | | | | | | | | | | Will not preallocate images in this mode. Old images are only deleted if the disk is full, determined by write() calls to the cache file yielding ENOSPC or EDQUOT. In such a case, the least recently used image(s) will be deleted to free up at least 256MiB, and then the write() call will be repeated. This *should* work somewhat reliably unless the cache partition is ridiculously small. Performance might suffer a little, and disk fragmentation might occur much faster than in prealloc mode. Testing is needed.
* [SERVER] Add multiple config options for limiting stuffSimon Rettberg2017-11-081-2/+5
| | | | | maxClients, maxImages, maxPayload, maxReplicationSize Refs #3231
* [SERVER] altservers: Tweak, cleanup, refactor, renameSimon Rettberg2017-11-081-1/+1
|
* [SERVER] Properly clamp to 4k borders in updateCachemap()Simon Rettberg2017-11-071-2/+9
| | | | Refs #3231
* [SERVER] Use multiConnect() to find uplink for replicationSimon Rettberg2017-11-071-12/+37
| | | | | Just as in the fuse client, this will speed things up if we have several alt-servers in our list which are not reachable.
* [SERVER] Support finer control over replication when a proxy connects to a proxySimon Rettberg2017-11-021-1/+1
| | | | | | | Introduce new flag in "select image" message to tell the uplink server whether we have background replication enabled or not. Also reject a connecting proxy if the connecting proxy uses BGR but we don't, as this would basically force the image to be replicated locally too.
* [*] Mark logadd() as printf-style function, fix errors that it revealedSimon Rettberg2017-10-311-1/+1
| | | | ...there were quite a few format string errors as it turns out :/
* [SERVER] Image list private to image.cSimon Rettberg2017-10-311-2/+2
|
* [SERVER] Only start reloading images if no other reload is in progressSimon Rettberg2017-10-251-4/+12
|
* [SERVER] Initialize PRNGSimon Rettberg2017-10-241-0/+1
|
* [SERVER] Get rid of zlib dependencySimon Rettberg2017-10-241-16/+16
| | | | | | We only used it for CRC-32, so now the source tree includes a stripped down version of the crc32 code from the zlib project.
* [SERVER] Fix types or add explicit casts everywhere we might have type ↵Simon Rettberg2017-10-241-40/+45
| | | | conversion problems
* [SERVER] Use monotonic clock for measuring timeSimon Rettberg2017-10-191-19/+26
| | | | | Introduces new shared source unit timing.[ch] Closes #3214