| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
There was a logic bug that would favor a remotely looked up image rid,
even if we already found a higher revision locally.
|
|
|
|
|
| |
Still needs some cleanup and optimizations, variable naming sucks,
comments, etc.
|
|
|
|
| |
Entries in _images array might ne NULL
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
| |
- Now uses linked lists instead of huge array
- Does prefetch data on client requests
- Can have multiple replication requests in-flight
|
| |
|
| |
|
| |
|
|
|
|
|
|
| |
If an image is incomplete, but has no upstream server that can be used
for replication, reload the cache map from disk periodically, in case
some other server instance is writing to the image.
|
|
|
|
|
|
|
|
|
| |
Cache maps will now be saved periodically, but only if either they have
a "dirty" bit set, which happens if any bits in the map get cleared
again (due to corruption), or if new data has been replicated from an
uplink server. This either means at least one byte received and 5
minutes have passed, or at least 500MB have been downloaded. The timer
currently runs every 20 seconds.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
Tracking the "working" state of images using one boolean is insufficient
regarding the different ways in which providing an image can fail.
Introduce separate flags for different conditions, like "file not
readable", "file not writable", "no uplink server available", "file
content has changed".
|
|
|
|
|
| |
If enabled, a failed fallocate will not abort image replication, but
retry with sparse mode.
|
|
|
|
|
|
|
| |
In proxy mode, when rid 0 is requested, we now first query
our uplink servers for the latest revision and if this fails,
like in non-proxy mode, we'll see what the latest version on
disk is.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
This setting allows you to control the formerly hard-coded timeout of 10
hours before a proxy would start deleting old images in order to free up
space for new images. Setting it to -1 entirely disables automatic
deletion, in case you have an external process for freeing up disk
space.
|
|
|
|
|
|
| |
Not really namespace but simple string matching for the image path. Path
is matched from start with no support for glob or regex, so usually you
want to have a trailing '/' to limit to certain directories.
|
| |
|
| |
|
| |
|
|
|
|
|
|
| |
Gets rid of a bunch of locking, especially the hot path in net.c where
clients are requesting data. Many clients unsing the same incomplete
image previously created a bottleneck here.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to POSIX, a signal sent to a PID can be delivered to an
arbitrary thread of that process that hasn't the signal blocked. This
seens to never happen on Linux, but would mess things up since the code
expected the main signal handler to only be executed by the main thread.
This should now be fixed by examining the destination PID of the signal
as well as the ID of the thread currently running the signal handler. If
we notice the signal wasn't sent by our own PID and the handler is not
currently run by the main thread, we re-send the signal to the main
thread. Otherwise, if the signal was sent by our own PID but the handler
is not run in the main thread, do nothing. This way we can use
pthread_kill() to wake up threads that might be stuck in a blocking
syscall when it's time to shut down.
|
| |
|
| |
|
|
|
|
| |
First step towards less locking for proxy mode
|
|
|
|
|
|
|
|
|
|
| |
Alt-Server checks are now run using the threadpool, so we don't need a
queue and dedicated thread anymore. The rtt history is now kept per
uplink, so many uplinks won't overwhelm the history, making its time
window very short.
Also the fail counter is now split up; a global one for when the server
actually isn't reachable, a local (per-uplink) one for when the server
is reachable but doesn't serve the requested image.
|
| |
|
|
|
|
|
|
| |
Lock order is predefined in locks.h. Immediately bail out if a lock with
lower priority is obtained while the same thread already holds one with
higher priority.
|
|
|
|
|
|
|
|
| |
With this change it should be safe to read the users count of an image
without locking first, assuming you already have a reference on the
image or are otherwise sure it cannot be freed, i.e. in an active
uplink. Updating users, or checking whether it's 0 in order to free the
image should only be done while holding the imageListLock.
|
|
|
|
|
|
|
|
| |
Just assume sane platforms offer smart mutexes
that have a fast-path with spinlocks internally
for locks that have little to no congestion.
In all other cases, mutexes should perform better
anyways.
|
|
|
|
| |
Counter in seconds for how long this image hasn't been used.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The idea is that for full image checks, we don't want to
pollute the fs cache with gigabytes of data that won't be
needed again soon. This would certainly hurt performance
on servers that dont have hundreds of GBs of RAM.
For single block checks during replication this has the
advantage that we don't check the block in memory before
it hit the disk once, but actually flush the data to disk,
then remove it from the page cache, and only then read it
again, from disk.
TODO: Might be worth making this a config option
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The cacheFd is now moved to the uplink data structure and will
only be handled by the uplink thread.
The integrity checker now supports checking all blocks of an
image. This will be triggered automatically whenever a check for
a single block failed.
Also, if a crc check on startup fails, the image won't be discarded
anymore, but rather a full check will be initiated.
Furthermore, when calling image_updateCacheMap() on an image that
was previously complete, the cache map will now be re-initialized,
and a new uplink connection created.
|
|
|
|
|
|
|
| |
In scenarios where the proxy is using an NFS server as
storage (for whatever crazy reason) or when the cacheFd
goes bad through e.g. a switchroot, try to re-open it
instead of just disabling caching forever.
|
| |
|
| |
|
|
|
|
|
| |
imageListLock was locked on twice in the call stack, which
is bad if you're using non-recursive locks.
|
| |
|