bwlp/qemu.git - Experimental fork of QEMU with video encoding patches

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	virtiofsd: Show submounts	Max Reitz	2020-05-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, setup_mounts() bind-mounts the shared directory without MS_REC. This makes all submounts disappear. Pass MS_REC so that the guest can see submounts again. Fixes: 5baa3b8e95064c2434bd9e2f312edd5e9ae275dc Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200424133516.73077-1-mreitz@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Changed Fixes to point to the commit with the problem rather than the commit that turned it on
*	virtiofsd: jail lo->proc_self_fd	Miklos Szeredi	2020-05-01	1	-2/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	While it's not possible to escape the proc filesystem through lo->proc_self_fd, it is possible to escape to the root of the proc filesystem itself through "../..". Use a temporary mount for opening lo->proc_self_fd, that has it's root at /proc/self/fd/, preventing access to the ancestor directories. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Message-Id: <20200429124733.22488-1-mszeredi@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: stay below fs.file-max sysctl value (CVE-2020-10717)	Stefan Hajnoczi	2020-05-01	1	-1/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The system-wide fs.file-max sysctl value determines how many files can be open. It defaults to a value calculated based on the machine's RAM size. Previously virtiofsd would try to set RLIMIT_NOFILE to 1,000,000 and this allowed the FUSE client to exhaust the number of open files system-wide on Linux hosts with less than 10 GB of RAM! Take fs.file-max into account when choosing the default RLIMIT_NOFILE value. Fixes: CVE-2020-10717 Reported-by: Yuval Avrahami <yavrahami@paloaltonetworks.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20200501140644.220940-3-stefanha@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: add --rlimit-nofile=NUM option	Stefan Hajnoczi	2020-05-01	3	-14/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make it possible to specify the RLIMIT_NOFILE on the command-line. Users running multiple virtiofsd processes should allocate a certain number to each process so that the system-wide limit can never be exhausted. When this option is set to 0 the rlimit is left at its current value. This is useful when a management tool wants to configure the rlimit itself. The default behavior remains unchanged: try to set the limit to 1,000,000 file descriptors if the current rlimit is lower. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20200501140644.220940-2-stefanha@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	tools/virtiofsd/passthrough_ll: Fix double close()	Philippe Mathieu-Daudé	2020-03-25	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On success, the fdopendir() call closes fd. Later on the error path we try to close an already-closed fd. This can lead to use-after-free. Fix by only closing the fd if the fdopendir() call failed. Cc: qemu-stable@nongnu.org Fixes: b39bce121b (add dirp_map to hide lo_dirp pointers) Reported-by: Coverity (CID 1421933 USE_AFTER_FREE) Suggested-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200321120654.7985-1-philmd@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: Fix xattr operations	Misono Tomohiro	2020-03-03	3	-47/+77
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Current virtiofsd has problems about xattr operations and they does not work properly for directory/symlink/special file. The fundamental cause is that virtiofsd uses openat() + f...xattr() systemcalls for xattr operation but we should not open symlink/special file in the daemon. Therefore the function is restricted. Fix this problem by: 1. during setup of each thread, call unshare(CLONE_FS) 2. in xattr operations (i.e. lo_getxattr), if inode is not a regular file or directory, use fchdir(proc_loot_fd) + ...xattr() + fchdir(root.fd) instead of openat() + f...xattr() (Note: for a regular file/directory openat() + f...xattr() is still used for performance reason) With this patch, xfstests generic/062 passes on virtiofs. This fix is suggested by Miklos Szeredi and Stefan Hajnoczi. The original discussion can be found here: https://www.redhat.com/archives/virtio-fs/2019-October/msg00046.html Signed-off-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com> Message-Id: <20200227055927.24566-3-misono.tomohiro@jp.fujitsu.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: passthrough_ll: cleanup getxattr/listxattr	Misono Tomohiro	2020-03-03	1	-32/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a cleanup patch to simplify the following xattr fix and there is no functional changes. - Move memory allocation to head of the function - Unify fgetxattr/flistxattr call for both size == 0 and size != 0 case - Remove redundant lo_inode_put call in error path (Note: second call is ignored now since @inode is already NULL) Signed-off-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com> Message-Id: <20200227055927.24566-2-misono.tomohiro@jp.fujitsu.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: Remove fuse.h and struct fuse_module	Xiao Yang	2020-02-21	2	-1245/+0
\| \| \| \| \| \| \| \| \| \|	All code in fuse.h and struct fuse_module are not used by virtiofsd so removing them is safe. Signed-off-by: Xiao Yang <yangx.jy@cn.fujitsu.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	tools/virtiofsd/fuse_lowlevel: Fix fuse_out_header::error value	Philippe Mathieu-Daudé	2020-02-21	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix warning reported by Clang static code analyzer: CC tools/virtiofsd/fuse_lowlevel.o tools/virtiofsd/fuse_lowlevel.c:195:9: warning: Value stored to 'error' is never read error = -ERANGE; ^ ~~~~~~~ Fixes: 3db2876 Reported-by: Clang Static Analyzer Reviewed-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	tools/virtiofsd/passthrough_ll: Remove unneeded variable assignment	Philippe Mathieu-Daudé	2020-02-21	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix warning reported by Clang static code analyzer: CC tools/virtiofsd/passthrough_ll.o tools/virtiofsd/passthrough_ll.c:925:9: warning: Value stored to 'newfd' is never read newfd = -1; ^ ~~ tools/virtiofsd/passthrough_ll.c:942:9: warning: Value stored to 'newfd' is never read newfd = -1; ^ ~~ Fixes: 7c6b66027 Reported-by: Clang Static Analyzer Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	tools/virtiofsd/passthrough_ll: Remove unneeded variable assignment	Philippe Mathieu-Daudé	2020-02-21	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix warning reported by Clang static code analyzer: CC tools/virtiofsd/passthrough_ll.o tools/virtiofsd/passthrough_ll.c:1083:5: warning: Value stored to 'saverr' is never read saverr = ENOMEM; ^ ~~~~~~ Fixes: 7c6b66027 Reported-by: Clang Static Analyzer Reviewed-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: Help message fix for 'seconds'	Dr. David Alan Gilbert	2020-02-21	1	-1/+1
\| \| \| \| \| \| \|	second should be seconds. Reported-by: Christophe de Dinechin <dinechin@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: do_read missing NULL check	Dr. David Alan Gilbert	2020-02-10	1	-0/+4
\| \| \| \| \| \| \| \| \|	Missing a NULL check if the argument fetch fails. Fixes: Coverity CID 1413119 Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
*	virtiofsd: load_capng missing unlock	Dr. David Alan Gilbert	2020-02-10	1	-0/+1
\| \| \| \| \| \| \| \| \|	Missing unlock in error path. Fixes: Covertiy CID 1413123 Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
*	virtiofsd: fv_create_listen_socket error path socket leak	Dr. David Alan Gilbert	2020-02-10	1	-0/+2
\| \| \| \| \| \| \| \| \| \|	If we fail when bringing up the socket we can leak the listen_fd; in practice the daemon will exit so it's not really a problem. Fixes: Coverity CID 1413121 Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
*	virtiofsd: Remove fuse_req_getgroups	Dr. David Alan Gilbert	2020-02-10	3	-118/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove fuse_req_getgroups that's unused in virtiofsd; it came in from libfuse but we don't actually use it. It was called from fuse_getgroups which we previously removed (but had left it's header in). Coverity had complained about null termination in it, but removing it is the easiest answer. Fixes: Coverity CID: 1413117 (String not null terminated) Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
*	virtiofsd: add some options to the help message	Masayoshi Mizuma	2020-01-23	1	-1/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add following options to the help message: - cache - flock\|no_flock - norace - posix_lock\|no_posix_lock - readdirplus\|no_readdirplus - timeout - writeback\|no_writeback - xattr\|no_xattr Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> dgilbert: Split cache, norace, posix_lock, readdirplus off into our own earlier patches that added the options Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: stop all queue threads on exit in virtio_loop()	Eryu Guan	2020-01-23	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On guest graceful shutdown, virtiofsd receives VHOST_USER_GET_VRING_BASE request from VMM and shuts down virtqueues by calling fv_set_started(), which joins fv_queue_thread() threads. So when virtio_loop() returns, there should be no thread is still accessing data in fuse session and/or virtio dev. But on abnormal exit, e.g. guest got killed for whatever reason, vhost-user socket is closed and virtio_loop() breaks out the main loop and returns to main(). But it's possible fv_queue_worker()s are still working and accessing fuse session and virtio dev, which results in crash or use-after-free. Fix it by stopping fv_queue_thread()s before virtio_loop() returns, to make sure there's no-one could access fuse session and virtio dev. Reported-by: Qingming Su <qingming.su@linux.alibaba.com> Signed-off-by: Eryu Guan <eguan@linux.alibaba.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd/passthrough_ll: Pass errno to fuse_reply_err()	Xiao Yang	2020-01-23	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	lo_copy_file_range() passes -errno to fuse_reply_err() and then fuse_reply_err() changes it to errno again, so that subsequent fuse_send_reply_iov_nofree() catches the wrong errno.(i.e. reports "fuse: bad error value: ..."). Make fuse_send_reply_iov_nofree() accept the correct -errno by passing errno directly in lo_copy_file_range(). Signed-off-by: Xiao Yang <yangx.jy@cn.fujitsu.com> Reviewed-by: Eryu Guan <eguan@linux.alibaba.com> dgilbert: Sent upstream and now Merged as aa1185e153f774f1df65 Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: Convert lo_destroy to take the lo->mutex lock itself	Dr. David Alan Gilbert	2020-01-23	1	-14/+17
\| \| \| \| \| \| \| \| \| \| \|	lo_destroy was relying on some implicit knowledge of the locking; we can avoid this if we create an unref_inode that doesn't take the lock and then grab it for the whole of the lo_destroy. Suggested-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: add --thread-pool-size=NUM option	Stefan Hajnoczi	2020-01-23	3	-3/+10
\| \| \| \| \| \| \| \| \| \|	Add an option to control the size of the thread pool. Requests are now processed in parallel by default. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: fix lo_destroy() resource leaks	Stefan Hajnoczi	2020-01-23	1	-21/+20
\| \| \| \| \| \| \| \| \| \|	Now that lo_destroy() is serialized we can call unref_inode() so that all inode resources are freed. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: prevent FUSE_INIT/FUSE_DESTROY races	Stefan Hajnoczi	2020-01-23	2	-0/+19
\| \| \| \| \| \| \| \| \| \| \|	When running with multiple threads it can be tricky to handle FUSE_INIT/FUSE_DESTROY in parallel with other request types or in parallel with themselves. Serialize FUSE_INIT and FUSE_DESTROY so that malicious clients cannot trigger race conditions. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: process requests in a thread pool	Stefan Hajnoczi	2020-01-23	1	-158/+201
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Introduce a thread pool so that fv_queue_thread() just pops VuVirtqElements and hands them to the thread pool. For the time being only one worker thread is allowed since passthrough_ll.c is not thread-safe yet. Future patches will lift this restriction so that multiple FUSE requests can be processed in parallel. The main new concept is struct FVRequest, which contains both VuVirtqElement and struct fuse_chan. We now have fv_VuDev for a device, fv_QueueInfo for a virtqueue, and FVRequest for a request. Some of fv_QueueInfo's fields are moved into FVRequest because they are per-request. The name FVRequest conforms to QEMU coding style and I expect the struct fv_* types will be renamed in a future refactoring. This patch series is not optimal. fbuf reuse is dropped so each request does malloc(se->bufsize), but there is no clean and cheap way to keep this with a thread pool. The vq_lock mutex is held for longer than necessary, especially during the eventfd_write() syscall. Performance can be improved in the future. prctl(2) had to be added to the seccomp whitelist because glib invokes it. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: use fuse_buf_writev to replace fuse_buf_write for better performance	piaojun	2020-01-23	1	-2/+18
\| \| \| \| \| \| \| \| \| \| \| \| \|	fuse_buf_writev() only handles the normal write in which src is buffer and dest is fd. Specially if src buffer represents guest physical address that can't be mapped by the daemon process, IO must be bounced back to the VMM to do it by fuse_buf_copy(). Signed-off-by: Jun Piao <piaojun@huawei.com> Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Suggested-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: add definition of fuse_buf_writev()	piaojun	2020-01-23	1	-0/+38
\| \| \| \| \| \| \| \| \| \| \| \|	Define fuse_buf_writev() which use pwritev and writev to improve io bandwidth. Especially, the src bufs with 0 size should be skipped as their mems are not block_size aligned which will cause writev failed in direct io mode. Signed-off-by: Jun Piao <piaojun@huawei.com> Suggested-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: passthrough_ll: Use cache_readdir for directory open	Misono Tomohiro	2020-01-23	1	-1/+1
\| \| \| \| \| \| \| \| \|	Since keep_cache(FOPEN_KEEP_CACHE) has no effect for directory as described in fuse_common.h, use cache_readdir(FOPNE_CACHE_DIR) for diretory open when cache=always mode. Signed-off-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: Fix data corruption with O_APPEND write in writeback mode	Misono Tomohiro	2020-01-23	1	-33/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When writeback mode is enabled (-o writeback), O_APPEND handling is done in kernel. Therefore virtiofsd clears O_APPEND flag when open. Otherwise O_APPEND flag takes precedence over pwrite() and write data may corrupt. Currently clearing O_APPEND flag is done in lo_open(), but we also need the same operation in lo_create(). So, factor out the flag update operation in lo_open() to update_open_flags() and call it in both lo_open() and lo_create(). This fixes the failure of xfstest generic/069 in writeback mode (which tests O_APPEND write data integrity). Signed-off-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: Reset O_DIRECT flag during file open	Vivek Goyal	2020-01-23	1	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If an application wants to do direct IO and opens a file with O_DIRECT in guest, that does not necessarily mean that we need to bypass page cache on host as well. So reset this flag on host. If somebody needs to bypass page cache on host as well (and it is safe to do so), we can add a knob in daemon later to control this behavior. I check virtio-9p and they do reset O_DIRECT flag. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: convert more fprintf and perror to use fuse log infra	Eryu Guan	2020-01-23	2	-5/+11
\| \| \| \| \| \| \| \|	Signed-off-by: Eryu Guan <eguan@linux.alibaba.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: do not always set FUSE_FLOCK_LOCKS	Peng Tao	2020-01-23	1	-3/+8
\| \| \| \| \| \| \| \| \| \|	Right now we always enable it regardless of given commandlines. Fix it by setting the flag relying on the lo->flock bit. Signed-off-by: Peng Tao <tao.peng@linux.alibaba.com> Reviewed-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com> Reviewed-by: Sergio Lopez <slp@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: introduce inode refcount to prevent use-after-free	Stefan Hajnoczi	2020-01-23	1	-23/+146
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If thread A is using an inode it must not be deleted by thread B when processing a FUSE_FORGET request. The FUSE protocol itself already has a counter called nlookup that is used in FUSE_FORGET messages. We cannot trust this counter since the untrusted client can manipulate it via FUSE_FORGET messages. Introduce a new refcount to keep inodes alive for the required lifespan. lo_inode_put() must be called to release a reference. FUSE's nlookup counter holds exactly one reference so that the inode stays alive as long as the client still wants to remember it. Note that the lo_inode->is_symlink field is moved to avoid creating a hole in the struct due to struct field alignment. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com> Reviewed-by: Sergio Lopez <slp@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: passthrough_ll: fix refcounting on remove/rename	Miklos Szeredi	2020-01-23	1	-1/+49
\| \| \| \| \| \|	Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Reviewed-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: rename inode->refcount to inode->nlookup	Stefan Hajnoczi	2020-01-23	1	-12/+25
\| \| \| \| \| \| \| \| \| \|	This reference counter plays a specific role in the FUSE protocol. It's not a generic object reference counter and the FUSE kernel code calls it "nlookup". Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: prevent races with lo_dirp_put()	Stefan Hajnoczi	2020-01-23	1	-6/+35
\| \| \| \| \| \| \| \| \| \| \| \|	Introduce lo_dirp_put() so that FUSE_RELEASEDIR does not cause use-after-free races with other threads that are accessing lo_dirp. Also make lo_releasedir() atomic to prevent FUSE_RELEASEDIR racing with itself. This prevents double-frees. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: make lo_release() atomic	Stefan Hajnoczi	2020-01-23	1	-4/+8
\| \| \| \| \| \| \| \| \| \|	Hold the lock across both lo_map_get() and lo_map_remove() to prevent races between two FUSE_RELEASE requests. In this case I don't see a serious bug but it's safer to do things atomically. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: prevent fv_queue_thread() vs virtio_loop() races	Stefan Hajnoczi	2020-01-23	1	-1/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We call into libvhost-user from the virtqueue handler thread and the vhost-user message processing thread without a lock. There is nothing protecting the virtqueue handler thread if the vhost-user message processing thread changes the virtqueue or memory table while it is running. This patch introduces a read-write lock. Virtqueue handler threads are readers. The vhost-user message processing thread is a writer. This will allow concurrency for multiqueue in the future while protecting against fv_queue_thread() vs virtio_loop() races. Note that the critical sections could be made smaller but it would be more invasive and require libvhost-user changes. Let's start simple and improve performance later, if necessary. Another option would be an RCU-style approach with lighter-weight primitives. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: use fuse_lowlevel_is_virtio() in fuse_session_destroy()	Stefan Hajnoczi	2020-01-23	1	-3/+4
\| \| \| \| \| \| \| \| \|	vu_socket_path is NULL when --fd=FDNUM was used. Use fuse_lowlevel_is_virtio() instead. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: Support remote posix locks	Vivek Goyal	2020-01-23	2	-0/+192
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Doing posix locks with-in guest kernel are not sufficient if a file/dir is being shared by multiple guests. So we need the notion of daemon doing the locks which are visible to rest of the guests. Given posix locks are per process, one can not call posix lock API on host, otherwise bunch of basic posix locks properties are broken. For example, If two processes (A and B) in guest open the file and take locks on different sections of file, if one of the processes closes the fd, it will close fd on virtiofsd and all posix locks on file will go away. This means if process A closes the fd, then locks of process B will go away too. Similar other problems exist too. This patch set tries to emulate posix locks while using open file description locks provided on Linux. Daemon provides two options (-o posix_lock, -o no_posix_lock) to enable or disable posix locking in daemon. By default it is enabled. There are few issues though. - GETLK() returns pid of process holding lock. As we are emulating locks using OFD, and these locks are not per process and don't return pid of process, so GETLK() in guest does not reuturn process pid. - As of now only F_SETLK is supported and not F_SETLKW. We can't block the thread in virtiofsd for arbitrary long duration as there is only one thread serving the queue. That means unlock request will not make it to daemon and F_SETLKW will block infinitely and bring virtio-fs to a halt. This is a solvable problem though and will require significant changes in virtiofsd and kernel. Left as a TODO item for now. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	Virtiofsd: fix memory leak on fuse queueinfo	Liu Bo	2020-01-23	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \|	For fuse's queueinfo, both queueinfo array and queueinfos are allocated in fv_queue_set_started() but not cleaned up when the daemon process quits. This fixes the leak in proper places. Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Signed-off-by: Eric Ren <renzhen@linux.alibaba.com> Reviewed-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: fix incorrect error handling in lo_do_lookup	Eric Ren	2020-01-23	1	-1/+0
\| \| \| \| \| \|	Signed-off-by: Eric Ren <renzhen@linux.alibaba.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: enable PARALLEL_DIROPS during INIT	Liu Bo	2020-01-23	1	-0/+3
\| \| \| \| \| \| \| \|	lookup is a RO operations, PARALLEL_DIROPS can be enabled. Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: Prevent multiply running with same vhost_user_socket	Masayoshi Mizuma	2020-01-23	2	-1/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	virtiofsd can run multiply even if the vhost_user_socket is same path. ]# ./virtiofsd -o vhost_user_socket=/tmp/vhostqemu -o source=/tmp/share & [1] 244965 virtio_session_mount: Waiting for vhost-user socket connection... ]# ./virtiofsd -o vhost_user_socket=/tmp/vhostqemu -o source=/tmp/share & [2] 244966 virtio_session_mount: Waiting for vhost-user socket connection... ]# The user will get confused about the situation and maybe the cause of the unexpected problem. So it's better to prevent the multiple running. Create a regular file under localstatedir directory to exclude the vhost_user_socket. To create and lock the file, use qemu_write_pidfile() because the API has some sanity checks and file lock. Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Applied fixes from Stefan's review and moved osdep include Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: add helper for lo_data cleanup	Liu Bo	2020-01-23	1	-16/+21
\| \| \| \| \| \| \| \|	This offers an helper function for lo_data's cleanup. Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: fix memory leak on lo.source	Liu Bo	2020-01-23	1	-3/+4
\| \| \| \| \| \| \| \| \|	valgrind reported that lo.source is leaked on quiting, but it was defined as (const char*) as it may point to a const string "/". Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Reviewed-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: cleanup allocated resource in se	Liu Bo	2020-01-23	3	-1/+15
\| \| \| \| \| \| \| \| \|	This cleans up unfreed resources in se on quiting, including se->virtio_dev, se->vu_socket_path, se->vu_socketfd. Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: fix error handling in main()	Liu Bo	2020-01-23	1	-2/+3
\| \| \| \| \| \| \| \| \|	Neither fuse_parse_cmdline() nor fuse_opt_parse() goes to the right place to do cleanup. Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: support nanosecond resolution for file timestamp	Jiufei Xue	2020-01-23	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	Define HAVE_STRUCT_STAT_ST_ATIM to 1 if `st_atim' is member of `struct stat' which means support nanosecond resolution for the file timestamp fields. Signed-off-by: Jiufei Xue <jiufei.xue@linux.alibaba.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: Clean up inodes on destroy	Dr. David Alan Gilbert	2020-01-23	1	-0/+26
\| \| \| \| \| \| \| \| \|	Clear out our inodes and fd's on a 'destroy' - so we get rid of them if we reboot the guest. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
*	virtiofsd: passthrough_ll: use hashtable	Miklos Szeredi	2020-01-23	1	-36/+45
\| \| \| \| \| \| \| \| \| \|	Improve performance of inode lookup by using a hash table. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>