This lock-while-mounted behaviour is similar to what QEMU does, and can
help avoid duplicate mounts.
Allowing for an explicit lock path that differs from the filesystem
image / block device path was intentional, to ensure non-flock
supporting filesystems can still be used. Also, there are cases where
a block device and partition (e.g. sda and sda1) can both provide access
to the same filesystem image, in which case an FS-ID based lock would
make sense.
Signed-off-by: David Disseldorp <ddiss@suse.de>
The copy_file_range() syscall allows filesystems to optimize a copy
workload by using reflinks or server-side offload. It can be triggered
via e.g.
xfs_io -f -c "copy_range copysrc" copydest
Without this hook, (kernel) fuse currently falls back to manual
read/write via splice_copy_file_range().
Signed-off-by: David Disseldorp <ddiss@suse.de>
This matches the behaviour of Linux mount helper, e.g.
mount("/dev/loop0p1", "/mnt", "iso9660", 0, NULL) = -1 EACCES
mount("/dev/loop0p1", "/mnt", "iso9660", MS_RDONLY, NULL) = 0
WARNING: source write-protected, mounted read-only.
Signed-off-by: David Disseldorp <ddiss@suse.de>
On 32-bit architectures, struct lkl_timespec is 2*sizeof(long) while
__lkl__kernel_timespec is 2*sizeof(long long); casting these pointer
types is unsafe.
Fixes: 3d4047ac9a ("lkl: follow up fixes after v5.1 merge (y2038)")
Signed-off-by: David Disseldorp <ddiss@suse.de>
Proper open flag semantics are important for some applications and tests
such as xfstests generic/130 (for O_TRUNCATE). Pass all open flags from
fuse through to the corresponding LKL open syscall, with the exception
of O_CREAT, O_EXCL and O_NOCTTY, which fuse handles internally. Other
flags may also be filtered out by fuse.
Signed-off-by: David Disseldorp <ddiss@suse.de>
Generic VFS mount options are normally parsed by the mount binary and
mapped to MS_X mount flags. lklfuse should do the same for kernel mount
options.
This implementation includes a simple string -> flag map, with
corresponding inverted noX flags.
Signed-off-by: David Disseldorp <ddiss@suse.de>
lkl provides valid st_ino values via the getattr and readdir hooks, so
it makes sense to have fuse use them.
Signed-off-by: David Disseldorp <ddiss@suse.de>
As found by xfstests generic/002, fuse attr caching appears to be broken
in that it doesn't invalidate/update st_nlink following a link or
unlink, instead returning a stale value.
Regardless, there's no need to enable fuse caching, given that lkl
provides its own regular dcache.
Signed-off-by: David Disseldorp <ddiss@suse.de>
Update to the new libfuse3 API, which has a number of attractive
features for lkl:
- an lseek() hook with SEEK_DATA / SEEK_HOLE
- copy_file_range()
- RENAME_NOREPLACE and RENAME_EXCHANGE.
Aside from passing through the rename flags, the remaining features
still need to be added to lklfuse. This commit attempts to be a direct
switch over to the new API.
The last upstream release of libfuse2 was in 2019 (2.9.9), so it makes
sense to upgrade regardless of the new functionality.
Signed-off-by: David Disseldorp <ddiss@suse.de>
The utimens array layout is { atime, mtime } but in copying the values
from timespec to lkl_timespec, fuse source array offset 0 (atime) is
used for both lkl_timespec atime and mtime values.
Signed-off-by: David Disseldorp <ddiss@suse.de>
Add new APIs for LKL initialization and cleanup in preparation for
KASAN support:
* lkl_init - should be called before any other LKL API
* lkl_cleanup - should be called after lkl_sys_halt and once called
no other LKL APIs can be issued until lkl_init is called again
This change is not API backwards compatible. All LKL applications need
to be changed to call lkl_init() and update the lkl_start_kernel()
call to match the new function prototype.
Signed-off-by: Octavian Purdila <tavip@google.com>
Resolving symlinks on `chown()` calls will have already been handled when
the call reaches the FUSE layer. Thus, `chown()` actually has to be
treated as `lchown()` internally.
Test (before patch):
```
truncate --size 10M test.raw
mkfs.ext4 test.raw
mkdir mnt
lklfuse -d -o type=ext4 test.raw mnt
cd mnt
ln -s missing-dest test
chown 100:100 test
# chown: cannot dereference 'test': No such file or directory
# readlink /test 4097
# unique: 247, success, outsize: 28
# unique: 248, opcode: LOOKUP (1), nodeid: 1, insize: 53, pid: 2201
# LOOKUP /missing-dest
# getattr /missing-dest
# unique: 248, error: -2 (No such file or directory), outsize: 16
chown 100:100 test -h
# chown: changing ownership of 'test': No such file or directory
# chown /test 100 100
# unique: 253, error: -2 (No such file or directory), outsize: 16
```
Signed-off-by: Rafael Gieschke <rafael.gieschke@rz.uni-freiburg.de>
This patch changes the API of lkl_start_kernel(). The memsize parameter
is now able to specify via this env variable with "mem=100M" string.
Also this patch includes 2 API incompatibility.
1. lkl_start_kernel() now doesn't have memsize argument
2. LKL_HIJACK_NET_IP=dhcp is now obsolted - use ip=dhpc in
LKL_HIJACK_BOOT_CMDLINE instead.
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
This adds support for mounting and unmounting for disk partitions.
NOTE: this patch breaks the user APIs.
Signed-off-by: Octavian Purdila <tavi@cs.pub.ro>
We can't properly support it because multithreading LKL leaks threads
and the number of concurrent threads is limited to the maximum numbers
of interrupts (64 on 64bit systems, 32 otherwise).
Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Some system headers defines use unprotected global defines such as
sa_handler which collides with the names of LKL structure fields.
Define them in uapi/asm/syscalls.h so that they are replaced by the
headers install scripts.
Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
lklfuse_utimens doesn't call utimensat with the AT_SYMLINK_NOFOLLOW
flag.
This makes untar fail if it extracts a symlink before its target and
attempts to set the mtime. lklfuse will follow the broken link and
return ENOENT.
Upper layers translate utimensat on a symlink to the link or the target
depending on the flags before calling utimens on the VFS, which is why
FUSE does not receive the flags.
Simple repro that fails in an lklfuse mount:
ln -s bad link
touch --no-dereference link
This bug originates in older FUSE examples, and has been fixed:
http://marc.info/?l=fuse-devel&m=130902410132197&w=2http://unix.stackexchange.com/questions/89886/modify-date-of-symlink-on-bindfs
This fixes github issue #60.
Reported-by: Ryan Hitchman <hitchmanr@gmail.com>
Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Use asm-generic system call numbers instead of redefining them and use
the __SYSCALL macro to initialize the system call table.
System call stubs are now generated automatically by gathering the
SYSCALL_DEFINE() macros and inserting them in uapi/asm/syscalls.h with
the headers_install.py script.
This patch also changes the way we deal with 64/32 bit specific system
calls (e.g. newstatfs vfs statfs64). Instead of always forcing defining
the *64 version use wrappers in the host library headers.
Note that this change breaks the library APIs.
Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Allow the user to specify how much memory to allocate for LKL. Also
remove a debug printf.
Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Because we open the disk file image after fuse_daemonize (which
changes the current directory) opening the disk file failed if the
filename was not absolute.
Also, the error path was wrong after commit 407679a460 (lkl tools:
lklfuse: initialize lkl after going in the background) which caused a
lkl syscall to be issued even though lkl was not started.
To fix this issue and to allow better error report we now open the
disk file image before fuse_daemonize().
Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
This fixes an issue where fuse blocks every filesystem request is
lklfuse is not started in the background.
Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
cptofs.c: In function ‘main’:
cptofs.c:429:2: warning: comparison of unsigned expression < 0 is always false [-Wtype-limits]
if (disk_id < 0) {
fs2tar.c:355:2: warning: comparison of unsigned expression < 0 is always false [-Wtype-limits]
if (disk_id < 0) {
^
tests/boot.c: In function ‘printk’:
tests/boot.c:54:8: error: ignoring return value of ‘write’, declared with attribute warn_unused_result [-Werror=unused-result]
write(STDOUT_FILENO, str, len);
^
lib/posix-host.c: In function ‘print’:
lib/posix-host.c:20:7: error: ignoring return value of ‘write’, declared with attribute warn_unused_result [-Werror=unused-result]
write(STDOUT_FILENO, str, len);
lib/fs.c: In function ‘lkl_mount_dev’:
lib/fs.c:72:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (mnt_str_len < sizeof("/mnt/xxxxxxxx"))
^
lib/utils.c: In function ‘lkl_strerror’:
lib/utils.c:147:10: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (err >= sizeof(lkl_err_strings) / sizeof(const char *))
lib/virtio.c: In function ‘virtio_read’:
lib/virtio.c:216:19: warning: ‘val’ may be used uninitialized in this function [-Wmaybe-uninitialized]
*(uint32_t *)res = htole32(val);
lklfuse.c: In function ‘lklfuse_readlink’:
lklfuse.c:136:10: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (ret == len)
^
lklfuse.c: In function ‘lklfuse_read’:
lklfuse.c:234:23: warning: signed and unsigned type in conditional expression [-Wsign-compare]
return ret < 0 ? ret : orig_size - size;
^
lklfuse.c: In function ‘lklfuse_write’:
lklfuse.c:253:23: warning: signed and unsigned type in conditional expression [-Wsign-compare]
return ret < 0 ? ret : orig_size - size;
Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
[Octavian: change lkl_mount_dev to take unsigned int size arg, squashed into single commit]
Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>