Implement
- userland build in nix, with cross platform support and
non-previleged disk gen
- qemu start command in nix
- nix develop envrionment for make kernel
- document build environment defined by nix
Next Steps
- 减小rootfs占用nix store的体积
- deb包的打包兼容用的函数
- 更加灵活的构建依赖注入
- 保留先前系统内修改的内容
- nix rootfs build与qemu启动适配 vnc模式
- 适配 riscv64 构建
- Arm MacOS上的开发兼容
* feat(filesystem): Enhance symlink handling and VFS behavior
- Updated tmpfs to require page cache for both regular files and symlinks to ensure proper read/write operations.
- Increased the maximum symlink follow count to 40, aligning with Linux 6.6 standards.
- Improved symlink handling in VFS to correctly follow symlinks based on path conditions and trailing slashes.
- Added validation for conflicting flags in vfs_statx to prevent invalid operations.
- Refined syscall implementations for symlink and lstat to adhere to Linux semantics, ensuring correct behavior for symlink creation and path resolution.
Signed-off-by: longjin <longjin@DragonOS.org>
* fix(vfs): 修正符号链接跟随次数的处理逻辑
- 将 VFS_MAX_FOLLOW_SYMLINK_TIMES 从 40 调整为 41,以保留 0 的禁用语义并实现最多
40 次跟随的 Linux 语义
- 重构路径解析逻辑,明确区分 max_follow_times 为
0(完全禁用跟随)、1(计数耗尽)及 >=2(允许继续跟随)三种情况
- 确保在计数耗尽(max_follow_times == 1)且需要跟随时正确返回 ELOOP 错误
Signed-off-by: longjin <longjin@DragonOS.org>
---------
Signed-off-by: longjin <longjin@DragonOS.org>
- Set default umask to 0022 for new filesystem instances
- Add apply_umask_for_create() and chmod_preserve_type() helper functions
- Implement proper permission checks for file creation and chmod operations
- Fix fchmod syscall to work correctly and reject O_PATH file descriptors
- Add open_create_test to gvisor test suite
Signed-off-by: longjin <longjin@DragonOS.org>
* feat(filesystem): Implement EventFd filesystem and enhance VFS inode capabilities
- Introduced EventFdFs as a new pseudo-filesystem to support eventfd file descriptors, including methods for root inode retrieval and filesystem information.
- Enhanced IndexNode trait with is_stream, supports_seek, supports_pread, and supports_pwrite methods to streamline file operation semantics for stream-like files.
- Updated file handling in VFS to utilize new inode capabilities, ensuring correct behavior for pread, pwrite, and lseek operations.
- Added eventfd_test to the syscall whitelist for testing purposes.
This implementation aligns with Linux semantics for eventfd and improves the overall VFS design by consolidating stream behavior checks.
Signed-off-by: longjin <longjin@DragonOS.org>
* fix(vfs): 修复pread/pwrite中O_PATH和流式对象的错误处理顺序
- 调整O_PATH文件描述符的错误处理顺序,确保优先返回EBADF
- 为流式对象(FMODE_STREAM)添加ESPIPE错误处理,避免权限检查导致的误报
- 分离权限检查逻辑,确保错误码符合Linux语义
Signed-off-by: longjin <longjin@DragonOS.org>
---------
Signed-off-by: longjin <longjin@DragonOS.org>
* feat(vfs): Implement append lock manager for file operations
- Introduced an `AppendLockManager` to ensure atomicity for append operations across filesystems, preventing data corruption in concurrent write scenarios.
- Updated file write methods to utilize the new append lock mechanism, ensuring that appending to files respects the latest end-of-file position.
- Enhanced `write_append` and `pwrite_append` methods to support forced append semantics, aligning with Linux behavior.
- Initialized the append lock manager during VFS initialization to ensure it is ready before any file write operations.
This addition improves the reliability of file operations in a multi-threaded environment, particularly for append scenarios.
Signed-off-by: longjin <longjin@DragonOS.org>
* feat(kernel): 添加jhash库并用于append_lock的哈希计算
Signed-off-by: longjin <longjin@DragonOS.org>
---------
Signed-off-by: longjin <longjin@DragonOS.org>
* feat(procfs): Add /proc/[pid]/mountinfo and /proc/[pid]/maps support
- Introduced new ProcFileType variants for /proc/[pid]/mountinfo and /proc/[pid]/maps.
- Implemented content generation for /proc/[pid]/mountinfo and /proc/[pid]/maps to align with Linux semantics.
- Updated ProcFS inode creation to include these new files for each process.
- Enhanced path handling in the VFS to ensure correct resolution based on process-specific root and current working directory.
This addition improves the process filesystem's functionality and compatibility with Linux behavior.
* feat(filesystem): Enhance page cache management in tmpfs
- Added an unevictable flag to the PageCache structure, allowing pages to be marked as unevictable to prevent reclamation.
- Updated the TmpfsInode structure to integrate page cache management, replacing direct data manipulation with page cache operations for read and write methods.
- Refactored truncate and resize methods to utilize the new page cache functionality, ensuring consistency and improved memory management.
* feat(filesystem): Enhance tmpfs functionality and VFS constraints
- Implemented support for readahead in tmpfs, allowing for optimized data retrieval.
- Added checks for filename length across various VFS operations to prevent errors related to excessively long names.
- Updated the tmpfs implementation to handle read and write operations directly through the page cache, improving memory management.
- Enhanced rename functionality to ensure type compatibility and empty directory checks during operations.
- Increased maximum path length and defined maximum single filename length for better filesystem compliance.
* refactor(fs): 重构tmpfs重命名逻辑并修复MountFSInode的move_to委托
- 将tmpfs跨目录移动逻辑提取为独立函数`tmpfs_move_entry_between_dirs`
- 优化锁顺序以避免死锁,按inode_id顺序锁定目录
- 修复MountFSInode::move_to中目标inode解包问题,确保正确委托给底层文件系统
Signed-off-by: longjin <longjin@DragonOS.org>
* feat(filesystem): Implement zero-page creation for tmpfs and enhance page fault handling
- Added `create_zero_pages` method to `InnerPageCache` for efficient zero-page creation, optimizing memory usage in tmpfs.
- Updated `Tmpfs` to utilize the new zero-page creation during read and write operations, ensuring seamless handling of page faults.
- Enhanced `PageFaultHandler` with `pagecache_fault_zero` to manage page faults specifically for tmpfs, allowing for direct page cache access without disk I/O.
This improves the performance and reliability of memory file systems by reducing unnecessary allocations and ensuring proper page management.
* refactor(syscall): Rename check_and_clone_cstr to vfs_check_and_clone_cstr for clarity
- Updated the user access module to introduce vfs_check_and_clone_cstr, enhancing clarity in its purpose for VFS operations.
- Refactored sys_openat and utimensat to utilize the new vfs_check_and_clone_cstr function, ensuring consistent handling of C string paths across the filesystem.
---------
Signed-off-by: longjin <longjin@DragonOS.org>
- Added `from_slice` method to `CacheBlock` for creating instances directly from slices, avoiding unnecessary allocations.
- Introduced `write_data` method in `CacheBlock` to allow in-place updates of block data.
- Updated `insert_one_block` and `immediate_write` methods in `BlockCache` to accept slices instead of vectors, improving performance and memory usage.
- Implemented error handling for block size validation in multiple locations to ensure data integrity.
- Change `FileMapInfo::page_cache` to `Weak<PageCache>` to fix a memory leak caused by reference cycles.
Signed-off-by: longjin <longjin@DragonOS.org>
* fix(net): udp getsockname/getpeername
* feat(ci): add test whitelist for new available inet syscall
---------
Co-authored-by: longjin <longjin@DragonOS.org>
* fix(tty): Enhance TTY driver and device management
- Updated the TTY driver to handle both master and slave types more effectively during close operations, ensuring proper cleanup of device entries in /dev/pts.
- Improved the handling of controlling TTY detachment for processes, adding support for the TIOCNOTTY command.
- Refactored the PTY device initialization to ensure correct metadata settings and device registration.
- Added a symlink for /dev/ptmx to point to the internal devpts node, preventing ENOENT errors during early access.
These changes enhance the robustness and compatibility of the TTY subsystem with Linux semantics.
Signed-off-by: longjin <longjin@DragonOS.org>
* refactor(tty): Improve PTY device management and cleanup logic
- Enhanced the `PtyDevPtsLink` structure to manage the lifecycle of PTY devices more effectively, including precise unlinking of directory entries and freeing of indices upon closure.
- Refactored the close operation in the TTY driver to utilize the new management logic, ensuring proper cleanup of master and slave PTY devices.
- Removed redundant code related to device entry removal, streamlining the cleanup process and aligning with Linux semantics.
Signed-off-by: longjin <longjin@DragonOS.org>
---------
Signed-off-by: longjin <longjin@DragonOS.org>
* feat(filesystem): Add tmpfs support and integrate with devfs
- Introduced a new tmpfs module for temporary file storage in memory.
- Updated devfs to mount /dev/shm as tmpfs, aligning with Linux semantics.
- Enhanced vfs module to include TMPFS_MAGIC for tmpfs identification.
- Added necessary methods and structures for tmpfs functionality, including inode management and file operations.
Signed-off-by: longjin <longjin@DragonOS.org>
* feat(filesystem): Implement atomic size management for tmpfs
- Added atomic operations to manage the current size of the tmpfs filesystem, including methods to increase and decrease size based on file operations.
- Integrated size management into inode operations, ensuring that size updates are thread-safe and adhere to specified limits.
- Enhanced the resize and truncate methods to adjust the filesystem size accordingly during file modifications.
Signed-off-by: longjin <longjin@DragonOS.org>
---------
Signed-off-by: longjin <longjin@DragonOS.org>
* feat(filesystem): Add pwritev2 syscall implementation
- Introduced the pwritev2 syscall, allowing vectorized writes with offset and flags, enhancing compatibility with Linux semantics.
- Implemented validation for file descriptors and offsets, ensuring robust error handling.
- Reused core logic from pwritev for the new syscall, maintaining consistency in file writing operations.
This addition improves the VFS layer's functionality and aligns with Linux behavior for vectorized writing operations.
Signed-off-by: longjin <longjin@DragonOS.org>
* fmt
---------
Signed-off-by: longjin <longjin@DragonOS.org>
- Added validation for pwrite and pwritev syscalls to ensure offset and length conform to Linux semantics, returning EINVAL for negative offsets and invalid ranges.
- Updated offset extraction in sys_pwrite64 and sys_pwritev to use i64 for better compatibility.
- Included a new test case for pwrite64 in the gvisor whitelist to ensure proper functionality.
This change improves the robustness of file writing operations in the VFS layer.
Signed-off-by: longjin <longjin@DragonOS.org>
* refactor(mm): 重构页缓存读写以解决死锁问题并改进错误处理
- 将页缓存读写拆分为两阶段以避免用户缺页时持有锁
- 改进文件系统缺页处理,返回SIGBUS而非panic
- 优化sys_read/sys_write的用户缓冲区访问检查
- 修复mprotect参数对齐检查
Signed-off-by: longjin <longjin@DragonOS.org>
* feat(devfs): 添加随机设备支持
- 新增随机设备模块random_dev,提供随机字节生成能力
- 在DevFS中注册/dev/random设备,确保系统能够访问随机数据
- 更新相关文件以支持新设备的集成
Signed-off-by: longjin <longjin@DragonOS.org>
* feat(filesystem): 添加mmap支持到多个文件系统节点
- 为LockedZeroInode、LockedExt4Inode、LockedFATInode和LockedRamFSInode实现mmap方法,允许内存映射操作。
- 更新相关文件以支持mmap功能,确保与用户空间的交互更加灵活。
Signed-off-by: longjin <longjin@DragonOS.org>
* fix(mm): Improve mmap error handling and validation
- Enhanced error handling in mmap implementation to return appropriate errors for unsupported operations.
- Added checks for MAP_PRIVATE and MAP_SHARED flags to ensure only one is set.
- Implemented page alignment validation for MAP_FIXED.
- Updated tests to reflect changes in mmap behavior.
Signed-off-by: longjin <longjin@DragonOS.org>
* fix(mm): Enhance memory protection handling and validation
- Updated the `init_xd_rsvd` function to ensure NX support is enabled and correctly handle hardware limitations.
- Improved alignment checks in `sys_mprotect` to prevent overflow and ensure proper memory area verification.
- Removed outdated tests from `mmap_test` to streamline the test suite.
Signed-off-by: longjin <longjin@DragonOS.org>
* feat(procfs): Add support for /proc/[pid]/statm file
- Introduced the ProcStatm file type to the ProcFileType enum.
- Implemented the open_statm function to return a placeholder response for the statm file.
- Updated the ProcFS inode creation to include the statm file for each process.
- Enhanced the IndexNode implementation to handle the new ProcStatm file type.
Signed-off-by: longjin <longjin@DragonOS.org>
* fix(mmap): 增强mmap系统调用的偏移量检查和内存分配逻辑
* fix(procfs): 优化statm文件打开逻辑,增加虚拟内存页数计算
* fix(syscall): 处理len为0的情况,确保read和write系统调用遵循POSIX标准
* refactor(mm): Optimize page reclamation process to prevent deadlocks
- Separated the page reclamation into two phases to avoid holding the reclaimer lock for extended periods, reducing the risk of lock order inversion with page_manager/page_cache.
- Updated the `shrink_list` method to handle victim page eviction without holding the reclaimer lock, ensuring safer memory management.
- Improved the `drain_lru` method to efficiently retrieve victim pages for reclamation.
Signed-off-by: longjin <longjin@DragonOS.org>
---------
Signed-off-by: longjin <longjin@DragonOS.org>
- Adjusted the link count for directories to ensure it starts at 2, accounting for the self-reference and parent directory link.
- Updated the logic for incrementing and decrementing link counts when creating and deleting directories.
- Enhanced the dynamic calculation of directory link counts in the VFS layer to ensure accuracy when metadata is unreliable.
This change improves the consistency of link count management across different filesystem implementations.
Signed-off-by: longjin <longjin@DragonOS.org>
* feat(filesystem): 新增preadv2系统调用支持
- 实现preadv2系统调用,支持带偏移量和标志位的向量化读取
- 处理offset为-1时使用当前文件偏移量,其他情况复用preadv逻辑
- 添加RWF标志位验证,遵循Linux兼容性要求
Signed-off-by: longjin <longjin@DragonOS.org>
* refactor(open): remove follow_symlink parameter from do_sys_open and related functions
- Simplified the do_sys_open and do_open functions by removing the follow_symlink parameter.
- Updated all calls to these functions to reflect the change, ensuring that symlink following behavior is determined by the O_NOFOLLOW flag in the file flags.
- Enhanced code readability and maintainability by streamlining function signatures.
Signed-off-by: longjin <longjin@DragonOS.org>
---------
Signed-off-by: longjin <longjin@DragonOS.org>
- Add rt_sigqueueinfo and rt_tgsigqueueinfo system calls for POSIX-compliant signal delivery
- Enhance kill process functionality with proper signal validation and permission checks
- Improve process exit handling with signal cleanup and parent notification
- Update fork implementation to handle signal inheritance properly
- Implement setresuid system call with proper privilege management
- Add comprehensive test coverage for signal-related syscalls
Signed-off-by: longjin <longjin@DragonOS.org>
- Add sched_getparam syscall to retrieve process scheduling parameters
- Add sched_getscheduler syscall to get process scheduling policy
- Refactor sched_yield into separate module with proper syscall handler
- Add utility functions for scheduling permission checks
- Remove old do_sched_yield implementation from main syscall module
Signed-off-by: longjin <longjin@DragonOS.org>
- Introduced a new module for managing kernel log levels, mimicking Linux behavior.
- Added support for dynamic log level configuration via command line and procfs interface.
- Created a new `/proc/sys/kernel/printk` file for reading and writing log level settings.
- Updated existing logging mechanisms to utilize the new log level management system.
- Enhanced the QEMU startup script to allow setting log levels through environment variables.
This implementation improves logging flexibility and aligns with expected Linux functionality.
Signed-off-by: longjin <longjin@DragonOS.org>
- Add proper error handling for concurrent namespace file creation
- Move namespace type validation earlier in the creation process
- Ensure child inodes are fully initialized before being visible in children map
- Handle EEXIST race condition by re-checking children map
Signed-off-by: longjin <longjin@DragonOS.org>
* feat(procfs): Add support for /proc/thread-self/ns namespace files
- Introduced a new module for handling namespace files under /proc/thread-self/ns, allowing applications to reference namespaces via symlinks.
- Implemented dynamic creation of namespace files and their corresponding IDs, ensuring compatibility with Linux behavior.
- Updated ProcFS to create necessary directory structures and files for thread-specific namespaces.
- Enhanced existing ProcFileType enum to include new types for thread self namespaces.
This addition improves the process namespace management and aligns with expected Linux functionality.
Signed-off-by: longjin <longjin@DragonOS.org>
* feat(namespace): Implement setns syscall for namespace management
- Added the `setns` syscall to allow processes to join existing namespaces using a file descriptor.
- Implemented kernel-level logic to handle namespace switching based on provided flags, supporting pidfd and namespace fd types.
- Enhanced the namespace management by integrating with the existing `NsProxy` structure, ensuring compatibility with Linux behavior.
- Updated related modules and added necessary error handling for invalid flags and file descriptors.
This addition improves the flexibility of namespace management in the kernel, aligning with expected Linux functionality.
Signed-off-by: longjin <longjin@DragonOS.org>
---------
Signed-off-by: longjin <longjin@DragonOS.org>
* feat(vfs): Implement mount propagation mechanism in VFS
- Added support for mount propagation types: Shared, Private, Slave, and Unbindable.
- Introduced a new module for managing mount propagation semantics, including peer group registration and event propagation.
- Updated existing mount functions to handle propagation logic during mount and unmount operations.
- Enhanced documentation to include details on the new mount propagation features and their usage.
- Added unit tests to verify the correctness of mount propagation behavior across different scenarios.
This implementation aligns with Linux semantics for mount propagation, ensuring compatibility and expected behavior in containerized environments.;
Signed-off-by: longjin <longjin@DragonOS.org>
* refactor(filesystem): optimize mount propagation and logging
- Replace ID allocator with atomic counter for propagation groups
- Refactor peer group registry into structured class with better APIs
- Remove verbose debug logs to reduce noise
Signed-off-by: longjin <longjin@DragonOS.org>
* fix(namespace): correct mount propagation peer group handling
- Fix peer group registration when changing propagation type from shared
- Ensure propagated mounts join source child's peer group instead of target
parent's group
- Add proper peer group cleanup when transitioning from shared propagation
Signed-off-by: longjin <longjin@DragonOS.org>
* feat(vfs): implement recursive bind mount support
- Add recursive bind mount functionality with MS_BIND | MS_REC flags
- Implement BFS traversal for copying submounts in do_recursive_bind_mount
- Fix mount registration order to prevent dangling registrations on failure
- Add comprehensive test cases for recursive and non-recursive bind mounts
Signed-off-by: longjin <longjin@DragonOS.org>
---------
Signed-off-by: longjin <longjin@DragonOS.org>
- Replace from_bits_truncate with from_bits to properly validate clone flags
- Remove CloneTest.Clone3UnknownFlag from test blocklist as it's now handled
Signed-off-by: longjin <longjin@DragonOS.org>
- Support embedding initram and using Ramfs as the file system for extracting initram
- Support kexec series system calls, including load series and reboot
- Support u-root as the root file system to boot in Go language
- Add sysfs such as boot_crams and memmap
- Add a series of peripheral system calls related to the above
Signed-off-by: JensenWei007 <jensenwei007@gmail.com>
https://github.com/DragonOS-Community/DragonOS/pull/1304
* fix rename error in fat32
add a fake link implementation for fat32(it will be removed in the
future).
Signed-off-by: Godones <chenlinfeng25@outlook.com>
* feat: add new syscall and fix the fnctl error
add sendfile syscall.
add rt_sigsuspend syscall.
add sendfile test.
add setown/getown command for fcntl.
Signed-off-by: Godones <chenlinfeng25@outlook.com>
---------
Signed-off-by: Godones <chenlinfeng25@outlook.com>
- Add necessary platform driver support
- Modify some startup processes and assert
- Fixed some issues
Signed-off-by: JensenWei007 <jensenwei007@gmail.com>
1. Return ok instead of error for tty devices.
2. Fixed the packet sending and receiving issues in the network.
3. Fix file descriptor duplication issue.
4. Fix readlink error.