Add a new "Limitations on System Calls" section to the book

This commit is contained in:
Tate, Hongliang Tian 2025-08-07 17:44:27 +08:00 committed by Tate, Hongliang Tian
parent f6478d62cc
commit 5f47febe42
15 changed files with 964 additions and 381 deletions

View File

@ -8,7 +8,20 @@
* [Advanced Build and Test Instructions](kernel/advanced-instructions.md)
* [Intel TDX](kernel/intel_tdx.md)
* [The Framekernel Architecture](kernel/the-framekernel-architecture.md)
* [Linux Compatibility](kernel/linux-compatibility.md)
* [Linux Compatibility](kernel/linux-compatibility/README.md)
* [Limitations on System Calls](kernel/linux-compatibility/limitations-on-system-calls/README.md)
* [System Call Matching Language (SCML)](kernel/linux-compatibility/limitations-on-system-calls/system-call-matching-language.md)
* [Process and thread management](kernel/linux-compatibility/limitations-on-system-calls/process-and-thread-management.md)
* [Memory management](kernel/linux-compatibility/limitations-on-system-calls/memory-management.md)
* [File & directory operations](kernel/linux-compatibility/limitations-on-system-calls/file-and-directory-operations.md)
* [File systems & mount control](kernel/linux-compatibility/limitations-on-system-calls/file-systems-and-mount-control.md)
* [File descriptor & I/O control](kernel/linux-compatibility/limitations-on-system-calls/file-descriptor-and-io-control.md)
* [Inter-process communication](kernel/linux-compatibility/limitations-on-system-calls/inter-process-communication.md)
* [Networking & sockets](kernel/linux-compatibility/limitations-on-system-calls/networking-and-sockets.md)
* [Signals & timers](kernel/linux-compatibility/limitations-on-system-calls/signals-and-timers.md)
* [Namespaces, cgroups & security](kernel/linux-compatibility/limitations-on-system-calls/namespaces-cgroups-and-security.md)
* [System information & misc](kernel/linux-compatibility/limitations-on-system-calls/system-information-and-misc.md)
* [Limitations on File Systems]()
* [Roadmap](kernel/roadmap.md)
# Asterinas OSTD

View File

@ -1,380 +0,0 @@
# Linux Compatibility
> "We don't break user space."
>
> --- Linus Torvalds
Asterinas is dedicated to maintaining compatibility with the Linux ABI,
ensuring that applications and administrative tools
designed for Linux can seamlessly operate within Asterinas.
While we prioritize compatibility,
it is important to note that Asterinas does not,
nor will it in the future,
support the loading of Linux kernel modules.
## System Calls
At the time of writing,
Asterinas implements 219 out of the 336 system calls
provided by Linux on x86-64 architecture.
| Numbers | Names | Is Implemented |
| ------- | ---------------- | --------------- |
| 0 | read | ✅ |
| 1 | write | ✅ |
| 2 | open | ✅ |
| 3 | close | ✅ |
| 4 | stat | ✅ |
| 5 | fstat | ✅ |
| 6 | lstat | ✅ |
| 7 | poll | ✅ |
| 8 | lseek | ✅ |
| 9 | mmap | ✅ |
| 10 | mprotect | ✅ |
| 11 | munmap | ✅ |
| 12 | brk | ✅ |
| 13 | rt_sigaction | ✅ |
| 14 | rt_sigprocmask | ✅ |
| 15 | rt_sigreturn | ✅ |
| 16 | ioctl | ✅ |
| 17 | pread64 | ✅ |
| 18 | pwrite64 | ✅ |
| 19 | readv | ✅ |
| 20 | writev | ✅ |
| 21 | access | ✅ |
| 22 | pipe | ✅ |
| 23 | select | ✅ |
| 24 | sched_yield | ✅ |
| 25 | mremap | ✅ |
| 26 | msync | ✅ |
| 27 | mincore | ❌ |
| 28 | madvise | ✅ |
| 29 | shmget | ❌ |
| 30 | shmat | ❌ |
| 31 | shmctl | ❌ |
| 32 | dup | ✅ |
| 33 | dup2 | ✅ |
| 34 | pause | ✅ |
| 35 | nanosleep | ✅ |
| 36 | getitimer | ✅ |
| 37 | alarm | ✅ |
| 38 | setitimer | ✅ |
| 39 | getpid | ✅ |
| 40 | sendfile | ✅ |
| 41 | socket | ✅ |
| 42 | connect | ✅ |
| 43 | accept | ✅ |
| 44 | sendto | ✅ |
| 45 | recvfrom | ✅ |
| 46 | sendmsg | ✅ |
| 47 | recvmsg | ✅ |
| 48 | shutdown | ✅ |
| 49 | bind | ✅ |
| 50 | listen | ✅ |
| 51 | getsockname | ✅ |
| 52 | getpeername | ✅ |
| 53 | socketpair | ✅ |
| 54 | setsockopt | ✅ |
| 55 | getsockopt | ✅ |
| 56 | clone | ✅ |
| 57 | fork | ✅ |
| 58 | vfork | ❌ |
| 59 | execve | ✅ |
| 60 | exit | ✅ |
| 61 | wait4 | ✅ |
| 62 | kill | ✅ |
| 63 | uname | ✅ |
| 64 | semget | ✅ |
| 65 | semop | ✅ |
| 66 | semctl | ✅ |
| 67 | shmdt | ❌ |
| 68 | msgget | ❌ |
| 69 | msgsnd | ❌ |
| 70 | msgrcv | ❌ |
| 71 | msgctl | ❌ |
| 72 | fcntl | ✅ |
| 73 | flock | ✅ |
| 74 | fsync | ✅ |
| 75 | fdatasync | ✅ |
| 76 | truncate | ✅ |
| 77 | ftruncate | ✅ |
| 78 | getdents | ✅ |
| 79 | getcwd | ✅ |
| 80 | chdir | ✅ |
| 81 | fchdir | ✅ |
| 82 | rename | ✅ |
| 83 | mkdir | ✅ |
| 84 | rmdir | ✅ |
| 85 | creat | ✅ |
| 86 | link | ✅ |
| 87 | unlink | ✅ |
| 88 | symlink | ✅ |
| 89 | readlink | ✅ |
| 90 | chmod | ✅ |
| 91 | fchmod | ✅ |
| 92 | chown | ✅ |
| 93 | fchown | ✅ |
| 94 | lchown | ✅ |
| 95 | umask | ✅ |
| 96 | gettimeofday | ✅ |
| 97 | getrlimit | ✅ |
| 98 | getrusage | ✅ |
| 99 | sysinfo | ✅ |
| 100 | times | ❌ |
| 101 | ptrace | ❌ |
| 102 | getuid | ✅ |
| 103 | syslog | ❌ |
| 104 | getgid | ✅ |
| 105 | setuid | ✅ |
| 106 | setgid | ✅ |
| 107 | geteuid | ✅ |
| 108 | getegid | ✅ |
| 109 | setpgid | ✅ |
| 110 | getppid | ✅ |
| 111 | getpgrp | ✅ |
| 112 | setsid | ✅ |
| 113 | setreuid | ✅ |
| 114 | setregid | ✅ |
| 115 | getgroups | ✅ |
| 116 | setgroups | ✅ |
| 117 | setresuid | ✅ |
| 118 | getresuid | ✅ |
| 119 | setresgid | ✅ |
| 120 | getresgid | ✅ |
| 121 | getpgid | ✅ |
| 122 | setfsuid | ✅ |
| 123 | setfsgid | ✅ |
| 124 | getsid | ✅ |
| 125 | capget | ✅ |
| 126 | capset | ✅ |
| 127 | rt_sigpending | ✅ |
| 128 | rt_sigtimedwait | ❌ |
| 129 | rt_sigqueueinfo | ❌ |
| 130 | rt_sigsuspend | ✅ |
| 131 | sigaltstack | ✅ |
| 132 | utime | ✅ |
| 133 | mknod | ✅ |
| 134 | uselib | ❌ |
| 135 | personality | ❌ |
| 136 | ustat | ❌ |
| 137 | statfs | ✅ |
| 138 | fstatfs | ✅ |
| 139 | sysfs | ❌ |
| 140 | getpriority | ✅ |
| 141 | setpriority | ✅ |
| 142 | sched_setparam | ✅ |
| 143 | sched_getparam | ✅ |
| 144 | sched_setscheduler | ✅ |
| 145 | sched_getscheduler | ✅ |
| 146 | sched_get_priority_max | ✅ |
| 147 | sched_get_priority_min | ✅ |
| 148 | sched_rr_get_interval | ❌ |
| 149 | mlock | ❌ |
| 150 | munlock | ❌ |
| 151 | mlockall | ❌ |
| 152 | munlockall | ❌ |
| 153 | vhangup | ❌ |
| 154 | modify_ldt | ❌ |
| 155 | pivot_root | ❌ |
| 156 | _sysctl | ❌ |
| 157 | prctl | ✅ |
| 158 | arch_prctl | ✅ |
| 159 | adjtimex | ❌ |
| 160 | setrlimit | ✅ |
| 161 | chroot | ✅ |
| 162 | sync | ✅ |
| 163 | acct | ❌ |
| 164 | settimeofday | ❌ |
| 165 | mount | ✅ |
| 166 | umount2 | ✅ |
| 167 | swapon | ❌ |
| 168 | swapoff | ❌ |
| 169 | reboot | ❌ |
| 170 | sethostname | ❌ |
| 171 | setdomainname | ❌ |
| 172 | iopl | ❌ |
| 173 | ioperm | ❌ |
| 174 | create_module | ❌ |
| 175 | init_module | ❌ |
| 176 | delete_module | ❌ |
| 177 | get_kernel_syms | ❌ |
| 178 | query_module | ❌ |
| 179 | quotactl | ❌ |
| 180 | nfsservctl | ❌ |
| 181 | getpmsg | ❌ |
| 182 | putpmsg | ❌ |
| 183 | afs_syscall | ❌ |
| 184 | tuxcall | ❌ |
| 185 | security | ❌ |
| 186 | gettid | ✅ |
| 187 | readahead | ❌ |
| 188 | setxattr | ✅ |
| 189 | lsetxattr | ✅ |
| 190 | fsetxattr | ✅ |
| 191 | getxattr | ✅ |
| 192 | lgetxattr | ✅ |
| 193 | fgetxattr | ✅ |
| 194 | listxattr | ✅ |
| 195 | llistxattr | ✅ |
| 196 | flistxattr | ✅ |
| 197 | removexattr | ✅ |
| 198 | lremovexattr | ✅ |
| 199 | fremovexattr | ✅ |
| 200 | tkill | ❌ |
| 201 | time | ✅ |
| 202 | futex | ✅ |
| 203 | sched_setaffinity | ✅ |
| 204 | sched_getaffinity | ✅ |
| 205 | set_thread_area | ❌ |
| 206 | io_setup | ❌ |
| 207 | io_destroy | ❌ |
| 208 | io_getevents | ❌ |
| 209 | io_submit | ❌ |
| 210 | io_cancel | ❌ |
| 211 | get_thread_area | ❌ |
| 212 | lookup_dcookie | ❌ |
| 213 | epoll_create | ✅ |
| 214 | epoll_ctl_old | ❌ |
| 215 | epoll_wait_old | ❌ |
| 216 | remap_file_pages | ❌ |
| 217 | getdents64 | ✅ |
| 218 | set_tid_address | ✅ |
| 219 | restart_syscall | ❌ |
| 220 | semtimedop | ✅ |
| 221 | fadvise64 | ✅ |
| 222 | timer_create | ✅ |
| 223 | timer_settime | ✅ |
| 224 | timer_gettime | ✅ |
| 225 | timer_getoverrun | ❌ |
| 226 | timer_delete | ✅ |
| 227 | clock_settime | ❌ |
| 228 | clock_gettime | ✅ |
| 229 | clock_getres | ❌ |
| 230 | clock_nanosleep | ✅ |
| 231 | exit_group | ✅ |
| 232 | epoll_wait | ✅ |
| 233 | epoll_ctl | ✅ |
| 234 | tgkill | ✅ |
| 235 | utimes | ✅ |
| 236 | vserver | ❌ |
| 237 | mbind | ❌ |
| 238 | set_mempolicy | ❌ |
| 239 | get_mempolicy | ❌ |
| 240 | mq_open | ❌ |
| 241 | mq_unlink | ❌ |
| 242 | mq_timedsend | ❌ |
| 243 | mq_timedreceive | ❌ |
| 244 | mq_notify | ❌ |
| 245 | mq_getsetattr | ❌ |
| 246 | kexec_load | ❌ |
| 247 | waitid | ✅ |
| 248 | add_key | ❌ |
| 249 | request_key | ❌ |
| 250 | keyctl | ❌ |
| 251 | ioprio_set | ✅ |
| 252 | ioprio_get | ✅ |
| 253 | inotify_init | ❌ |
| 254 | inotify_add_watch | ❌ |
| 255 | inotify_rm_watch | ❌ |
| 256 | migrate_pages | ❌ |
| 257 | openat | ✅ |
| 258 | mkdirat | ✅ |
| 259 | mknodat | ✅ |
| 260 | fchownat | ✅ |
| 261 | futimesat | ✅ |
| 262 | newfstatat | ✅ |
| 263 | unlinkat | ✅ |
| 264 | renameat | ✅ |
| 265 | linkat | ✅ |
| 266 | symlinkat | ✅ |
| 267 | readlinkat | ✅ |
| 268 | fchmodat | ✅ |
| 269 | faccessat | ✅ |
| 270 | pselect6 | ✅ |
| 271 | ppoll | ✅ |
| 272 | unshare | ❌ |
| 273 | set_robust_list | ✅ |
| 274 | get_robust_list | ❌ |
| 275 | splice | ❌ |
| 276 | tee | ❌ |
| 277 | sync_file_range | ❌ |
| 278 | vmsplice | ❌ |
| 279 | move_pages | ❌ |
| 280 | utimensat | ✅ |
| 281 | epoll_pwait | ✅ |
| 282 | signalfd | ✅ |
| 283 | timerfd_create | ✅ |
| 284 | eventfd | ✅ |
| 285 | fallocate | ✅ |
| 286 | timerfd_settime | ✅ |
| 287 | timerfd_gettime | ✅ |
| 288 | accept4 | ✅ |
| 289 | signalfd4 | ✅ |
| 290 | eventfd2 | ✅ |
| 291 | epoll_create1 | ✅ |
| 292 | dup3 | ✅ |
| 293 | pipe2 | ✅ |
| 294 | inotify_init1 | ❌ |
| 295 | preadv | ✅ |
| 296 | pwritev | ✅ |
| 297 | rt_tgsigqueueinfo | ❌ |
| 298 | perf_event_open | ❌ |
| 299 | recvmmsg | ❌ |
| 300 | fanotify_init | ❌ |
| 301 | fanotify_mark | ❌ |
| 302 | prlimit64 | ✅ |
| 303 | name_to_handle_at | ❌ |
| 304 | open_by_handle_at | ❌ |
| 305 | clock_adjtime | ❌ |
| 306 | syncfs | ❌ |
| 307 | sendmmsg | ❌ |
| 308 | setns | ❌ |
| 309 | getcpu | ✅ |
| 310 | process_vm_readv | ❌ |
| 311 | process_vm_writev | ❌ |
| 312 | kcmp | ❌ |
| 313 | finit_module | ❌ |
| 314 | sched_setattr | ✅ |
| 315 | sched_getattr | ✅ |
| 318 | getrandom | ✅ |
| 319 | memfd_create | ✅ |
| 322 | execveat | ✅ |
| 327 | preadv2 | ✅ |
| 328 | pwritev2 | ✅ |
| 332 | statx | ✅ |
| 434 | pidfd_open | ✅ |
| 435 | clone3 | ✅ |
| 436 | close_range | ✅ |
| 439 | faccessat2 | ✅ |
| 441 | epoll_pwait2 | ✅ |
## File Systems
Here is the list of supported file systems:
* Devfs
* Devpts
* Ext2
* Procfs
* Ramfs
## Sockets
Here is the list of supported socket types:
* TCP sockets over IPv4
* UDP sockets over IPv4
* Unix sockets
## vDSO
Here is the list of supported symbols in vDSO:
* `__vdso_clock_gettime`
* `__vdso_gettimeofday`
* `__vdso_time`
## Boot Protocols
Here is the list of supported boot protocols:
* [Multiboot](https://www.gnu.org/software/grub/manual/multiboot/multiboot.html)
* [Multiboot2](https://www.gnu.org/software/grub/manual/multiboot2/multiboot.html)
* [Linux 32-bit boot protocol](https://www.kernel.org/doc/html/v5.4/x86/boot.html#bit-boot-protocol)
* [Linux EFI handover](https://www.kernel.org/doc/html/v5.4/x86/boot.html#efi-handover-protocol)

View File

@ -0,0 +1,380 @@
# Linux Compatibility
> "We don't break user space."
>
> --- Linus Torvalds
Asterinas is dedicated to maintaining compatibility with the Linux ABI,
ensuring that applications and administrative tools
designed for Linux can seamlessly operate within Asterinas.
While we prioritize compatibility,
it is important to note that Asterinas does not,
nor will it in the future,
support the loading of Linux kernel modules.
## System Calls
At the time of writing,
Asterinas implements 219 out of the 336 system calls
provided by Linux on x86-64 architecture.
| Numbers | Names | Supported | Limitations |
| ------- | ---------------------- | -------------- | --- |
| 0 | read | ✅ | |
| 1 | write | ✅ | |
| 2 | open | ✅ | [⚠️](limitations-on-system-calls/file-and-directory-operations.md#open-and-openat) |
| 3 | close | ✅ | |
| 4 | stat | ✅ | |
| 5 | fstat | ✅ | |
| 6 | lstat | ✅ | |
| 7 | poll | ✅ | |
| 8 | lseek | ✅ | |
| 9 | mmap | ✅ | [⚠️](limitations-on-system-calls/memory-management.md#mmap) |
| 10 | mprotect | ✅ | |
| 11 | munmap | ✅ | |
| 12 | brk | ✅ | |
| 13 | rt_sigaction | ✅ | |
| 14 | rt_sigprocmask | ✅ | |
| 15 | rt_sigreturn | ✅ | |
| 16 | ioctl | ✅ | |
| 17 | pread64 | ✅ | |
| 18 | pwrite64 | ✅ | |
| 19 | readv | ✅ | |
| 20 | writev | ✅ | |
| 21 | access | ✅ | |
| 22 | pipe | ✅ | |
| 23 | select | ✅ | |
| 24 | sched_yield | ✅ | |
| 25 | mremap | ✅ | |
| 26 | msync | ✅ | |
| 27 | mincore | ❌ | |
| 28 | madvise | ✅ | |
| 29 | shmget | ❌ | |
| 30 | shmat | ❌ | |
| 31 | shmctl | ❌ | |
| 32 | dup | ✅ | |
| 33 | dup2 | ✅ | |
| 34 | pause | ✅ | |
| 35 | nanosleep | ✅ | |
| 36 | getitimer | ✅ | |
| 37 | alarm | ✅ | |
| 38 | setitimer | ✅ | |
| 39 | getpid | ✅ | |
| 40 | sendfile | ✅ | |
| 41 | socket | ✅ | [⚠️](limitations-on-system-calls/networking-and-sockets.html#socket) |
| 42 | connect | ✅ | |
| 43 | accept | ✅ | |
| 44 | sendto | ✅ | |
| 45 | recvfrom | ✅ | |
| 46 | sendmsg | ✅ | |
| 47 | recvmsg | ✅ | |
| 48 | shutdown | ✅ | |
| 49 | bind | ✅ | |
| 50 | listen | ✅ | |
| 51 | getsockname | ✅ | |
| 52 | getpeername | ✅ | |
| 53 | socketpair | ✅ | |
| 54 | setsockopt | ✅ | |
| 55 | getsockopt | ✅ | |
| 56 | clone | ✅ | |
| 57 | fork | ✅ | |
| 58 | vfork | ❌ | |
| 59 | execve | ✅ | |
| 60 | exit | ✅ | |
| 61 | wait4 | ✅ | |
| 62 | kill | ✅ | |
| 63 | uname | ✅ | |
| 64 | semget | ✅ | |
| 65 | semop | ✅ | |
| 66 | semctl | ✅ | |
| 67 | shmdt | ❌ | |
| 68 | msgget | ❌ | |
| 69 | msgsnd | ❌ | |
| 70 | msgrcv | ❌ | |
| 71 | msgctl | ❌ | |
| 72 | fcntl | ✅ | |
| 73 | flock | ✅ | |
| 74 | fsync | ✅ | |
| 75 | fdatasync | ✅ | |
| 76 | truncate | ✅ | |
| 77 | ftruncate | ✅ | |
| 78 | getdents | ✅ | |
| 79 | getcwd | ✅ | |
| 80 | chdir | ✅ | |
| 81 | fchdir | ✅ | |
| 82 | rename | ✅ | |
| 83 | mkdir | ✅ | |
| 84 | rmdir | ✅ | |
| 85 | creat | ✅ | |
| 86 | link | ✅ | |
| 87 | unlink | ✅ | |
| 88 | symlink | ✅ | |
| 89 | readlink | ✅ | |
| 90 | chmod | ✅ | |
| 91 | fchmod | ✅ | |
| 92 | chown | ✅ | |
| 93 | fchown | ✅ | |
| 94 | lchown | ✅ | |
| 95 | umask | ✅ | |
| 96 | gettimeofday | ✅ | |
| 97 | getrlimit | ✅ | |
| 98 | getrusage | ✅ | |
| 99 | sysinfo | ✅ | |
| 100 | times | ❌ | |
| 101 | ptrace | ❌ | |
| 102 | getuid | ✅ | |
| 103 | syslog | ❌ | |
| 104 | getgid | ✅ | |
| 105 | setuid | ✅ | |
| 106 | setgid | ✅ | |
| 107 | geteuid | ✅ | |
| 108 | getegid | ✅ | |
| 109 | setpgid | ✅ | |
| 110 | getppid | ✅ | |
| 111 | getpgrp | ✅ | |
| 112 | setsid | ✅ | |
| 113 | setreuid | ✅ | |
| 114 | setregid | ✅ | |
| 115 | getgroups | ✅ | |
| 116 | setgroups | ✅ | |
| 117 | setresuid | ✅ | |
| 118 | getresuid | ✅ | |
| 119 | setresgid | ✅ | |
| 120 | getresgid | ✅ | |
| 121 | getpgid | ✅ | |
| 122 | setfsuid | ✅ | |
| 123 | setfsgid | ✅ | |
| 124 | getsid | ✅ | |
| 125 | capget | ✅ | |
| 126 | capset | ✅ | |
| 127 | rt_sigpending | ✅ | |
| 128 | rt_sigtimedwait | ❌ | |
| 129 | rt_sigqueueinfo | ❌ | |
| 130 | rt_sigsuspend | ✅ | |
| 131 | sigaltstack | ✅ | |
| 132 | utime | ✅ | |
| 133 | mknod | ✅ | |
| 134 | uselib | ❌ | |
| 135 | personality | ❌ | |
| 136 | ustat | ❌ | |
| 137 | statfs | ✅ | |
| 138 | fstatfs | ✅ | |
| 139 | sysfs | ❌ | |
| 140 | getpriority | ✅ | |
| 141 | setpriority | ✅ | |
| 142 | sched_setparam | ✅ | |
| 143 | sched_getparam | ✅ | |
| 144 | sched_setscheduler | ✅ | |
| 145 | sched_getscheduler | ✅ | |
| 146 | sched_get_priority_max | ✅ | |
| 147 | sched_get_priority_min | ✅ | |
| 148 | sched_rr_get_interval | ❌ | |
| 149 | mlock | ❌ | |
| 150 | munlock | ❌ | |
| 151 | mlockall | ❌ | |
| 152 | munlockall | ❌ | |
| 153 | vhangup | ❌ | |
| 154 | modify_ldt | ❌ | |
| 155 | pivot_root | ❌ | |
| 156 | _sysctl | ❌ | |
| 157 | prctl | ✅ | |
| 158 | arch_prctl | ✅ | |
| 159 | adjtimex | ❌ | |
| 160 | setrlimit | ✅ | |
| 161 | chroot | ✅ | |
| 162 | sync | ✅ | |
| 163 | acct | ❌ | |
| 164 | settimeofday | ❌ | |
| 165 | mount | ✅ | |
| 166 | umount2 | ✅ | |
| 167 | swapon | ❌ | |
| 168 | swapoff | ❌ | |
| 169 | reboot | ❌ | |
| 170 | sethostname | ❌ | |
| 171 | setdomainname | ❌ | |
| 172 | iopl | ❌ | |
| 173 | ioperm | ❌ | |
| 174 | create_module | ❌ | |
| 175 | init_module | ❌ | |
| 176 | delete_module | ❌ | |
| 177 | get_kernel_syms | ❌ | |
| 178 | query_module | ❌ | |
| 179 | quotactl | ❌ | |
| 180 | nfsservctl | ❌ | |
| 181 | getpmsg | ❌ | |
| 182 | putpmsg | ❌ | |
| 183 | afs_syscall | ❌ | |
| 184 | tuxcall | ❌ | |
| 185 | security | ❌ | |
| 186 | gettid | ✅ | |
| 187 | readahead | ❌ | |
| 188 | setxattr | ✅ | |
| 189 | lsetxattr | ✅ | |
| 190 | fsetxattr | ✅ | |
| 191 | getxattr | ✅ | |
| 192 | lgetxattr | ✅ | |
| 193 | fgetxattr | ✅ | |
| 194 | listxattr | ✅ | |
| 195 | llistxattr | ✅ | |
| 196 | flistxattr | ✅ | |
| 197 | removexattr | ✅ | |
| 198 | lremovexattr | ✅ | |
| 199 | fremovexattr | ✅ | |
| 200 | tkill | ❌ | |
| 201 | time | ✅ | |
| 202 | futex | ✅ | |
| 203 | sched_setaffinity | ✅ | |
| 204 | sched_getaffinity | ✅ | |
| 205 | set_thread_area | ❌ | |
| 206 | io_setup | ❌ | |
| 207 | io_destroy | ❌ | |
| 208 | io_getevents | ❌ | |
| 209 | io_submit | ❌ | |
| 210 | io_cancel | ❌ | |
| 211 | get_thread_area | ❌ | |
| 212 | lookup_dcookie | ❌ | |
| 213 | epoll_create | ✅ | |
| 214 | epoll_ctl_old | ❌ | |
| 215 | epoll_wait_old | ❌ | |
| 216 | remap_file_pages | ❌ | |
| 217 | getdents64 | ✅ | |
| 218 | set_tid_address | ✅ | |
| 219 | restart_syscall | ❌ | |
| 220 | semtimedop | ✅ | |
| 221 | fadvise64 | ✅ | |
| 222 | timer_create | ✅ | |
| 223 | timer_settime | ✅ | |
| 224 | timer_gettime | ✅ | |
| 225 | timer_getoverrun | ❌ | |
| 226 | timer_delete | ✅ | |
| 227 | clock_settime | ❌ | |
| 228 | clock_gettime | ✅ | |
| 229 | clock_getres | ❌ | |
| 230 | clock_nanosleep | ✅ | |
| 231 | exit_group | ✅ | |
| 232 | epoll_wait | ✅ | |
| 233 | epoll_ctl | ✅ | |
| 234 | tgkill | ✅ | |
| 235 | utimes | ✅ | |
| 236 | vserver | ❌ | |
| 237 | mbind | ❌ | |
| 238 | set_mempolicy | ❌ | |
| 239 | get_mempolicy | ❌ | |
| 240 | mq_open | ❌ | |
| 241 | mq_unlink | ❌ | |
| 242 | mq_timedsend | ❌ | |
| 243 | mq_timedreceive | ❌ | |
| 244 | mq_notify | ❌ | |
| 245 | mq_getsetattr | ❌ | |
| 246 | kexec_load | ❌ | |
| 247 | waitid | ✅ | |
| 248 | add_key | ❌ | |
| 249 | request_key | ❌ | |
| 250 | keyctl | ❌ | |
| 251 | ioprio_set | ✅ | |
| 252 | ioprio_get | ✅ | |
| 253 | inotify_init | ❌ | |
| 254 | inotify_add_watch | ❌ | |
| 255 | inotify_rm_watch | ❌ | |
| 256 | migrate_pages | ❌ | |
| 257 | openat | ✅ | [⚠️](limitations-on-system-calls/file-and-directory-operations.md#open-and-openat) |
| 258 | mkdirat | ✅ | |
| 259 | mknodat | ✅ | |
| 260 | fchownat | ✅ | |
| 261 | futimesat | ✅ | |
| 262 | newfstatat | ✅ | |
| 263 | unlinkat | ✅ | |
| 264 | renameat | ✅ | |
| 265 | linkat | ✅ | |
| 266 | symlinkat | ✅ | |
| 267 | readlinkat | ✅ | |
| 268 | fchmodat | ✅ | |
| 269 | faccessat | ✅ | |
| 270 | pselect6 | ✅ | |
| 271 | ppoll | ✅ | |
| 272 | unshare | ❌ | |
| 273 | set_robust_list | ✅ | |
| 274 | get_robust_list | ❌ | |
| 275 | splice | ❌ | |
| 276 | tee | ❌ | |
| 277 | sync_file_range | ❌ | |
| 278 | vmsplice | ❌ | |
| 279 | move_pages | ❌ | |
| 280 | utimensat | ✅ | |
| 281 | epoll_pwait | ✅ | |
| 282 | signalfd | ✅ | |
| 283 | timerfd_create | ✅ | |
| 284 | eventfd | ✅ | |
| 285 | fallocate | ✅ | |
| 286 | timerfd_settime | ✅ | |
| 287 | timerfd_gettime | ✅ | |
| 288 | accept4 | ✅ | |
| 289 | signalfd4 | ✅ | |
| 290 | eventfd2 | ✅ | |
| 291 | epoll_create1 | ✅ | |
| 292 | dup3 | ✅ | |
| 293 | pipe2 | ✅ | |
| 294 | inotify_init1 | ❌ | |
| 295 | preadv | ✅ | |
| 296 | pwritev | ✅ | |
| 297 | rt_tgsigqueueinfo | ❌ | |
| 298 | perf_event_open | ❌ | |
| 299 | recvmmsg | ❌ | |
| 300 | fanotify_init | ❌ | |
| 301 | fanotify_mark | ❌ | |
| 302 | prlimit64 | ✅ | |
| 303 | name_to_handle_at | ❌ | |
| 304 | open_by_handle_at | ❌ | |
| 305 | clock_adjtime | ❌ | |
| 306 | syncfs | ❌ | |
| 307 | sendmmsg | ❌ | |
| 308 | setns | ❌ | |
| 309 | getcpu | ✅ | |
| 310 | process_vm_readv | ❌ | |
| 311 | process_vm_writev | ❌ | |
| 312 | kcmp | ❌ | |
| 313 | finit_module | ❌ | |
| 314 | sched_setattr | ✅ | [⚠️](limitations-on-system-calls/process-and-thread-management.md#sched_getattr-and-sched_setattr) |
| 315 | sched_getattr | ✅ | [⚠️](limitations-on-system-calls/process-and-thread-management.md#sched_getattr-and-sched_setattr) |
| 318 | getrandom | ✅ | |
| 319 | memfd_create | ✅ | |
| 322 | execveat | ✅ | |
| 327 | preadv2 | ✅ | |
| 328 | pwritev2 | ✅ | |
| 332 | statx | ✅ | |
| 434 | pidfd_open | ✅ | |
| 435 | clone3 | ✅ | |
| 436 | close_range | ✅ | |
| 439 | faccessat2 | ✅ | |
| 441 | epoll_pwait2 | ✅ | |
## File Systems
Here is the list of supported file systems:
* Devfs
* Devpts
* Ext2
* Procfs
* Ramfs
## Sockets
Here is the list of supported socket types:
* TCP sockets over IPv4
* UDP sockets over IPv4
* Unix sockets
## vDSO
Here is the list of supported symbols in vDSO:
* `__vdso_clock_gettime`
* `__vdso_gettimeofday`
* `__vdso_time`
## Boot Protocols
Here is the list of supported boot protocols:
* [Multiboot](https://www.gnu.org/software/grub/manual/multiboot/multiboot.html)
* [Multiboot2](https://www.gnu.org/software/grub/manual/multiboot2/multiboot.html)
* [Linux 32-bit boot protocol](https://www.kernel.org/doc/html/v5.4/x86/boot.html#bit-boot-protocol)
* [Linux EFI handover](https://www.kernel.org/doc/html/v5.4/x86/boot.html#efi-handover-protocol)

View File

@ -0,0 +1,21 @@
# Limitations on System Calls
This section documents known limitations of Asterinas's implementation of Linux system calls.
It introduce [**System Call Matching Language (SCML)**](system-call-matching-language.md),
a lightweight domainspecific language for
specifying allowed and disallowed patterns of systemcall invocations.
The rest of this section uses SCML
to accurately and concisely describe
both supported and unsupported functionality of system calls,
which are divided into the following categories:
* [Process and thread management](process-and-thread-management.md)
* [Memory management](memory-management.md)
* [File & directory operations](file-and-directory-operations.md)
* [File systems & mount control](file-systems-and-mount-control.md)
* [File descriptor & I/O control](file-descriptor-and-io-control.md)
* [Inter-process communication](inter-process-communication.md)
* [Networking & sockets](networking-and-sockets.md)
* [Signals & timers](signals-and-timers.md)
* [Namespaces, cgroups & security](namespaces-cgroups-and-security.md)
* [System information & misc](system-information-and-misc.md)

View File

@ -0,0 +1,82 @@
# File and Directory Operations
<!--
Put system calls such as
open, openat, creat, close, read, write, readv, writev, pread64,
pwrite64, lseek, stat, fstat, lstat, statx, mkdir, rmdir, link,
unlink, rename, symlink, readlink, chmod, fchmod, chown, fchown,
utime, and utimensat
under this category.
-->
## `open` and `openat`
Supported functionality of `open` in SCML:
```c
access_mode =
O_RDONLY |
O_WRONLY |
O_RDWR;
creation_flags =
O_CLOEXEC |
O_DIRECTORY |
O_EXCL |
O_NOCTTY |
O_NOFOLLOW |
O_TRUNC;
status_flags =
O_APPEND |
O_ASYNC |
O_DIRECT |
O_LARGEFILE |
O_NOATIME |
O_NONBLOCK |
O_SYNC;
// Open an existing file
open(
path,
flags = <access_mode> | <creation_flags> | <status_flags>,
);
// Create a new file
open(
path,
flags = O_CREAT | <access_mode> | <creation_flags> | <status_flags>,
mode
);
// Status flags that are meaningful with O_PATH
opath_valid_flags = O_CLOEXEC | O_DIRECTORY | O_NOFOLLOW;
// All other flags are ignored with O_PATH
opath_ignored_flags = O_CREAT | <creation_flags> | <status_flags>;
// Obtain a file descriptor to indicate a location in FS
open(
path,
flags = O_PATH | <opath_valid_flags> | <opath_ignored_flags>
);
// Create an unnamed file
// open(path, flags = O_TMPFILE | <creation_flags> | <status_flags>)
```
Silently-ignored flags:
* `O_NOCTTY`
* `O_DSYNC`
* `O_SYNC`
* `O_LARGEFILE`
* `O_NOATIME`
* `O_NOCTTY`
Partially-supported flags:
* `O_PATH`
Unsupported flags:
* `O_TMPFILE`
Supported and unsupported functionality of `openat` are the same as `open`.
The SCML rules are omitted for brevity.
For more information,
see [the man page](https://man7.org/linux/man-pages/man2/openat.2.html).

View File

@ -0,0 +1,8 @@
# File Descriptor and I/O Control
<!--
Put system calls such as
dup, dup2, dup3, fcntl, ioctl, pipe, pipe2, splice, tee, vmsplice, sendfile,
eventfd, eventfd2, inotify_init, inotify_init1, inotify_add_watch, and inotify_rm_watch
under this category.
-->

View File

@ -0,0 +1,9 @@
# File Systems & Mount Control
<!--
Put system calls such as
mount, umount2, pivot_root, statfs, fstatfs, truncate, ftruncate, fsync,
fdatasync, sync, syncfs, sync_file_range, open_tree, move_mount, fsopen,
fsconfig, fsmount, and fspick
under this category.
-->

View File

@ -0,0 +1,8 @@
# Inter-Process Communication
<!--
Put system calls such as
msgget, msgsnd, msgrcv, msgctl, semget, semop, semctl, shmget, shmat, shmctl
futex, set_robust_list, and get_robust_list
under this category.
-->

View File

@ -0,0 +1,67 @@
# Memory Management
<!--
Put system calls such as
brk, mmap, munmap, mprotect, mremap, msync, mincore, madvise,
shmget, shmat, shmctl, mlock, munlock, mbind, and set_mempolicy
under this part.
-->
## `mmap`
Supported functionality in SCML:
```c
prot = PROT_NONE |
PROT_EXEC |
PROT_READ |
PROT_WRITE;
opt_flags =
MAP_ANONYMOUS |
MAP_FIXED |
MAP_FIXED_NOREPLACE |
MAP_GROWSDOWN |
MAP_HUGETLB |
MAP_LOCKED |
MAP_NONBLOCK |
MAP_NORESERVE |
MAP_POPULATE |
MAP_SYNC;
// Create a private memory mapping
mmap(
addr, length,
prot = <prot>,
flags = MAP_PRIVATE | <opt_flags>
fd, offset
);
// Create a shared memory mapping
mmap(
addr, length,
prot = <prot>,
flags = MAP_SHARED | MAP_SHARED_VALIDATE | <opt_flags>
fd, offset
);
```
Silently-ignored flags:
* `MAP_HUGETLB`
* `MAP_GROWSDOWN`
* `MAP_LOCKED`
* `MAP_NONBLOCK`
* `MAP_NORESERVE`
* `MAP_POPULATE`
* `MAP_SYNC`
Partially supported flags:
* `MAP_FIXED_NOREPLACE` is treated as `MAP_FIXED`
Unsupported flags:
* `MAP_32BIT`
* `MAP_HUGE_1GB`
* `MAP_HUGE_2MB`
* `MAP_UNINITIALIZED`
For more information,
see [the man page](https://man7.org/linux/man-pages/man2/mmap.2.html).

View File

@ -0,0 +1,9 @@
# Namespaces, Cgroups & Security
<!--
Put system calls such as
unshare, setns, clone (with namespace flags), chroot, pivot_root, prctl,
capset, seccomp, landlock_create_ruleset, landlock_add_rule,
landlock_restrict_self, and bpf
under this category.
-->

View File

@ -0,0 +1,49 @@
# Networking & Sockets
<!--
Put system calls such as
socket, socketpair, bind, listen, accept, connect, getsockname, getpeername,
sendto, recvfrom, sendmsg, recvmsg, shutdown, setsockopt, getsockopt,
sendmmsg, recvmmsg, accept4, recvmsg, and socketcall
under this category.
-->
## `socket`
```c
// Optional flags for socket type
opt_type_flags = SOCK_NONBLOCK | SOCK_CLOEXEC;
// Create a UNIX socket
socket(
family = AF_UNIX,
type = SOCK_STREAM | SOCK_SEQPACKET | <opt_type_flags>,
protocol = 0
);
// Create an IPv4 socket (TCP or UDP)
socket(
family = AF_INET,
type = SOCK_STREAM | SOCK_DGRAM | <opt_type_flags>,
protocol = IPPROTO_IP | IPPROTO_TCP | IPPROTO_UDP
);
// Create a netlink socket
socket(
family = AF_NETLINK,
type = SOCK_RAW | SOCK_DGRAM | <opt_type_flags>,
protocol = NETLINK_ROUTE | NETLINK_KOBJECT_UEVENT
);
// Create a VSOCK socket
socket(
family = AF_VSOCK,
type = SOCK_STREAM | <opt_type_flags>,
protocol = 0
);
```
For more information,
see [the man page](https://man7.org/linux/man-pages/man2/socket.2.html).

View File

@ -0,0 +1,66 @@
# Process & Thread Management
<!--
Put system calls such as
fork, vfork, clone, execve, exit, exit_group, wait4, waitid,
getpid, getppid, gettid, setuid, setgid, getuid, getgid, and prctl
under this category.
-->
## `sched_getattr` and `sched_setattr`
Supported functionality in SCML:
```c
// Get the scheduling policy of a "normal" thread
sched_getattr(
pid,
attr = {
sched_policy = SCHED_OTHER | SCHED_BATCH | SCHED_IDLE,
sched_flags = 0,
..
},
flags = 0,
);
// Set the scheduling policy of a "normal" thread
sched_setattr(
pid,
attr = {
sched_policy = SCHED_OTHER | SCHED_BATCH | SCHED_IDLE,
sched_flags = 0,
..
},
flags = 0,
);
// Get the scheduling policy of a real-time thread
sched_getattr(
pid,
attr = {
sched_policy = SCHED_FIFO | SCHED_RR,
sched_flags = 0,
..
},
flags = 0,
);
// Set the scheduling policy of a real-time thread
sched_setattr(
pid,
attr = {
sched_policy = SCHED_FIFO | SCHED_RR,
sched_flags = 0,
..
},
flags = 0,
);
```
Unsupported scheduling policies:
* `SCHED_DEADLINE`
Unsupported scheduling flags:
* `SCHED_FLAG_RESET_ON_FORK`
* `SCHED_FLAG_RECLAIM`
* `SCHED_FLAG_DL_OVERRUN`
* `SCHED_FLAG_UTIL_CLAMP_MIN`
* `SCHED_FLAG_UTIL_CLAMP_MAX`

View File

@ -0,0 +1,10 @@
# Signals & Timers
<!--
Put system calls such as
rt_sigaction, rt_sigprocmask, rt_sigpending, rt_sigqueueinfo, rt_tgsigqueueinfo,
rt_sigreturn, kill, tkill, tgkill, alarm, setitimer, getitimer, nanosleep,
timer_create, timer_settime, timer_gettime, and timer_delete
under this category.
-->

View File

@ -0,0 +1,232 @@
# System Call Matching Language (SCML)
SCML specifies matching patterns for systemcall invocations.
Asterinas developers can easily write SCML rules to describe supported patterns.
Likewise, users and developers can intuitively read these rules
to understand which system calls and features are available.
SCML is designed to integrate seamlessly with
[strace](https://man7.org/linux/man-pages/man1/strace.1.html),
the standard Linux systemcall tracer.
Strace emits each invocation in a Cstyle syntax;
given a set of SCML rules,
a tool can automatically determine
whether a strace log entry conforms to the supported patterns.
This paves the way for an SCMLbased analyzer
that reports unsupported calls in any application's trace.
## Strace: A Quick Example
To illustrate, run strace on a simple "Hello, World!" program:
```bash
$ strace ./hello_world
```
A typical trace might look like this:
```shell
execve("./hello_world", ["./hello_world"], 0xffffffd3f710 /* 4 vars */) = 0
brk(NULL) = 0xaaaabdc1b000
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xffff890f4000
openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0\267\0\1\0\0\0\360\206\2\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1722920, ...}) = 0
write(1, "Hello, World!\n", 14) = 14
exit_group(0) = ?
```
Key points of this output:
* System calls are rendered as `name(arg1, …, argN)`.
* Flags appear as `FLAG1|FLAG2|…|FLAGN`.
* Structs use `{field1=value1, …}`.
* Arrays are shown as `[value1, …]`.
SCML's syntax draws directly from these conventions.
## SCML by Example
SCML is intentionally simple:
most Linux systemcall semantics hinge on bitflags.
SCML rules act as templates:
you define a rule once,
and a human or an analyzer uses it to check if a syscall invocation matches it or not.
Imagine you're developing a Linux-compatible OS (like Asterinas)
that supports just a restricted subset of syscalls and their options.
We will use SCML to describe the restricted functionality.
### Matching Rules for System Calls
For example,
your OS supports the [`open`](https://man7.org/linux/man-pages/man2/openat.2.html) system call
with one or more of the four flags: `O_RDONLY`, `O_WRONLY`, `O_RDWR`, and `O_CLOEXEC`:
This constraint can be expressed in the following system call matching rule.
```c
open(path, flags = O_RDONLY | O_WRONLY | O_RDWR | O_CLOEXEC);
```
To allow file creation,
you add another matching rule that
includes the `O_CREAT` flag and requires a `mode` argument:
```c
open(path, flags = O_CREAT | O_RDONLY | O_WRONLY | O_RDWR | O_CLOEXEC, mode);
```
To support the `O_PATH` flag
(only valid with `O_CLOEXEC`, not with `O_RDONLY`, `O_WRONLY`, or `O_RDWR`),
you add a third matching rule:
```c
open(path, flags = O_PATH | O_CLOEXEC);
```
SCML rules constrain only the flagged arguments;
other parameters (like `path` and `mode`) accept any value.
### C-Style Comments
SCML also supports Cstyle comments:
```c
// All matching rules for the open syscall.
// A supported invocation of the open syscall must match at least one of the rules.
open(path, flags = O_RDONLY | O_WRONLY | O_RDWR | O_CLOEXEC);
open(path, flags = O_CREAT | O_RDONLY | O_WRONLY | O_RDWR | O_CLOEXEC, mode);
open(path, flags = O_PATH | O_CLOEXEC);
```
### Matching Rules for Bitflags
Above, we embedded flag combinations directly within individual systemcall rules,
which can lead to duplication and make maintenance harder.
SCML allows you to define named bitflag rules that
can be reused across multiple rules.
This reduces repetition and centralizes your flag definitions.
For example:
```c
// Define a reusable bitflags rule
access_mode = O_RDONLY | O_WRONLY | O_RDWR;
open(path, flags = <access_mode> | O_CLOEXEC);
open(path, flags = O_CREAT | <access_mode> | O_CLOEXEC, mode);
open(path, flags = O_PATH | O_CLOEXEC);
```
### Matching Rules for Structs
SCML can match flags inside struct fields.
Consider [`sigaction`](https://man7.org/linux/man-pages/man2/sigaction.2.html):
```c
struct sigaction = {
sa_flags: SA_NOCLDSTOP | SA_NOCLDWAIT,
..
};
```
Here, `..` is a wildcard for remaining fields that we do not care.
Then, we can write a system call rule that
refers to the struct rule using the `<struct_rule>` syntax.
```c
sigaction(signum, act = <sigaction>, oldact = <sigaction>);
```
### Matching Rules for Arrays
SCML can describe how to match flags embedded inside the struct values of an array.
This is the case of the [`poll`](https://man7.org/linux/man-pages/man2/poll.2.html) system call.
It takes an array of values of `struct pollfd`,
whose `event` and `revents` fields are bitflags.
```c
// Support all but the POLLPRI flags
events = POLLIN | POLLOUT | POLLRDHUP | POLLERR | POLLHUP | POLLNVAL;
struct pollfd = {
events = <events>,
revents = <events>,
..
};
poll(fds = [ <pollfd> ], nfds, timeout);
```
Notice how SCML denotes an array with the `[ <struct_rule> ]` syntax.
### Advanced Usage
Just like you can write multiple rules of the same system call,
you may define multiple rules for the same struct:
```c
// Rules for control message header
struct cmsghdr = {
cmsg_level = SOL_SOCKET,
cmsg_type = SO_TIMESTAMP_OLD | SCM_RIGHTS | SCM_CREDENTIALS,
..
};
struct cmsghdr = {
cmsg_level = SOL_IP,
cmsg_type = IP_TTL,
..
};
```
A `cmsghdr` value matches if it satisfies any one rule.
Struct rules may also be nested:
```c
// Rule for message header, which refers to the rules for control message header
struct msghdr = {
msg_control = [ <cmsghdr> ],
..
};
recvmsg(socket, message = <msghdr>, flags);
```
## Formal Syntax
Below is the formal syntax of SCML,
expressed in Extended BackusNaur Form (EBNF).
Nonterminals are in angle brackets, terminals in quotes.
```
<scml> ::= { <rule> }
<rule> ::= <syscall-rule> ';'
| <struct-rule> ';'
| <bitflags-rule> ';'
<syscall-rule> ::= <identifier> '(' [ <param-list> ] ')'
<param-list> ::= <param> { ',' <param> }
<param> ::= <identifier> '=' <expr>
| <identifier>
<expr> ::= <expr> '|' <expr>
| <term>
<term> ::= <identifier>
| '<' <identifier> '>'
<array> ::= '[' '<' <identifier> '>' ']'
<struct-rule> ::= 'struct' <identifier> '=' '{' <field-list> [ ',' '..' ] '}'
<field-list> ::= <field> { ',' <field> }
<field> ::= <identifier>
| <identifier> ':' <expr>
| <identifier> ':' <array>
<bitflags-rule> ::= <identifier> '=' <expr>
<identifier> ::= letter { letter | digit | '_' }
comment ::= '//' { any-char }
```

View File

@ -0,0 +1,9 @@
# System Information & Misc.
<!--
Put system calls such as
uname, getrlimit, setrlimit, sysinfo, times, gettimeofday, clock_gettime,
clock_settime, getrusage, getdents, getdents64, personality, syslog,
arch_prctl, set_tid_address, and getrandom
under this category.
-->