System Calls: Overview#

System Call? Function?#

  • The kernel is not a library ⟶ no direct function calls, but rather System Calls.

  • Entry points into the kernel

  • Every system call has a unique number and a fixed set of parameters and registers (ABI)

  • Changes context from user mode to kernel mode

  • Implementation is CPU specific (software interrupt …)

  • Numbers, parameters, etc. are Linux specific

  • ⟶ “kernel acts on behalf of a process”, but with a different address space

../../../../../../../_images/syscall.svg

Syscall Wrappers (C-Library)#

  • Most system calls are wrapped by thin functions provided by the C-library (libc.so)

#include <fcntl.h>
#include <unistd.h>

int main()
{
    int fd = open("/etc/passwd", O_RDONLY);
    // ... do something with fd
    close(fd);
    return 0;
}
  • Except there is no wrapper ⟶ use syscall macro directly

#include <unistd.h>
#include <sys/syscall.h>                               // <-- SYS_* macros

int main()
{
    pid_t tid = syscall(SYS_gettid);
    (void)tid; // ... or do something with it
    return 0;
}

Return Value, And Errors#

  • Return type is int, mostly (mmap() is one exception)

  • Return value >= 0 ⟶ no error

  • Return value == -1 ⟶ error

    • Global variable errno is set to indicate the cause

#include <fcntl.h>
#include <unistd.h>
#include <print>

int main()
{
    int fd = open("/etc/passwd", O_WRONLY);            // <-- only root can do this
    if (fd == -1) {                                    // <-- "open failed"
        std::println("open failed with errno {}", 
                     errno);                           // <-- here's how it failed
        return 1;
    }
    return 0;
}
$ ./sysprog-syscall-error
open failed with errno 13
#include <errno.h>#
...
#define      EACCES          13      /* Permission denied */
...

strace: System Call Tracer#

  • Traces system calls (and the opposite direction, signals)

  • ⟶ “sniffer” at the user/kernel boundary

  • Very informative, especially when something goes wrong

$ strace ./sysprog-syscall-open-example
...
brk(NULL)                               = 0x2da92000   # <--
brk(0x2dab3000)                         = 0x2dab3000   # <-- end of program loading
openat(AT_FDCWD, "/etc/passwd", O_RDONLY) = 3          # <-- open("/etc/passwd", O_RDONLY)
close(3)                                = 0            # <-- close(fd)
exit_group(0)                           = ?            # <-- program teardown
+++ exited with 0 +++

System Call Wrappers, And “Normal” Library Functions#

  • For example, what is the difference between open() and fopen()?

  • Answer:

    • open() is documented in section 2 of the manual ⟶ system call wrapper

    • fopen() is documented in section 3 of the manual ⟶ library function

  • One shows up in strace output, the other doesn’t

From man -s 1 man

1   Executable programs or shell commands
2   System calls (functions provided by the kernel)
3   Library calls (functions within program libraries)
4   Special files (usually found in /dev)
5   File formats and conventions, e.g. /etc/passwd
6   Games
7   Miscellaneous (including macro packages and conventions), e.g. man(7), groff(7), man-pages(7)
8   System administration commands (usually only for root)
9   Kernel routines [Non standard]