Unveiling the Mechanics of Linux System Calls: A Deep Dive

Published On Tue Oct 15 2024
Unveiling the Mechanics of Linux System Calls: A Deep Dive

The Linux System Call Execution Model: An Insight - Open Source ...

In the first article in this two-part series, published in the August 2024 issue of OSFY, we discussed the role of the C library in system call execution. We talked of how the C library loads system call arguments into architecture-specific registers, generating syscall interrupt, which switches the mode from the user to the kernel. In this final part we will discuss what happens after this — how the kernel handles and executes the system call request on behalf of the user space application, and sends the return value of the system call value back to the user application.

Understanding System Call Execution in the Kernel

Once the syscall instruction is executed and an exception is generated, an exception handler is triggered. But how does the kernel know which exception handler it should invoke for syscall instruction, and how does the exception handler know which system call handler it should invoke? The Linux kernel contains a table called the system call table, which is represented by sys_call_table array. This array is defined in /arch/x86/kernel/syscall_64.c.

Taxonomy of smart contract vulnerabilities

There are a few important points to be noted in the code given above. The kernel maintains a sys_call_table array that contains addresses of each system call handler function. These system call handler functions are invoked when user space applications execute a syscall instruction.

From the Intel manual SYSCALL invokes an OS system-call handler at privilege level 0. It does so by loading RIP from the IA32_LSTAR MSR. IA32_LSTAR_MSR is a model-specific register used for various control purposes in x86 architecture.

The kernel writes IA32_LSTAR_MSR with the address of kernel entry code that needs to be executed when a user space application triggers a system call request. This address is written during kernel initialization. The kernel executes the entry code for the system call handler, saves the user space program context onto the kernel stack, prepares the stack frame for system call handle, and finally, executes the system call.

Handling Return Value and Resuming Execution

Once the system call handler finishes the execution, control will return to arch/x86/entry/entry_64.S right after where the system call was made. The kernel then handles the return of the system call value to the user program and resumes execution in the user mode.

In conclusion, we have delved into the Linux system call execution model, from the user space application and syscall instruction from the C library to handling syscall from the kernel perspective.