Ftrace is a Linux kernel framework for tracing Linux kernel functions. But our team managed to find a new way to use ftrace when trying to enable system activity monitoring to be able to block suspicious processes. It turns out that ftrace allows you to install hooks from a loadable GPL module without rebuilding the kernel. This approach works for Linux kernel versions 3.19 and higher for the x86_64 architecture.
This is the second part of our three-part series on hooking Linux kernel function calls. In this article, we explain how you can use ftrace to hook critical function calls in the Linux kernel. We also describe and test two theories for protecting a Linux kernel module from ftrace hooks. Read Hooking Linux Kernel Functions, Part 1: Looking for the Perfect Solution to learn more about other approaches that can be used for accomplishing this task.
A new approach: Using ftrace for Linux kernel hooking
What is an ftrace? Basically, ftrace is a framework used for tracing the kernel on the function level. This framework has been in development since 2008 and has quite an impressive feature set. What data can you usually get when you trace your kernel functions with ftrace? Linux ftrace displays call graphs, tracks the frequency and length of function calls, filters particular functions by templates, and so on. Further down this article you’ll find references to official documents and sources you can use to learn more about the capabilities of ftrace.
The implementation of ftrace is based on the compiler options -pg and -mfentry. These kernel options insert the call of a special tracing function — mcount() or __fentry__() — at the beginning of every function. In user programs, profilers use this compiler capability for tracking calls of all functions. In the kernel, however, these functions are used for implementing the ftrace framework.
Calling ftrace from every function is, of course, pretty costly. This is why there’s an optimization available for popular architectures — dynamic ftrace. If ftrace isn’t in use, it nearly doesn’t affect the system because the kernel knows where the calls mcount() or __fentry__() are located and replaces the machine code with nop (a specific instruction that does nothing) at an early stage. And when Linux kernel trace is on, ftrace calls are added back to the necessary functions.
Description of necessary functions
The following structure can be used for describing each hooked function:
There are only three fields that the user needs to fill in: name, function, and original. The rest of the fields are considered to be implementation details. You can put the description of all hooked functions together and use macros to make the code more compact:
This is what the hooked function wrapper looks like:
Now, hooked functions have a minimum of extra code. The only thing requiring special attention is the function signatures. They must be completely identical; otherwise, the arguments will be passed on incorrectly and everything will go wrong. This isn’t as important for hooking system calls, though, since their handlers are pretty stable and, for performance reasons, the system call ABI and function call ABI use the same layout of arguments in registers. However, if you’re going to hook other functions, remember that the kernel has no stable interfaces.
Our first step is finding and saving the hooked function address. As you probably know, when using ftrace, Linux kernel tracing can be performed by the function name. However, we still need to know the address of the original function in order to call it.
You can use kallsyms — a list of all kernel symbols — to get the address of the needed function. This list includes not only symbols exported for the modules but actually all symbols. This is what the process of getting the hooked function address can look like:
Next, we need to initialize the ftrace_ops structure. Here we have one necessary field, func, pointing to the callback. However, some critical flags are needed:
The fh_ftrace_thunk () feature is our callback that ftrace will call when tracing the function. We’ll talk about this callback later. The flags are needed for hooking — they command ftrace to save and restore the processor registers whose contents we’ll be able to change in the callback.
Now we’re ready to turn on the hook. First, we use ftrace_set_filter_ip() to turn on the ftrace utility for the needed function. Second, we use register_ftrace_function() to give ftrace permission to call our callback:
To turn off the hook, we repeat the same actions in reverse:
When the unregister_ftrace_function() call is over, it’s guaranteed that there won’t be any activations of the installed callback or our wrapper in the system. We can unload the hook module without worrying that our functions are still being executed somewhere in the system. Next, we provide a detailed description of the function hooking process.
Hooking functions with ftrace
So how can you configure kernel function hooking? The process is pretty simple: ftrace is able to alter the register state after exiting the callback. By changing the register %rip — a pointer to the next executed instruction — we can change the function executed by the processor. In other words, we can force the processor to make an unconditional jump from the current function to ours and take over control.
This is what the ftrace callback looks like:
We get the address of struct ftrace_hook for our function using a macro container_of() and the address of struct ftrace_ops embedded in struct ftrace_hook. Next, we substitute the value of the register %rip in the struct pt_regs structure with our handler’s address. For architectures other than x86_64, this register can have a different name (like PC or IP). The basic idea, however, still applies.
Note that the notrace specifier added for the callback requires special attention. This specifier can be used for marking functions that are prohibited for Linux kernel tracing with ftrace. For instance, you can mark ftrace functions that are used in the tracing process. By using this specifier, you can prevent the system from hanging if you accidentally call a function from your ftrace callback that’s currently being traced by ftrace.
The ftrace callback is usually called with a disabled preemption (just like kprobes), although there might be some exceptions. But in our case, this limitation wasn’t important since we only needed to replace eight bytes of %rip value in the pt_regs structure.
Since the wrapper function and the original are executed in the same context, both functions have the same restrictions. For instance, if you hook an interrupt handler, then sleeping in the wrapper is still out of the question.
Protection from recursive calls
There’s one catch in the code we gave you before: when the wrapper calls the original function, the original function will be traced by ftrace again, thus causing an endless recursion. We came up with a pretty neat way of breaking this cycle by using parent_ip — one of the ftrace callback arguments that contains the return address to the function that called the hooked one. Usually, this argument is used for building function call graphs. However, we can use this argument to distinguish the first traced function call from the repeated calls.
The difference is significant: during the first call, the argument parent_ip will point to some place in the kernel, while during the repeated call it will only point inside our wrapper. You should pass control only during the first function call. All other calls must let the original function be executed.
We can run the entry test by comparing the address to the boundaries of the current module with our functions. However, this approach works only if the module doesn’t contain anything other than the wrapper that calls the hooked function. Otherwise, you’ll need to be more picky.
So this is what a correct ftrace callback looks like:
This approach has three main advantages:
- Low overhead costs. You need to perform only several comparisons and subtractions without grabbing any spinlocks or iterating through lists.
- It doesn’t have to be global. Since there’s no synchronization, this approach is compatible with preemption and isn’t tied to the global process list. As a result, you can trace even interrupt handlers.
- There are no limitations for functions. This approach doesn’t have the main kretprobes drawback and can support any number of trace function activations (including recursive) out of the box. During recursive calls, the return address is still located outside of our module, so the callback test works correctly.
In the next section, we take a more detailed look at the hooking process and describe how ftrace works.
The scheme of the hooking process
So, how does ftrace work? Let’s take a look at a simple example: you’ve typed the command Is in the terminal to see the list of files in the current directory. The command-line interpreter (say, Bash) launches a new process using the common functions fork() plus execve() from the standard C library. Inside the system, these functions are implemented through system calls clone() and execve() respectively. Let’s suggest that we hook the execve() system call to gain control over launching new processes.
Figure 1 below gives an ftrace example and illustrates the process of hooking a handler function.
In this image, we can see how a user process (blue) executes a system call to the kernel (red) where the ftrace framework (violet) calls functions from our module (green).
Below, we give a more detailed description of each step of the process:
- The SYSCALL instruction is executed by the user process. This instruction allows switching to the kernel mode and puts the low-level system call handler entry_SYSCALL_64() in charge. This handler is responsible for all system calls of 64-bit programs on 64-bit kernels.
- A specific handler receives control. The kernel accomplishes all low-level tasks implemented on the assembler pretty fast and hands over control to the high-level do_syscall_64 () function, which is written in C. This function reaches the system call handler table sys_call_table and calls a particular handler by the system call number. In our case, it’s the function sys_execve ().
- Calling ftrace. There’s an __fentry__() function call at the beginning of every kernel function. This function is implemented by the ftrace framework. In the functions that don’t need to be traced, this call is replaced with the instruction nop. However, in the case of the sys_execve() function, there’s no such call.
- Ftrace calls our callback. Ftrace calls all registered trace callbacks, including ours. Other callbacks won’t interfere since, at each particular place, only one callback can be installed that changes the value of the %rip register.
- The callback performs the hooking. The callback looks at the value of parent_ip leading inside the do_syscall_64() function — since it’s the particular function that called the sys_execve() handler — and decides to hook the function, changing the values of the register %rip in the pt_regs structure.
- Ftrace restores the state of the registers. Following the FTRACE_SAVE_REGS flag, the framework saves the register state in the pt_regs structure before it calls the handlers. When the handling is over, the registers are restored from the same structure. Our handler changes the register %rip — a pointer to the next executed function — which leads to passing control to a new address.
- Wrapper function receives control. An unconditional jump makes it look like the activation of the sys_execve() function has been terminated. Instead of this function, control goes to our function, fh_sys_execve(). Meanwhile, the state of both processor and memory remains the same, so our function receives the arguments of the original handler and returns control to the do_syscall_64() function.
- The original function is called by our wrapper. Now, the system call is under our control. After analyzing the context and arguments of the system call, the fh_sys_execve() function can either permit or prohibit execution. If execution is prohibited, the function returns an error code. Otherwise, the function needs to repeat the call to the original handler and sys_execve() is called again through the real_sys_execve pointer that was saved during the hook setup.
- The callback gets control. Just like during the first call of sys_execve(), control goes through ftrace to our callback. But this time, the process ends differently.
- The callback does nothing. The sys_execve() function was called not by the kernel from do_syscall_64() but by our fh_sys_execve() function. Therefore, the registers remain unchanged and the sys_execve() function is executed as usual. The only problem is that ftrace sees the entry to sys_execve() twice.
- The wrapper gets back control. The system call handler sys_execve() gives control to our fh_sys_execve() function for the second time. Now, the launch of a new process is nearly finished. We can see if the execve() call finished with an error, study the new process, make some notes to the log file, and so on.
- The kernel receives control. Finally, the fh_sys_execve() function is finished and control returns to the do_syscall_64() function. The function sees the call as one that was completed normally, and the kernel proceeds as usual.
- Control goes to the user process. In the end, the kernel executes the IRET instruction (or SYSRET, but for execve() there can be only IRET), installing the registers for a new user process and switching the processor into user code execution mode. The system call is over and so is the launch of the new process.
As you can see, the process of hooking Linux kernel function calls with ftrace isn’t that complex. Now, it's time to focus on the ways you can protect your Linux kernel modules from ftrace hooks.
Function hooks can be used for different purposes, from monitoring the performance of the system to patching a specific bug. But if you want to make sure that kernel module functionality remains unchanged, you need to be able to prevent the installation of any hooks.
So how can you protect a Linux kernel module from ftrace hooks? When thinking about possible solutions, we came up with two ideas:
Let’s consider these two approaches more closely:
- Hooking ftrace functions. In this case, we would need to hook the ftrace function that can set hooks, such as ftrace_set_filter_ip or ftrace_set_hash. Then, theoretically, once the framework tried to hook a function from our module, we would be able to block it. We could use the addresses of our module functions from the .text section to distinguish our kernel module functions from other functions.
- Modifying ftrace structs. For this, we would need to delete all the information about our module functions that’s stored in the records and structs of the ftrace framework. Then, to make the blocking of ftrace hooks possible, we’d also need to fill the mcount records with nop instructions.
Let’s see which of these theories proves effective.
We’ll start with a simple kernel module named TestModule. The name of the function that ftrace wants to hook is HookMe:
Let’s try out the theory that seems to be the most logical: use ftrace hooks against themselves.
Hooking an ftrace hooking function
First, we need to set a hook for one of the ftrace functions responsible for hooking function calls. In our example, we try to hook the ftrace_set_hash function:
Our first step is getting the address of the ftrace_set_hash function:
Then, we can try to register a hook for this ftrace function:
Unfortunately, this wasn’t successful. We got a message about an error that occurred when we tried to hook the ftrace_set_hash function:
The reason why this error occured is quite simple: apparently, ftrace can’t be hooked with its own methods. The framework is protected from setting hooks in its critical functions. You can see it here:
As a result, even though this option seemed the most obvious and logical solution to our problem, we can’t use it to protect our kernel module from ftrace function hooks.
Clearing ftrace records
Our next approach is a bit more cunning, since we need to delete some information from the ftrace records. The framework keeps all data about installed hooks in special ftrace pages. Each page is described by the ftrace_page struct.
The size field of an ftrace page shows the size of the specific page in bytes. The index field displays the number of dyn_ftrace structs that this page contains, sorted by dyn_ftrace.ip.
The dyn_ftrace struct keeps the address (IP) of the mcount entry needed for setting a hook for a selected function. So to block the setting of a function hook, we need to delete the dyn_ftrace struct related to a specific function and then fill its mcount entry with a nop.
Once the dyn_ftrace struct is deleted, we’ll also need to shift other entries by one for the current ftrace page.
We can use the following script to check if this approach works for protecting our kernel module from ftrace hooks:
Here’s what we get when we don’t use this type of protection against ftrace hooks:
However, when using this type of Linux kernel module protection, we get the following result:
We’ll also receive the exact same error, sh: echo: I/O error, if the HookMe function isn’t loaded. This method works and can be used for protecting a Linux kernel module from ftrace hooks.
Ftrace is a helpful framework that can be used for solving different tasks, from tracing kernel functions to setting function hooks. Apriorit developers use this tool regularly and keep expanding their knowledge of its capabilities. But in some cases, when you need to keep kernel functionality unchanged, you might want to protect your kernel module from ftrace hooks.
Even though the main purpose of ftrace is to trace Linux kernel function calls rather than hook them, our innovative approach turned out to be both simple and effective. However, the approach we describe above works only for kernel versions 3.19 and higher and only for the x86_64 architecture.
Since ftrace functions are securely protected from being hooked with ftrace methods, you’ll need a different technique for hooking ftrace functions and thereby protecting your kernel module. Nevertheless, we managed to find a solution.
All we had to do was delete the information about a specific function from ftrace records and then fill its mcount entry with an nop option. Once this was done, our kernel module was effectively protected from being hooked with the help of the ftrace framework.
In the third and final part of our series, we’ll tell you about the main ftrace pros and cons and some unexpected surprises that might be waiting for you if you decide to implement this approach. Meanwhile, you can read about another unusual solution for installing hooks — by using the GCC attribute constructor with LD_PRELOAD.