ApriorIT
Redirection of Imported Functions in Mach-O

In the previous article, we examined how to dynamically link functions in Mach-O libraries. Now let’s move on to practice.

We have a macOS program that's used by a number of third-party dynamically linked libraries, which, in turn, call each other's functions.

The task is as follows: we need use a handler to intercept a function call made by one library to another and then call the original function. This article will be useful for Mac software developers who need to redirection imported functions in Mach-O libraries.

Contents

Test example

Redirection algorithm

Implementing redirection

Testing our solution

Conclusion

Test example

Let’s suppose we have a program called test that’s written in C (test.c) and a shared static library (libtest.c) that’s compiled beforehand. This library implements one libtest function.

Both the program and the library use the puts function from the standard C library that’s provided with macOS and is contained in libSystem.B.dylib. Let’s visualize this:

The task is the following:

  1. Replace the call to the puts function in the libtest.dylib library with a call to the hooked_puts function that’s implemented in the main program (test.c). The hooked_puts function will then call the original puts function.
  1. Cancel the previously made changes so that the next call to libtest leads to a call to the original puts function.

To do this, we cannot change the code or recompile the libraries. We can only change the code in and recompile the main program. Call redirection itself should be performed only for a specific library and on the fly, without the program needing to restart.

Related services

Custom Mac OS X / macOS Development Services

Redirection algorithm

Let’s describe all of the redirect actions in words, as the code can be hard to follow despite the many comments:

  1. Find the symbol table and table of strings using data from the LC_SYMTAB loader command.
  2. From the LC_DYSYMTAB loader command, find out from which element of the symbol table a subset of undefined symbols (the iundefsym field) begins.
  3. Find the target symbol by name among the subset of undefined symbols in the symbol table.
  4. Save the index of the target symbol from the beginning of the symbol table.
  5. Find the table of indirect symbols (the indirectsymoff field) using data from the LC_DYSYMTAB loader command.
  6. Find out the index from which mapping begins of the import table (contents of the __DATA, __la_symbol_ptr section or __IMPORT, __jump_table) to the table of indirect symbols (the reserved1 field).
  7. Starting from this index, look through the table of indirect symbols and search for the value that corresponds to the index of the target symbol in the symbol table.
  8. Save the number of the target symbol from which begins the mapping of the import table to the table of indirect symbols. This saved value is the index of the required element in the import table.
  9. Find the import table (the offset field) using data from the __la_symbol_ptr section or __jump_table.
  10. Once X contains the index of the target element, rewrite the address for __la_symbol_ptr to the required value — or just change the CALL/JMP instruction to JMP with an operand that is the address of the required function (for __jump_table).

Note that you should work with tables of symbols, strings, and indirect symbols only after loading them from the Mach-O file. Also, you should read the contents of sections that describe import tables as well as perform the redirection itself in memory. This is connected with the fact that tables of symbols and tables of strings can be absent or may not display the real state in the target Mach-O file. This is because the dynamic loader has successfully saved all necessary data about symbols without allocating the tables themselves.

Read also:
Dynamic Linking of Imported Functions in Mach-O

Implementing redirection

Now it’s time to turn our thoughts to the code. Let’s divide all operations into three stages to optimize the search for required Mach-O elements:

1. void  *mach_hook_init(char const *library_filename, void const *library_address);

Based on the Mach-O file and how it’s displayed in memory, this function returns some unclear descriptor. Behind this descriptor are offsets of the import table, symbol table, table of strings, and the mapping of indirect symbols from the table of dynamic symbols as well as a number of useful indexes for this module. The descriptor is the following:

struct mach_hook_handle
{
    void const *library_address;  //base address of a library in memory
    char const *string_table;  //buffer to read string_table table from file
    struct nlist const *symbol_table;  //buffer to read symbol table from file
    uint32_t const *indirect_table;  //buffer to read the indirect symbol table in dynamic symbol table from file
    uint32_t undefined_symbols_count;  //number of undefined symbols in the symbol table
    uint32_t undefined_symbols_index;  //position of undefined symbols in the symbol table
    uint32_t indirect_symbols_count;  //number of indirect symbols in the indirect symbol table of DYSYMTAB
    uint32_t indirect_symbols_index;  //index of the first imported symbol in the indirect symbol table of DYSYMTAB
    uint32_t import_table_offset;  //the offset of (__DATA, __la_symbol_ptr) or (__IMPORT, __jump_table)
    uint32_t jump_table_present;  //special flag to show if we work with (__IMPORT, __jump_table)
};
2. mach_substitution mach_hook(void const *handle, char const *function_name, mach_substitution substitution);

This function performs the redirection using the algorithm described above and using the existing library descriptor, the name of the target symbol, and the address of the interceptor.

3. void  mach_hook_free(void *handle);

In this way, any descriptor returned by mach_hook_init is cleaned up.

Taking into account these prototypes, we need to rewrite the test program:

#include <stdio.h>
#include <dlfcn.h>
#include "mach_hook.h"
#define LIBTEST_PATH "libtest.dylib"
void libtest();  //from libtest.dylib
int hooked_puts(char const *s)
{
    puts(s);  //calls the original puts() from libSystem.B.dylib because our main executable module called "test" remains intact
    return puts("HOOKED!");
}
int main()
{
    void *handle = 0;  //handle to store hook-related info
    mach_substitution original;  //original data for restoration
    Dl_info info;
    if (!dladdr((void const *)libtest, 	&info))  //gets an address of the library which contains the libtest() function
    {
        fprintf(stderr, "Failed to get the base address of a library!\n", LIBTEST_PATH);
        goto end;
    }
    handle = mach_hook_init(LIBTEST_PATH, info.dli_fbase);
    if (!handle)
    {
        fprintf(stderr, "Redirection init failed!\n");
        goto end;
    }
    libtest();  //calls puts() from libSystem.B.dylib
    puts("-----------------------------");
    original = mach_hook(handle, "puts", (mach_substitution)hooked_puts);
    if (!original)
    {
        fprintf(stderr, "Redirection failed!\n");
        goto end;
    }
    libtest();  //calls hooked_puts()
    puts("-----------------------------");
    original = mach_hook(handle, "puts", original);  //restores the original relocation
    if (!original)
    {
        fprintf(stderr, "Restoration failed!\n");
        goto end;
    }
    libtest();  //again calls puts() from libSystem.B.dylib
end:
    mach_hook_free(handle);
    handle = 0;  //no effect here but advisable to prevent double freeing
    return 0;
}

Read also:
How to Reverse Engineer an iOS App and macOS Software

Testing our solution

Let's initiate the test in the following way:

[email protected]$ arch -i386 ./test
libtest: calls the original puts()
-----------------------------
libtest: calls the original puts()
HOOKED!
-----------------------------
libtest: calls the original puts()
[email protected]$ arch -x86_64 ./test
libtest: calls the original puts()
-----------------------------
libtest: calls the original puts()
HOOKED!
-----------------------------
libtest: calls the original puts()

The program output indicates the full execution of the task that was formulated in the beginning.

Conclusion

In this article, we provided a practical example of how to redirect an imported function for a macOS program using Mach-O functions. You can download the test example together with the redirection algorithm and the project file at the link below.

Apriorit has a team of dedicated macOS specialists who will be glad to assist you in developing your Mac project. Contact us using the form below to discuss the details.

 

Let's talk

4000 chars left
Attach a file
Browse
By clicking Send you give consent to processing your data

Book an Exploratory Call

Do not have any specific task for us in mind but our skills seem interesting? Get a quick Apriorit intro to better understand our team capabilities.

Book time slot

Contact Us

P: +1 202-780-9339
E: [email protected]

8 The Green, Suite #7106, Dover, DE 19901
United States

D-U-N-S number: 117063762

btnUp