Principle of Dynamic Linking of Imported Functions in Mach-O

By knowing the principles by which imported functions are linked in Mach-O libraries, we can achieve a rather interesting effect: We can redirect the calls of imported functions to our code, in which we can then use the original function. To do this, it’s enough to pretend to be a dynamic loader and to correct the import table of the target library in memory. Let’s look at the Mach-O file format and learn how the dynamic loader relocates its import table.

Mach-O in brief

Developed by Apple, Mach object (Mach-O) is a file format for executables, core dumps, shared libraries, dynamically loaded code, and object code. This file format is used for ordering code and data in a binary file so it can be properly read into the memory of Apple-developed systems like macOS, iOS, Mach kernel, and NeXTSTEP.

The best way to understand Mach-O is to look at the image below:

At first glance, understanding the format is supposed to be simple, as everything looks like the following:

Header – Stores information on the target architecture and different options for interpreting further contents of the file.
Load commands — Dictate how and where to load Mach-O parts (segments, symbol tables) and which libraries this Mach-O file depends on so they can be loaded first.
Segments — Describe regions of memory to which code or data should be loaded.

Upon taking a closer look, we’ll have to study some parser utilities:

otool — A console program that’s provided together with the system. It can display the contents of different parts of the file: headers, load commands, segments, sections, etc. It’s especially useful to call it with the -v (verbose) key
MachOView — Distributed under the GPL, has its own GUI, and works only on Mac OS X 10.6 and higher. MachOView includes a viewer that displays the full contents of a Mach-O file and adds information to some partitions on the basis of data from other partitions, which is very handy.

Actually, it’s enough to use MachOView on different examples in order to study it. But that’s not enough for Mach-O development, as MachOView can’t show us the exact structures of headers, load commands, segments, sections, and symbol tables, nor do we know the exact description of their fields. But this isn’t a big problem when we have a specification, which is always available on Apple’s official site. After installing Xcode development tools, we can view the header files from /usr/include/mach-o (especially loader.h).

Related services

Custom Mac OS X / macOS Development Services

Besides, it’s worth remembering that though the file contents are located in memory in the same order as they’re stored on the disk, the dynamic linker still can delete some parts of the Mach-O symbol table and the whole string table during loading. Also, it can set values of the real offsets in memory where necessary, while these values in the file can be zeroed or correspond to the offset on the disk.

The header structure is simple (an example is provided for a 32-bit architecture, though a 64-bit architecture does not differ much):

struct mach_header
{
  uint32_t magic;
  cpu_type_t cputype;
  cpu_subtype_t cpusubtype;
  uint32_t filetype;
  uint32_t ncmds;
  uint32_t sizeofcmds;
  uint32_t flags;
};

Everything begins from a magic value (0xFEEDFACE or vice versa, depending on the agreement concerning the order of bytes in machine words). Then the processor architecture type, number and size of load commands, and flags that describe other specifics are defined.

For example:

The existing load commands are listed below:

LC_SEGMENT — Everything begins from a magic value (0xFEEDFACE or vice versa, depending on the agreement concerning the order of bytes in machine words). Then the processor architecture type, number and size of load commands, and flags that describe other specifics are defined.
LC_SYMTAB — Loads the table of symbols and strings
LC_DYSYMTAB — Creates an import table; data on symbols is taken from the symbol table
LC_LOAD_DYLIB — Defines the dependency on a certain third-party library

For example (for 32- and 64-bit versions, respectively):

The most important segments of a Mach-O file are the following:

__TEXT — The executed code and other read-only data
__DATA — Data available for writing, including import tables that can be changed by the dynamic loader during lazy binding
__OBJC — Information which is necessary for Objective-C runtime: class definitions, method selectors, constants etc.
__IMPORT — Import table only for a 32-bit architecture (we managed to generate it only on Mac OS X 10.5)
__LINKEDIT — The dynamic loader places its data for already loaded modules (symbol tables, string tables, etc.)

Any load command starts with the following fields:

struct load_command
{
  uint32_t cmd;  //command numeric code
  uint32_t cmdsize;  //size of the current command in bytes
};

After these fields, there can be many different fields depending on the type of command.

For example:

The most interesting sections in the listed segments are the following:

__TEXT,__text — The code itself
__TEXT,__cstring — Constant strings (in double quotes)
__TEXT,__const — Different constants
__DATA,__data — Initialized variables (strings and arrays)
__DATA,__la_symbol_ptr — Table of pointers to imported functions
__DATA,__bss — Non-initialized static variables
__IMPORT,__jump_table — Stubs for calls of imported functions

We should mention that there can be either __IMPORT,__jump_table (for 32-bit, Mac OS 10.5), or __DATA,__la_symbol_ptr (for 64-bit, or Mac OS 10.6 and later) as the import table in a Mach-O file.

Sections in segments have the following structure:

struct section
{
  char sectname[16];
  char segname[16];
  uint32_t addr;
  uint32_t size;
  uint32_t offset;
  uint32_t align;
  uint32_t reloff;
  uint32_t nreloc;
  uint32_t flags;
  uint32_t reserved1;
  uint32_t reserved2;
};

We have the name of the segment and the section itself, section size, section offset in the file, and the address in memory at which the dynamic loader located it. Additionally, there’s other information specific to the given section that can be found in /usr/include/mach-o/loader.h file.

For example:

Fat binary

Of course, it’s worth mentioning that executable files and libraries have “learned” to store several variants of the executable code at once. This is due to the repeated and gradual change of target architectures by Apple (Motorola –> IBM –> Intel). Generally speaking, these executables are called fat binaries. In fact, Fat binary is a container which consists of multiple Mach-O files gathered in one file, but the header of the last is special. It contains information on the number and type of supported architectures and the offsets to each. A simple Mach-O file with the structure described above is located by such an offset.

This Mach-O fat binary looks as follows in the C language:

struct fat_header
{
  uint32_t magic;
  uint32_t nfat_arch;
};

Here magic means 0xCAFEBABE (or vice versa, we should remember the different order of bytes in machine language on different processors). Then, exactly nfat_arch (number) structures of the described type below follow:

struct fat_arch
{
  cpu_type_t cputype;
  cpu_subtype_t cpusubtype;
  uint32_t offset;
  uint32_t size;
  uint32_t align;
};

Actually, fat_arch describes at which offset in Fat binary Mach-O file is situated, the size of this Mach-O file and the CPU architecture on which it should be run.

Experimental program

Let’s take the following files written in C to investigate the work of the imported function call:

//File test.c
void libtest();  //from libtest.dylib
int main()
{
    libtest();  //calls puts() from libSystem.B.dylib
    return 0;
}
//File libtest.c
#include <stdio.h>
void libtest()  //just a simple library function
{
    puts("libtest: calls the original puts()");
}

Investigation of dynamic linking

We’ll confine ourselves to Intel processors. Let’s suppose that we have Mac OS X 10.5 (Leopard). Let’s add these files to a new Xcode project, compile a 32-bit version, and start it in debug mode. We stop on the line where the call of the puts() function is performed in the libtest() function of the libtest.dylib library. Here is the assembler listing for libtest():

Let’s perform one more instruction:

And look at it in the memory:

This is that cell of the import table (in this case, __IMPORT, __jump_table cell) that serves as a springboard for the call of the dynamic loader (__dyld_stub_binding_helper_interface function) if lazy binding is used, or it jumps directly to the target function. This is confirmed by the following puts() call:

And in the memory:

We see that the dynamic loader substituted the instruction of the indirect call, CALL (0xE8), for the instruction of the indirect jump, JMP (0xE9). This means that for the redirection of __jump_table elements, it will be enough to write the instruction for the indirect jump at the beginning of the function substitution instead of using the inline patch.

Here is one more interesting moment. Why is JMP not used for the jump to the dynamic loader (linker)? Because CALL (which saves the return address in the stack) will help the linker define which element of the import table called it. So the return address in the stack will help to define what that symbol was and solve it by substituting CALL for itself, for the indirect JMP, and for the required function.

Now let’s move the project to Mac OS X 10.6 (Snow Leopard) and compile a fat binary for 32- and 64-bit architectures. Just in case, you can do this as follows in Xcode:

First, we compile and then start the 64-bit variant (the import table in Snow Leopard will be the same for the 32-bit version) and stop once again at the puts() call:

Here is a simple CALL again. Let’s look at the following:

Here we can see the difference with a simple __IMPORT, __jump_table.

Welcome to __TEXT, __symbol_stub1. This table is a set of JMP instructions for each imported function. In our case, we have only one such instruction, which is presented above. Each of these instructions performs a jump to the address defined in the corresponding cell of the __DATA, __la_symbol_ptr table. The last one is an import table for this Mach-O file.

But let’s continue our investigation. If we look at the address to which the jump is going to be performed:

We will see the following:

We’ll jump to the __TEXT, __stub_helper section. Actually, this is a Procedure Linkage Table (PLT) for Mach-O. By means of the first instruction (in our case, LEA in connection with R11, but it could also be a simple PUSH), the dynamic linker remembers which symbol needs to be relocated. The second instruction always leads to the same address: to the beginning of the function, __dyld_stub_binding_helper, which will perform the linking:

After the dynamic linker performs relocations for puts(), the corresponding cell in __DATA, __la_symbol_ptr will look like the following:

And this is the address of the puts() function from the libSystem.B.dylib module. It means that we’ll receive the required effect of the call redirection by replacing the address with our own.

So with the help of this specific example, we found out how Mach-O dynamic linking is performed, what the import tables in Mach-O are, and what elements they consist of. Now let’s move to the analysis of Mach-O.

Searching for an element in the import table

We need to find the corresponding cell in the import table by symbol name. The algorithm of doing this is rather nontrivial.

First, we need to find the symbol itself in the symbol table. The table is an array of the following structures:

struct nlist
{
  union
  {
     int32_t n_strx;
  } n_un;
  uint8_t n_type;
  uint8_t n_sect;
  int16_t n_desc;
  uint32_t n_value;
};

where n_un.n_strx is the offset of the name of this symbol, in bytes, from the beginning of the string table. The rest concerns the type of symbol, the section where it’s placed, etc. In short, here are several of the last elements for our experimental Mach-O dynamic library called libtest.dylib (32-bit version):

A string table is a chain of names, each of which ends with a zero. But it’s worth mentioning that the compiler adds the underscore character “_” to the beginning of each name. That’s why the name “puts” will look like “_puts” in the string table.

Here is an example:

We can find out the location of the symbol table and string table with the help of the corresponding loader command (LC_SYMTAB):

But the symbol table is not uniform. There are several partitions in it. We have a special interest in one of them: it includes undefined symbols, i.e. those that are linked dynamically. Besides, MachOView highlights these symbols with a blue background. To define which part of the symbol table reflects the subset of undefined symbols, we need to look at the loader command for dynamic symbols (LC_DYSYMTAB):

Here is its representation in the C language:

struct dysymtab_command
{
    uint32_t cmd;
    uint32_t cmdsize;
    uint32_t ilocalsym;
    uint32_t nlocalsym;
    uint32_t iextdefsym;
    uint32_t nextdefsym;
    uint32_t iundefsym;
    uint32_t nundefsym;
    uint32_t tocoff;
    uint32_t ntoc;
    uint32_t modtaboff;
    uint32_t nmodtab;
    uint32_t extrefsymoff;
    uint32_t nextrefsyms;
    uint32_t indirectsymoff;
    uint32_t nindirectsyms;
    uint32_t extreloff;
    uint32_t nextrel;
    uint32_t locreloff;
    uint32_t nlocrel;
};

Here, dysymtab_command.iundefsym is an index in the symbol table from which the subset of undefined symbols starts. dysymtab_command.nundefsym is the number of undefined symbols. Since we know that we’re looking for an undefined symbol, we should look for it only in this subset in the symbol table.

And now one very important moment: when finding a symbol by its name, the most important thing for us to remember is its index in the symbol table from the beginning. This is because another important table — the table of indirect symbols — consists of numerical values of these indexes. We can find this table by the value of dysymtab_command.indirectsymoff; dysymtab_command.nindirectsyms defines the number of indexes.

This table consists of only one element in our case (there are many more elements in real life):

And finally, let’s look at the section __IMPORT, __jump_table, the element of which we need to find. The section looks like the following:

The section.reserved1 field for this section is very important (MachOView calls it Indirect Sym Index). This is the index in the table of indirect symbols from which the mutual univocal correspondence with __jump_table elements begins. And as you may recall, elements in the table of indirect symbols are indexes in the symbol table. Do you catch what we’re getting at?

But before collecting everything together, let’s glance over the situation in Snow Leopard to give the complete picture. __DATA, __la_symbol_ptr plays the role of an import table here. The differences are not very appreciable.

Here is the command for loading symbols:

And here are the last elements of that command:

There are two undefined symbols on the blue background. This corresponds to data from the loader command of dynamic symbols (LC_DYSYMTAB):

Also, there are four elements instead of one in the table of indirect symbols:

But if we look at the reserved1 field of the required __la_symbol_ptr section, we will discover that the mutual univocal reflection of its elements on the table of indirect symbols starts not from the beginning of the last but from the fourth element (index is equal to 3):

The contents of the import table that the __la_symbol_ptr section describes will be as follows:

Knowing all these subtleties of Mach-O, we can formulate a search algorithm to find the required element in the import table. That’s a matter for the next article.

Conclusion

In this article, we discussed how you can arrange dynamic linking of imported functions in Mach-O. Hopefully, our tips and tricks will be useful for software developers who are working on Mac-related projects. At Apriorit, we have a team of dedicated experts with extensive experience programming for Mac OS X / macOS. They’ll gladly assist you in implementing your Mac project. Get in touch with us to discuss the details.

Useful links

After reading some tips on creating software for OS X, learn more about how to reverse engineer OS X software.

Principle of Dynamic Linking of Imported Functions in Mach-O

Mach-O in brief

Fat binary

Experimental program

Investigation of dynamic linking

Searching for an element in the import table

Conclusion

Useful links

Effective Load Testing for Golang Microservices: A Practical Guide

Cybersecurity Risks of Automotive OTA Updates and How to Mitigate Them

Build an Observability System to Know What’s Really Happening Inside Your Software (with a Practical Example)

Protect Sensitive Data Using Code Obfuscation in Android Apps (with a Practical Example)