How to Run a Binary File That Is Mach-O Executable I386 on Linux

What is required for a Mach-O executable to load?

Since 10.10.5 Yosemite, the executable file must be at least 4096 bytes long ( PAGE_SIZE ), or it will be killed immediately. The relevant code found by @Siguza in the XNU kernel exec_activate_image function https://github.com/apple/darwin-xnu/blob/0a798f6738bc1db01281fc08ae024145e84df927/bsd/kern/kern_exec.c#L1456

Without dyld

Assuming you want a 64-bit macOS executable using only system calls, you need:

Mach-O 64-bit Header
LC_SEGMENT_64 __PAGEZERO (with nonzero size, name can be anything)
LC_SEGMENT_64 __TEXT (name can be anything; must be readable and executable; sections are optional)
LC_UNIXTHREAD

Here is my example for this case.

With dyld

You can't do much without dyld though, so if you want to use it the minimal set is:

Mach-O 64-bit Header
LC_SEGMENT_64 __PAGEZERO (with nonzero size)
LC_SEGMENT_64 __TEXT (name can be anything; must be readable and executable; sections are optional)
LC_SEGMENT_64 __LINKEDIT (must be writable because dyld requires a writable segment, in a ld linked binary the writable segment typically would be __DATA)
LC_DYLD_INFO_ONLY (specifies where the actual dyld load commands physically are in the executable, typically they will be found __LINKEDIT but there's no limitation on this) or interestingly LC_SYMTAB instead, which would make the actual dyld impossible to use without LC_DYLD_INFO_ONLY.
LC_DYSYMTAB (this can be empty)
LC_LOAD_DYLINKER
LC_MAIN or LC_UNIXTHREAD
LC_LOAD_DYLIB (at least one actual dylib to load that eventually depends on libSystem or libSystem itself for LC_MAIN to work)

Additionally since MacOS Monterey 12.3:

LC_SYMTAB (which is now required if LC_DYSYMTAB is used)

`LC_UNIXTHREAD` and `LC_MAIN`

In modern executables (since 10.7 Mountain Lion), LC_UNIXTHREAD is replaced by LC_MAIN, which requires dyld — but LC_UNIXTHREAD is still supported for any executable as of 10.12 Sierra (and it should be in future MacOS versions, because it's utilised by dyld executable itself to actually start).

For dyld to work the extra steps depend on type of binding:

bind at load is the least effort approach , where LC_DYLD_INFO_ONLY pointing to valid dyld load commands pointing to writable segment will do the trick.

lazy binding additionally requires extra platform specific code in __TEXT which utilises binded at load time dyld_stub_binder to lazy load address of a dyld loaded function.

There are other types of dyld binding which I don't cover here.

Further details can be found here: https://github.com/opensource-apple/dyld/blob/master/src/ImageLoaderMachO.cpp

Fat Mach-O Executable Multi-purpose?

Yes it is possible, but hardly useful. Before I get to why, here's how to create one:

Take this C file:

#ifdef __LP64__
int main(void)
#else
int derp(void)
#endif
{
    return 123;
}

Compile it as a 64-bit executable and a 32-bit shared library:

gcc -o t t.c -Wall
gcc -m32 -o t.dylib -Wall t.c -shared

And smash them together:

lipo -create -output t.fat t t.dylib

Now, why is that supposed to be not useful?

Because you're limited to one binary per architecture, and you have little to no control over which slice is used.

In theory, you can have slices for all these architectures in the same fat binary:

i386
x86_64
x86_64h
armv6
armv6m
armv7
armv7s
armv7k
armv7m
arm64

So you could smash an executable, a dylib, a linker and a kernel extension into one fat binary, but you'd have a hard time getting anything useful out of it.

The biggest problem is that the OS chooses which slice to load. For executables, that will always be the closest match for the processor you're running on. For dylibs, dylinkers and kexts, it will first be determined whether the process they're gonna be loaded into is 32- or 64-bit, but once that distinction has been made, there too you will get the slice most closely matching your CPU's capabilities.

I imagine back on Mac OS X 10.5 you could've had a 64-bit binary bundled with a 32-bit kext that it could try and load. However, outside of that I cannot think of a use case for this.

How do I embed the contents of a binary file in an executable on Mac OS X?

As evidenced in this other question about objcopy, another way to include a binary file into an executable is to use the .incbin assembler directive. This solution has two main advantages over objcopy: the developer is in control of the symbol names (objcopy appears to have a fixed scheme to name them), and, well, it doesn't require objcopy.

The solution also has advantages over the linker-based -sectcreate solution. It's cross-platform and accessing the data is much, much more straightforward.

I'm using this Xcode build rule script to generate the file to be included and an assembly file with the .incbin directive:

my_generation_tool -o $DERIVED_FILE_DIR/$INPUT_FILE_NAME.out $INPUT_FILE_PATH

export AS_PATH=$DERIVED_FILE_DIR/$INPUT_FILE_NAME.out.s

echo "\t.global _data_start_$INPUT_FILE_BASE" > $AS_PATH
echo "\t.global _data_end_$INPUT_FILE_BASE" >> $AS_PATH
echo "_data_start_ $INPUT_FILE_BASE:" >> $AS_PATH
echo "\t.incbin \"$INPUT_FILE_NAME.out\"" >> $AS_PATH
echo "_data_end_$INPUT_FILE_BASE:" >> $AS_PATH

Then, given a file "somefile.gen" that is processed with this rule, the assembly will look like:

    .global _data_start_somefile
    .global _data_end_somefile
_data_start_somefile:
    .incbin "somefile.gen.out"
_data_end_somefile:

The data can be accessed in C using the data_start_somefile and data_end_somefile symbols (the macOS linker prefixes C names with a spurious _, that's why the assembly file has them):

extern char data_start_somefile, data_end_somefile;

for (const char* c = &data_start_somefile; c != &data_end_somefile; ++c)
{
    // do something with character
}

The answer on the other thread has more bells and whistles that some people may find useful (for instance, a length symbol).

Linux Mach-O Disassembler

AFAIK, the native Darwin binary tools are part of the cctools package. They don't have the same command line syntax or output as the GNU binutils. Later binutils (i.e., 2.22) supports the Mach-O format however. You can get these prebuilt, with the 'g' prefix to the tool names, as mentioned here. Alternatively, you can compile binutils, with something like:

> ./configure --prefix=$CROSSTOOLDIR --target=x86_64-apple-darwin \
--enable-64-bit-bfd --disable-nls --disable-werror

Installation will yield a bin/ directory where the utilities are prefixed with x86_64-apple-darwin. It should handle i386 Mach-O format (and FAT binaries) fine.

Corrupt Binary Executable?

That is the executable format for Mac OS X, you need a binary compiled for linux to be able to run it.

cannot execute binary file error that I scpd to a different computer

Can not compile on OS X without using a cross compiler and run on Ubuntu. Here is the answer that I found:
Compiling C Program on OS X to Run on Ubuntu

How to Run a Binary File That Is Mach-O Executable I386 on Linux

What is required for a Mach-O executable to load?

Without dyld

With dyld

`LC_UNIXTHREAD` and `LC_MAIN`

Fat Mach-O Executable Multi-purpose?

How do I embed the contents of a binary file in an executable on Mac OS X?

Linux Mach-O Disassembler

Corrupt Binary Executable?

cannot execute binary file error that I scpd to a different computer

Related Topics

Leave a reply

What is required for a Mach-O executable to load?

Without dyld

With dyld

LC_UNIXTHREAD and LC_MAIN

Fat Mach-O Executable Multi-purpose?

How do I embed the contents of a binary file in an executable on Mac OS X?

Linux Mach-O Disassembler

Corrupt Binary Executable?

cannot execute binary file error that I scpd to a different computer

Related Topics

Leave a reply

`LC_UNIXTHREAD` and `LC_MAIN`