Home iOS & Swift Books Advanced Apple Debugging & Reverse Engineering

18
Hello, Mach-O Written by Derek Selander

Heads up... You're reading this book for free, with parts of this chapter shown beyond this point as scrambled text.

You can unlock the rest of this book, and our entire catalogue of books and videos, with a raywenderlich.com Professional subscription.

Mach-O is the file format used for a compiled program running on any of your Apple operating systems. Knowledge of the format is important for both debugging and reverse engineering, since the layout Mach-O defines is applicable to how the executable is stored on disk as well as how the executable is loaded into memory.

Knowing which area of memory an instruction is referencing is useful on the reverse engineering side, but there are a number of useful hidden treasures on the debugging front when exploring Mach-O. For example:

  • You can instrospect an external function call at runtime.
  • You can quickly find the reference to a singleton’s memory address without having to trip a breakpoint.
  • You can inspect and modify variables in your own app or other frameworks
  • You can perform security audits and make sure no internal, secret messages are being sent out into production in the form of strings or methods.

This chapter introduces the concepts of Mach-O, while the next chapter, Mach-O Fun will show the amusing things that are possible with this knowledge. Make sure you have that caffeine on board for this chapter since the theory comes first, followed by the fun in the following chapter.

Terminology

Before diving into the weeds with all the different C structs you’re about to view, it would be best to take a high level, birds-eye view of the Mach-O layout.

This is the layout of every compiled executable; every main program, every framework, every kernel extension, everything that’s compiled on an Apple platform.

At the start of every compiled Apple program is the Mach-O header that gives information about the CPU this program can run on, the type of executable it is (A framework? A standalone program?) as well as how many load commands immediately follow it.

Load commands are instructions on how to load the program and are made up of C structs, which vary in size depending on the type of load command.

Some of the load commands provide instructions about how to load segments. Think of segments as areas of memory that have a specific type of memory protection. For example, executable code should only have read and execute permissions; it doesn’t need write permissions.

Other parts of the program, such as global variables or singletons, need read and write permissions, but not executable permissions. This means that executable code and the address to global variables will live in separate segments.

Segments can have 0 or more subcomponents called sections. These are more finely-grained areas bound by the same memory protections given by their parent segment.

Take another look at the above diagram. Segment Command 1, points to an offset in the executable that contains four section commands, while Segment Command 2 points to an offset that contains 0 section commands. Finally, Segment Command 3 doesn’t point to any offset in the executable.

It’s these sections that can be of profound interest to developers and reverse engineerers since they each serve a unique purpose to the program. For example, there’s a specific section to store hard-coded UTF-8 strings, there’s a specific section to store references to statically defined variables and so on.

The ultimate goal of these two Mach-O chapters is to show you some interesting load commands in this chapter, and reveal some interesting sections in the next chapter.

In this chapter, you’ll be seeing a lot of references to system headers. If you see something like mach-o/stab.h, you can view it via the Open Quickly menu in Xcode by pressing ⌘ + Shift + O (the default), then typing in /usr/include/mach-o/stab.h.

I’d recommend adding a /usr/include/ to the search query since Xcode isn’t all that smart at times.

If you want to view this header without Xcode, then the physical location will be at:

${PATH_TO_XCODE}/Contents/Developer/Platforms/${SYSTEM_PLATFORM}.platform/Developer/SDKs/${SYSTEM_PLATFORM}.sdk/usr/include/mach-o/stab.h 

Where ${SYSTEM_PLATFORM} can be MacOSX, iPhoneOS, iPhoneSimulator, WatchOS, etc.

Now you’ve gotten a birds-eye overview, it’s time to drop down into the weeds and view all the lovely C structs.

The Mach-O header

At the beginning of every compiled Apple executable is a special struct that indicates if it’s a Mach-O executable. This struct can be found in mach-o/loader.h.

struct mach_header_64 {
  uint32_t  magic;    /* mach magic number identifier */
  cpu_type_t  cputype;  /* cpu specifier */
  cpu_subtype_t cpusubtype; /* machine specifier */
  uint32_t  filetype; /* type of file */
  uint32_t  ncmds;    /* number of load commands */
  uint32_t  sizeofcmds; /* the size of all the load commands */
  uint32_t  flags;    /* flags */
  uint32_t  reserved; /* reserved */
};
/* Constant for the magic field of the mach_header_64 (64-bit architectures) */
#define MH_MAGIC_64 0xfeedfacf /*the 64-bit mach magic number*/
#define MH_CIGAM_64 0xcffaedfe /*NXSwapInt(MH_MAGIC_64)*/
#define MH_OBJECT 0x1   /* relocatable object file */
#define MH_EXECUTE  0x2 /* demand paged executable file */
#define MH_FVMLIB 0x3   /* fixed VM shared library file */
#define MH_CORE   0x4   /* core file */
... // there’s way more below but ommiting for brevity...

Mach-O header in grep

Open up a Terminal window. I’ll pick on the grep executable command, but you can pick on any Terminal command that suits your interests. Type the following:

xxd -l 32 $(which grep)
00000000: cffa edfe 0700 0001 0300 0080 0200 0000  ................
00000010: 1300 0000 4007 0000 8500 2000 0000 0000  ....@..... .....
cffa edfe 
cf fa ed fe
fe ed fa cf
xxd -e -l 32 $(which grep)
00000000: feedfacf 01000007 80000003 00000002  ................
00000010: 00000013 00000740 00200085 00000000  ....@..... .....
struct mach_header_64 {
  uint32_t      magic      = 0xfeedfacf
  cpu_type_t    cputype    = 0x01000007
  cpu_subtype_t cpusubtype = 0x80000003
  uint32_t      filetype   = 0x00000002
  uint32_t      ncmds      = 0x00000013
  uint32_t      sizeofcmds = 0x00000740
  uint32_t      flags      = 0x00200085
  uint32_t      reserved   = 0x00000000
};
#define CPU_ARCH_ABI64    0x01000000  /* 64 bit ABI */
...
#define CPU_TYPE_X86    ((cpu_type_t) 7)

The fat header

Some executables are actually a group of one or more executables “glued” together. For example, many apps compile both a 32-bit and 64-bit executable and place them into a “fat” executable. This “gluing together” of multiple executables is indicated by a fat header, which also has a unique magic value differentiating it from a Mach-O header.

#define FAT_MAGIC 0xcafebabe
#define FAT_CIGAM 0xbebafeca  /* NXSwapLong(FAT_MAGIC) */

struct fat_header {
  uint32_t  magic;    /* FAT_MAGIC or FAT_MAGIC_64 */
  uint32_t  nfat_arch;  /* number of structs that follow */
};

...
#define FAT_MAGIC_64  0xcafebabf
#define FAT_CIGAM_64  0xbfbafeca  /* NXSwapLong(FAT_MAGIC_64) */
struct fat_arch_64 {
  cpu_type_t  cputype;  /* cpu specifier (int) */
  cpu_subtype_t cpusubtype; /* machine specifier (int) */
  uint64_t  offset;   /* file offset to this object file */
  uint64_t  size;   /* size of this object file */
  uint32_t  align;    /* alignment as a power of 2 */
  uint32_t  reserved; /* reserved */
};
struct fat_arch {
  cpu_type_t  cputype;  /* cpu specifier (int) */
  cpu_subtype_t cpusubtype; /* machine specifier (int) */
  uint32_t  offset;   /* file offset to this object file */
  uint32_t  size;   /* size of this object file */
  uint32_t  align;    /* alignment as a power of 2 */
};
file /System/Library/Frameworks/CoreFoundation.framework/CoreFoundation 
/System/Library/Frameworks/CoreFoundation.framework/CoreFoundation: Mach-O universal binary with 3 architectures: [x86_64:Mach-O 64-bit dynamically linked shared library x86_64] [x86_64h]
/System/Library/Frameworks/CoreFoundation.framework/CoreFoundation (for architecture x86_64): Mach-O 64-bit dynamically linked shared library x86_64
/System/Library/Frameworks/CoreFoundation.framework/CoreFoundation (for architecture i386): Mach-O dynamically linked shared library i386
/System/Library/Frameworks/CoreFoundation.framework/CoreFoundation (for architecture x86_64h):  Mach-O 64-bit dynamically linked shared library x86_64h
lldb $(which plutil)
(lldb) run
Process 946 launched: ’/usr/bin/plutil’ (x86_64)
No files specified.
plutil: [command_option] [other_options] file...
... etc ...
(lldb) image list -h CoreFoundation
[  0] 0x00007fff33cf6000
(lldb) x/8wx 0x00007fff33cf6000
0x7fff33cf6000: 0xfeedfacf 0x01000007 0x00000008 0x00000006
0x7fff33cf6010: 0x00000013 0x00001100 0xc2100085 0x00000000
#define CPU_SUBTYPE_X86_64_H    ((cpu_subtype_t)8)  /* Haswell feature subset */
xxd -l 68 -e /System/Library/Frameworks/CoreFoundation.framework/CoreFoundation
00000000: bebafeca 03000000 07000001 03000000  ................
00000010: 00100000 30767400 0c000000 07000000  .....tv0........
00000020: 03000000 00907400 e0ca6700 0c000000  .....t...g......
00000030: 07000001 08000000 0060dc00 d0e67400  ..........`..t..
00000040: 0c000000                             ....
xxd -l 68 -g 4 /System/Library/Frameworks/CoreFoundation.framework/CoreFoundation
00000000: cafebabe 00000003 01000007 00000003  ................
00000010: 00001000 00747630 0000000c 00000007  .....tv0........ 
00000020: 00000003 00749000 0067cae0 0000000c  .....t...g......
00000030: 01000007 00000008 00dc6000 0074e6d0  ..........`..t..
00000040: 0000000c                             ....
xxd -l 32 -e -s 4096 /System/Library/Frameworks/CoreFoundation.framework/CoreFoundation 
00001000: feedfacf 01000007 00000003 00000006  ................
00001010: 00000015 00001120 02100085 00000000  .... ...........

The load commands

Immediately following the Mach-O header are the load commands providing instructions on how an executable should be loaded into memory, as well as other miscellaneous details. This is where it gets interesting. Each load command consists of a series of structs, each varying in struct size and arguments.

struct load_command {
  uint32_t cmd;   /* type of load command */
  uint32_t cmdsize; /* total size of command in bytes */
};
#define LC_SEGMENT_64 0x19  /*64-bit segment of this file to be mapped*/
#define LC_ROUTINES_64  0x1a  /* 64-bit image routines */
#define LC_UUID   0x1b  /* the uuid */
/*
 * The uuid load command contains a single 128-bit unique random number that
 * identifies an object produced by the static link editor.
 */
struct uuid_command {
    uint32_t  cmd;    /* LC_UUID */
    uint32_t  cmdsize;  /* sizeof(struct uuid_command) */
    uint8_t uuid[16]; /* the 128-bit uuid */
};
otool -l $(which grep) | grep LC_UUID -A2
     cmd LC_UUID
 cmdsize 24
    uuid 3B067B3F-4F1F-39A3-A4B9-CFDD595F9289

Segments

The LC_UUID is a simple load command since it’s self-contained and doesn’t provide offsets into the executable’s segments/sections. It’s now time to turn your attention to segments.

lldb -n SpringBoard
(lldb) image dump sections SpringBoard
Sections for ’/Applications/Xcode-beta.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/Library/CoreSimulator/Profiles/Runtimes/iOS.simruntime/Contents/Resources/RuntimeRoot/System/Library/CoreServices/SpringBoard.app/SpringBoard’ (x86_64):
  SectID     Type             Load Address                             Perm File Off.  File Size  Flags      Section Name
  ---------- ---------------- ---------------------------------------  ---- ---------- ---------- ---------- ----------------------------
  0x00000100 container        [0x0000000000000000-0x0000000100000000)* ---  0x00000000 0x00000000 0x00000000 SpringBoard.__PAGEZERO
  0x00000200 container        [0x000000010c0da000-0x000000010c8b4000)  r-x  0x00000000 0x007da000 0x00000000 SpringBoard.__TEXT
  0x00000001 code             [0x000000010c0de69c-0x000000010c70d663)  r-x  0x0000469c 0x0062efc7 0x80000400 SpringBoard.__TEXT.__text
... etc ...
(lldb) image dump objfile SpringBoard

Programmatically finding segments and sections

For the demo part of this chapter, you’ll build a macOS executable that iterates through the loaded modules and prints all the segments and sections found in each module.

import Foundation
import MachO // 1

for i in 0..<_dyld_image_count() { // 2
  let imagePath =
    String(validatingUTF8: _dyld_get_image_name(i))! // 3 
  let imageName = (imagePath as NSString).lastPathComponent 
  let header = _dyld_get_image_header(i)! // 4
  print("\(i) \(imageName) \(header)")
}

CFRunLoopRun() // 5
8 CoreFoundation 0x00007fff33cf6000
(lldb) x/8wx 0x00007fff33cf6000
0x7fff33cf6000: 0xfeedfacf 0x01000007 0x00000008 0x00000006
0x7fff33cf6010: 0x00000013 0x00001100 0xc2100085 0x00000000
var curLoadCommandIterator = Int(bitPattern: header) + 
  MemoryLayout<mach_header_64>.size // 1
for _ in 0..<header.pointee.ncmds {
  let loadCommand = 
    UnsafePointer<load_command>(
      bitPattern: curLoadCommandIterator)!.pointee // 2

  if loadCommand.cmd == LC_SEGMENT_64 {
    let segmentCommand = 
      UnsafePointer<segment_command_64>(
        bitPattern: curLoadCommandIterator)!.pointee // 3

    print("\t\(segmentCommand.segname)")
  }

  curLoadCommandIterator = 
    curLoadCommandIterator + Int(loadCommand.cmdsize) // 4
}
0 MachOPOC 0x0000000100000000
  (95, 95, 80, 65, 71, 69, 90, 69, 82, 79, 0, 0, 0, 0, 0, 0)
  (95, 95, 84, 69, 88, 84, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
  (95, 95, 68, 65, 84, 65, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
  (95, 95, 76, 73, 78, 75, 69, 68, 73, 84, 0, 0, 0, 0, 0, 0)
func convertIntTupleToString(name : Any) -> String {
  var returnString = ""
  let mirror = Mirror(reflecting: name)
  for child in mirror.children {
    guard let val = child.value as? Int8,
      val != 0 else {
        break
    } 
    returnString.append(Character(UnicodeScalar(UInt8(val))))
  }
  
  return returnString
}
let segName = convertIntTupleToString(
  name: segmentCommand.segname)
print("\t\(segName)")
0 MachOPOC 0x0000000100000000
  __PAGEZERO
  __TEXT
  __DATA
  __LINKEDIT
1 libBacktraceRecording.dylib 0x0000000100ac7000
  __TEXT
  __DATA
  __LINKEDIT
2 libMainThreadChecker.dylib 0x0000000100ad7000
  __TEXT
  __DATA
  __LINKEDIT
  ...
for j in 0..<segmentCommand.nsects { // 1
  let sectionOffset = curLoadCommandIterator +
    MemoryLayout<segment_command_64>.size // 2
  let offset = MemoryLayout<section_64>.size * Int(j) // 3
  let sectionCommand = 
    UnsafePointer<section_64>(
      bitPattern: sectionOffset + offset)!.pointee

  let sectionName = 
    convertIntTupleToString(name: sectionCommand.sectname) // 4
  print("\t\t\(sectionName)") 
}
0 MachOPOC 0x0000000100000000
  __PAGEZERO
  __TEXT
    __text
    __stubs
    __stub_helper
    __cstring
    __objc_methname
    __const
    __swift4_types
    __swift4_typeref
    __swift4_reflstr
    __swift4_fieldmd
    __swift4_capture
    __swift4_assocty
    __swift4_proto
    __swift4_builtin
    __objc_classname
    __objc_methtype
    __swift4_protos
    __ustring
    __gcc_except_tab
    __unwind_info
    __eh_frame
  __DATA
    __nl_symbol_ptr
    __got
    __la_symbol_ptr
    __mod_init_func
    __const
    __cfstring
    __objc_classlist
    __objc_nlclslist
    __objc_catlist
    __objc_protolist
    __objc_imageinfo
    __objc_const
    __objc_selrefs
    __objc_protorefs
    __objc_classrefs
    __objc_superrefs
    __objc_ivar
    __objc_data
    __data
    __crash_info
    __thread_vars
    __thread_bss
    __bss
    __common
  __LINKEDIT

Where to go from here?

If I haven’t indirectly hinted it enough, go check out mach-o/loader.h. I’ve read that header many times myself, and each time I read it I still learn something new. There’s a lot there, so don’t get frustrated if this chapter knocked you back into your chair.

Have a technical question? Want to report a bug? You can ask questions and report bugs to the book authors in our official book forum here.

Have feedback to share about the online reading experience? If you have feedback about the UI, UX, highlighting, or other features of our online readers, you can send them to the design team with the form below:

© 2021 Razeware LLC

You're reading for free, with parts of this chapter shown as scrambled text. Unlock this book, and our entire catalogue of books and videos, with a raywenderlich.com Professional subscription.

Unlock Now

To highlight or take notes, you’ll need to own this book in a subscription or purchased by itself.