Chapters

Hide chapters

Advanced Apple Debugging & Reverse Engineering

Third Edition · iOS 12 · Swift 4.2 · Xcode 10

Before You Begin

Section 0: 3 chapters
Show chapters Hide chapters

Section III: Low Level

Section 3: 7 chapters
Show chapters Hide chapters

Section IV: Custom LLDB Commands

Section 4: 8 chapters
Show chapters Hide chapters

12. Assembly & Memory
Written by Derek Selander

Heads up... You're reading this book for free, with parts of this chapter shown beyond this point as scrambled text.

You’ve begun the journey and learned the dark arts of the x64 calling convention in the previous chapter. When a function is called, you now know how parameters are passed to functions, and how function return values come back. What you haven’t learned yet is how code is executed when it’s loaded into memory.

In this chapter, you’ll explore how a program executes. You’ll look at a special register used to tell the processor where it should read the next instruction from, as well as how different sizes and groupings of memory can produce very different results.

Setting up the Intel-Flavored Assembly Experience™

As mentioned in the previous chapter, there are two main ways to display assembly. One type, AT&T assembly, is the default assembly set for LLDB. This flavor has the following format:

opcode  source  destination

Take a look at a concrete example:

movq  $0x78, %rax

This will move the hexadecimal value 0x78 into the RAX register. Although this assembly flavor is nice for some, you’ll use the Intel flavor instead from here on out.

Why opt for Intel over AT&T? The answer can be best explained by this simple tweet…

Note: In all seriousness, the choice of assembly flavor is somewhat of a flame war — check out this discussion in StackOverflow: https://stackoverflow.com/questions/972602/att-vs-intel-syntax-and-limitations.

Using Intel was based on the admittedly loose consensus that Intel is better for reading, but at times, worse for writing. Since you’re learning about debugging, the majority of time you’ll be reading assembly as opposed to writing it.

Add the following lines to the bottom of your ~/.lldbinit file:

settings set target.x86-disassembly-flavor intel
settings set target.skip-prologue false

The first line tells LLDB to display x86 assembly (both 32-bit and 64-bit) in the Intel flavor.

The second line tells LLDB to not skip the function prologue. You came across this earlier in this book, and from now on it’s prudent to not skip the prologue since you’ll be inspecting assembly right from the first instruction in a function.

Note: When editing your ~/.lldbinit file, make sure you don’t use a program like TextEdit for this, as it will add unnecessary characters into the file that could result in LLDB not correctly parsing the file. An easy (although dangerous) way to add this is through a Terminal command like so: echo "settings set target.x86-disassembly-flavor intel" >> ~/.lldbinit.

Make sure you have two ‘>>’ in there or else you’ll overwrite all your previous content in your ~/.lldbinit file. If you’re not comfortable with the Terminal, editors like nano (which you’ve used earlier) are your best bet.

The Intel flavor will swap the source and destination values, remove the ‘%’ and ‘$’ characters as well as do many, many other changes. Since you’re not using the AT&T syntax, it’s better to not explain the full differences between the two assembly flavors, and instead just learn the Intel format.

Take a look at the previous example, now shown in the Intel flavor and see how much cleaner it looks:

mov  rax, 0x78

Again, this will move the hexadecimal value 0x78 into the RAX register.

Compared to the AT&T flavor shown earlier, the Intel flavor swaps the source and destination operands. The destination operand now precedes the source operand. When working with assembly, it’s important that you always identify the correct flavor, since a different action could occur if you’re not clear which flavor you’re working with.

From here on out, the Intel flavor will be the path forward. If you ever see a numeric hexadecimal constant that begins with a $ character, or a register that begins with %, know that you’re in the wrong assembly flavor and should change it using the process described above.

Creating the cpx command

First of all, you’re going to create your own LLDB command to help later on.

command alias -H "Print value in ObjC context in hexadecimal" -h "Print in hex" -- cpx expression -f x -l objc -- 

Bits, bytes, and other terminology

Before you begin exploring memory, you need to be aware of some vocabulary about how memory is grouped. A value that can contain either a 1 or a 0 is known as a bit. You can say there are 64 bits per address in a 64-bit architecture. Simple enough.

(lldb) p sizeof('A')
(unsigned long) $0 = 1
(lldb) p/t 'A'
(char) $1 = 0b01000001
(lldb) p/x 'A'
(char) $2 = 0x41

The RIP register

Ah, the exact register to put on your gravestone.

@NSApplicationMain
class AppDelegate: NSObject, NSApplicationDelegate {

  func applicationWillBecomeActive(
    _ notification: Notification) {
      print("\(#function)")
      self.aBadMethod()
  }

  func aBadMethod() {
    print("\(#function)")
  }
  
  func aGoodMethod() {
    print("\(#function)")
  }
}

(lldb) cpx $rip
(unsigned long) $1 = 0x0000000100007c20
(lldb) image lookup -vrn ^Registers.*aGoodMethod

(lldb) register write rip 0x0000000100003a10

Registers and breaking up the bits

As mentioned in the previous chapter, x64 has 16 general purpose registers: RDI, RSI, RAX, RDX, RBP, RSP, RCX, RDX, R8, R9, R10, R11, R12, R13, R14 and R15.

(lldb) register write rdx 0x0123456789ABCDEF
(lldb) p/x $rdx 
(lldb) p/x $edx 
0x89abcdef
(lldb) p/x $dx
0xcdef
(lldb) p/x $dl
0xef
(lldb) p/x $dh  

Registers R8 to R15

Since the R8 to R15 family of registers were created only for 64-bit architectures, they use a completely different format for signifying their smaller counterparts.

(lldb) register write $r9 0x0123456789abcdef
(lldb) p/x $r9
(lldb) p/x $r9d
(lldb) p/x $r9w
(lldb) p/x $r9l

Breaking down the memory

Now that you’ve taken a look at the instruction pointer, it’s time to further explore the memory behind it.

(lldb) cpx $rip
(lldb) memory read -fi -c1 0x100007c20
->  0x100007c20:  55  push   rbp
(lldb) expression -f i -l objc -- 0x55
(int) $0 = 55  push   rbp

(lldb) p/i 0x55
(lldb) memory read -fi -c4 0x100007c20
(lldb) x/4i 0x100007c20
0x100007c20: 55                      push rbp
0x100007c21: 48 89 e5                mov  rbp, rsp
0x100007c24: 41 55                   push r13
0x100007c26: 48 81 ec a8 00 00 00    sub  rsp, 0xa8
(lldb) p/i 0x4889e5
e5 89  inl    $0x89, %eax

Endianness… this stuff is reversed?

The x64 as well as the ARM family architecture devices all use little-endian, which means that data is stored in memory with the least significant byte first. If you were to store the number 0xabcd in memory, the 0xcd byte would be stored first, followed by the 0xab byte.

(lldb) p/i 0xe58948
(Int) $R1 = 48 89 e5  mov    rbp, rsp
(lldb) memory read -s1 -c20 -fx 0x100003840
0x100003840: 0x55 0x48 0x89 0xe5 0x48 0x83 0xec 0x60
0x100003848: 0xb8 0x01 0x00 0x00 0x00 0x89 0xc1 0x48
0x100003850: 0x89 0x7d 0xf8 0x48
(lldb) memory read -s2 -c10 -fx 0x100003840
0x100003840: 0x4855 0xe589 0x8348 0x60ec 0x01b8 0x0000 0x8900 0x48c1
0x100003850: 0x7d89 0x48f8
(lldb) memory read -s4 -c5 -fx 0x100003840
0x100003840: 0xe5894855 0x60ec8348 0x000001b8 0x48c18900
0x100003850: 0x48f87d89

Where to go from here?

Good job getting through this one. Memory layout can be a confusing topic. Try exploring memory on other devices to make sure you have a solid understanding of the little-endian architecture and how assembly is grouped together.

Have a technical question? Want to report a bug? You can ask questions and report bugs to the book authors in our official book forum here.
© 2024 Kodeco Inc.

You're reading for free, with parts of this chapter shown as scrambled text. Unlock this book, and our entire catalogue of books and videos, with a Kodeco Personal Plan.

Unlock now