Home iOS & Swift Books Advanced Apple Debugging & Reverse Engineering

13
Assembly & the Stack Written by Derek Selander

Heads up... You're reading this book for free, with parts of this chapter shown beyond this point as scrambled text.

You can unlock the rest of this book, and our entire catalogue of books and videos, with a raywenderlich.com Professional subscription.

In x86_64, when there are more than six parameters passed into a function, the excess parameters are passed through the stack (there’s situations when this is not true, but one thing at a time, young grasshopper). But what does being passed on the stack mean exactly? It’s time to take a deeper dive into what happens when a function is called from an assembly standpoint by exploring some “stack related” registers as well as the contents in the stack.

Understanding how the stack works is useful when you’re reverse engineering programs, since you can help deduce what parameters are being manipulated in a certain function when no debugging symbols are available.

Let’s begin.

The stack, revisited

As discussed previously in Chapter 6, “Thread, Frame & Stepping Around”, when a program executes, the memory is laid out so the stack starts at a “high address” and grows downward, towards a lower address; that is, towards the heap.

Note: In some architectures, the stack grows upwards. But for x64 and ARM for iOS devices, the two you care about, both grow the the stack downwards.

Confused? Here’s an image to help clarify how the stack moves.

The stack starts at a high address. How high, exactly, is determined by the operating system’s kernel. The kernel gives stack space to each running program (well, each thread).

The stack is finite in size and increases by growing downwards in memory address space. As space on the stack is used up, the pointer to the “top” of the stack moves down from the highest address to the lowest address.

Once the stack reaches the finite size given by the kernel, or if it crosses the bounds of the heap, the stack is said to overflow. This is a fatal error, often referred to as a stack overflow. Now you know where your favorite website gets its name from!

Stack pointer & base pointer registers

Two very important registers you’ve yet to learn about are the RSP and RBP. The stack pointer register, RSP, points to the head of the stack for a particular thread. The head of the stack will grow downwards, so the RSP will decrement when items are added to the stack. The RSP will always point to the head of the stack.

Stack related opcodes

So far, you’ve learned about the calling convention and how the memory is laid out, but haven’t really explored what the many opcodes actually do in x64 assembly. It’s time to focus on several stack related opcodes in more detail.

The ‘push’ opcode

When anything such as an int, Objective-C instance, Swift class or a reference needs to be saved onto the stack, the push opcode is used. push decrements the stack pointer (remember, the stack grows downward), then stores the value assigned to the memory address pointed at by the new RSP value.

push 0x5
RSP = RSP - 0x8 
*RSP = 0x5

The ‘pop’ opcode

The pop opcode is the exact opposite of the push opcode. pop takes the value from the RSP register and stores it to a destination. Next, the RSP is incremented by 0x8 because, again, as the stack gets smaller, it will grow to a higher address.

pop rdx
RDX = *RSP
RSP = RSP + 0x8

The ‘call’ opcode

The call opcode is responsible for executing a function. call pushes the address of where to return to after the called function completes; then jumps to the function.

0x7fffb34de913 <+227>: call   0x7fffb34df410            
0x7fffb34de918 <+232>: mov    edx, eax
RIP = 0x7fffb34de918
RSP = RSP - 0x8
*RSP = RIP
RIP = 0x7fffb34df410

The ‘ret’ opcode

The ret opcode is the opposite of the call opcode, in that it pops the top value off the stack (which will be the return address pushed on by the call opcode, provided the assembly’s pushes and pops match) then sets the RIP register to this address. Thus execution goes back to where the function was called from.

Observing RBP & RSP in action

Now that you have an understanding of the RBP and RSP registers, as well as the four opcodes that manipulate the stack, it’s time to see it all in action.

override func awakeFromNib() {
  super.awakeFromNib()
  StackWalkthrough(5)
}

push  %rbp       ; Push contents of RBP onto the stack (*RSP = RBP, RSP decreases)

movq  %rsp, %rbp ; RBP = RSP
movq  $0x0, %rdx ; RDX = 0
movq  %rdi, %rdx ; RDX = RDI
push  %rdx       ; Push contents of RDX onto the stack (*RSP = RDX, RSP decreases)

movq  $0x0, %rdx ; RDX = 0
pop   %rdx       ; Pop top of stack into RDX (RDX = *RSP, RSP increases)

pop   %rbp       ; Pop top of stack into RBP (RBP = *RSP, RSP increases)

ret              ; Return from function (RIP = *RSP, RSP increases)

(lldb) command alias dumpreg register read rsp rbp rdi rdx
(lldb) dumpreg
rsp = 0x00007fff5fbfe820
rbp = 0x00007fff5fbfe850
rdi = 0x0000000000000005
rdx = 0x0040000000000000

(lldb) si

(lldb) x/gx $rsp 

(lldb) x/gx $rsp 
(lldb) p/x $rbp

(lldb) p (BOOL)($rbp == $rsp)

(lldb) p/x $rsp 
(lldb) x/gx $rsp 

The stack and 7+ parameters

As described in Chapter 11, the calling convention for x86_64 will use the following registers for function parameters in order: RDI, RSI, RDX, RCX, R8, R9. When a function requires more than six parameters, the stack needs to be used.

_ = self.executeLotsOfArguments(one: 1, two: 2, three: 3,
                                four: 4, five: 5, six: 6,
                                seven: 7, eight: 8, nine: 9,
                                ten: 10)

0x1000013e2 <+178>: mov    qword ptr [rsp], 0x7
0x1000013ea <+186>: mov    qword ptr [rsp + 0x8], 0x8
0x1000013f3 <+195>: mov    qword ptr [rsp + 0x10], 0x9
0x1000013fc <+204>: mov    qword ptr [rsp + 0x18], 0xa

The stack and debugging info

The stack is not only used when calling functions, but it’s also used as a scratch space for a function’s local variables. Speaking of which, how does the debugger know which addresses to reference when printing out the names of variables that belong to that function?

(lldb) image dump symfile Registers
Swift.String, type_uid = 0x300000222
0x7f9b4633a988:     Block{0x300000222}, ranges = [0x1000035e0-0x100003e7f)
0x7f9b48171a20:       Variable{0x30000023f}, name = "one", type = {d50e000003000000} 0x00007f9b4828d2a0 (Swift.Int), scope = parameter, decl = ViewController.swift:39, location =  DW_OP_fbreg(-32)
mov    qword ptr [rbp - 0x20], rdi

(lldb) po one
(lldb) si
(lldb) po one

Stack exploration takeaways

Don’t worry. This chapter is almost done. But there are some very important takeaways that should be remembered from your stack explorations.

Where to go from here?

Now that you’re familiar with the RBP and RSP registers, you’ve got a homework assignment!

(lldb) f 0 
push   rbp
mov    rbp, rsp
(lldb) p uintptr_t $Previous_RBP = *(uintptr_t *)$rsp
(lldb) x/gx '$Previous_RBP + 0x8'
0x7fff5fbfd718: 0x00007fffa83ed11b
(lldb) f 2
frame #2: 0x00007fffa83ed11b AppKit`-[NSWindow _setFrameCommon:display:stashSize:] + 3234
AppKit`-[NSWindow _setFrameCommon:display:stashSize:]:
    0x7fffa83ed11b <+3234>: xor    ebx, ebx
    0x7fffa83ed11d <+3236>: mov    rsi, qword ptr [rip + 0x1c5a9d8c] ; "_bindingAdaptor"
    0x7fffa83ed124 <+3243>: mov    rdi, r12
    0x7fffa83ed127 <+3246>: call   qword ptr [rip + 0x1c319f53] ; (void *)0x00007fffbee77b40: objc_msgSend

Have a technical question? Want to report a bug? You can ask questions and report bugs to the book authors in our official book forum here.

Have feedback to share about the online reading experience? If you have feedback about the UI, UX, highlighting, or other features of our online readers, you can send them to the design team with the form below:

© 2021 Razeware LLC

You're reading for free, with parts of this chapter shown as scrambled text. Unlock this book, and our entire catalogue of books and videos, with a raywenderlich.com Professional subscription.

Unlock Now

To highlight or take notes, you’ll need to own this book in a subscription or purchased by itself.