Home iOS & Swift Books Advanced Apple Debugging & Reverse Engineering

28
SB Examples, Malloc Logging Written by Derek Selander

Heads up... You're reading this book for free, with parts of this chapter shown beyond this point as scrambled text.

You can unlock the rest of this book, and our entire catalogue of books and videos, with a raywenderlich.com Professional subscription.

For the final chapter in this section, you’ll go through the same steps I myself took to understand how the MallocStackLogging environment variable is used to get the stack trace when an object is created.

From there, you’ll create a custom LLDB command which gives you the stack trace of when an object was allocated or deallocated in memory — even after the stack trace is long gone from the debugger.

Knowing the stack trace of where an object was created in your program is not only useful for reverse engineering, but also has great use cases in your typical day-to-day debugging. When a process crashes, it’s incredibly helpful to know the history of that memory and any allocation or deallocation events that occurred before your process went off the deep end.

This is another example of a script using stack-related logic, but this chapter will focus on the complete cycle of how to explore, learn, then implement a rather powerful custom command.

Setting up the scripts

You have a couple of scripts to use (and implement!) for this chapter. Let’s go through each one of them and how you’ll use them:

  • msl.py: This is the command (which is an abbreviation for MallocStackLogging) is the script you’ll be working on in this chapter. This has a basic skeleton of the logic.

  • lookup.py: Wait — you already made this command, right? Yes, but I’ll give you my own version of the lookup command that adds a couple of additional options at the price of uglier code. You’ll use one of the options to filter your searches to specific modules within a process.

  • sbt.py: This command will take a backtrace with unsymbolicated symbols, and symbolicate it. You made this in the previous chapter, and you’ll need it at the very end of this chapter. And in case you didn’t work through the previous chapter, it’s included in this chapter’s resources for you to install.

  • search.py: This command will enumerate all objects in the heap and search for a particular subclass. This is a very convenient command for quickly grabbing references to instances of a particular class.

Note: These scripts come from https://github.com/DerekSelander/lldb. If I need a tool that I don’t have, I’ll build it, and stick it in the above repo. Check it out for some other novel ideas for LLDB scripts. It’s important to note that a lot of scripts in the above repo have dependencies on other files included in the repo, so if you only download one script, it might not compile until the full set of files is included.

Now for the usual setup. Take all the Python files found in the starter directory for this chapter and copy them into your ~/lldb directory. I am assuming you have the lldbinit.py file already set up, found in Chapter 26, “SB Examples, Improved Lookup.”

Launch an LLDB session in Terminal and go through all the help commands to make sure each script has loaded successfully:

(lldb) help msl
(lldb) help lookup
(lldb) help sbt
(lldb) help search

MallocStackLogging explained

In case you’re unfamiliar with the MallocStackLogging environment variable, I’ll describe it and show how it’s typically used.

ShadesOfRay(12911,0x104e663c0) malloc: stack logs being written into /tmp/stack-logs.12911.10d42a000.ShadesOfRay.gjehFY.index

ShadesOfRay(12911,0x104e663c0) malloc: recording malloc and VM allocation stacks to disk using standard recorder

ShadesOfRay(12911,0x104e663c0) malloc: process 12673 no longer exists, stack logs deleted from /tmp/stack-logs.12673.11b51d000.ShadesOfRay.GVo3li.index

Plan of attack

You know it’s possible to grab a stack trace for an instantiated object, but you’re going to do one better than Apple.

Hunting in getenv

MallocStackLogging is an environment variable passed into the process. This means the C getenv function is likely used to check if this argument is supplied, and perform additional logic if it is.

  * frame #0: 0x0000000112b4da26 libsystem_c.dylib`getenv
    frame #1: 0x0000000112c7dd53 libsystem_malloc.dylib`_malloc_initialize + 466
    frame #2: 0x0000000112ddcac1 libsystem_platform.dylib`_os_once + 36
    frame #3: 0x0000000112c7d849 libsystem_malloc.dylib`default_zone_malloc + 77
    frame #4: 0x0000000112c7d259 libsystem_malloc.dylib`malloc_zone_malloc + 103
    frame #5: 0x0000000112c7f44a libsystem_malloc.dylib`malloc + 24
    frame #6: 0x0000000112aa2947 libdyld.dylib`tlv_load_notification + 286
    frame #7: 0x000000010e0f68a9 dyld_sim`dyld::registerAddCallback(void (*)(mach_header const*, long)) + 134
    frame #8: 0x0000000112aa1a0d libdyld.dylib`_dyld_register_func_for_add_image + 61
    frame #9: 0x0000000112aa1be7 libdyld.dylib`_dyld_initializer + 47
frame #1: 0x0000000112c7dd53 libsystem_malloc.dylib`_malloc_initialize + 466
(lldb) lookup . -m libsystem_malloc.dylib
(lldb) lookup (?i)log -m libsystem_malloc.dylib
create_log_file

open_log_file_from_directory

__mach_stack_logging_get_frames

turn_off_stack_logging

turn_on_stack_logging

Googling JIT function candidates

Google for any code pertaining to turn_on_stack_logging. Take a look at this search query:

typedef enum {
  stack_logging_mode_none = 0,
  stack_logging_mode_all,
  stack_logging_mode_malloc,
  stack_logging_mode_vm,
  stack_logging_mode_lite
} stack_logging_mode_type;

extern boolean_t turn_on_stack_logging(stack_logging_mode_type mode);

Exploring __mach_stack_logging_get_frames

Fortunately, for your exploration efforts, __mach_stack_logging_get_frames can also be found in the same header file. This function signature looks like the following:

extern kern_return_t __mach_stack_logging_get_frames(
                                        task_t task,   
                          mach_vm_address_t address, 
             mach_vm_address_t *stack_frames_buffer,
                          uint32_t max_stack_frames,
                                   uint32_t *count);
    /* Gets the last allocation record (malloc, realloc, or free) about address */
task_t task = mach_task_self();
/* Omitted code.... */
    stack_entry->address = addr;
    stack_entry->type_flags = stack_logging_type_alloc;
    stack_entry->argument = 0;
    stack_entry->num_frames = 0;
    stack_entry->frames[0] = 0;

    err = __mach_stack_logging_get_frames(task, 
                       (mach_vm_address_t)addr,
                           stack_entry->frames, 
                                    MAX_FRAMES,
                      &stack_entry->num_frames);

    if (err == 0 && stack_entry->num_frames > 0) {
      // Terminate the frames with zero if there is room
      if (stack_entry->num_frames < MAX_FRAMES)
        stack_entry->frames[stack_entry->num_frames] = 0;
    } else {
      g_malloc_stack_history.clear();
    }
  }
}

Testing the functions

To prevent you from getting bored to tears, I’ve already implemented the logic for the __mach_stack_logging_get_frames inside the app.

void trace_address(mach_vm_address_t addr) {

  typedef struct LLDBStackAddress {
    mach_vm_address_t *addresses;
    uint32_t count = 0;
  } LLDBStackAddress;   // 1 
  
  LLDBStackAddress stackaddress; // 2
  __unused mach_vm_address_t address = (mach_vm_address_t)addr;
  __unused task_t task = mach_task_self_;  // 3

  stackaddress.addresses = (mach_vm_address_t *)calloc(100,
                                sizeof(mach_vm_address_t)); // 4

  __mach_stack_logging_get_frames(task, 
                               address, 
                stackaddress.addresses, 
                                   100, 
                  &stackaddress.count); // 5
  
  // 6
  for (int i = 0; i < stackaddress.count; i++) {
    
    printf("[%d] %llu\n", i, stackaddress.addresses[i]);
  }
  
  free(stackaddress.addresses); // 7
}

LLDB testing

Make sure the app is running, then tap the Generate a Ray! button. Pause execution and enter the following into LLDB:

(lldb) search RayView -b
(lldb) search RayView -b
RayView * [0x00007fa838414330]
RayView * [0x00007fa8384125f0]
RayView * [0x00007fa83860c000]
(lldb) po trace_address(0x00007fa838414330)
[0] 4533269637
[1] 4460190625
[2] 4460232164
[3] 4454012240
[4] 4478307618
[5] 4482741703
[6] 4478307618
[7] 4479898204
[8] 4479898999
[9] 4479899371
...
(lldb) image lookup -a 4533269637
Address: libsystem_malloc.dylib[0x000000000000f485] (libsystem_malloc.dylib.__TEXT.__text + 56217)
Summary: libsystem_malloc.dylib`calloc + 30
(lldb) script print lldb.SBAddress(4454012240, lldb.target)
ShadesOfRay`-[ViewController generateRayViewTapped:] + 64 at ViewController.m:38

Navigating a C array with lldb.value

You’ll again use the lldb.value class to parse the return value of this C struct which was generated inline while executing this function.

(lldb) e -lobjc++ -O -i0 -- trace_address(0x00007fa838414330)
(lldb) script print lldb.frame.FindVariable('stackaddress')
(LLDBStackAddress) stackaddress = {
  addresses = 0x00007fa838515cd0
  count = 25
}
(lldb) script a = lldb.value(lldb.frame.FindVariable('stackaddress'))
(lldb) script print a
(lldb) script print a.count
(uint32_t) count = 25
(lldb) script print a.addresses[0]
(lldb) script print a.addresses[3]
(mach_vm_address_t) [3] = 4454012240

Turning numbers into stack frames

Included within the starter directory for this chapter is the msl.py script for malloc script logging. You’ve already installed this msl.py script earlier in the “Setting up the scripts” section.

command_args = shlex.split(command)
parser = generateOptionParser()
try:
    (options, args) = parser.parse_args(command_args)
except:
    result.SetError(parser.usage)
    return

cleanCommand = args[0]
process = debugger.GetSelectedTarget().GetProcess()
frame = process.GetSelectedThread().GetSelectedFrame()
target = debugger.GetSelectedTarget()
# 1
script = generateScript(cleanCommand, options)

# 2
sbval = frame.EvaluateExpression(script, generateOptions())

# 3
if sbval.error.fail: 
    result.AppendMessage(str(sbval.error))
    return

val = lldb.value(sbval)
addresses = []

# 4
for i in range(val.count.sbvalue.unsigned):
    address = val.addresses[i].sbvalue.unsigned
    sbaddr = target.ResolveLoadAddress(address)
    loadAddr = sbaddr.GetLoadAddress(target)
    addresses.append(loadAddr)

# 5
retString = processStackTraceStringFromAddresses(
                                        addresses, 
                                           target)

# 6
freeExpr = 'free('+str(val.addresses.sbvalue.unsigned)+')'
frame.EvaluateExpression(freeExpr, generateOptions())
result.AppendMessage(retString)
(lldb) reload_script
(lldb) search RayView -b
(lldb) search UIView -m ShadesOfRay -b
(lldb) msl 0x00007fa838414330
frame #0 : 0x11197d485 libsystem_malloc.dylib`calloc + 30
frame #1 : 0x10d3cbba1 libobjc.A.dylib`class_createInstance + 85
frame #2 : 0x10d3d5de4 libobjc.A.dylib`_objc_rootAlloc + 42
frame #3 : 0x10cde7550 ShadesOfRay`-[ViewController generateRayViewTapped:] + 64
frame #4 : 0x10e512d22 UIKit`-[UIApplication sendAction:to:from:forEvent:] + 83

Stack trace from a Swift object

OK — I know you want me to talk about Swift code. You’ll cover a Swift example as well.

public final class SomeSwiftCode {
  private init() {}
  static let shared = SomeSwiftCode()
}
(lldb) e -lswift -O -- import SomeSwiftModule
(lldb) e -lswift -O -- SomeSwiftCode.shared
<SomeSwiftCode: 0x600000033640>
(lldb) search SwiftObject
<__NSArrayM 0x6000004578b0>(
SomeSwiftModule.SomeSwiftCode
)
(lldb) search SwiftObject -b
_TtC15SomeSwiftModule13SomeSwiftCode * [0x0000600000033640]
(lldb) msl 0x0000600000033640

DRY Python code

Stop the app! In the schemes, select the Stripped 50 Shades of Ray Xcode scheme.

(lldb) search RayView -b
RayView * [0x00007fc23eb00620]
(lldb) msl 0x00007fc23eb00620
(lldb) po turn_on_stack_logging(1)

(lldb) msl 0x00007f8250f0a170

import sbt
retString = sbt.processStackTraceStringFromAddresses(
                                            addresses, 
                                               target)
if options.resymbolicate:
    retString = sbt.processStackTraceStringFromAddresses(
                                                addresses, 
                                                   target)
else:
    retString = processStackTraceStringFromAddresses(
                                        addresses, 
                                           target)
debugger.HandleCommand('command alias enable_logging expression -lobjc -O -- extern void turn_on_stack_logging(int); turn_on_stack_logging(1);')
(lldb) reload_script
(lldb) msl 0x00007f8250f0a170 -r

Where to go from here?

Hopefully, this full circle of idea, research & implementation has proven useful and even inspired you to create your own scripts. There’s a lot of power hidden quietly away in the many frameworks that already exist on your [i|mac|tv|watch]OS device.

Have a technical question? Want to report a bug? You can ask questions and report bugs to the book authors in our official book forum here.

Have feedback to share about the online reading experience? If you have feedback about the UI, UX, highlighting, or other features of our online readers, you can send them to the design team with the form below:

© 2021 Razeware LLC

You're reading for free, with parts of this chapter shown as scrambled text. Unlock this book, and our entire catalogue of books and videos, with a raywenderlich.com Professional subscription.

Unlock Now

To highlight or take notes, you’ll need to own this book in a subscription or purchased by itself.