Unsafe Swift: Using Pointers and Interacting With C

In this tutorial, you’ll learn how to use unsafe Swift to directly access memory through a variety of pointer types. By Brody Eller.

Leave a rating/review
Download materials
Save for later
Share
You are currently viewing page 3 of 4 of this article. Click here to view the first page.

Don’t Return the Pointer From withUnsafeBytes!

// Rule #1
do {
  print("1. Don't return the pointer from withUnsafeBytes!")
  
  var sampleStruct = SampleStruct(number: 25, flag: true)
  
  let bytes = withUnsafeBytes(of: &sampleStruct) { bytes in
    return bytes // strange bugs here we come ☠️☠️☠️
  }
  
  print("Horse is out of the barn!", bytes) // undefined!!!
}

You should never let the pointer escape the withUnsafeBytes(of:) closure. Even if your code works today, it may cause strange bugs in the future.

Only Bind to One Type at a Time!

// Rule #2
do {
  print("2. Only bind to one type at a time!")
  
  let count = 3
  let stride = MemoryLayout<Int16>.stride
  let alignment = MemoryLayout<Int16>.alignment
  let byteCount = count * stride
  
  let pointer = UnsafeMutableRawPointer.allocate(
    byteCount: byteCount,
    alignment: alignment)
  
  let typedPointer1 = pointer.bindMemory(to: UInt16.self, capacity: count)
  
  // Breakin' the Law... Breakin' the Law (Undefined behavior)
  let typedPointer2 = pointer.bindMemory(to: Bool.self, capacity: count * 2)
  
  // If you must, do it this way:
  typedPointer1.withMemoryRebound(to: Bool.self, capacity: count * 2) {
    (boolPointer: UnsafeMutablePointer<Bool>) in
    print(boolPointer.pointee) // See Rule #1, don't return the pointer
  }
}

Never bind memory to two unrelated types at once. This is called Type Punning and Swift does not like puns. :]

Instead, temporarily rebind memory with a method like withMemoryRebound(to:capacity:).

Also, it is illegal to rebind from a trivial type, such as an Int, to a non-trivial type, such as a class. Don’t do it.

Don’t Walk Off the End… Whoops!

// Rule #3... wait
do {
  print("3. Don't walk off the end... whoops!")
  
  let count = 3
  let stride = MemoryLayout<Int16>.stride
  let alignment = MemoryLayout<Int16>.alignment
  let byteCount =  count * stride
  
  let pointer = UnsafeMutableRawPointer.allocate(
    byteCount: byteCount,
    alignment: alignment)
  let bufferPointer = UnsafeRawBufferPointer(start: pointer, count: byteCount + 1) 
  // OMG +1????
  
  for byte in bufferPointer {
    print(byte) // pawing through memory like an animal
  }
}

The ever-present problem of off-by-one errors becomes even worse with unsafe code. Be careful, review and test!

Unsafe Swift Example 1: Compression

Time to take all your knowledge and use it to wrap a C API. Cocoa includes a C module that implements some common data compression algorithms. These include:

  • LZ4 for when speed is critical.
  • LZ4A for when you need the highest compression ratio and don’t care about speed.
  • ZLIB, which balances space and speed.
  • The new, open-source LZFSE, which does an even better job balancing space and speed.

Now, open the Compression playground in the begin project.

First, you’ll define a pure Swift API using Data by replacing the contents of your playground with the following code:

import Foundation
import Compression

enum CompressionAlgorithm {
  case lz4   // speed is critical
  case lz4a  // space is critical
  case zlib  // reasonable speed and space
  case lzfse // better speed and space
}

enum CompressionOperation {
  case compression, decompression
}

/// return compressed or uncompressed data depending on the operation
func perform(
  _ operation: CompressionOperation,
  on input: Data,
  using algorithm: CompressionAlgorithm,
  workingBufferSize: Int = 2000) 
    -> Data?  {
  return nil
}

The function that does the compression and decompression is perform, which is currently stubbed out to return nil. You’ll add some unsafe code to it shortly.

Next, add the following code to the end of the playground:

/// Compressed keeps the compressed data and the algorithm
/// together as one unit, so you never forget how the data was
/// compressed.
struct Compressed {
  let data: Data
  let algorithm: CompressionAlgorithm
  
  init(data: Data, algorithm: CompressionAlgorithm) {
    self.data = data
    self.algorithm = algorithm
  }
  
  /// Compresses the input with the specified algorithm. Returns nil if it fails.
  static func compress(
    input: Data,with algorithm: CompressionAlgorithm) 
      -> Compressed? {
    guard let data = perform(.compression, on: input, using: algorithm) else {
      return nil
    }
    return Compressed(data: data, algorithm: algorithm)
  }
  
  /// Uncompressed data. Returns nil if the data cannot be decompressed.
 func decompressed() -> Data? {
    return perform(.decompression, on: data, using: algorithm)
  }
}

The Compressed structure stores both the compressed data and the algorithm used to create it. That makes it less error-prone when deciding what decompression algorithm to use.

Next, add the following code to the end of the playground:

/// For discoverability, adds a compressed method to Data
extension Data {
  /// Returns compressed data or nil if compression fails.
  func compressed(with algorithm: CompressionAlgorithm) -> Compressed? {
    return Compressed.compress(input: self, with: algorithm)
  }
}

// Example usage:

let input = Data(Array(repeating: UInt8(123), count: 10000))

let compressed = input.compressed(with: .lzfse)
compressed?.data.count // in most cases much less than original input count

let restoredInput = compressed?.decompressed()
input == restoredInput // true

The main entry point is an extension on the Data type. You’ve added a method called compressed(with:) which returns an optional Compressed struct. This method simply calls the static method compress(input:with:) on Compressed.

There’s an example at the end, but it’s currently not working. Time to fix that!

Scroll up to the first block of code you entered and begin the implementation of perform(_:on:using:workingBufferSize:) inserting the following before return nil:

// set the algorithm
let streamAlgorithm: compression_algorithm
switch algorithm {
case .lz4:   streamAlgorithm = COMPRESSION_LZ4
case .lz4a:  streamAlgorithm = COMPRESSION_LZMA
case .zlib:  streamAlgorithm = COMPRESSION_ZLIB
case .lzfse: streamAlgorithm = COMPRESSION_LZFSE
}
  
// set the stream operation and flags
let streamOperation: compression_stream_operation
let flags: Int32
switch operation {
case .compression:
  streamOperation = COMPRESSION_STREAM_ENCODE
  flags = Int32(COMPRESSION_STREAM_FINALIZE.rawValue)
case .decompression:
  streamOperation = COMPRESSION_STREAM_DECODE
  flags = 0
}

This converts your Swift types to the C types required for the compression algorithm.

Next, replace return nil with:

// 1: create a stream
var streamPointer = UnsafeMutablePointer<compression_stream>.allocate(capacity: 1)
defer {
  streamPointer.deallocate()
}

// 2: initialize the stream
var stream = streamPointer.pointee
var status = compression_stream_init(&stream, streamOperation, streamAlgorithm)
guard status != COMPRESSION_STATUS_ERROR else {
  return nil
}
defer {
  compression_stream_destroy(&stream)
}

// 3: set up a destination buffer
let dstSize = workingBufferSize
let dstPointer = UnsafeMutablePointer<UInt8>.allocate(capacity: dstSize)
defer {
  dstPointer.deallocate()
}

return nil // To be continued

Here’s what’s happening:

The compiler is doing something special here: It’s using the in-out & marker to take your compression_stream and turn it into an UnsafeMutablePointer<compression_stream>. Alternatively, you could have passed streamPointer. Then you wouldn’t need this special conversion.

  1. Allocate a compression_stream and schedule it for deallocation with the defer block.
  2. Then, using the pointee property, you get the stream and pass it to the compression_stream_init function.
  3. Finally, you create a destination buffer to act as your working buffer.

Next, finish perform by replacing the final return nil with:

// process the input
return input.withUnsafeBytes { srcRawBufferPointer in
  // 1
  var output = Data()
  
  // 2
  let srcBufferPointer = srcRawBufferPointer.bindMemory(to: UInt8.self)
  guard let srcPointer = srcBufferPointer.baseAddress else {
    return nil
  }
  stream.src_ptr = srcPointer
  stream.src_size = input.count
  stream.dst_ptr = dstPointer
  stream.dst_size = dstSize
  
  // 3
  while status == COMPRESSION_STATUS_OK {
    // process the stream
    status = compression_stream_process(&stream, flags)
    
    // collect bytes from the stream and reset
    switch status {
      
    case COMPRESSION_STATUS_OK:
      // 4
      output.append(dstPointer, count: dstSize)
      stream.dst_ptr = dstPointer
      stream.dst_size = dstSize
      
    case COMPRESSION_STATUS_ERROR:
      return nil
      
    case COMPRESSION_STATUS_END:
      // 5
      output.append(dstPointer, count: stream.dst_ptr - dstPointer)
      
    default:
      fatalError()
    }
  }
  return output
}

This is where the work really happens. And here’s what it’s doing:

  1. Create a Data object which will contain the output — the compressed or decompressed data, depending on what operation this is.
  2. Set up the source and destination buffers with the pointers you allocated and their sizes.
  3. Here, you keep calling compression_stream_process as long as it returns COMPRESSION_STATUS_OK.
  4. You then copy the destination buffer into output that’s eventually returned from this function.
  5. When the last packet comes in, marked with COMPRESSION_STATUS_END, you potentially only need to copy part of the destination buffer.

In this example, you can see that the 10,000-element array gets compressed down to 153 bytes. Not too shabby.