This course is available as part of the Professional subscription. To learn more click here

Detect Hand & Body Poses with Vision in iOS

Jun 1 2021 · Video Course (37 mins) · Intermediate

Learn how to detect hand and body landmarks in live video with help from the Vision framework. Explore the kind of data Vision can provide, and use it to evaluate hand and body poses for simple gestures.


  • Swift 5.3, iOS 14, Xcode 12.4

Detect Hand & Body Poses with Vision in iOS

Learn a bit about what the Vision framework is, how it relates to other Apple frameworks, and how we’ll use it in this course.


Get a short tour of the sample app, and prepare a live camera feed with AVFoundation.


Start detecting hands from live video with a VNDetectHumanHandPoseRequest, and get back data about all of the visible joints in your hands with VNHumanHandPoseObservation.


Learn about how all your hand bones are connected together in this quick anatomy lesson!


Explore and organize the hand landmark data received from Vision’s VNHumanHandPoseObservations.


Convert Vision’s landmarks for use in a SwiftUI view to display fingertip locations.


Collect specific hand landmarks and evaluate their positions for potential hand gesture matches.


The thumbs are a bit different than the rest of the phalanges! Learn how to handle thumb data.


Detecting whole bodies with Vision is quite similar to the hand detection you’ve already done. Try it out!


Collect specific body landmarks and evaluate their positions for potential body pose matches.


Who is this for?

This course will be best for experienced iOS developers who are comfortable with Swift. It focuses exclusively on using the Vision framework to detect and evaluate hand and body poses via live video capture. The UI is not a focus of this course, but the project does use SwiftUI and UIKit.

Covered concepts

  • Vision
  • AVFoundation
  • Hand Pose Request & Observation
  • Hand Anatomy
  • VNRecognizedPoint