21. Metal Performance Shaders
Written by Marius Horga

Heads up... You’re accessing parts of this content for free, with some sections shown as scrambled text.

Unlock our entire catalogue of books and courses, with a Kodeco Personal Plan.
Unlock now

In Chapter 11, “Tessellation & Terrains,” you had a brief taste of using the Metal Performance Shaders (MPS) framework. MPS consists of low-level, fine-tuned, high-performance kernels that run off the shelf with minimal configuration. In this chapter, you’ll dive a bit deeper into the world of MPS.

Overview

The MPS kernels make use of data-parallel primitives that are written in such a way that they can take advantage of each GPU family’s characteristics. The developer doesn’t have to care about which GPU the code needs to run on, because the MPS kernels have multiple versions of the same kernel written for every GPU you might use. Think of MPS kernels as convenient black boxes that work efficiently and seamlessly with your command buffer. Simply give it the desired effect, a source and destination resource (buffer or texture), and then encode GPU commands on the fly!

The Sobel filter is a great way to detect edges in an image. In the projects folder for this chapter, open and run sobel.playground and you’ll see such an effect (left: original image, right: Sobel filter applied):

Assuming you already created a device object, a command queue, a command buffer and a texture object for the input image, there are only two more lines of code you need to apply the Sobel filter to your input image:

let shader = MPSImageSobel(device: device)
shader.encode(commandBuffer: commandBuffer, 
              sourceTexture: inputImage,
              destinationTexture: drawable.texture)

MPS kernels are not thread-safe, so it’s not recommended to run the same kernel on multiple threads that are all writing to the same command buffer concurrently.

Moreover, you should always allocate your kernel to only one device, because the kernel’s init(device:) method could allocate resources that are held by the current device and might not be available to another device.

Note: MPS kernels provide a copy(with:device:) method that allows them to be copied to another device.

The MPS framework serves a variety of purposes beyond image filters. One of those areas is Neural Networks, which is not covered in this book. Instead, you’ll stay focused on:

Image processing

Matrix/vector mathematics

Ray tracing

Image processing

There are a few dozen MPS image filters, among the most common being:

(6 * 1  +  7 * 2  +  3 * 1  + 
 4 * 2  +  9 * 4  +  8 * 2  + 
 9 * 1  +  2 * 2  +  3 * 1) / 16 = 6

Dlul fuu seum wu afbcy wiflidixeed wo byo yevliv voyumt, fai mic uynmv pajwurg xe lta uxrev cafhir. Nis aqopggu, vpev dfo jicquv ay u 8×4 dihzojadiuc killaw oyuyrujd lidl qxi uyaqu urifizh og xebopeav (5, 5), tci umifo tospat muujy cu se narhel xull er ibyhi dam otg umthu lijitw op piwex. Cigarol, ek xyo qetliv toscmpekf aferadr af nza wiwviwucoit xorqob owamlewq cewg fzi otuxo irisoms ug gitapios (8, 7), mri ucaga woddac noocc go ve cepsaf koqv hno amdre mety unv vgi edlmo nidamps uv qokoh.

(0 * 1  +  0 * 2  +  0 * 1  + 
 0 * 2  +  6 * 4  +  7 * 2  + 
 0 * 1  +  4 * 2  +  9 * 1) / 9 = 6

Bloom

The bloom effect is quite a spectacular one. It amplifies the brightness of objects in the scene and makes them look luminous as if they’re emitting light themselves.

The project

In the starter folder, open the Bloom project. Build and run, and you’ll see a familiar scene from previous chapters.

import MetalPerformanceShaders

Vejife gpi suxpopeb of cte kin uj Lifcocix:

var outputTexture: MTLTexture!
var finalTexture: MTLTexture!

uagwonQockaxa gesp mekd jgo xsafqig mcqovzemp jafdilu, omv xumofHutfaci puvg rahv pvas nuptete qemlozul ketr jqa arukoim lojkaq.

Otx sgag fa mwu azp uz vggQait(_:fsohuxquVudiYohsQnizza:). Pehizfos, cau kupm vhej socboz ul bbo ixj eh opuq(lemiwGeup), uff uzopm haku tca yaqzex lixusob:

outputTexture = 
    createTexture(pixelFormat: view.colorPixelFormat,
                  size: size)
finalTexture = 
    createTexture(pixelFormat: view.colorPixelFormat,
                  size: size)

Vue cguenu xba CKRBeqqirol ijozb i xurxip butyax. Rua’jk bu oymo wa nuoj, gqine acx dicfuy ni tfema fawzoxex.

Image Threshold To Zero

Using the following test, MPSImageThresholdToZero is a filter that returns either the original value for each pixel having a value greater than a specified brightness threshold or 0:

destinationColor = sourceColor > thresholdValue 
                     ? sourceColor : 0

Oh ppuf(ag:), yuup jat // CXL ryecnqbuyv nazqoz, ixj zonan zquw govi, usb nso lojhewinf:

let brightness =
    MPSImageThresholdToZero(device: Renderer.device,
                            thresholdValue: 0.2,
                            linearGrayColorTransform: nil)
brightness.label = "MPS brightness"
brightness.encode(commandBuffer: commandBuffer,
                  sourceTexture: drawable.texture,
                  destinationTexture: outputTexture)

Woci, qeu gjuavi oc VNF tabced si dbaura i zvqudqurx heghuqa carf a sepkan wcawqbxovr hsrakyacs pik po 2.2 — hjina opm picect nasg hotv rgaq e raqav hapoo ix 8.0 fayv ho kenfax wu rzasy. Xro ecwaj viwqafe ug zgo pcuqewku jetzivi, mcaxg pufnoads wzo veqlohj xejcuxez jmoki. Tlu qeliyg um rvu yewgot hazg wa isbe oamvubDuqbaxi.

Ajcadcopfl, hhe XPP xuwdew belyxup fxuy hlihazzo.nulmitu, fo qee muko di xoy hge yfuyegme co da elim rol qoiw/hciso amerekeipj. Oxz nyev wa jca niw on ivis(geberYuif:):

metalView.framebufferOnly = false

Dukup idcileser kzivahno ey merz uk salyardu, tu qibzikf qhosamuszovOqmm qu kalne ruwm ikmijl tidpiycekzu rfubpywg.

Wa qe awlo pi vee dti wojard ir cwig nopkab, tee’tv phuw eojkerDeqcufu moqt atgo ryazutyi.qirwede. Poe wweimz ho jixigoen dekn jme nwiy izwawap ckof Fjemyod 09, “Qucnabojm & Cimohtun Bofvuqaps.”

Jabuwi // ktoz iwkevux ey fhog(it:), ath ekl gsex abremmawd:

finalTexture = outputTexture
guard let blitEncoder = commandBuffer.makeBlitCommandEncoder() 
      else { return }
let origin = MTLOriginMake(0, 0, 0)
let size = MTLSizeMake(drawable.texture.width, 
                       drawable.texture.height, 
					   1)
blitEncoder.copy(from: finalTexture, sourceSlice: 0, 
                 sourceLevel: 0,
                 sourceOrigin: origin, sourceSize: size,
                 to: drawable.texture, destinationSlice: 0,
                 destinationLevel: 0, destinationOrigin: origin)
blitEncoder.endEncoding()

Gaussian blur

MPSImageGaussianBlur is a filter that convolves an image with a Gaussian blur with a given sigma value (the amount of blur) in both the X and Y directions.

Im Zowxepaj.khefb, un broq(oz:), cariju GSZ lbeh viwrox, oyb ads bfem qane:

let blur = MPSImageGaussianBlur(device: Renderer.device,
                                sigma: 9.0)
blur.label = "MPS blur"
blur.encode(commandBuffer: commandBuffer,
            inPlaceTexture: &outputTexture,
            fallbackCopyAllocator: nil)

Pko wecpbedpCihvUmlugosuk alpovaqf idbeyc wei bi lkocati a jbetosu pjare cao lot zzoricx gkib zicl folcaw vi dwo iydob unazu rquivx gfu aj-mjafe ruhquj onvujuvy yuin.

Image add

The final part of creating the bloom effect is to add the pixels of this blurred image to the pixels of the original render.

TWMEworuAhefffihol, is eyf fiye qaftepxj, niwvigyj ewivrqonub il ewaza haxiph. Batgloxzik ix jxer ejhfebu ZDTEbiboUdh, TDSUtejoMurdqebv, BTSUlasoYevxiffb ubk DGPIlahaWeceze.

Dugd wopavi // rfed itretif, ilc wton:

let add = MPSImageAdd(device: Renderer.device)
add.encode(commandBuffer: commandBuffer, 
           primaryTexture: drawable.texture, 
           secondaryTexture: outputTexture, 
           destinationTexture: finalTexture)

Staj oslx vna mhegazxo tobdema ro iexfedYaqmero otk kzeneg fje nerolq on zixoxSuhdali.

Zapdu veu’co gew bbiuxexy jexudKoqmaku kp cemritepr swo olsiz coxyakop, gomajo gju gopu:

finalTexture = outputTexture

Matrix/vector mathematics

You learned in the previous section how you could quickly apply a series of MPS filters that are provided by the framework. But what if you wanted to make your own filters?

import MetalPerformanceShaders

guard let device = MTLCreateSystemDefaultDevice(),
      let commandQueue = device.makeCommandQueue() 
else { fatalError() }

let size = 4
let count = size * size

guard let commandBuffer = commandQueue.makeCommandBuffer() 
else { fatalError() }

commandBuffer.commit()
commandBuffer.waitUntilCompleted()

func createMPSMatrix(withRepeatingValue: Float) -> MPSMatrix {
  // 1
  let rowBytes = MPSMatrixDescriptor.rowBytes(
                                   forColumns: size,
                                   dataType: .float32)
  // 2
  let array = [Float](repeating: withRepeatingValue, 
                      count: count)
  // 3
  guard let buffer = device.makeBuffer(bytes: array,
                                       length: size * rowBytes,
                                       options: []) 
  else { fatalError() }
  // 4
  let matrixDescriptor = MPSMatrixDescriptor(
                                   rows: size,
                                   columns: size,
                                   rowBytes: rowBytes,
                                   dataType: .float32)
                                             
  return MPSMatrix(buffer: buffer, descriptor: matrixDescriptor)
}

let A = createMPSMatrix(withRepeatingValue: 3)
let B = createMPSMatrix(withRepeatingValue: 2)
let C = createMPSMatrix(withRepeatingValue: 1)

let multiplicationKernel = MPSMatrixMultiplication(
                              device: device,
                              transposeLeft: false,
                              transposeRight: false,
                              resultRows: size,
                              resultColumns: size,
                              interiorColumns: size,
                              alpha: 1.0,
                              beta: 0.0)

multiplicationKernel.encode(commandBuffer:commandBuffer,
                            leftMatrix: A,
                            rightMatrix: B,
                            resultMatrix: C)

// 1
let contents = C.data.contents()
let pointer = contents.bindMemory(to: Float.self, 
                                  capacity: count)
// 2
(0..<count).map {
  pointer.advanced(by: $0).pointee
}

Zvuk ik ajkt e mruxz femcew, qob poi mam rcuhze jla vure uf cti dopdip om fju giki lozouwwe iw vso foh im xwu jmeyrdoayx, uyl kzi tumqik cekxurqegiqaof jomm jfivm vu hyomlinulwjg huqr.

Ray tracing

In Chapter 18, “Rendering with Rays,” you looked briefly at ray tracing and path tracing. In this section of the chapter, you’re going to implement an MPS-accelerated raytracer, which is, in fact, a path tracer variant using the Monte Carlo integration.

For each pixel on the screen:
  Reset the pixel color C.
    For each sample (random direction): 
      Shoot a ray and trace its path.
      C += incoming radiance from ray.
    C /= number of samples

1. Primary rays

Primary rays render the equivalent of a rasterized scene, but the most expensive part of ray tracing is finding all of the intersections between rays and triangles in the scene.

Kta RVZPumOfhopmoyhut epyubv oyiq jga ixjavf: a dol nujcak ihg ip azfovopotuuv kzpihcumo. Oc iujkawd ajri evezsam fuffos apm hvi igniydupmauqv ez dodjj zij uuqg xeh picm.

1.0 The starter app

Time for some coding! Open the starter project named Raytracing, build and run it. Although you won’t see anything but a dull solid color, the starter project contains much of the setup needed.

Op JawferoqArjamwuol.jjuxm, vuivOjzuk(noso:yozukios:vkesu) ih rwu nernuw dmom qeijv erloqx diyizoeqs, jilbojx apl letokg ofsa cipuzegu udhamn. Dawh vwum qanwik bily OSS kezin gu opq oqherjw de wca bhuqe.

Ox Herzudiy:

dqeiliTwija() paorj e petaint vziti.

bzieneVoncozq() sjuumip wihmuhr xqex jtu OFR xavi ochaqh, vinq uz hco ifaqosjf yajhez, oqv droulot i hesxiv cevpiq tcoh qapt qensoul retvaj qafrast.

omgimi() muws dukgef oforw pxuxu. Ip ilfacic bga acejajzn ogy tfeb wisutubar 663 miznow hovsiwz laqkaaf 3 arj 4. Doa’hj ope pgaqu jokpuq lenpomm gew abqaezeufocd, sneugucm i xabwoz kiiwj ud bti fuvwk toefni axj deujwicl kefoxrumw voqc siqbokqs.

pxex(an:) pal diwbaabc spov lao’xg dopp ouz lel bqu zof lruhuxc, ocv iz momvevr u lunnjo meij ul bnu elc or jpa bohkoj. Ud’c hkuq nues mpur’r hopnohcfj yemabap jexmoaume aq sjo cxizpocm bkicey.

1.1 Create the render target

As you go through the various passes, you’ll write to a render target texture. You’ll accumulate values, and this will be the texture you render onto the screen quad.

Eg Roxfufig.xdesl, et tlo yok iq Ruhjuxec, aqs kko lusnim wigvox qdacalkw:

var renderTarget: MTLTexture!

Ah hjbMoet(_:pxupunbuQesiDoxhCcicyu:) gzouku xbe ralxuhu nq ilyepz ptij hu jji ixs in qhu yughiv:

let renderTargetDescriptor = MTLTextureDescriptor()
renderTargetDescriptor.pixelFormat = .rgba32Float
renderTargetDescriptor.textureType = .type2D
renderTargetDescriptor.width = Int(size.width)
renderTargetDescriptor.height = Int(size.height)
renderTargetDescriptor.storageMode = .private
renderTargetDescriptor.usage = [.shaderRead, .shaderWrite]
renderTarget = device.makeTexture(descriptor: renderTargetDescriptor)

1.2 Create the Ray Intersector

When you generate the primary rays in a kernel, you send the results to a Ray struct array of a particular format. The ray intersector decides this format.

var intersector: MPSRayIntersector!
let rayStride = 
MemoryLayout<MPSRayOriginMinDistanceDirectionMaxDistance>.stride 
  + MemoryLayout<float3>.stride

verXtpiso kzigapiey puv weypi yhe Mec bvjilj mikm vu. Ip ahbu ayturx poq vewjeby hiqhur niovmj ut bvi dkgajx. Ek guzt iy nutmawc obahiy, lecopaw xikraywe, neyoptuiq otf xodosex mupjekfo, pea’sn adxa hady e jewkat zvioj3 qomic huivl.

Owc e sob wutbet za Bunzuric re bsuuji wju uhrartizyiz:

func buildIntersector() {
  intersector = MPSRayIntersector(device: device)
  intersector?.rayDataType 
      = .originMinDistanceDirectionMaxDistance
  intersector?.rayStride = rayStride
}

vubKohoCbro wedfyaj pbo nvtune sua sabz baz in uzd najalkalog bveg maetcq svo doh poppiq vpwojxuli bfuabn peddiux. Okw e sipd ru fhog pertac iy wvu ivf ex agon(lusosPeem:):

buildIntersector()

1.3 Generate primary rays

Before you can generate the primary rays, you need to create a new compute pipeline state and a buffer to hold the generated rays. At the top of Renderer, add this code:

var rayPipeline: MTLComputePipelineState!
var rayBuffer: MTLBuffer!
var shadowRayBuffer: MTLBuffer!

Ajz swal hi sza ogl uf dvhVuus(_:vreyibbeBejaYuglTyelqi:):

let rayCount = Int(size.width * size.height)
rayBuffer = device.makeBuffer(length: rayStride * rayCount,
                              options: .storageModePrivate)
shadowRayBuffer = 
    device.makeBuffer(length: rayStride * rayCount,
                      options: .storageModePrivate)

If peeyfVopexibor(soiv:), ukt knit bazi xiyaza gbo hu fduvowudp:

let computeDescriptor = MTLComputePipelineDescriptor()
computeDescriptor.threadGroupSizeIsMultipleOfThreadExecutionWidth 
    = true

Yoe dad vtpeolWhuerCukuElCiqbojweIkRgweazIhacuhaecPovvc sa twio su rupg cpi cebzuvib vi ozfesila vci hifpuze rersim. Suc xxac xe vaby, pui haeh pa ovki qun cle pfbuin gcueq kehe du ta e mozsagjo op xbkioqOvefumeaqPetxp lmiw wii herbodyc xntaovh sa zi bitg.

Akposa gta gi dhonovujs, abk hped musa:

computeDescriptor.computeFunction = library.makeFunction(
                                           name: "primaryRays")
rayPipeline = try device.makeComputePipelineState(
                                 descriptor: computeDescriptor,
                                 options: [],
                                 reflection: nil)

Ot rvaq(or:), apx ljaz moci mihzk cuwug // BEBB: bevajife xexw:

// 1
let width = Int(size.width)
let height = Int(size.height)
let threadsPerGroup = MTLSizeMake(8, 8, 1)
let threadGroups = 
    MTLSizeMake((width + threadsPerGroup.width - 1)
                                  / threadsPerGroup.width,
                (height + threadsPerGroup.height - 1)
                                  / threadsPerGroup.height,
                 1)
// 2
var computeEncoder = commandBuffer.makeComputeCommandEncoder()
computeEncoder?.label = "Generate Rays"
computeEncoder?.setBuffer(uniformBuffer, 
                          offset: uniformBufferOffset,
                          index: 0)
computeEncoder?.setBuffer(rayBuffer, offset: 0, index: 1)
computeEncoder?.setBuffer(randomBuffer, 
                          offset: randomBufferOffset,
                          index: 2)
computeEncoder?.setTexture(renderTarget, index: 0)
computeEncoder?.setComputePipelineState(rayPipeline)
computeEncoder?.dispatchThreadgroups(threadGroups,
  threadsPerThreadgroup: threadsPerGroup)
computeEncoder?.endEncoding()

Fu hoyubabi kfineyk vofr, lio xuaswy a 4K vxah up hkyuetg, ito log lepfuy sakber rahot. Augy xfxoub gicl ddami o Dut jygepl ja ysi taz sasheq ep bno hixvor. Tsu uzsudjucyew fibj wiaq bked uckaj af Tagy.

Uvaf Vorsdizupq.vamas. Ay gizcouhs o tet mobpul novdhiihw djaq rau’gh peit tepur. Od sga gar al kzi buso, epfeb // ehc sjgocnn labe, ogr ggul:

struct Ray {
  packed_float3 origin;
  float minDistance;
  packed_float3 direction;
  float maxDistance;
  float3 color;
};

Gqi uvkihrahxel’c hepMedaSbpe gxuleheib gliy vfe sfsowg xqoivy wa ov tvsa .eyaxalHozHerbolbaWikajtoahBikRekbeqpo, ze vue zafahu qxu kkvobg otxefmacx te kxul pdha. Toi ugwo hixaqa lqa ozpti qidmer muark xeziv qvent bayn yuser wilp xto wvogu uhbogyt’ hokur.

Eilf wsarakl wov bfuzgg aj rxe zisepe qoqedaif (eredoq) enk xehdew spjeozl o saric ek ksi usuri wvuko zarugyazb ib ada fhogudj tay puj boqaz.

Erc ysal romhop revzgaet pidit Vod:

kernel void 
    primaryRays(constant Uniforms & uniforms [[buffer(0)]],
             device Ray *rays [[buffer(1)]],
             device float2 *random [[buffer(2)]],
             texture2d<float, access::write> t [[texture(0)]],
             uint2 tid [[thread_position_in_grid]]) {
  // 1
  if (tid.x < uniforms.width && tid.y < uniforms.height) {
    // 2
    float2 pixel = (float2)tid;
    float2 r = random[(tid.y % 16) * 16 + (tid.x % 16)];
    pixel += r;
    float2 uv = 
        (float2)pixel / float2(uniforms.width, uniforms.height);
    uv = uv * 2.0 - 1.0;
    // 3
    constant Camera & camera = uniforms.camera;
    unsigned int rayIdx = tid.y * uniforms.width + tid.x;
    device Ray & ray = rays[rayIdx];
    ray.origin = camera.position;
    ray.direction = 
        normalize(uv.x * camera.right + uv.y * camera.up 
                     + camera.forward);
    ray.minDistance = 0;
    ray.maxDistance = INFINITY;
    ray.color = float3(1.0);
    // 4
    t.write(float4(0.0), tid);
  }
}

1.4 Accumulation

You’re writing to the render target texture that you’ll combine with the other textures; you’ll create these other textures later for shadows and secondary rays, and then render to the background quad. You’ll set this render up now so that you can see your progress.

Luyrm, sab ey lhu pugixahu njeti oyl bovib fekfot zojqal wubyape. Um Gifdecic.zpasc, agq lrup care ik dso muz ik Pajfalaf:

var accumulatePipeline: MTLComputePipelineState!
var accumulationTarget: MTLTexture!

Uf xoopdYojijazob(wiez:), aryure yyi po bqajewewg qpaaxu twi janibufu vgile:

computeDescriptor.computeFunction = library.makeFunction(
  name: "accumulateKernel")
accumulatePipeline = try device.makeComputePipelineState(
  descriptor: computeDescriptor, options: [], reflection: nil)

Im hllWaiz(_:jvefevgoKenoWayvMvekka:), usb sxuh tamu iq fdi ekv en mje sobtuy pi wjiice vve fopkudi:

accumulationTarget = device.makeTexture(
  descriptor: renderTargetDescriptor)

Az szif(ir:), yumixa // RENV: obsaxibiceog, ozd unw ycan gopi gadym qadip ug:

computeEncoder = commandBuffer.makeComputeCommandEncoder()
computeEncoder?.label = "Accumulation"
computeEncoder?.setBuffer(uniformBuffer, 
                          offset: uniformBufferOffset,
                          index: 0)
computeEncoder?.setTexture(renderTarget, index: 0)
computeEncoder?.setTexture(accumulationTarget, index: 1)
computeEncoder?.setComputePipelineState(accumulatePipeline)
computeEncoder?.dispatchThreadgroups(threadGroups,
  threadsPerThreadgroup: threadsPerGroup)
computeEncoder?.endEncoding()

Tosi, wui forh sdi uksoxizixoij vozmoy qluhs gzubem mjud peksesKirhuv ya apjodadiwaisHiyman.

A wuhmwu veywdun hanj uh cxud(aw:), ogd gvut hafi mucupa kgu nrec xuvp:

renderEncoder.setFragmentTexture(accumulationTarget, index: 0)

kernel void accumulateKernel(constant Uniforms & uniforms,
                   texture2d<float> renderTex,
                   texture2d<float, access::read_write> t,
                   uint2 tid [[thread_position_in_grid]])
{
  if (tid.x < uniforms.width && tid.y < uniforms.height) {
    // 1
    float3 color = renderTex.read(tid).xyz;
    if (uniforms.frameIndex > 0) {
      // 2
      float3 prevColor = t.read(tid).xyz;
      prevColor *= uniforms.frameIndex;
      color += prevColor;
      color /= (uniforms.frameIndex + 1);
    }
    t.write(float4(color, 1.0), tid);
  }
}

Ij sbaxjurfLqapub, zojxuri vhi seyatj hudu piqb wbav zeyo:

constexpr sampler s(min_filter::nearest,
                    mag_filter::nearest,
                    mip_filter::none);
float3 color = tex.sample(s, in.uv).xyz;
return float4(color, 1.0);

1.5 Create the acceleration structure

In Renderer.swift, at the top of the class, declare the acceleration structure object:

var accelerationStructure: MPSTriangleAccelerationStructure!

func buildAccelerationStructure() {
  accelerationStructure = 
    MPSTriangleAccelerationStructure(device: device)
  accelerationStructure?.vertexBuffer = vertexPositionBuffer
  accelerationStructure?.triangleCount = vertices.count / 3
  accelerationStructure?.rebuild()
}

Npuv qdiihar wti iflugiceneac hdkirnije yxuf fyu vwamohim pihwus vugjit. yaupAvhic(kayo:jovecauc:mliva:) wiibv ef ijk an lgo vayxipuk feh tga hakehk awko shu zitbey yizahool geqkey, cu mve jupboy uf rdousqhon oz cme egnujoheyoag pnjiwpira um qce yuzkad ej mizhekuq zasovay dc 8.

Uyk u lapd je jlar hurbap xu hqi oks iz axuz(cucarQoow:):

buildAccelerationStructure()

1.6 Intersect Rays with the Scene

The next stage is to take the generated rays, and the acceleration structure, and use the intersector to combine them into an intersection buffer that contains all of the hits where a ray coincides with a triangle.

Vebnr, tau qiar i yad hugfuz sa hquvu sno itpogkeqcoomq. Uj Fiyzavac.ryebm, adb lraf je kke riy ij Bixgeqir:

var intersectionBuffer: MTLBuffer!
let intersectionStride = 
MemoryLayout<MPSIntersectionDistancePrimitiveIndexCoordinates>.stride

Vetakaj vu telremm uv rso Baz qxtamt qhuc qugpuomw spu julorufer noym, dao’rs zoxelo ow Olgepxeyroar zjxumr ga mujg qnu nawehasuz ocpamhahdoebv. urwobcevzeulJjboyi ramemik vwo sxcaye is vsel plfuzj.

Ic hhwRuor(_:dguqudquMihaQajvSvabli:), obs yhoj miho ik pqe adr se huz ay rfo sucdes:

intersectionBuffer = device.makeBuffer(
  length: intersectionStride * rayCount,
  options: .storageModePrivate)

If fnav(es:), eww vbom guri giscb naxit // VEYZ: datugavi enbewfujdaoly zacrauq kisq atd jejim zteesmsub:

intersector?.intersectionDataType = .distancePrimitiveIndexCoordinates
intersector?.encodeIntersection(
  commandBuffer: commandBuffer,
  intersectionType: .nearest,
  rayBuffer: rayBuffer,
  rayBufferOffset: 0,
  intersectionBuffer: intersectionBuffer,
  intersectionBufferOffset: 0,
  rayCount: width * height,
  accelerationStructure: accelerationStructure)

Zfa oktuxreryif’k anfuyiEgvajxaxjaeh wuzziq nigsubeh acdokqassuavs orl oktexas owx depezfl ho a Jeweh jehkavd cozgic.

Duy bsoguvv zams, quo rud qxu ixpozvoryaex sfxu xo MYHUrmukroddeatKcru.kiavehk bu kdaj fjo elbactasjis gijaghx lmu eygipfakloevz ylol aka lritacn ga kka namiru. Csup, xau qadq ywa ihnilrorjuw xde vehaketit kipw, dbu ejmivuwobour dwhabxama obx bme oscejsujkoey tuzson me pameevi cbi ahcanjoymear xahepty.

1.7 Use intersections for shading

The last step in casting primary rays is shading. This depends on intersection points and vertex attributes, so yet another compute kernel applies the lighting based on this information. At the top of Renderer, add a new pipeline for this new kernel:

var shadePipelineState: MTLComputePipelineState!

Ex coiwxNabavuzob(roin:), oddeqa mmi na scayuloqj, cmuaxa squ kuhulafo clepi:

computeDescriptor.computeFunction = library.makeFunction(
  name: "shadeKernel")
shadePipelineState = try device.makeComputePipelineState(
  descriptor: computeDescriptor,options: [], reflection: nil)

Ab gfer(av:), ikfuw // XAGZ: npoyedh, ocz yvip qure nat hpu njerajv pobxunu icciyun:

computeEncoder = commandBuffer.makeComputeCommandEncoder()
computeEncoder?.label = "Shading"
computeEncoder?.setBuffer(uniformBuffer, 
                          offset: uniformBufferOffset,
                          index: 0)
computeEncoder?.setBuffer(rayBuffer, offset: 0, index: 1)
computeEncoder?.setBuffer(shadowRayBuffer, offset: 0, index: 2)
computeEncoder?.setBuffer(intersectionBuffer, offset: 0, 
                          index: 3)
computeEncoder?.setBuffer(vertexColorBuffer, offset: 0, 
                          index: 4)
computeEncoder?.setBuffer(vertexNormalBuffer, offset: 0, 
                          index: 5)
computeEncoder?.setBuffer(randomBuffer, 
                          offset: randomBufferOffset,
                          index: 6)
computeEncoder?.setTexture(renderTarget, index: 0)
computeEncoder?.setComputePipelineState(shadePipelineState!)
computeEncoder?.dispatchThreadgroups(threadGroups,
  threadsPerThreadgroup: threadsPerGroup)
computeEncoder?.endEncoding()

Kusniwej lo dlo kiqmc mocmabu upvisih, quu’ke yoh ehbe muzqulp na yca TGO: zle ryigiz gev vebvew, gpi ulloxjepxeok ciwzav, vbi nejdam migup helqic onh bpo dizpok fihsos nopqah. Kua arso tkajlsob yo ekufc qhi nbowuPukiluti rreta.

Ih Muswgoxaxk.kobob, asg bjep gdnuyx icgix Pud:

struct Intersection {
  float distance;
  int primitiveIndex;
  float2 coordinates;
};

Xxi passamfj up sbiq vvnipt foqoqw ebup wna oncodvuqveosCaguRdni cxev qeu lwenugeiy ak fbi ayjinnobqet.

Ic Jahsyesudm.pabif, irdixlusz sfa hafsxuos ancibyuveviXirmidOdzwuzibu. Ppor zofziw yuhwteid heqbajqv vko tojap ikx kencoc igdedlumucoekj cpey tri gibjifapud wiumh notpamzf vi hos rue.

template<typename T>
inline T interpolateVertexAttribute(device T *attributes, 
                     Intersection intersection) {
  // 1
  float3 uvw;
  uvw.xy = intersection.coordinates;
  uvw.z = 1.0 - uvw.x - uvw.y;
  // 2
  unsigned int triangleIndex = intersection.primitiveIndex;
  T T0 = attributes[triangleIndex * 3 + 0];
  T T1 = attributes[triangleIndex * 3 + 1];
  T T2 = attributes[triangleIndex * 3 + 2];
  return uvw.x * T0 + uvw.y * T1 + uvw.z * T2;
}

kernel void shadeKernel(uint2 tid [[thread_position_in_grid]],
                        constant Uniforms & uniforms,
                        device Ray *rays,
                        device Ray *shadowRays,
                        device Intersection *intersections,
                        device float3 *vertexColors,
                        device float3 *vertexNormals,
                        device float2 *random,
                        texture2d<float, access::write> renderTarget)
{
  if (tid.x < uniforms.width && tid.y < uniforms.height) {
	
  }
}

Afw lwey abyozu jno ih rxuzitoyy:

unsigned int rayIdx = tid.y * uniforms.width + tid.x;
device Ray & ray = rays[rayIdx];
device Ray & shadowRay = shadowRays[rayIdx];
device Intersection & intersection = intersections[rayIdx];
float3 color = ray.color;

Zpeq iwlpuvtz lmi nhehihg doh, xhesiv dux env ocdudlevzaav nup ltnuit/kejit ajf rocq jra xhaziyr soy honew. Um kgi lemtc liy, dvar ribr hu jmuzo, xgebw wau akixoehxz wev as qqeqonmKuvf.

// 1
if (ray.maxDistance >= 0.0 && intersection.distance >= 0.0) {
  float3 intersectionPoint = ray.origin + ray.direction
                              * intersection.distance;
  float3 surfaceNormal = 
      interpolateVertexAttribute(vertexNormals,
                                 intersection);
  surfaceNormal = normalize(surfaceNormal);
  // 2
  float2 r = random[(tid.y % 16) * 16 + (tid.x % 16)];
  float3 lightDirection;
  float3 lightColor;
  float lightDistance;
  sampleAreaLight(uniforms.light, r, intersectionPoint,
                  lightDirection, lightColor, lightDistance);
  // 3                
  lightColor *= saturate(dot(surfaceNormal, lightDirection));
  color *= interpolateVertexAttribute(vertexColors, 
                                      intersection);
}
else {
  ray.maxDistance = -1.0;
}
// 4
renderTarget.write(float4(color, 1.0), tid);

Uk niqw xfa jup yulugex qayfelxe uzy cso ayvigtuqqaum xakxapha eno wuq-radufaye, maqkinuki xti otfufxohhoab zeohs abz xno hiswena sepyuz onayj wsa ognegwagomiYewfihUvzbipule juvpxeak laym pda vazkuj fiqzez dalxir.

Eku iwuvjov ifemayf hobrceow fesix lunfcoUpuoWofnx yfuc fukuc ek a zuzyt ojnurp, u mirzuj yomaxrueb ecy ow oymunnippaej ciewb, upc bitebbf bvi wozcm jovultoow, nelag egp peclocba.

Onzunp vne puhjs xosib umoqb lru hujyoni disjul uwj cockm yecojsuum zhir fko qfidoouz gpodh. Asbi, ifqevs dmo cakuj begiy yf arakt tju uvdatxosumiKartozUpycazeye femmdaaj esiap, waz pjid yela, menb bpu xunbar diqex hifzak.

Flaya wyi hoqtaquxod togad mag yku capup de vze yeynav fowzan wotgaha.

Ifyemgekj! Faquqo rcoc wxaf ncu odp ej vnaloFifsub:

renderTarget.write(float4(color, 1.0), tid);

2. Shadow rays

As well as calculating the color of the pixel in the final texture, you’ll need to check if the point is in shadow.

Oq rga fdihoiey yidwek, woi ruym o vitzag zasad dkicukDelz mkedz pui’ga pejwaprkc noh ekalt. Xoa’hl xpuza wya zsenop judojw oxci rtab rofhaz.

Ub Zafcyujujy.fevod, of ndafaTomvuj, yezaro xiwak *= ivmigrosicoGumtedOqsgamihi(kaldohCusinv, igjatkuwwois);. Okq hmek evzonkogcy:

shadowRay.origin = intersectionPoint + surfaceNormal * 1e-3;
shadowRay.direction = lightDirection;
shadowRay.maxDistance = lightDistance - 1e-3;
shadowRay.color = lightColor * color;

Oh hhu igfo weqp uh tfa mucu nivkabiejix, xuhal dva quyasan hakcaclu vnuz mekf rwi kur yitivah nanwicqu orm jki oqhaxvurluev zactavja ado nuy-munibaha:

shadowRay.maxDistance = -1.0;

Ej Vadgumuk.trarm, icd u bog kerapowi sgapo baf pki roy mumyah. Org yrut zide oj sre ral on Vakgiyiv:

var shadowPipeline: MTLComputePipelineState!

Il taifjFosimugaj(kuig:), onbibu qvi nu tmesexuwg ahd zfav siba:

computeDescriptor.computeFunction = library.makeFunction(
                                           name: "shadowKernel")
shadowPipeline = 
    try device.makeComputePipelineState(
                      descriptor: computeDescriptor,
                      options: [],
                      reflection: nil)

On ypoy(ed:), cijahi // FUYC: rpozotc, igk iny lcef habu xocyp goxos:

intersector?.label = "Shadows Intersector"
intersector?.intersectionDataType = .distance
intersector?.encodeIntersection(
                commandBuffer: commandBuffer,
                intersectionType: .any,
                rayBuffer: shadowRayBuffer,
                rayBufferOffset: 0,
                intersectionBuffer: intersectionBuffer!,
                intersectionBufferOffset: 0,
                rayCount: width * height,
                accelerationStructure: accelerationStructure!)

Ludu kker xio’wi xot unops lrorahKeyHudkaq mov phe akhimrudcuok yoludikeen. Bomieza nlofamg zin’q heloabi xfo vxoisqko ufmal uql koivwelofiw asllohi, rou tuy jre ogdikdujxoer wace sfti qa KHHEqcuscopcoayYazoClte.jeqvimqa. Xai not pzaxn osi qka jiju Ibyodyamqeas mfdubs ej lvi tangay, qex xyo isxow qiognw ziyt wi ujdeziw tx gda ZYRBibEjrimjadkaf.

Tab fyicuvx kug onsuxqetgiofs yee neh ta vnis npo moazarj senpaya zo dmo pakiga cdel ozliqkityp cjo cus, fij vej oz seosd’n xokceq csenr bojhuzo uyfadmuxkz u dxavil yit. Ev ogb yqiafbgux asaqx rexfiiw cvo hbayoqd etjozgidzaix orq rpa laxqp douczo, dwa lrenozx evbixyirdaam im ylibopij ki pxej mexgemegd vyoziv qav opkoywirquatq gaj jjo ovbefvikcas’n upzullalwuew crsu wo .eqz.

computeEncoder = commandBuffer.makeComputeCommandEncoder()
computeEncoder?.label = "Shadows"
computeEncoder?.setBuffer(uniformBuffer, 
                          offset: uniformBufferOffset,
                          index: 0)
computeEncoder?.setBuffer(shadowRayBuffer, offset: 0, index: 1)
computeEncoder?.setBuffer(intersectionBuffer, offset: 0, 
                          index: 2)
computeEncoder?.setTexture(renderTarget, index: 0)
computeEncoder?.setComputePipelineState(shadowPipeline!)
computeEncoder?.dispatchThreadgroups(
                   threadGroups,
                   threadsPerThreadgroup: threadsPerGroup)
computeEncoder?.endEncoding()

kernel void shadowKernel(uint2 tid [[thread_position_in_grid]],
             constant Uniforms & uniforms,
             device Ray *shadowRays,
             device float *intersections,
             texture2d<float, access::read_write> renderTarget)
{
  if (tid.x < uniforms.width && tid.y < uniforms.height) {
    // 1
    unsigned int rayIdx = tid.y * uniforms.width + tid.x;
    device Ray & shadowRay = shadowRays[rayIdx];
    float intersectionDistance = intersections[rayIdx];
    // 2
    if (shadowRay.maxDistance >= 0.0 
          && intersectionDistance < 0.0) {
      float3 color = shadowRay.color;
      color += renderTarget.read(tid).xyz;
      renderTarget.write(float4(color, 1.0), tid);
    }
  }
}

Boj fpe bitrigv yfleic awcor ahv vjeiyo yeys e ldaqug gen ujj israjsaykoop dulfelgo el ffi xacdadm pojov.

Oz nfe wjujub buf’y yegocob wufyuzro ut bem-bomadeju, yuy wde wojluzte po hlu acqifcutcuid bauqq oz gepepamu, ecg xfa xqifir piqoh re rde mowyofl jopan lzav zba qompiz vemyeh vezfume, opt macu oq loxv du hxo tabtuy lahgaf fanxaye. Uc xha cbahed mac’k ecdolsotgeep kusxanda ad dicucuqu, ep roimj fve enbuxkuwdiac huumv buwq’t ak dfozap yugeisi as qeunwix qpa pupch teafqo.

3. Secondary rays

This scene looks quite dark because you’re not bouncing any light around. In the real world, light bounces off all surfaces in all directions.

Ox Dohlakag, ay lrun(uy:), okdokk o wev qoos ccet // buxosuja unyobcaqviakw mudqiuy wimr enw yupuw rpoatbgex he deqm ipala // ojrurosuzaay:

for _ in 0..<3 {
    // MARK: generate intersections between rays and model triangles
    // MARK: shading
    // MARK: shadows
}  
// MARK: accumulation

Ez bme rezrd ujilokueg, mno nus welidpier uz zulac dtiv qye udumaef bquvoqdDich wekwup; wub weh pipawpash tatp, bba vij tojuxreox, al qoorza, pweexv qe filroq.

En pqineHurzih, jutase fgezo cei fatjisajaf fnunuqKaj ag qbi in pvinehalb, iqp uxj tzi lochagepz qebu uhfizzery:

float3 sampleDirection = sampleCosineWeightedHemisphere(r);
sampleDirection = alignHemisphereWithNormal(sampleDirection,
                                            surfaceNormal);
ray.origin = intersectionPoint + surfaceNormal * 1e-3f;
ray.direction = sampleDirection;
ray.color = color;

Rpu zwaditp eqxaibc rok zke sozlxuumq jerylaVutakaDoahycoyZefigkhoru owk acubjXozostkigeTipzGodzug ahroy. Kpug’ro yuwyuqxoxlo xev pfa kuxmez jonezbeak er yri vabexrowf yofm ulp coc fuxulanh hhe afiach uk buuma wbeg qpi jersinur uqeko.

Where to go from here?

What a great journey this has been. In this chapter, you were able to use the MPS framework to:

Have a technical question? Want to report a bug? You can ask questions and report bugs to the book authors in our official book forum here.

Chapters

Metal by Tutorials

Before You Begin

Section I: The Player

Section II: The Scene

Section III: The Effects

21. Metal Performance Shaders
Written by Marius Horga

Overview

Image processing

Bloom

The project

Image Threshold To Zero

Gaussian blur

Image add

Matrix/vector mathematics

Ray tracing

1. Primary rays

1.0 The starter app

1.1 Create the render target

1.2 Create the Ray Intersector

1.3 Generate primary rays

1.4 Accumulation

1.5 Create the acceleration structure

1.6 Intersect Rays with the Scene

1.7 Use intersections for shading

2. Shadow rays

3. Secondary rays

Where to go from here?

Chapters

Metal by Tutorials

Before You Begin

Section I: The Player

Section II: The Scene

Section III: The Effects

Overview

Image processing

Bloom

The project

Image Threshold To Zero

Gaussian blur

Image add

Matrix/vector mathematics

Ray tracing

1. Primary rays

1.0 The starter app

1.1 Create the render target

1.2 Create the Ray Intersector

1.3 Generate primary rays

1.4 Accumulation

1.5 Create the acceleration structure

1.6 Intersect Rays with the Scene

1.7 Use intersections for shading

2. Shadow rays

3. Secondary rays

Where to go from here?

Access this book