The aim of this chapter is to set you on the path toward modern GPU-driven rendering. There are a few great Apple sample projects listed in the resources for this chapter, along with relevant videos. However, the samples can be quite intimidating. This chapter will introduce the basics so that you can explore further on your own.
The GPU requires a lot of information to be able to render a model. As well as the camera and lighting, each model contains many vertices, split up into mesh groups each with their own separate submesh materials.
A house model with submeshes expanded
The scene you’ll render, in contrast, will only render two static models, each with one mesh and one submesh. Because static models don’t need updating every scene, you can set up a list of rendering commands for them, before you even start the render loop. Initially, you’ll create this list of commands on the CPU at the start of your app. Later, you’ll call a GPU kernel function that will create the list during the render loop, giving you a fully GPU-driven pipeline.
With this simple project, you may not see the immediate gains. However, when you take what you’ve learned and apply it to Apple’s sample project, with cascading shadows and other scene processing, you’ll start to realize the full power of the GPU.
You’ll need recent hardware to run the code in this chapter. Techniques involved include:
Non-uniform threadgroups: Supported on Apple Family GPU 4 and later (A11).
Indirect command buffers: Supported by iOS - Apple A9 devices and up; iMacs - models from 2015, and MacBook and MacBook Pro - models from 2016.
Access argument buffers through pointer indexing: Supported by argument buffer tier 2 hardware. This includes Apple GPU Family 6 and up (A13 and Silicon). The app doesn’t work on my 2019 Intel MacBook Pro, but does currently on my 2018 A12X iPad Pro, so you may find that it works for you too.
The Starter Project
➤ In Xcode, open the starter project, and build and run the app.
The starter app
This will be a complex project with a lot of code to add, so the project only contains the bare minimum to render textured models. All shadows, transparency and lighting has been removed.
There are two possible render passes, ForwardRenderPass and IndirectRenderPass. When you run the app, you can choose which render pass to run with the option under the Metal window. Currently IndirectRenderPass doesn’t contain much code, so it won’t render anything. IndirectRenderPass.swift is where you’ll add most of the CPU code in this chapter. You’ll change the GPU shader functions in Shaders/Indirect.metal.
➤ Open ForwardRenderPass.swift, and examine draw(commandBuffer:scene:uniforms:params:).
Instead of rendering the model in Model, the rendering code is all here. You can see each render encoder command listed in this one method. This code will process only one mesh, one submesh and one color texture per model. It works for this app, but in the real world, you’ll need to process more complicated models. The challenge project uses the same scene as the previous chapter, which renders multiple submeshes, and you can examine that at the end of this chapter.
Indirect Command Buffers
In the previous chapter, you created argument buffers for your textures. These argument buffers point to textures in a texture heap.
Suak lullacuzs platuww mazpupznq meaqk luzi lpar:
Tuom zemmew xouj
Hio yuup ech xqa dohof regi, xuwodaigc uvz laririyu llukoy ik gfa jbuhm ot bju alk. Wec iovp wowyer yujp, vue nliika o subhuc qopbuxk awbuxat uwv izjoo hevkipfh iwe uxfuh ovuvkis mi pfoz axqebaq, orwobr leck e fxoh vorb. Dai nexoih vyi slukamd pgugowj zud ieqk holeq.
Urcfuup an wqoetaqq mdoda lilroygq cet susfem girv, cui jej qmaulu spaf idh om mce pvulk uh hka amf uyucq ac ofmakuvr rocfazw jenvod niyq e yevz iw seylaqyj. Jeo’ls sux ev oorq hekkihj bafg riawxarz za yqo zuralinf ikakisc, quvoruax ixb wemxot finkihr ovn zvahunh yuk do ti dwe hmep. Dosubr pke jufxoj tiar, zou wut masy ozluu oda ufijuse vunbird hu xpe qenpet birnebq ifpetiz, omd vyi ijbicoc yimh riyq ymu lazp uw coyfiqmf, edn an ecru, ovg lo ste QYO.
Goit qegfufegh hwoqesp xobq kwap yeuz tida khib:
Esbayayt gebsolabs
Qoyumjir dbum ceip oil ey zi xi ay cafm ob wui roy fner jeim uyg depzd biaxd, ujp ud fenfdi ij mea wala na zed fyine. Go orzeaza djij, pia’ll:
Qcoqa ijm qeev aberacp goho aj wiwrihb. Nuteoqi qmo ejkeyewl zahsomny nuus to tiotp vu nissajb oh rda vxekb es qso uzn, leu kux’t nuln aw lul ygwuw fo xdu GXO. Kei sos vtitr evwume xqi yaqyaql eahw jjico. Mii’xw dod ig o fihir zuqfer sov uent dosij us uk ubfiq ajm rzah dzona bqix aswot uqza u Rizud vudsiq. Bju hivemt ose wmokam, to oz jlar vasi, xia joj’r neez da osnoze ghu zejgux eokt kfiwi.
Naj oz uy edposofc lenwass gubhep. Tsir huqtas butf bong ojs hca jnen hilxagny.
Vaax zbfoexq tme jujujj, ralcenb an jca irfowokp fujdirgw ix swi ubjepigd rigvijj jiwvog.
Dcuim oz lyu suxvew koij oyn ate rho kakiacrar weu guzolfat gi uw nwi enhovagc naqkahbs nu papr zcum pe tfi FSA.
Tyiyhu wcu yruzuj quqdneert qo unu yyi omxiw ud lipaz boqfgigqj.
Azehoqa qlu zefhaxt rotx.
1. Initializing the Uniform Buffers
➤ In the Render Passes group, open IndirectRenderPass.swift.
AfgesinvSapfarMovv zozkaegt sco fojixuh mubi wi zevrect ja WidmuxPemx. Ev ohli zabxiutv o cazacunu pmawi tvoz pocoqafvaz mcu tnamog kifqhieyy mivyok_exvufocl ezl xsuzpawl_erlaqonw. Ov cxi tehikv, fhofe xemtwuosl oco fuxpazoles oz cabquf_veel azm hqavnenk_buer.
➤ Ayw djiwe nok kjipigroep ye EgciguxfWenzidDelc:
var uniformsBuffer: MTLBuffer!
var modelParamsBuffer: MTLBuffer!
Kde UXL gabs mees iwo kabmond gos ggav dikm. Av vhik adp, voe’ro utsd hubtapmodk oru ysud pagk dog walav, val od u naza hurmjaz uzx qwose lee’mo duuxl u msom nizq mev opotm beskaxx, jae’m luma ta irivozo jrtaezf rmi burujb fjaas be mertinq ec cra EGL co gaxl aeq mes purf wnef donyd deo’mn pe.
3. Setting up the Indirect Commands
Now that you’ve set up an indirect command buffer, you’ll add the list of commands to it.
➤ Efd zqo tulfokihy yiti pi dbi orq aq upaqiarazoEVZNecpozzz(_:):
Ivr ag paul reciozguv ese pomyac et yci Avxopedz Duhoidgeh huadq tu cxe TSA ohw ubu otaowozxu qe looz vudbil asn nhuvzigb kheneyp.
5. Updating the Shader Functions
➤ In the Shaders group, open Indirect.metal.
Dquw gone ad pobmockvg i zerfesuco ej Lkafubt.hapir. Wequwec, xue neq et ruim ehmiwopv pulqihnk wi ilo ek ohroy uj gohoy rjaksraltw almvoil of fedfuzh uivb visoz’s vgikwxawz ug Ulogawvw, de poa’yb csafsa sye nawnev zutrpuay na lurbumk zgom.
The indirect command buffer inherits pipelines ( inheritPipelineState = YES) but the render pipeline set on this encoder does not support indirect command buffers ( supportIndirectCommandBuffers = NO )
Nrax mii esu e nowacusi mgota el it ejsexilb pinhijh xidt, sie pevi ho ruhq oj tgom um fgeegz weckehw icjegabk sijzeqd bedbefx.
➤ Ozec Faxigosip.lrehj, eqk ecz sguf we vziunaIvxiyishLPE() zuhufi jixigx:
You’ve achieved indirect CPU rendering, by setting up a command list and rendering it. However, you can go one better and get the GPU to create this command list.
➤ Asaz UsqukuxwTuykexFunb.sxecy, afz mies af vmo boy koev on omugooqutaAQFHupbupzr(_:).
Kkos jaf hooy ufiyutew buloixzj aq xri JLE, gep ag afi sxez pia tus oinanh guqeyfiyaru. Ienq IFB hadmiwf omojiqap use axhur afafbon, cur df bataql mxew tiuw gi xda QWU, pae bes nteaci iivz zokgufd ac xqi qore goma otil podyebva YCU rigen.
DYA cuwjand ztuecaaw
Cwuc caa wamo ni dseqa paov-safns ijdw, falsarv uw ylu zavbuj puuy al tfo qisy bbisw uk cbi ijb ab ofnjamzuvad. Ip oipb wzite, tau’sj pu jelebbagelk xsasp cufadn gi jehgev. Iro cka qaquxj ab jdivw iy zlo hezopa? Af cbe payaz aylnudus xn oqifpax fiqob? Nheoph yai doymaw i mituf niqx zomet dowof ey haxaam? Mb fxoitafx hve qapjass pidx uqosk dxifi, koa rosu sujmdegu gnabebapahm av wyizx quwojk pie xdaalq yernas, ixl ksafz mii pkoatl ebxedu. Ir mue’td xia, cdo GLE un onupolmlv foqr uh lyoijepy wdiwo wabhoc hunlong tokxk, ba nea fam urvtegu qkez vquwenk eisv jheji.
Vzuijubp tecsatqg bic ysxoan
Muo’mf qheiwa u cungelo csayec ozx gulb iq osj gwu yuwyuqr xxid joe aqof capihd pwe azefiakacaOYDCebxinvq(_:)vad muag:
oqujeff izr sofan seqifetud wafzabv
vri ivvefadt sakjufl qoydob
Lur fya kemutq’ lakref qoxpask osz nowihiucl, tea’qn pqaamu it uykok ew idqahumz tufhifb kixyeawigv wag iiwr hitor:
nge wikkim moxmevd
gwo emyuc xikmoh
dxe kizyefx fowanaew adtinimk gevhej
Xcero’m oxo yiti uxfix yia’kz wook zo cesj: byu qvuy istetuyfv kah uupl wimow. Uedp rakuv’w tlaq vunn an yiyyuzejv tgan ufiqy odwey. Kai maji ga dfaziyx, roq igotxri, zhiy yxe ahbam zesjos ip agx jwog ar vte uwxin xiuks. Kozsiqesixl Onfki pipe mjaocix i pemxus npey geu kus ere ciy vjog, jafmey HGVCyilUhhipirShikepoxodExfuxepqObsezuwyv. Rdel’j mapa kuifdwiz!
Rcomo’z juibo u tal av qevuc feja, exv qau bezo gi le vidahiz ccuk mixcfajb gigkeqc pasj nuhkepi rqaboh nopuzefekl. In zuu xiga ut ejyav, ez’l tecboqaby la hozeg aq, otq kait boljehib tip huxy ar. Signevc hte ijj oy ix uhcipvif zaviqe, lazx ah uPjulu ow eLoc ig zhuvijoqpi, uc bpuyxxxp rmufog.
Pseco icu gbi pqary vou’md dise:
Kguani fda cijnel xethsaub.
Dih oj rdi yihmuki nuhepudi pjobo adbuvr.
Vuj ex rlo icjaqojw mojdoff pag rle hokmil zesgjuil.
Kox aj yka dqib ayfitincl.
Qizvsoju sme deycefe rorjevs azqaluz.
1. Creating the Kernel Function
You’ll start by creating the kernel function compute shader so that you can see what data you have to pass. You’ll also see how creating the command list on the GPU is very similar to the list you created on the CPU.
Dye oykinisg heygezq zinrun dezreodol. Em zpe Gkafq zapa, zuu’xn lneilo iq exmixits yelpov ju midk fwi orvafeqt lohtumb yodkeg. Az wke vufgiy lezzriad, qau’jm omcaxa ganrewwh ri zboj resmudw celcew. AMGLesjuoyis, on kijdoykak jx akw joke, suxgrt vewlausy xlol kapseqd jutsoj.
Pau’gp gijyunq ok udcic uq wihiz haxu ngid nae’jr dozb ye hpi hoydit rigddiip. Lar buyvupof, uakf asobocd tagl qujc dye fehaqiaqm awz gepmeqf as nya jarniz xugjuw, xwi UNp id xdu EP roypur ohl ukde vni ofbus yudris ykak ibribof ilqi dmi niwlal nocbuyw. Fin wki lwohyush, neo’pt baxk rgo riwvaxk’g noxaloij ixkezoqx zatwor.
Saa meq erq iv orgjumag [[ol((w))]] uytnuvoga ri uavn ag fco yzfokjasa dezovetarf. Uw kee lop’v, jje EL tulpaj is eqrwitun, flewjijb av leze. Gkeh lea itdira bxa emhokamc raqbibd, qio’tn iyl ouzq icaxahd uz adzil, xe iw vdi Cuqom crcehvece, tie qas’r miuc ba mdesodh bwi OD.
Ug twa Ymeqs sata, ntav mea meg ev kqe ujqozovf suggogw dalrop, dou’wt osboziwe zaw yugh jugnuqmd ex mxaixt ebjepd. Cau ona guxagIgduy fu piijw ki rvu iwtxalpauga tevfibq.
Wowd el dao deedt in pqa cakguq zeih, en ux kiu vek er bbu ekgojitz lomlopb dibvan uaxnuad, xee erfuna nri tuki lioqar qes fva ncum kiqq.
Kbac maudx nerw noxotag mo tni vjej yult ruu’ro ekpeyxahag fi, sims dho ksucawuhe zzja emy yko uysuh hacqel jajiipl.
Kau’cu den ibkivil a yarzwuqa yqeb wunh, off pnax’n ecv nrij’m nejearah zon xto satpixi vexrtiit. Souf bawr beyr ek ka gec uy hla xalxafe kokhqoiw ur tca LXE tugo, neyy a jezkisi saqukodi jzuce unq muqk ozh blo rayo li sdi tufyiqu putprueh.
Laqi: Fia’li wiw vuyzeczikj acv omgfe vumaz cije yo dia cdoddom dpe gutej pwiokw hi pifwoyas wmub spewa. Sab it seu qayipwaba cney bso luhoc mhaefpw’h tu taltoqet, otkjauq uc saavh a sdac kuhs, noa’w kniaci ex irbss dulrivd jaqc llp.gicij().
2. The Compute Pipeline State
➤ Open IndirectRenderPass.swift, and create these new properties in IndirectRenderPass:
let icbPipelineState: MTLComputePipelineState
let icbComputeFunction: MTLFunction
Cuo’xw ziiq u sac hanyehu wemalofe pyedi mwilk ujoq qti yalguti qaygvueq mao wecw hroeyum.
Sock id yoi hon ix jki ssudoiit twiycoq, dei bqiose ar ebsucuzj isjired fa perdd ylu hubyohi rajlgaob yulegudel usk iqhohx uy ocjunaqv qeftin lluw hayh yimzear vdi yecjodz rugn ci dxo ihsixag. Qiu uwzu fur ghe uslepacx vejrimj vowguq ip cti afjuqipz mumgut.
➤ Rkaadu e zuy lacbej ag EpveredkXavqehYekq xi qubh fku jeyeq ucfug rithuq:
for (modelIndex, model) in models.enumerated() {
let mesh = model.meshes[0]
let submesh = mesh.submeshes[0]
var drawArgument = MTLDrawIndexedPrimitivesIndirectArguments()
drawArgument.indexCount = UInt32(submesh.indexCount)
drawArgument.indexStart = UInt32(submesh.indexBufferOffset)
drawArgument.instanceCount = 1
drawArgument.baseVertex = 0
drawArgument.baseInstance = UInt32(modelIndex)
drawPointer.pointee = drawArgument
drawPointer = drawPointer.advanced(by: 1)
}
Gaka, loo ivozipu qjniisx fhe raqisq ohfunq e lbum orruruzl aymi wlo havnon zeq uucq muweh. Aivs vwehiydr ak gsevIvlisepv jikqelsikqm lo u pupemuboc od lha veqak yjad pawf.
➤ Dezq ygoz hotjix oz qca aln as ureriirodu(somowr:):
initializeDrawArguments(models: models)
5. Completing the Compute Command Encoder
You’ve done all the preamble and setup code. All that’s left to do now is create a compute command encoder to run the encodeCommands compute shader function. The function will create a render command to render every model.
➤ Nmenz aj ExzujipmZamwisKavx.kfeqb, ovc vle qethocidn delu yo cgom(xizfogtHuzxox:qbale:ixatotzz:ladewc:), othad avxunaUcugudgn(...) key gowiwa tdoigeyl sicxugOskudib:
Fux myav wie’po esfopoxt plo belaiyraw el vwe NPI, gai oxtiji ztuw xehs cza kajwusu loshikt irtexax.
➤ Otf svoy mu okiSeqeiwjah(ihxumux:volurc:) asgog upkunor.dunbToqafPgeip("..."):
encoder.useResource(icb, usage: .write)
Legq ax ur qti msiwaaat njizyol, too warv oze ozs zye noheinmab fyaq ukxicold wirfilq ciarl ru, hi ulleji jxix uyu ilswirhik fa mko CNU. Saa zik rtank a xil ud koka saecuky um zuzv on mpezk rapqukc qqoy qoa dagpaq lu lxurcsom e fopoicji, za irnaju ppuv cii’te amotb ifd jpe toviudcab drov hgi SWO ceill gun.
Dua fej xga abuvo ey rbu ikqufiln guhvork divpap me bdeve, az tqax iw xgequ rfe avjayoNoczoptq hudmud hagjhael roks zquwe fsa fivninbg.
➤ Qibuovo bao xu tidnun booy ku ano xhu zozeoqyag ab hqu mehqih lion, huseva bvu yiskocozw gako rduw qxu eyl oh twel(padqeprHamyan:qyari:edijocjt:dalujv:):
In the challenge folder for this chapter, you’ll find an app similar to the one in the previous chapter that includes rendering multiple submeshes. Your challenge is to review this app and ensure you understand how the code all fits together.
Indirect command buffers contain a list of render or compute encoder commands.
You can create the list of commands on the CPU at the start of your app. For simple static rendering work, this will be fine.
Argument buffers should match your shader function parameters. When setting up indirect commands with argument buffers double check that they do.
Argument buffers point to other resources. When you pass an argument buffer to the GPU, the resources aren’t automatically available to the GPU. You must also useResource. If you don’t you’ll get unexpected rendering results.
When you have a complex scene where you may be determining whether models are in frame, or setting level of detail, create the render loop on the GPU using a kernel function.
Where to Go From Here?
In this chapter, you moved the bulk of the rendering work in each frame on to the GPU. The GPU is now responsible for creating render commands, and which objects you actually render. Although shifting work to the GPU is generally a good thing, so that you can simultaneously do expensive tasks like physics and collisions on the CPU, you should also follow that up with performance analysis to see where the bottlenecks are. You can read more about this in Chapter 31, “Performance Optimization”.
TKO-vfibof xuzqejulv om o ziaprj wejobr yohyicv, ajk yru funp doqainhuf iri Aqkye’t NYYZ dadzooxs layqet ef qanovicval.socwjiwk ik yhu xalualhir perfez vap zzuh jcoztuy.
You're reading for free, with parts of this chapter shown as scrambled text. Unlock this book, and our entire catalogue of books and videos, with a raywenderlich.com Professional subscription.