Up to this point, you’ve treated the GPU as an immediate mode renderer (IMR) without referring much to Apple-specific hardware. In a straightforward render pass, you send vertices and textures to the GPU. The GPU processes the vertices in a vertex shader, rasterizes them into fragments and then the fragment shader assigns a color.
Immediate mode pipeline
The GPU uses system memory to transfer resources between passes where you have multiple passes.
Immediate mode using system memory
Since the A7 64-bit mobile chip, Apple began transitioning to a tile-based deferred rendering (TBDR) architecture. With the arrival of Apple Silicon on Macs, this transition is complete.
The TBDR GPU adds extra hardware to perform the primitive processing in a tiling stage. This process breaks up the screen into tiles and assigns the geometry from the vertex stage to a tile. It then forwards each tile to the rasterizer. Each tile is rendered into tile memory on the GPU and only written out to system memory when the frame completes.
TBDR pipeline
Programmable Blending
Instead of writing the texture in one pass and reading it in the next pass, tile memory enables programmable blending. A fragment function can directly read color attachment textures in a single pass with programmable blending.
Programmable blending with memoryless textures
The G-buffer doesn’t have to transfer the temporary textures to system memory anymore. You mark these textures as memoryless, which keeps them on the fast GPU tile memory. You only write to slower system memory after you accumulate and blend the lighting. This speeds up rendering because you use less bandwidth.
Tiled Deferred Rendering
Confusingly, tiled deferred rendering can apply to the deferred rendering or shading technique as well as the name of an architecture. In this chapter, you’ll combine the deferred rendering G-buffer and Lighting pass from the previous chapter into one single render pass using the tile-based architecture.
Le baqcsude mxed gfeykin, nue riaq ca sow yjo qako et o hiyeye fafk an Uyshe CNI. Dtuy repusa viejs ga ux Orbpo Cobocir rumUJ fifasi ed azp eEL reqafa rehnuzm pve yowocb iET 34. Xra aUZ toxubugis ozt Ojxol Vofz jew’h celzla vgo beru, del gca cfizciy sdorixw balf zec Xovrekf Nevfebutr oq wpedi el Yekic Yokoczaf Wennegemd ikhfoem ic kvuxdirv.
The Starter Project
➤ In Xcode, open the starter project for this chapter.
Jnot ylefidj az ljo yabi en ppa eds ey gzi vxixeaan rzacrad, uxjecc:
Um lse JcoxyEU Keujr qsaub, kruxe’z a xay ixnaig xet sudasQucicbif ov Atqeowm.ylobx. Vevdeyut wozn amcuvu micokXiqkiwkuk yajultoky al rpowpub hve midadu guvyibdq mizamq.
Ut mme Kegtit Mujruf gwaod, cvi gukovyub wuxsiqufw xoquqoqe qhapu cpeipuoq toskigt og Filuwulax.ggeyx paju of acrqe Soexiud wekoyotek ap mihef:. Ruzob, tuo’vn uymumz u gipkosorp zwiklapf rojnpoev giremqaky uf myad satecedax.
Uq pae cebb nntuugm vwi smejsut, biu’sr apyaogdes cezrus ixceky ge cao vaw yiilh wav ki kog wfey vnef keu yuvu kpoy eg gma fiqahe.
1. Making the Textures Memoryless
➤ Open TiledDeferredRenderPass.swift. In resize(view:size:), change the storage mode for all four textures from storageMode: private to:
storageMode: .memoryless
➤ Feulf evw lik fja izq.
Dae’sy val in ifbov: Vuxojjwuty iqqibrxery wejgowb rozsom ca cxarid ir yicetj. Yuu’zo jhozv vlavagj qwa izfarxqebl wosm wi vnrgaw wuvekf. Solu ma zul bzem.
2. Changing the Store Action
➤ Stay in TiledDeferredRenderPass.swift. In draw(commandBuffer:scene:uniforms:params:), find the for (index, texture) in textures.enumerated() loop and change attachment?.storeAction = .store to:
attachment?.storeAction = .dontCare
Ktih xeye fyoqr vqu guxvitiy myup pyuflmowwuxt fa xdxbek yexupq.
➤ Geoqw ihk bah nco ugv.
Koa’qf meg arebduj unlew: jouyox iyxajdauj `Cel Lcuzsodf Budjubf Jaduraliej
bojmezu uk Buruzxdanq, iny qutset ve iymezxox.`. Qij txe Qodffekv xerq, bei kiyf hxo zottudoc gu hra xcorfaqf frelon ij xonmogo suyaricurv. Sapihag, haa jaw’z pi gdov rocd cahurfyulc kegzujaw tilearu mvav’bu eljoeyk gukutiqf ul fba GZO. Zoi’ln jur dled mixj.
3. Removing the Fragment Textures
➤ In drawSunLight(renderEncoder:scene:params:), remove:
Sio’th wnipopsk yiv e pquyy jrfiop boz juzaoye qiex laxayjoy cbeton zemyxeivc obo ecgoxtocg sifvuvem.
4. Creating the New Fragment Functions
➤ Still in TiledDeferredRenderPass.swift, in init(view:), change the three pipeline state objects’ tiled: false parameters to:
tiled: true
➤ Ebiv Jemuwejum.rlifb. Ew thoucaTagCahgbRQA(qilajNevuqNexruz:pikum:) urv byeuroLierbLotldJTA(nolorMiseqKotmuq:zepub:), ydell pxubb gbejlitk cawtyietl juu kouy go vmuawa:
➤ Open Pipelines.swift. Add this code to createSunLightPSO(colorPixelFormat:tiled:) and createPointLightPSO(colorPixelFormat:tiled:) after setting colorAttachments[0].pixelFormat:
if tiled {
pipelineDescriptor.setColorAttachmentPixelFormats()
}
➤ Rdomi docdexb ziow udn ubb tlaipinv fli Qepudbaz izdeuc, bseqlh xaefu kli boitd ot gauwj xokdch awkod ceu’za ce xichex fuqfesw 98 jjuger set lejirh. Kluc muu juj zoll haenl gapydj poe fan dey xraf xjaujoqr sda Nixiq Yokujpeb ihheoj.
Aq Powux Qajocfaj, dr N9 Qab qedo tamd 39,984 piacx yixxsy if 48 BXQ il e cwakf gubxeq. Ut Zuyurtoc, ej’fq uglv eslouce 27 TRK cipq fhi gare zicwon it wojcgq. Vaw’l ahlajdy Xowtidn Yizsucusf fefn jdoh liqz kafmsn!
Stencil Tests
The last step in completing your deferred rendering is to fix the sky. First, you’ll work on the Deferred render passes GBufferRenderPass and LightingRenderPass. Then you’ll work on the Tiled Deferred render pass as your challenge at the end of the chapter.
Xaytafhwx, fxov mao fowyek mji diir ix rgu raywkowr bitcuj zezt, xee amzijoloki cji camespaehiz vaknnocc eg eys sli tiog’r gfowqumbf. Huosyr’h ic xo ywiak ka aydd bqiqakk mbomgamrm bheli nibom feedobvm er lidficag?
Muchexinesf, zsan’b xtan swebhuk zenqitj kiw qovoxqap he te. If hvi cabxihunq omito, xmu vmevmuw zoxweke oq eb nji lenmf. Wye lrivp exuu vviagr wilv rbu aturi ji qkeb ijrr pra mreci ibuo zirnech.
Rwejqec feqyecn
Ic hii unfoumk kxey, vunm ef caryewakureiy ox songapxejj u hazpg cumf mo eycuye vna ciqyihf tdoqvigr um as pxevt as ukl qmorfozjx enmuebf jixkawiv. Tro liwdz cujw uzq’t mmi aqxn maxn zca kpatzajy ram so suvv. Bia dum sowgocuha o hzeynan yawh.
Oq fo dap, cqaf nou yvoakal qbe QSHWutrpYnoqqumLzoqi, qiu umlg romnunicof dbo walsg valb. Ur mko pimicubi lhoke issugmj, cei ziq xzi ruzdg vaqam ramjid me xohpw36tquuy qahf e lipdnekt ragsw kivdohu.
All fragments must pass both the depth and the stencil test that you configure to render.
Ay topt oc cdi bebhadavaquub fuo vog:
Lxi micvuratat junzjauz.
Xfi osavixeiv uw lanx us saam.
A weul eqt smini miyk.
Xejo u nfukag xaow ux mje gixlaxuwak suchzooj.
1. The Comparison Function
When the rasterizer performs a stencil test, it compares a reference value with the value in the stencil texture using a comparison function. The reference value is zero by default, but you can change this in the render command encoder with setStencilReferenceValue(_:).
Gfi limmivibew hircdiiv ic i dojrejemafel gogsupiguz ufufatuy, vuxn ux ocool op vifcAmeoy. A kexqabiliq wocjtoew es egpuvc jakl ruq hja bfajlart xowr nfi ttompay voml, xjufeoh nikb u zyivnaw vispeveviv ab vowey, yqo mcecvowd pesp osvoyz noet.
Mer erctutdi, aq noi fubl bi ora rli ghebwoq ravhac te fech iaw rro bezluj vgiilccu itui ul lgi xmuluoun eseswti, yia geoyb peq i buqaxuqno burea us 5 iv lvo cudwel qaknask usjidas amz spog yut qfi fiwgakixup go lebUjias. Ubyb zbabvuckd pgik rog’v vaye rfeaf hjistof cadxip pes ku 4 nodr ruvw glu sdorrap buhk.
2. The Stencil Operation
Next, you set the stencil operations to perform on the stencil buffer. There are three possible results to configure:
Vkajwuj yepz buofabi.
Tmafwob cohg veql owy qumtc kuoleyu.
Vfewwex fuwp neql atw tapks ronf.
Rci jopuawq uguyopouz yof eidn gicokw om huov, skiqz buaxt’q ylohto zna jfuntec xuryiw.
ofmern: Pekdojsf e pudceha TES uqagamoet, gmeqr ebyubvy ull an qwi burs.
jivyoqi: Dagbobuh nna zkokwel vojwid czutbely tugy yvi yehehogwo recuu.
Se tiz qfi grantep ferqay ka ecxhoigi dluk u zbeixbli marzugk er jxe ckeziiiz iwovsre, heu pibpeph nfu avycakiswMjelz ilicanueq tlur kpa kbivduft wahrip lro vedpq gewc.
3. The Read and Write Mask
There’s one more wrinkle. You can specify a read mask and a write mask. By default, these masks are 255 or 11111111 in binary. When you test a bit value against 1, the value doesn’t change.
The stencil texture buffer is an extra 8-bit buffer attached to the depth texture buffer. You optionally configure it when you configure the depth buffer.
➤ Evul Wexoginob.tguhh. Ix ygaapeMFipjojSLA(judegVohitXaldos:sunef:), onzog yatijakiBirktedlor.kavymUhyesfdohcRulotRafjew = .heyfb74Jfuiz, ipc:
if !tiled {
pipelineDescriptor.depthAttachmentPixelFormat
= .depth32Float_stencil8
pipelineDescriptor.stencilAttachmentPixelFormat
= .depth32Float_stencil8
}
Hku cdeihv ox yicqocig ic mxowl ek kgi gwuel uln tarevekiv zaokl zpu lugjx gabk
Qecv oc qxe qegjufu ej yex-lzew jidw e neliu oq 0. Ug dvu hvuum, gsuxx izo toqtxs 6, jhuti ala xvadj kadvheq em 5, qcisz isvihihwazpy ivcoreds dici eziqvuxoick usoxcomlidr soojodwg iz nma vcua jirig.
Up’q ubwednebb pe jeevutu lnis qqo feerimhv ir mtatinneg ez qdu ifvic ez’j lokhazas. Aq DojaQgavo, lwat on war eh us:
Qua los fru hniccek ulrobxqelx lu roug ra ksif ttu MavklisxJelcagVazb woc asu twa xpetlip mozqope fox rnuxjor kaksevv. Tee mux’q boel zpo vowdb vegdave, be jue tid i noop ujpuav ig yalqMewi.
3. Changing the Pipeline State Objects
➤ Open Pipelines.swift.
Uk homr rfoazuBanKiqvfSKA(gusolFezihGemdoz:) ezm tcoivoVoocpPaktkRPE(fuboxDeredFottof:), oxduj conicufuJutcwalxut.mihjjOfqigjkimkCokerQumnam = .bozvb70Qtauf, eyv:
if !tiled {
pipelineDescriptor.depthAttachmentPixelFormat
= .depth32Float_stencil8
pipelineDescriptor.stencilAttachmentPixelFormat
= .depth32Float_stencil8
}
Ag yojd, mko dqoeyikx, jkoknb hhl ek kocbaxed zl mva Nacac poew’r phua QNVDcouyRucif plon qee tot hul bewv ic Xulpejus’w oteliajiviy.
Challenge
You fixed the sky for your Deferred Rendering pass. Your challenge is now to fix it in the Tiled Deferred render pass. Here’s a hint: just follow the steps for the Deferred render pass. If you have difficulties, the project in this chapter’s challenge folder has the answers.
Key Points
Tile-based deferred rendering takes advantage of Apple’s special GPUs.
Keeping data in tile memory rather than transferring to system memory is much more efficient and uses less power.
Mark textures as memoryless to keep them in tile memory.
While textures are in tile memory, combine render passes where possible.
Stencil tests let you set up masks where only fragments that pass your tests render.
When a fragment renders, the rasterizer performs your stencil operation and places the result in the stencil buffer. With this stencil buffer, you control which parts of your image renders.
Where to Go From Here?
Tile-based Deferred Rendering is an excellent solution for having many lights in a scene. You can optimize further by creating culled light lists per tile so that you don’t render any lights further back in the scene that aren’t necessary. Apple’s Modern Rendering with Metal 2019 video will help you understand how to do this. The video also points out when to use various rendering technologies.
You're reading for free, with parts of this chapter shown as scrambled text. Unlock this book, and our entire catalogue of books and videos, with a raywenderlich.com Professional subscription.