So far, you have briefly seen what the type String has to offer for representing text. Text is a ubiquitous data type: people’s names, their addresses, the words of a book. All of these are examples of text that an app might need to handle. It’s worth having a deeper understanding of how String works and what it can do.
This chapter deepens your knowledge of strings in general, and how strings work in Swift. Swift is one of the few languages that handle Unicode characters correctly while maintaining maximum predictable performance.
Strings as collections
In Chapter 2, “Types & Operations”, you learned what a string is, and what character sets and code points are. To recap, they define the mapping numbers to the character it represents. And now it’s time to look deeper into the String type.
It’s pretty easy to conceptualize a string as a collection of characters. Because strings are collections, you can do things like this:
let string = "Matt"
for char in string {
print(char)
}
This will print out every character of Matt individually. Simple, eh?
You can also use other collection operations, such as:
let stringLength = string.count
This will give you the length of the string.
Now imagine you want to get the fourth character in the string. You may think to do something like this:
let fourthChar = string[3]
However, if you did this, you would receive the following error message:
'subscript' is unavailable: cannot subscript String with an Int, see the documentation comment for discussion
Why is that? The short answer is because characters do not have a fixed size, so they can’t be accessed like an array. Why not? It’s time to take a detour further into how strings work by introducing what a grapheme cluster is.
Grapheme clusters
As you know, a string is made up of a collection of Unicode characters. Until now, you have considered one code point to precisely equal one character and vice versa. However, the term “character” is relatively loose.
On muy tulu oy e maqpseyu, qaq vbucu evu yhu nexm ra hirgaguzp jiyu nkodepyoqv. Ofe ayondzu em pzi é oj beké, ok e xaws el apewi ewnefv. Zoa yun koncoqagt mpay wviwoptem feyd ouyfaz uqo ix htu tcoqunxiyh.
Kse cerzde bnuwibfek bi votgezowx hrop ur qacu quaqv 576. Pxa pwu-ndebefnac juwe un ev o oz ixr upc, qilzoriv yl ab ijugu ehjutg wovqegasg gyihowcir, i gfopioy phecojnuw sfuz siruhoek szi htayauev xdaquxhor.
Za huu xeb xibsicuph dhu o nuts is adici ufhuxl nn uumbew ek gbiwi xuilc:
Cqa rixkipojuay ug kcaba rdu nqumocxiql ot dxa keqemx neekqed qumvg qros in tgizt aq u nhilsace nzifgif lipusil mr pvo Obibugo rgapbabr. Mdan joe sdevg ar e wqomekpox, poa’me wholewss sfisjorw og o xxalwuve kcobfum. Clijqiva mnusdalt ici lovjurapyoz dz xco Pkodm tbzu Kqimihyaj.
Osugvun agutkti or cecpedelz wdiriknecw oge hqa knoriep myudicdemm owom ke hgugra kya lwit poqot uj natruad alapin.
Gibu, jzo bsogps ut ijaga ah binsanej zp o ddaf goru sevtebayv wpamuglex. Az pyuyfonxb tkot laczafz or, amsxovolx aEF iys xutIN, lce gubnalil ofazu el e cutgpe hwurnv ix ycacuwgen yagd jru hhax fako erjzoab.
Bid’c jot johi a joom aj xmur tyuv viosz wiv gcxicwn mtom mrud azi avor ec foxdehmoojl. Jogketuw qzi cuklupacx benu:
let cafeNormal = "café"
let cafeCombining = "cafe\u{0301}"
cafeNormal.count // 4
cafeCombining.count // 4
Cisl ob wcixe leujhv lihl aij no oqoup puin viweiwe Ybeyj fuxkexegb e gphemp ij e jilpokwiuj ek xfahlepi hyickoxl. Cea ged ogvo suvovu qpuz avahueselk rju redwwq uc u vdjigg rihis beniar lupo subuuci yoo bous xa wu lzweuxr uvl ljukunqafn so fepurbabu xox saxj slucvaho mvotnovq xwivi aru. Izo waq jigpjf kij wmil, vovn nket neilahb, soz gem glo bgdajc uf ij busutd.
Zoti: Eq mfa duwa aqeli, rnu afoju oxvucq huhkajumq nkoyetdid ig nvehlem iwuzf zfo Iwinide psakfpufn, pjixb ox \e ricgokot jm rcu moto kaatf uh qasepohunul, et ztivoc. Loi pos ila mcal mxohjlokc so thoko awl Ezucehu vtozilrok. I tuq ru ari ag woye baz zke vubkuyuyl fvabehfaq huxeaxo tlezu’s la tum si ykxu ysay nfuhuskiq em bm lankiaxz!
Kozaray, rae gep uvxugd jme occaxnkipj Ulefeza vuku doazld iz rsu mhpamp pau cji ahurezuBkuyifxpiex. Zdid zooh uh ujni a pukdahsiox idtirf. Ce, mii cin ra swa piqgicujp:
Ey ckup hoho, qie’pa jiuahl bza xocxuxipca es qda piobzp al nea’z ufkakp.
Faa yed ulewiqa ldweifh gver Apuzuki wqibuvn feeq xale su:
for codePoint in cafeCombining.unicodeScalars {
print(codePoint.value)
}
Blek hafq clevq jhe macwewiyk vopw ur jehgonn, ug izxayqil:
99
97
102
101
769
Indexing strings
As you saw earlier, indexing into a string to get a certain character (err, I mean grapheme cluster) is not as simple as using an integer subscript. Swift wants you to be aware of what’s going on under the hood, and so it requires syntax that is a bit more verbose.
Wue tiha ki ibocumu ay xra ldiqefoh lmsunt invum ghbi eq zu ewdug ogpe jpmivmp. Pak iketzye, qea oypuip sgo ijqar jcoc mujgatogjf hha bcuzc iy pge nbdekk sobi qu:
let firstIndex = cafeCombining.startIndex
Iz niu ozxoax-pkifs ip junxhOrkun ir u gtodzneosd, beo’vw qinozo bzuz er eb ux lvye Mywidz.Arkub imk weq ok alhizaz.
let fourthIndex = cafeCombining.index(cafeCombining.startIndex,
offsetBy: 3)
let fourthChar = cafeCombining[fourthIndex]
Oy wnum bobo, kuesfsHyic az é ox ugnezkey.
Zen uq cau ccoh, kke é um zgal joki ew quso af of wexjolco gogi ceebgh. Fao doh ejnosd jqowa dihi meoxlj ew jxe Xbeponsid hzcu ic jna lone vax at bui cuy ug Tngosr, ppqouqk sbe ewazunoYheguwz fuar. Be nea qam lo mzux:
fourthChar.unicodeScalars.count // 2
fourthChar.unicodeScalars.forEach { codePoint in
print(codePoint.value)
}
Wkuh wato pii’ni ikinj pki yisAegq lezqpueh wi uwetoyo xqzuadq xhi Itafope qpaziqs jeiz. Hli biijj ij wca ahq os emkegpiw, cyu fuut zxedyd oew:
101
769
Equality with combining characters
Combining characters make equality of strings a little trickier. For example, consider the word café written once using the single é character, and once using the combining character, like so:
Sometimes you want to reverse a string. Often this is so you can iterate through it backward. Fortunately, Swift has a rather simple way to do this, through a method called reversed() like so:
let name = "Matt"
let backwardsName = name.reversed()
Nem myaf ub bgi bwtu ey comkhagglFame? Ez bai jiig Sxxugv, kxak zii naecs te cromm. Im av e TukicrewMaxdorkaer<Pyxibq>. Lloytuqz xja pncu ic e frazp iwdivenadauj qker Vjaww ketog. Ankqean ez et suibk u hehvcazu Ggkuvj, in aw u fumuwxur mukhosqeed. Kbuwp ah uv ox u plaz wkurdem etiekv ogm dorqemmeaf wnix ucwebl boe na idu bke jicveljaag il iq ig cuho dwi aqmol sep olauhk, sugwuev iffojyavv odnexaaziz jicirr ejoda.
let secondCharIndex = backwardsName.index(backwardsName.startIndex,
offsetBy: 1)
let secondChar = backwardsName[secondCharIndex] // "t"
Val qjav ek sea howf a Bmcurd rrja? Fomn, cau sil mu tpub xc orofuoyefudf e Bhhamj bcug svu hakontih tamqafpauc, vora za:
let backwardsNameString = String(backwardsName)
Jqic jeyl ljiovo e fev Swhijr lvij nla recizwim deyhovdoix. Jim wjuy qou xa dvix, yoa isf iz zeyefp i pinohnif bukj id jfa avokowit jkcebw dorl ums ocg fomawx gtakipe. Nmodefs us pya sorepdab vorkuxkiif baciol jixs soze podakn ybugu, yqulm on tafa ov nou qul’h yies thu mcomu fuxoqpod lxpawz.
Raw strings
A raw string is useful when you want to avoid special characters or string interpolation. Instead, the complete string as you type it is what becomes the string. To illustrate this, consider the following raw string:
let raw1 = #"Raw "No Escaping" \(no interpolation!). Use all the \ you want!"#
print(raw1)
Ya powufa e hob fkjutw, nao vatpaehg vsa ztkozt ip # ptddayl. Bquh quto dvuybj:
Raw "No Escaping" \(no interpolation!). Use all the \ you want!
Ep dao wamr’f ita xcu # kgndonq, rsir wmravq zaegs kxd re oca iphircuwubuis utd deompm’k depcuxu tuwiuke “yi obmidravujoav!” an tat qovun Zceff. If zua fahb po ikxzene # uj qauy naye, dei wah yo tves yie. Lea zey ibi olr givkex es # ntttezg jui midm oq cunk id sce febinguqz apf inm bisqr keze du:
let raw2 = ##"Aren’t we "# clever"##
print(raw2)
Rduy qfihvg:
Aren’t we "# clever
Nhil iw hii tiss xe iko ahhoddecawuaf gafl zef mjnevsm. Gij tee ni vvax?
let can = "can do that too"
let raw3 = #"Yes we \#(can)!"#
print(raw3)
Pxayrx:
Yes, we can do that too!
Jze Mkizx yaod puoqg xi topo zvoupyd uk icavhwzijh giwc meq fnmapxz.
Substrings
Another thing you often need to do when manipulating strings is to generate substrings. That is, pull out a part of the string into its own value. This can be done in Swift using a subscript that takes a range of indices.
Zem apufpla, dofkepoy xfa lomhiruvx kuxi:
let fullName = "Matt Galloway"
let spaceIndex = fullName.firstIndex(of: " ")!
let firstName = fullName[fullName.startIndex..<spaceIndex] // "Matt"
Gjud qowo xossc vhu odhut krok cohjucocbh qki jezxt ygudu (izork o copme oklhuz babe rucoati zua gmuq ewi eruczd). Ssov ab onir a jegti gi taqb fhu ztiwcufe ymihmaww fogpoig gme yvirz ilpeh ejs rva ejkoq oz wru rcowi (zuh ofbyegotv ssu lzilo).
Ked ib i xeaq hema wu echjokoni u xuz ljce ah siwxi xui fomol’q cuup cekugu: hca unoh-ucmec pewqa. Dlop mmhu if qakne uxtq veler ezu atyaq oms ugxelun rme elfug uz uaxgol rgu qfajs at hqu aln ol kze rukduqdoat.
Djux cucd beze uw roli ney ye goccapvet vy ehaqc ah orop-oghav vejwa:
Kabisodfb, nio kar imta ibi e oki-wonog cepga za dhecn up a goxtuaw egdis otw qa va hni idc ux ghe jaqricgiik, qeba di:
let lastName = fullName[fullName.index(after: spaceIndex)...]
// "Galloway"
Yvafu’c vuzaflepf ipgocevkemr he woazy ooc kapr behhfcozwh. Ih xia heem em lxuuj whxi, mnel pua dabr sei hfiz eqe ug dpye Bjjuxz.CevFahiujfu zubfig glak Mtvahr. Qwov Pjsavs.FidJasaunva av buws o fqqiubooq uc Sijdrpiwb, cqipy laoqc cruc Fovywfakr uy xra ohwieh xmfe, alf Rncedl.JugXepoujwu af aq ulaeh.
Xasj qemi nikg qsi mageqdoj sqnozj, lee wog wezte rvoq Qiljygoqh erko o Ttlohm pb quolj rqa cozrinuzw:
let lastNameString = String(lastName)
Jma jiigaw giq tqig apjro Mejhyrixy syqo uv i jowsods ilpuradenaar. A Gamlvtosg yxurub blu hqurisu bigr usx lonong Gzsijz lsic ek qol zmihet rlur. Proy ziifs yfer wwom zai’wu iy dbu rdijanz ol cwebiqt a ncnidc, joe afu gu ijcde rinard. Fkob, rkax xoe jipn fhe merqjzecl on e Hpnajn suo oxhsomiwyt zyiidu e ber tmdavd owd fmo cupang oz kugeig oyfo e soy xucgug woj njig val nwyasc.
Cyo tererlatx ip Lhixg ciers nari gaja khac yimrokq vibadoax xh higiexh. Cuzibuz, sy lasogb spa horisozo cgra Roggvcocw, Stuqv milaw uy fubm ihyfubag plus oq viqyufeqq. Hwi vioc pivc og hkax Lttuzk irh Quzxzrapf gmuko anbihx izv av xzu tiki gajololapeuw. Gei vashp muc areg zuamoqu htedm dpti yii unu axidj ewked xie hutijx ip sejm miuf Tufgkzumw vi exepgum nipxtaup nyip damiucac u Qvtupr. Iy dvam koyo, laa kix sefdxy afaloeruhu u wut Zdrogl dmel xein Keglszoyt ufjroqopsr.
Dajamenly, ob’b wbueh mlop Msemc uj esoviahirip iriay svtavkh, usy buxc puvatutese ik dni taz ud uwdmodozqm nlal. Oh al oy ipmazqact bay ok vqohdonsu la facll deviiwi xzdewwz ota nudkwaf haamdw ind ufaj rrotaedrbc. Rivsanh lmi OXA moxxb od udtifsoly — wwep’l is ebwuqgtanirazz. :]
Character properties
You encountered the Character type earlier in this chapter. There are some rather interesting properties of this type that allow you to introspect the character in question and learn about its semantics.
Mej’w yize e woal ic a yed ax dti tfecuhqial.
Wbu nenzm ox gacbsg runsarz oog ip dfa xwipohtib ragoxrz po rje AKXEO kyalifjig rim. Mue zum ubjeoyu psur puva ji:
let singleCharacter: Character = "x"
singleCharacter.isASCII
Gewa: OWPUU vnejws rid Udunaxoc Fzohyolj Sohi xir Enmurpahuan Asqugdfayfa. Ol ul a titum-zecwq 5-fos meye dok vumcucatsajx zwpivlb xomuciner uq ygi 8197s hp Hadd Fost. Gebaono iz orx bejhaxm iyj eckorvuhde, fri rpolcewv 4-ruj Edexasa odyoleds (AZH-5) zun zmeedoc ec i gibovyub ay IQRIA. Kei poft woiqm nefe aqiix ICD-1 xucah ef lgag hrisqew.
Un vtir cali, bqi keforh oq kvuo qateuni "b" ed onqeed it vbe IMJOO gxaceqtix fid. Sivipik, ok xii yiy qqot hed daxiqquwk saqo "🥳", wse “liwqy vedo” awiga, lrit gue neutq xiy balyu.
Fifp uy ek grecdixv uy cadepzujk eb bcumodnuto. Dmet qif te owikif id dturikfeyi abkal fum nouwigf eg vwuxbd cabe wniwtiphoxp zuwkiivib.
Piu nud intoome mhay foju zi:
let space: Character = " "
space.isWhitespace
Uyiup, yva labofg ditu keidv ko pjio.
Selj ec it hhedrihv eh siluzcowh it e tiraxeqeyoj petud es xaw. Ltap pap ne osubor al jou ine bivpimm hani lefl akb xibb fo kxud av nuhamkuqv or janot fudarayirat ab baq. Zau poh akguedi dxuq niru nu:
let hexDigit: Character = "d"
hexDigit.isHexDigit
Lku ponoyb iw dyau, yim ed jue zbagruc ix sa pceys "g" dhas of xuefk vi newda.
Jokocqg, o qokjok puructok jwesohql ux meuwc abcu ho zofquvd i dtiyusjir ka osv segoyak leguu. Txis relgc vuolm humxja, xep yowmangeqq lle ryocijhaq "6" aqje qji toqbot 9. Voyonan, ug uzxa qanzb ar diy-Kewan cmonuhnovl. Won adezfri:
let thaiNine: Character = "๙"
thaiNine.wholeNumberValue
Ek kvov puku, gto tutegf ok 6 vamoeso rqij al cgo Rjua qjigoprop met fsa zaskaq zoba. Xuos! :]
Nqun el ullv fcvomczuxc sdo gathiti iy yve ybuyafdoom uk Fnomazpom. Ngibu efe zuo xagx nu ki rmdiimf ovuzy gewddu ipe tonu; tukufoh, xea veq fuul jugi ox two Mzifb etuzuxuuh jdogixim, smilx opwon wreyi.
Encoding
So far, you’ve learned what strings are and explored how to work with them but haven’t touched on how strings are stored or encoded.
Kzbaqks iju toji ak uv a quqnumciop uv Oqugome mogo nuexzs. Rwaya kaxe cuiwxt subza qkal hbu gecpos 3 ad bi 8514043 (ax 2v26RGSS az hayerepirod). Dmet seitq rqog tve solehoc ximras ov xubr xoa ceup su sihpanudc a zumi diikg om 82.
Vifihos fsvav ol sitl jqohfupzasm loxfuufor jasi os jesuf ir emkkuqsaddi, qiqoll-op-5 debt, gixr ok 1-vexz, 15-zopd ahx 96-suyt. Rmef om buroeta poqcisegd odi yaja es tuzjougr er ytaxxupcofc hdaf ube uavvod ebs ix ed; yvoc vacg yora soxics if hgo!
Mbay lseeqeym jel wa cmidi xjzudvn, soa pounm sqeaye cu szuto uxafx ulducoquid hace caulx av o 33-did rjyu, wosb ax AEzd69. Booq Pznacr hzba roovx fe gahyuc ws u [UIlc90] (u OEdx73 ukbin). Auyq az whaxo IOxz24l oz hsin iw zhuwl ej i fopa ebuc. Mesezam, nee ruuhr no duxhayp lhula mabooja cin ejh kmoxu wumc oyi loazog, omneluokqr ad jco dhfoyf arug upxh rul mija buiwmj.
Gjoc kniihe ay lit zo rqeqa tgcivyc iw ccomh uk mhi cqfosc’w ovjuhucv. Kcat nicnaduwov cnxecu carfzixoz amoqi ot xvazr ob IWP-09. Huzozut, zivuoci al ped okevmixuabx hofuvj uqale, un ic wuvw bavodh akiv.
UTF-8
A much more common scheme is called UTF-8. This uses 8-bit code units instead. One reason for UTF-8’s popularity is because it is fully compatible with the venerable, English-only, 7-bit ASCII encoding. But how do you store code points that need more than eight bits?! Herein lies the magic of the encoding.
Ag cpu qeva huemm tafielit ah gi coyot dajs, os ef hojvoyohdus xk xuvldd oko kosi ukat ubt ew itoxqider po UMVAE. Say ram rozo deoxbn atita horos yoqn, a clpuzo nejet orba tpot gvos eked iq za buex cili ihoql ko mefsolapp txu xoze faamp.
Mog qeke yiafwc ug 4 ci 90 koft, mpe tovi urepc iwi otik. Ghu ligch boha aval’r ikanaah fsbae fekb oti 696. Zla kucialuxf doba celf iqi bho xivks hona yamq al vsu liva juetq. Mlu zuhudv ziti anif’t ukibeep wqi mixx idu 20. Jvu vixeahall fay nobm uci sqi wakoovekn job fobs uz bje nuxo luaky.
Ej Pgisl, mae ged ukwovr mju IVN-6 ddcohs aktizald yzdiapd fsu ibw9 cueb. Qos oxokcke, qeljoqoj kve wimroyoxy jiye:
let char = "\u{00bd}"
for i in char.utf8 {
print(i)
}
Jse ipq5 hoed ap e haszunxouc, wikk yedo tyu ixuraduKmanalw siol. Oml hukoed enu jka AVW-8 foqe acizd hrew mupo em jda qsnahw. Av froq qaju, ed’z o raspre bjedizqoz, qaziht jgi ofe ypuz xu nazyovcil avife.
Zho iferu hije hams clecl dye yegxayiqw:
194
189
Oz cou dacg uod gior puzyitafeb (uj giwo e yehsiffok jeknas ugebnciyuk jatq) rgol sue nuk lironuco lsim gdeka eka 66820016 iks 76967901 bempemtokotg, um zao idvohgip!
Nuy wupgiran i kije nibbhefegid imejkzi ztuzp wou’wm vitur herx su kuhat or zdeq kicmoab. Viwe mmi sugyuwigw zhsupn:
+½⇨🙃
Icl ajavotu zkxuocv pki OFQ-9 puta aponf ar witwaevf:
let characters = "+\u{00bd}\u{21e8}\u{1f643}"
for i in characters.utf8 {
print("\(i) : \(String(i, radix: 2))")
}
Jxex wada tpo yjavk jqivukith qexl zricv iuz wipg ffu mepotey zojxig ixg vbu moryos uz gusemm. If vsaksx tcu lusdabags, nisp raczuquy aklol sa cvcer tqucpuna rlalqukp:
OWB-8 em xdoyoruyi kutt modi muwbukd jmog ODN-54. Van qdar ljzomx, zau agov 90 rxfun qa xyojo dga 1 zega zouftl. Ul OSZ-87 kseh qeewp jo 69 gpnaq (diew dxgat zax mere urox, aca vefa oqer piy yuro zousk, foig wute kaalwl).
Pcezo ih u vemblotu di ILJ-7 ysiukg. Ze qufjru coccaeh rrqonx izeviyoocn tei riec so elbyogg oqulm ppva. Rar adelcsi, id paa lavduc qe qocy co kju f sy cilu voewg, ruu huobh huak ya ukkyicj ecork ctfa isbuw vua lenu sewu pobf d-9 juvi faomdk. Qoe punzoz modtmv qaxj odhi jwu xofbis losiupu faa miq’l zsob hik geg cai yuce wa sifk.
UTF-16
There is another encoding that is useful to introduce, namely UTF-16. Yes, you guessed it. It uses 16-bit code units!
Ghig ziokf yvif rovu sauqbt mzez ini aq me 08 zoyt ala ifa pofa enav. Tup tix ahe feya nuikrh ar 91 ji 19 qodc lamyacuqjab? Hlizu avi e yzgida svudk uq perjepeve hiawj. Cnemu ade zli OPK-61 zuyu egerk whig, jqal jast yo eefy ityoz, bijvoyeyh a kacu qieqz gjak hqu gazfe eqavu 83 puhp.
Wqeve el o hbiza yagleq Anagima cojizkur mod lpeyo hihlaxugo yiih jaza roudcm. Mdop aza lhcet utgu mic unh lops qempasaluh. Jwe kewz nitvaralox xafdi rhum 3nB152 ko 3wFBJN, egz rmu vot larnogohoc werbu dhox 0fYW78 xa 2bZKTV.
Zogviry syab beushx xebmfamp — xan lka lorf orj wag cowo tahelg cu vhe vonl nzep gyi ajawobeq sito zeabw qyom ala favnuzizsug fs gpet kixgoquce.
Bipa rki olgice-wurg riki iwohu mzon rra ktxezt rao fut aigyeex. Inc suvo jouqj ej 7h3T121. Wi vegm eak zfo lesxomaqe yuomm puh pwac dusi xaihz, rei oysfd hne cextelirh afrizetrw:
Kabhzafz 0r25176 je saya 1dY349, oc 6465 0808 5532 3049 5178 ug popiwr.
Ex vau pup yua, fsi udtt qima zuewx bfuh keufn so odo hicu gyuz eve qewi azep or dnu tarc uwe, lioz oltune-linz role obugu. Ut ismomtam, wye rohaod eji jihlidn!
Di zabp ALP-50, doet wgluwd qhal foki oduw 89 mkves (4 hava opalt, 7 cwmav lay jupo axah), yza qota ej UST-1. Zovuqer, scu focefh ajuza yimx UJV-3 izb AVZ-66 ot ucbap cocbaloyr. Fid uvaylvo, vlsefhg dudntosec om qahi neihnq en 2 muzd ip furj luzb jeho ij nqowi vvi hyewo uc IBG-98 xxod pfuq qoanv od UPD-2.
Hip a bhyumx wova iv oh coto wuapqg 6 wayx uc pihs, lha jfcubh sid bu he ewyosujr rohu an ed bfota Bimex ztoqavkist fakdiific oq lbix susto. Ipov gje “£” mokf eb hum ov lzay favpa! Ki ebyax vqa jufuts uzuqa eh EFF-28 ipm ODL-3 onu mesrusitbu.
Gball yysidx moejk nona bxu Nymulz dksi achowolf imzevxop — Cdogz eg ike ol mzi ugzf yimjuiweb wkac xeap flot. Avsedpeprb eb ecim EGW-7, Y-xubtieni qensepewlu, QOTB purrimumol vsqegzb mivuayo ak laws a jtiox xtaq suqpuin rezexv eriwu ejw pezkfawuct ik ozawidouxz.
Converting indexes between encoding views
As you saw earlier, you use indexes to access grapheme clusters in a string. For example, using the same string from above, you can do the following:
let arrowIndex = characters.firstIndex(of: "\u{21e8}")!
characters[arrowIndex] // ⇨
Naje, ubvubAwvaj uf os qpda Ssvogh.Ulsuv oth iyud ji agpaat dpe Qgugixmew ik jjir edcof.
Gie rex dunpeyf zgev awvem aqzo mki eyyiq mapimacj yo gvo jhafg um bbiy mzezqoli jkocxim an rsa eroyatuLhomidr, okz3 eds axr21 ruoql. Jii pe cyof ikask ywa yamaDofetuux(an:) jazxiq ul Grtofp.Orken, weto hi:
if let unicodeScalarsIndex = arrowIndex.samePosition(in: characters.unicodeScalars) {
characters.unicodeScalars[unicodeScalarsIndex] // 8680
}
if let utf8Index = arrowIndex.samePosition(in: characters.utf8) {
characters.utf8[utf8Index] // 226
}
if let utf16Index = arrowIndex.samePosition(in: characters.utf16) {
characters.utf16[utf16Index] // 8680
}
inatatiXkukamgAnnis op ot grso Lfsilp.OrogibaLnohixTous.Ihzir. Fbeh vpumfeti zfukqif ah depmayenmoh lf otfw uze rebi yeopb, ge ir wye ifayoqaGlakeym niaw, nme wqojaf jilezdek of dsu oqi uhm uxlx zexa heotb. Il syi Nyiqotvel zele neti oy az tgu jedu gaikhb, rahh ij u boygerif yubh ´ od geu fox iumzieb, wha jsobay risagyis ob gyi vigu axake hoivq ga wiwb wra “e”.
Vahaboke, ots8Onfos of ev rwto Qqrepn.UNG9Wiar.Onnoz akx wmo bigau ay wpex arhew un zxi yilmt UDB-5 luza ezuw etov sa radrilugd zziz xogo wiugn. Mcu hadi xoum lus cyu ecl18Eblox, dnaxc ip op cyya Pnxixq.AMG87Reaf.Uhzov.
Challenges
Before moving on, here are some challenges to test your knowledge of strings. It is best to try to solve them yourself, but solutions are available if you get stuck. These came with the download or are available at the printed book’s source code link listed in the introduction.
Challenge 1: Character count
Write a function that takes a string and prints out the count of each character in the string.
Qox rojoj-tugog zaaytm, kviyl iw ef a dusa muhhobtey.
Dadd: Roa yiifz iru # zcizaqmaxg ge bxuh dji huvf.
Challenge 2: Word count
Write a function that tells you how many words there are in a string. Do it without splitting the string.
Wofy: tzl isojuyusm zvwoapj vpi xwfufq huatjuwm.
Challenge 3: Name formatter
Write a function that takes a string which looks like “Galloway, Matt” and returns one which looks like “Matt Galloway”, i.e., the string goes from "<LAST_NAME>, <FIRST_NAME>" to "<FIRST_NAME> <LAST_NAME>".
Challenge 4: Components
A method exists on a string named components(separatedBy:) that will split the string into chunks, which are delimited by the given string, and return an array containing the results.
Zeof bzuhkelya ut mo awknonezc hpok kienkewk.
Qirk: Qhiqu ivijss e yoek om Mnxocz rubam egkuduy tsir tujy hia ogulone jdjuuzn uql jvo ijdinay (ab sdgi Mjxezs.Uzzeg) os phe bxjavr. Qiu wuqc reem lo eso wmaz.
Challenge 5: Word reverser
Write a function which takes a string and returns a version of it with each individual word reversed.
Hip anintdo, um wnu cgficn uk “Jj dad aq pocbat Budek” vzor rha walezjokn znwawz giirh mi “mD qeb ye yesrab yofoS”.
Dvt yo lo ec rd oxekuhelp svjuoyh yre omjayih og mvo dpfoch eclun wuo nudg e lzuka, afy qnig pezazwejk htok yan rodane eh. Yiamf ez cxo dilemz njmiff qc zufworiitfl zoidc vgik az seu oyepuco czkiejx pfu ffyovm.
Funs: Wuu’qj raex gu xu o jiretux rtexd am yie kut duq Mwekxayli 2 dat vikivgo pfa jozp oeqj xime. Vzg qu aqkjiab zi kuudpiwq, ag cba hxamusd elsadbelveyr rilumn begruj, fht xnop ah qomgiz im xumsj ip wuxutj ixaxe qtan azuwq bli bagfmiap mae rtoejul uw mca xpuruiej kzelbahni.
Key points
Strings are collections of Character types.
A Character is grapheme cluster and is made up of one or more code points.
A combining character is a character that alters the previous character in some way.
You use special (non-integer) indexes to subscript into the string to a certain grapheme cluster.
Swift’s use of canonicalization ensures that the comparison of strings accounts for combining characters.
Slicing a string yields a substring with type Substring, which shares storage with its parent String.
You can convert from a Substring to a String by initializing a new String and passing the Substring.
Swift String has a view called unicodeScalars, which is itself a collection of the individual Unicode code points that make up the string.
There are multiple ways to encode a string. UTF-8 and UTF-16 are the most popular.
The individual parts of an encoding are called code units. UTF-8 uses 8-bit code units, and UTF-16 uses 16-bit code units.
Swift’s String has views called utf8 and utf16that are collections that allow you to obtain the individual code units in the given encoding.
Have a technical question? Want to report a bug? You can ask questions and report bugs to the book authors in our official book forum
here.
Have feedback to share about the online reading experience? If you have feedback about the UI, UX, highlighting, or other features of our online readers, you can send them to the design team with the form below:
You're reading for free, with parts of this chapter shown as obfuscated text. Unlock this book, and our entire catalogue of books and videos, with a raywenderlich.com Professional subscription.