NSRegularExpression Tutorial and Cheat Sheet

Soheil Azarpour
NSRegularExpression tutorial and cheat sheet!

NSRegularExpression tutorial and cheat sheet!

A regular expression (commonly known as a “regex”) is a string or a sequence of characters that specifies a pattern. Think of it as a search string — but with super powers!

A plain old search in a text editor or word processor will allow you to find simple matches. A regular expression can also perform these simple searches, but it takes things a step further and lets you search for patterns, such as two digits followed by letter, or three letters followed by a hyphen.

This pattern matching allows you to do useful things like validate fields (phone numbers, email addresses), check user input, perform advanced text manipulation and much much more.

If you have been eager to know more about using regular expressions in iOS, look no further than this tutorial — no previous experience required!

By the end of this NSRegularExpression tutorial you will have implemented code to search for patterns in text, replace those matches with whatever you wish, validate user input information and find and highlight some complex strings in a block of text.

In addition, I’ll give you a handy NSRegularExpression Cheat Sheet PDF that you can print out and use as reference as you’re developing!

Without further ado, it’s time to start crunching some regular expressions.

/The (Basics|Introduction)/

Note: If you’re already familiar with regular expressions, feel free to skip ahead to the next section.

If you are new to regular expressions and are wondering what all the hype is about, here’s a simple definition: a regular expression is a simple string that can describe a large number of possibilities in a concise notation. There are many awesome books and tutorials written about regular expression – you’ll find a short list of them at the end of this tutorial.

Examples

Let’s start with a few brief examples to show you what regular expressions look like.

Here’s an example of a regular expression that matches the phrase “NSRegularExpression”:

NSRegularExpression

That’s about as simple as regular expressions get. You can use some APIs that are available in iOS to search a string of text for any part that matches this regular expression – and once you find a match, you can find where it is, or replace the text, etc.

Here’s a slightly more complicated example – this one matches the phrase “NSRegularExpression” or “NSRegularExpressions”:

NSRegularExpression(s)?

This is an example of using some special characters that are available in regular expressions. The parenthesis create a group, and the question mark says “match the previous element (the group in this case) 0 or 1 times”.

Now let’s go for a really complex example. This one matches any HTML or XML tag:

<([a-z][a-z0-9]*)\b[^>]*>(.*?)

Wow, looks complicated, eh? :] Don’t worry, you’ll be learning about all the special characters in this regular expression in the rest of this tutorial, and by the time you’re done you should be able to understand how this works! :]

Testing Regular Expressions

In this tutorial, you’ll be creating a lot of regular expressions. If you want to try them out visually as you’re working with them, check out regexpal, a web-based regular expression parser. Enter a regular expression in the top field, enter some text in the bottom field, and the matches in the searched text will automatically highlight.

Load up regexpal and try out the above example expressions one at a time. Here’s some good sample text to use:

NSRegularExpression tutorial or NSRegularExpressions tutorial. And here's an html tag.

Pretty handy, eh? It’s great to see regular expressions in action, so you can test out your own regular expressions as you’re working with them.

Overall Concepts

Before you go any further, it’s important to understand a few core concepts about regular expressions.

Literal characters are the simplest kind of regular expression. They’re similar to a “find” operation in a word processor or text editor. For example, the single-character regular expression t will find all occurrences of the letter “t”, and the regular expression hello will find all appearances of “hello”. Pretty straightforward!

Just like a programming language, there are some “reserved” characters in regular expression syntax, as follows:

It's easy to get carried away with regular expressions!

It’s easy to get carried away with regular expressions!

  • [
  • ( and )
  • \
  • *
  • +
  • ?
  • { and }
  • ^
  • $
  • .
  • | (pipe)
  • /

These characters are used for advanced pattern matching. If you want to search for one of these characters, you need to escape it with a backslash. For example, to search for all periods in a block of text, the pattern is not . but rather \..

As an extra complication, since regular expressions are strings themselves, the backslash character needs to be escaped when working with NSString and NSRegularExpression. That means the standard regular expression \. will be written as \\. in your code.

To clarify the above concept in point form:

  • The literal @"\\." defines a string that looks like this: \.
  • The regular expression \. will then match a single period character

Capturing parentheses are used to group part of a pattern. For example, 3 (pm|am) would match the text “3 pm” as well as the text “3 am”. The pipe character here (|) acts like an OR operator. You can include as many pipe characters in your regular expression as you would like. As an example, (Tom|Dick|Harry) is a valid pattern.

Grouping with parentheses comes in handy when you need to optionally match a certain text string. Say you are looking for “November” in some text, but the user may or may not have abbreviated the month as “Nov”. You can define the pattern as Nov(ember)? where the question mark after the capturing parentheses means that whatever is inside the parentheses is optional.

These parentheses are termed “capturing” because they capture the matched content and allow you reference it in other places in your regular expression.

As an example, assume you have the string “Say hi to Harry”. If you created a search-and-replace regular expression to replace any occurences of (Tom|Dick|Harry) with that guy $1, the result would be “Say hi to that guy Harry”. The $1 allows you to reference the first captured group of the preceding rule.

Capturing and non-capturing groups are somewhat advanced topics. You’ll encounter examples of capturing and non-capturing groups later on in the tutorial.

Character classes represent a set of possible single-character matches. Character classes appear between square brackets ([ and ]).

As an example, the regular expression t[aeiou] will match “ta”, “te”, “ti”, “to”, or “tu”. You can have as many character possibilities inside the square brackets as you like, but remember that any single character in the set will match. [aeiou] looks like five characters, but it actually means “a” or “e” or “i” or “o” or “u”.

You can also define a range in a character class if the characters appear consecutively. For example, to search for a number between 100 to 109, the pattern would be 10[0-9]. This returns the same results as 10[0123456789], but using ranges makes your regular expressions much cleaner and easier to understand.

But character classes aren’t limited to numbers — you can do the same thing with characters. For instance, [a-f] will match “a”, “b”, “c”, “d”, “e”, or “f”.

Character classes usually contain the characters you want to match, but what if you want to explicitly not match a character? You can also define negated character classes, which use the ^ character. For example, the pattern t[^o] will match any combination of “t” and one other character except for the single instance of “to”.

NSRegularExpressions Cheat Sheet

Regular expressions are a great example of a simple syntax that can end up with some very complicated arrangements! Even the best regular expression wranglers keep a cheat sheet handy for those odd corner cases.

To help you out, we have put together an official raywenderlich.com NSRegularExpression Cheat Sheet PDF for you! Please download it and check it out.

In addition, here’s an abbreviated form of the cheat sheet below with some additional explanations to get you started:

  • . matches any character. p.p matches pop, pup, pmp, p@p, and so on.
  • \w matches any “word-like” character which includes the set of numbers, letters, and underscore, but does not match punctuation or other symbols. hello\w will match “hello_9″ and “helloo” but not “hello!”
  • \d matches a numeric digit, which in most cases means [0-9]. \d\d?:\d\d will match strings in time format, such as “9:30″ and “12:45″.
  • \b matches word boundary characters such as spaces and punctuation. to\b will match the “to” in “to the moon” and “to!”, but it will not match “tomorrow”. \b is handy for “whole word” type matching.
  • \s matches whitespace characters such as spaces, tabs, and newlines. hello\s will match “hello ” in “Well, hello there!”.
  • ^ matches at the beginning of a line. Note that this particular ^ is different from ^ inside of the square brackets! For example, ^Hello will match against the string “Hello there”, but not “He said Hello”.
  • $ matches at the end of a line. For example, the end$ will match against “It was the end” but not “the end was near”
  • * matches the previous element 0 or more times. 12*3 will match 13, 123, 1223, 122223, and 1222222223
  • + matches the previous element 1 or more times. 12+3 will match 123, 1223, 122223, 1222222223, but not 13.
  • Curly braces {} contain the minimum and maximum number of matches. For example, 10{1,2}1 will match both “101” and “1001” but not “10001” as the minimum number of matches is 1 and the maximum number of matches is 2. He[Ll]{2,}o will match “HeLLo” and “HellLLLllo” and any such silly variation of “hello” with lots of L’s, since the minimum number of matches is 2 but the maximum number of matches is not set — and therefore unlimited!

That’s enough to get you started!

If you would like more background on regular expressions, hit up your favourite search engine to find some online regular expression references. If you’re more of a dead-tree type of person, there’s also a great number of thick paper books that too explain all the details of regular expressions.

Or – you can just load up regexpal again and start experimenting with some of the syntax you’ve learned so far!

Implementing RegEx in iOS

Now that you know the basics, it’s time to get into iOS-specific implementations of regular expressions.

Start by downloading the starter project for this NSRegularExpression tutorial. Once you’ve downloaded it, open up the project in Xcode and run it.

The UI for the app is mostly complete, but the core functionality of the app relies on regular expressions, which it doesn’t have…yet! Your job in this tutorial is to add the required regular expressions into this app to make it shine.

A few sample screenshots demonstrating the content of the application are shown below:

The sample application covers three common use-cases with regular expressions:

  1. Performing text search, as well as search & replace
  2. Validating user input
  3. Auto-formatting user input

You’ll start by implementing the most straightforward use of regular expressions: text search.

/Search( and replace)?/

Here’s the basic overview of the search-and-replace functionality of the app:

  • The first view controller RWFirstViewController has a read-only UITextView that contains some pre-filled content.
  • The navigation bar contains a search button that will present RWSearchViewController in a modal fashion.
  • The user will then type some information into the field and tap “Search”.
  • The app will then dismiss the search view and highlight all matches in the text view.
  • If the user selected the “Replace” option in RWSearchViewController, the app will perform a search-and-replace for all matches in the text, instead of highlighting the results.

Note: Your app uses the NSAttributedString property of UITextView in iOS 6 to highlight the search results. You can read more about it in iOS 6 by Tutorials – Chapter 15, “What’s New with Attributed Strings”.

There’s also a “Bookmark” button that allows the user to highlight any date, time or location in the text. For simplicity’s sake, you won’t not cover every possible format of date, time and location strings that can appear in your text. You’ll implement the bookmarking functionality at the very end of the tutorial.

Your first step to getting the search functionality working is to turn standard strings representing regular expressions into NSRegularExpression objects.

Open up RWFirstViewController.m and replace the stub implementation of regularExpressionWithString:options: with the following:

// Create a regular expression with given string and options
- (NSRegularExpression *)regularExpressionWithString:(NSString *)string options:(NSDictionary *)options
{
    // Create a regular expression
    BOOL isCaseSensitive = [[options objectForKey:kRWSearchCaseSensitiveKey] boolValue];
    BOOL isWholeWords = [[options objectForKey:kRWSearchWholeWordsKey] boolValue];
 
    NSError *error = NULL;
    NSRegularExpressionOptions regexOptions = isCaseSensitive ? 0 : NSRegularExpressionCaseInsensitive;
 
    NSString *placeholder = isWholeWords ? @"\\b%@\\b" : @"%@";
    NSString *pattern = [NSString stringWithFormat:placeholder, string];
 
    NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:pattern options:regexOptions error:&error];
    if (error)
    {
        NSLog(@"Couldn't create regex with given string and options");
    }
 
    return regex;
}

The workhorse in this method is the call to regularExpressionWithPattern:options:error. This turns a string into a NSRegularExpression object.

However, before that happens, there are two things to consider:

  • If a case-insensitive search is requested, then set the value of NSRegularExpressionCaseInsensitive. The default behavior of NSRegularExpression is to perform case-sensitive searches, but in this case you’re using the user-friendly option of case-insensitive searches.
  • If a whole word search is requested, then wrap the regular expression pattern in the \b character class. Recall that \b is the word boundary character class, so putting \b before and after the search pattern will turn it into a whole word search.

Now that you have the NSRegularExpression object, you can use it for matching text along with many other operations.

Inside RWFirstViewController.m, find searchAndReplaceText and modify it as follows:

// Search for a searchString and replace it with the replacementString in the given text view with search options
- (void)searchAndReplaceText:(NSString *)searchString withText:(NSString *)replacementString inTextView:(UITextView *)textView options:(NSDictionary *)options
{
    // Text before replacement
    NSString *beforeText = textView.text;
 
    // Create a range for it. We do the replacement on the whole
    // range of the text view, not only a portion of it.
    NSRange range = NSMakeRange(0, beforeText.length);
 
    // Call the convenient method to create a regex for us with the options we have
    NSRegularExpression *regex = [self regularExpressionWithString:searchString options:options];
 
    // Call the NSRegularExpression method to do the replacement for us
    NSString *afterText = [regex stringByReplacingMatchesInString:beforeText options:0 range:range withTemplate:replacementString];
 
    // Update UI
    textView.text = afterText;
}

First, this method captures the old text in the UITTextView and the text length is calculated. It’s possible to apply a regular expression to just a subset of your text, which is why you need to specify the range. In this case, you’re just using the the entire length of the string which will result in the regular expression being exercised over all of your text.

The real magic happens in stringByReplacingMatchesInString:options:range:withTemplate. Here, it returns a new string without mutating the old string. This new string is then set on the UITextView so the user can see the results.

Add the following method to RWFirstViewController.m:

// Search for a searchString in the given text view with search options
- (void)searchText:(NSString *)searchString inTextView:(UITextView *)textView options:(NSDictionary *)options
{
    // 1: Range of visible text
    NSRange visibleRange = [self visibleRangeOfTextView:self.textView];
 
    // 2: Get a mutable sub-range of attributed string of the text view that is visible
    NSMutableAttributedString *visibleAttributedText = [textView.attributedText attributedSubstringFromRange:visibleRange].mutableCopy;
 
    // Get the string of the attributed text
    NSString *visibleText = visibleAttributedText.string;
 
    // 3: Create a new range for the visible text. This is different
    // from visibleRange. VisibleRange is a portion of all textView that is visible, but
    // visibileTextRange is only for visibleText, so it starts at 0 and its length is
    // the length of visibleText
    NSRange visibleTextRange = NSMakeRange(0, visibleText.length);
 
    // 4: Call the convenient method to create a regex for us with the options we have
    NSRegularExpression *regex = [self regularExpressionWithString:searchString options:options];
 
    // 5: Find matches
    NSArray *matches = [regex matchesInString:visibleText options:NSMatchingProgress range:visibleTextRange];
 
    // 6: Iterate through the matches and highlight them
    for (NSTextCheckingResult *match in matches)
    {
        NSRange matchRange = match.range;
        [visibleAttributedText addAttribute:NSBackgroundColorAttributeName value:[UIColor yellowColor] range:matchRange];
    }
 
    // 7: Replace the range of the attributed string that we just highlighted
    // First, create a CFRange from the NSRange of the visible range
    CFRange visibleRange_CF = CFRangeMake(visibleRange.location, visibleRange.length);
 
    // Get a mutable copy of the attributed text of the text view
    NSMutableAttributedString *textViewAttributedString = self.textView.attributedText.mutableCopy;
 
    // Replace the visible range
    CFAttributedStringReplaceAttributedString((__bridge CFMutableAttributedStringRef)(textViewAttributedString), visibleRange_CF, (__bridge CFAttributedStringRef)(visibleAttributedText));
 
    // 8: Update UI
    textView.attributedText = textViewAttributedString;
}

Here’s a step-by-step explanation of the above code:

  1. In order to be as efficient as possible, get the range of the textview that is visible on the screen. You only need to search and highlight what’s visible on the screen — not the text that is off-screen. The helper method visibleRangeOfTextView: returns an NSRange of the currently visible text in a UITextView.
  2. Get a mutable sub-range of attributed string of the visible portion of the textview that you can use in the next step.
  3. You now create a new NSRange for the visible text. This is different from visibleRange. visibleRange determines which portion of all textView that is visible, but visibileTextRange is only for the substring of attributed text you pulled out, so it should start at 0 and its length is the length of visibleText. This is the range you will pass into the regex engine.
  4. Call the convenience method regularExpressionWithString to create a regex for you with the provided options.
  5. Find matches and store them in an array. These matches are all instances of NSTextCheckingResult.
  6. Iterate through the matches and highlight them.
  7. Replace the range of the attributed string that you just highlighted. To do this, create a CFRange from the NSRange of the visible range. Get a mutable copy of the attributed text of the text view and perform the replace.
  8. Update the UITextView with the highlighted results.

While searching, highlighting, and rendering attributed strings is a pretty cool feature to have in your app, it comes at a performance cost — especially if your text is very long.

To handle this in the most efficient way possible, you perform the find-and-highlight on only the visible portion of the text view.

However, when your user scrolls the text they’ll see results that aren’t highlighted. Whoops!

Updating Searches While Scrolling

You need a mechanism to update the UI before they see that. To do this, you’ll implement two UIScrollView delegate methods to update the view as the user scrolls.

Create the following two methods in RWFirstViewController.m:

// Called when the user finishes scrolling the content
- (void)scrollViewWillEndDragging:(UIScrollView *)scrollView withVelocity:(CGPoint)velocity targetContentOffset:(inout CGPoint *)targetContentOffset
{
    if (CGPointEqualToPoint(velocity, CGPointZero))
    {
        if (self.lastSearchString && self.lastSearchOptions && !self.lastReplacementString)
            [self searchText:self.lastSearchString inTextView:self.textView options:self.lastSearchOptions];
    }
}
 
// Called when the scroll view has ended decelerating the scrolling movement
- (void)scrollViewDidEndDecelerating:(UIScrollView *)scrollView
{
    if (self.lastSearchString && self.lastSearchOptions && !self.lastReplacementString)
        [self searchText:self.lastSearchString inTextView:self.textView options:self.lastSearchOptions];
}

Now every time the user scrolls, you perform a fresh search and update the UI accordingly.

Build and run your app and try searching on some various words and groups of words! You’ll see the search term highlighted throughout your text, as shown in the image below:

regex-search

Scroll through the text and see that your UI updates are keeping pace. As well, test out the search and replace functionality to see that your text strings are replaced as expected.

Highlighting and replacing are great user-facing functions. But what about for you, the programmer? How can regular expressions be used to your advantage in your apps?

Data Validation

Many apps will have some kind of user input, such as a user entering their email address or phone number. You’ll want to perform some level of data validation on this user input, both to ensure data integrity — and to inform the user if they’ve made a mistake entering their data.

Regular expressions are perfect for this kind of data validation, since they are excellent at parsing string patterns.

There are two thing you need to add to your app: the validation patterns themselves, and a mechanism to validate the user’s input with those patterns. To make things easy for the user, all of the validations in your app will be case-insensitive, so you can just use lower-case in your patterns.

As an exercise, try to come up with the regular expressions to validate the following text strings (don’t worry about case insensitivity):

  • First name — should be composed of standard English letters and between one and ten characters in length.
  • Middle initial — should be composed of standard English letters and be only one character in length.
  • Last name — should be composed of standard English letters plus the apostrophe (for names such as O’Brien) and between two and ten characters in length.
  • Date of birth – should fall between 1/1/1900 and 12/31/2099, and should be one of the following date formats: dd/mm/yyyy, dd-mm-yyyy, or dd.mm.yyyy.

Of course, you can use regexpal to try out your expressions as you develop them.

How did you do with coming up with the required regular expressions? If you’re stuck, just go back to the cheat sheet above and look for the bits that will help you in the scenarios above.

The spoiler below shows the regular expressions in code, and where to add them in your code — but try it yourself first, without looking!

Solution Inside SelectShow

How did you do?

To create the regular expression to validate the first name, you first match from the beginning of the string, then you match a range of characters from a-z and then finally match the end of the string ensuring that it is between 1 to 10 characters in length.

The next two patterns, middle initial, and last name, follow the same logic. In case of the middle initial, you don’t need to specify the length — {1} — since^[a-z]$ matches on one character.

Note that you’re not worrying about case insensitivity here — you’ll take care of that when instantiating the regular expression.

For the date of birth, you have a little more work to do. You match on the start of the string, then for the month portion you have a capturing group to match one of 01, 02, 03, 04, 05, 06, 07, 08, 09, 10, 11 or 12, followed by another capturing group to match either -, / or ..

For the day portion, you then have another capturing group to match one of 01, 02, … 29, 30, or 31, followed by capturing group to match either -, / or .

Finally, there is a capturing group to match either 19 or 20, followed by any two numeric characters.

You can get very creative with regular expressions. There are other ways to solve the above problem, such as using \d instead of [0-9]. However, any solution is perfectly fine as long as it works!

Now that you have the patterns, you need to validate the entered text in each text fields.

At the very end of RWSecondViewController.m, find the validateString:withPattern: method and replace the implementation with the following:

// Validate the input string with the given pattern and
// return the result as a boolean
- (BOOL)validateString:(NSString *)string withPattern:(NSString *)pattern
{
    NSError *error = nil;
    NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:pattern options:NSRegularExpressionCaseInsensitive error:&error];
 
    NSAssert(regex, @"Unable to create regular expression");
 
    NSRange textRange = NSMakeRange(0, string.length);
    NSRange matchRange = [regex rangeOfFirstMatchInString:string options:NSMatchingReportProgress range:textRange];
 
    BOOL didValidate = NO;
 
    // Did we find a matching range
    if (matchRange.location != NSNotFound)
        didValidate = YES;
 
    return didValidate;
}

This is very similar to what you did in RWFirstViewController.m. You create a regular expression with the given pattern, and since case sensitivity is not required, you use NSRegularExpressionCaseInsensitive.

To actually check for a match, the result of rangeOfFirstMatchInString:options:range: is tested. This is probably the most efficient way to check for a match, since this call exits early when it finds the first match. However, there are other alternatives such as numberOfMatchesInString:options:range: if you need to know the total number of matches.

There’s a small problem with the patterns above. Did you notice what it was?

If the user provides leading or trailing spaces in their fields, then the patterns above will fail to validate. In a case like this, you have two options: either update the patterns to account for leading and trailing spaces, or create and apply another pattern to take out leading and trailing spaces before applying those validation patterns.

The starter project uses the second approach since it keeps the validation patterns simple and concise, and can be refactored into its own method. This is done in stringTrimmedForLeadingAndTrailingWhiteFromString:.

regex-whitespace

Take a look at the following method in RWSecondViewController.m:

// Trim the input string by removing leading and trailing white spaces
// and return the result
- (NSString *)stringTrimmedForLeadingAndTrailingWhiteFromString:(NSString *)string
{
    return string;
}

Right now, this method doesn’t do anything — it simply returns the input.

Can you think of a regular expression to find spaces at the start and end of a string? Try it yourself before checking the spoiler below:

Solution Inside SelectShow

Here’s how to put it into code. In RWSecondViewController.m, replace the implementation of stringTrimmedForLeadingAndTrailingWhiteSpacesFromString with the following:

// Trim the input string by removing leading and trailing white spaces
// and return the result
- (NSString *)stringTrimmedForLeadingAndTrailingWhiteSpacesFromString:(NSString *)string
{
    NSString *leadingTrailingWhiteSpacesPattern = @"(?:^\\s+)|(?:\\s+$)";
 
    NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:leadingTrailingWhiteSpacesPattern options:NSRegularExpressionCaseInsensitive error:NULL];
 
    NSRange stringRange = NSMakeRange(0, string.length);
    NSString *trimmedString = [regex stringByReplacingMatchesInString:string options:NSMatchingReportProgress range:stringRange withTemplate:@"$1"];
 
    return trimmedString;
}

The pattern ^\s+ will find leading whitespace and \s+$ will find trailing whitespace.

That’s clear enough, and the call to stringByReplacingMatchesInString:options:range:withTemplate: is the same as in RWFirstViewController.m.

But why is the replacement string $1? Won’t that just replace the whitespace with itself — since it’s a captured group — which effectively does nothing?

The ?: inside the parentheses tells the regular expression engine to create a non-capturing group. This means that the matched text isn’t stored in a buffer as it normally would be.

The replacement template here is a back-reference to the first capturing group: $1. It tells the engine to replace the matched text with what what was matched in the first capturing group.

Since the first capturing group is a non-capturing group, the engine doesn’t capture anything — and therefore it’s empty! Thus the engine ends up matching white spaces and replacing them with nothing, which effectively removes the leading and trailing spaces.

Build and run your app and switch to the second tab. As you fill in the fields, the regular expressions will check what you entered, and your app will inform you if your text passed or failed validation, as shown in the screenshot below:

regex-second

If you are still struggling to make sense out of non-capturing, capturing and back referencing, try out the following different scenarios to see what the results are.For example:

  • replace the pattern above with @”(^\\s+)|(\\s+$)” and template with @”BOO”
  • replace the pattern above with @”(?:^\\s+)|(\\s+$)” and template with @”$1BOO”
  • replace the pattern above with @”(?:^\\s+)|(\\s+$)” and template with @”$2BOO”

Dealing With Multiple Format Patterns

In the previous section, you implemented a regular expression pattern to match a date. To do this, you made some assumptions and gave the user some flexibility to enter a date in dd-mm-yyyy, dd/mm/yyyy or dd.mm.yyyy format.

But the user could have entered a date in the format dd\mm\yyyy, and it would have failed validation. What if you’re in a situation where you want to enforce a specific pattern?

In RWThirdViewController.m, the user will enter a Social Security Number (SSN) into the app. In the United States, a Social Security Number has a specific format: xxx-xx-xxxx. There are only 11 possible characters in a SSN: 0 1 2 3 4 5 6 7 8 9 and – (hyphen).

How can you enforce this format in your app? What are your options?

You could:

  • Display the default keyboard and intercept user input; if there is an invalid character typed, ignore it. Problem: not very user-friendly. User types something and nothing happens!
  • Display a customized keyboard? Problem: too much work for such a simple task.
  • Display the number pad keyboard, but also intercept user input and insert dashes where necessary. Problem: what if user copy + pastes text from somewhere else?

This tutorial implements the last option, as it requires the least effort, it is more user-friendly, and the problem of copy-pasting from another place can be easily fixed with the help of (surprise surprise!) regular expressions.

Open RWThirdViewController.m and take a look at textField:shouldChangeCharactersInRange:replacementString: to see how the app intercepts user inputs and inserts or removes dashes.

All that’s left for you to do is to come up with a pattern to match a Social Security Number (xxx-xx-xxxx). You need a pattern that must contain 3 numbers, followed by a dash, then 2 numbers, followed by another dash, followed by the final 4 numbers.

Note:

In the real world, Social Security Numbers have more detailed criteria than just matching a certain format. However, for the purposes of this tutorial, you’ll keep it simple and just match the xxx-xx-xxxx format. As homework, try modifying the regex pattern for a Social Security Number so that it matches only acceptable ones, as detailed here: http://en.wikipedia.org/wiki/Social_Security_number#Structure

Rather than [0-9], you can use the more concise digit class \d in your regular expression. A pattern like @"^\\d{3}[-]\\d{2}[-]\\d{4}$" should do the trick!

Replace the definition of kSocialSecuityNumberPattern> at the top of RWThirdViewController.m with the following:

#define kSocialSecuityNumberPattern @"^\\d{3}[-]\\d{2}[-]\\d{4}$"

Build and run your app, and switch to the third tab. Whether you paste in numbers, characters, or use the number pad, the validation will kick in and auto-format, as seen in the screenshot below:

regex-third

Handling Multiple Search Results

You haven’t yet implemented the Bookmark button found on the navigation bar. When the user taps on it, the app should highlight any date, time or location strings in the text.

Open up RWFirstViewController.m in Xcode, and find the following implementation for the Bookmark bar button item:

#pragma mark
#pragma mark - IBActions
 
- (IBAction)findInterestingData:(id)sender
{
    [self underlineAllDates];
    [self underlineAllTimes];
    [self underlineAllLocations];
}

The method above calls three other helper methods to underline dates, times and locations in the text. If you look at the implementation of each of the helper methods above, you will see they are empty!

Guess it’s your job to flesh out these three methods!

Here’s the requirements for the helper methods:

Date Requirements:

  • xx/xx/xx or xx.xx.xx or xx-xx-xx format. Day, month and year placement is not important since they will just be highlighted. Example: 10-05-12.
  • Full or abbreviated month name (e.g. Jan or January, Feb or February, etc.), followed by one or two character number (e.g. x or xx). The day of the month can be ordinal (e.g. 1st, 2nd, 10th, 21st, etc.), followed by comma as separator, and then a four-digit number (e.g. xxxx). There can be zero or more white spaces between the name of the month, day and year. Example: March 13th, 2001

Time requirements:

  • Find simple times like “9am” or “11 pm”: One or two digits followed by zero or more white spaces, followed by either lowercase “am” or “pm”.

Location requirements:

  • Any word at least one character long, immediately followed by a comma, followed by zero or more white spaces followed by any capitalized English letter combination that is exactly 2 characters long. For example “Boston, MA”.

Give it a try and see if you can sketch out the needed regular expressions!

For your convenience, there is already a helper method in the project you can call with an array of patterns for the regex engine to find and highlight for you. It is - (void)highlightMatches:(NSArray *)matches, and you can find its implementation at the very end of RWFirstViewController.m.

Here’s three samples for you to try. Replace the empty implementation of underlineAllDates with the following code:

- (void)underlineAllDates
{
    NSError *error = NULL;
    NSString *pattern = @"(\\d{1,2}[-/.]\\d{1,2}[-/.]\\d{1,2})|(Jan(uary)?|Feb(ruary)?|Mar(ch)?|Apr(il)?|May|Jun(e)?|Jul(y)?|Aug(ust)?|Sep(tember)?|Oct(ober)?|Nov(ember)?|Dec(ember)?)\\s*\\d{1,2}(st|nd|rd|th)?+[,]\\s*\\d{4}";
    NSString *string = self.textView.text;
    NSRange range = NSMakeRange(0, string.length);
    NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:pattern options:NSRegularExpressionCaseInsensitive error:&error];
    NSArray *matches = [regex matchesInString:string options:NSMatchingProgress range:range];
    [self highlightMatches:matches];
}

This pattern has two parts separated by the | (OR) character. That means either the first part or the second part will match.

The first part reads: (\d{1,2}[-/.]\d{1,2}[-/.]\d{1,2}). That means two digits followed by one of - or / or . followed by two digits, followed by - or / or ., followed by a final two digits.

The second part starts with (Jan(uary)?|Feb(ruary)?|Mar(ch)?|Apr(il)?|May|Jun(e)?|Jul(y)?|Aug(ust)?|Sep(tember)?|Oct(ober)?|Nov(ember)?|Dec(ember)?), which will match a full or abbreviated month name.

Next up is \\s*\\d{1,2}(st|nd|rd|th)? which will match zero or many spaces, followed by 1 or 2 digits, followed by an optional ordinal suffix. As an example, this will match both “1” and “1st”.

Finally [,]\\s*\\d{4} will match a comma followed by zero or multiple spaces followed by a four-digit number for the year.

That’s quite the intimidating regular expression! However, you can see how regular expressions are concise and pack a lot of information — and power! — into a seemingly cryptic string.

Next up are the the implementations of underlineAllTimes and underlineAllLocations.

Add the following code to [FPE: to where should we add this?]

- (void)underlineAllTimes
{
    NSError *error = NULL;
    NSString *pattern = @"\\d{1,2}\\s*(pm|am)";
    NSString *string = self.textView.text;
    NSRange range = NSMakeRange(0, string.length);
    NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:pattern options:NSRegularExpressionCaseInsensitive error:&error];
    NSArray *matches = [regex matchesInString:string options:NSMatchingProgress range:range];
    [self highlightMatches:matches];
}
 
- (void)underlineAllLocations
{
    NSError *error = NULL;
    NSString *pattern = @"[a-zA-Z]+[,]\\s*([A-Z]{2})";
    NSString *string = self.textView.text;
    NSRange range = NSMakeRange(0, string.length);
    NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:pattern options:0 error:&error];
    NSArray *matches = [regex matchesInString:string options:NSMatchingProgress range:range];
    [self highlightMatches:matches];
}

As an exercise, see if you can explain the regular expression patterns based on the specifications above.

Build and run the app and tap on the Bookmark icon. You should see the link-style highlighting for dates, times, and locations, as shown below:

regex-bookmark

Where To Go From Here?

Here is the final example project that you developed in the above tutorial.

Congratulations! You now have some practical experience with using regular expressions.

Regular expressions are powerful and fun to work with — they’re a lot like solving a math problem. The flexibility of regular expressions gives you many ways to create a pattern to fit your needs, such as filtering input strings for white spaces, stripping out HTML or XML tags before parsing, or finding particular XML or HTML tags — and much more!

There are a lot of real-world examples of strings that can be validated with regular expressions.
As a final exercise, try to untangle the following regular expression that validates an email address:

[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?

It looks like a jumble of characters at first glance, but with your new-found knowledge (and the helpful links below) you’re one step closer to understanding it and becoming a master of regular expressions!

Here is a short list of some useful resources about regular expressions:

I hope you enjoyed this NSRegularExpression tutorial and cheat sheet, and if you have any comments or questions, please join the forum discussion below!

Soheil Azarpour

Soheil Moayedi Azarpour is an independent iOS developer. He’s worked on iOS applications for clients as well as his own personal apps. You can find him on Twitter, GitHub, Stack Overflow and connect on LinkedIn.

User Comments

5 Comments

  • This is the most helpful tutorial I've read on Regex. Great job!
    ressy
  • Good tutorial. Very useful . Thanks!
    mzds
  • Finished reading and learned a lot .Thank you!
    fogisland
  • Thanks for this article, it's helped me a ton. Shouldn't " be listed as one of the reserved characters also? Only because if you don't put \ in front of it, your string will end prematurely.
    RDSpinz
  • RDSpinz wrote:Shouldn't " be listed as one of the reserved characters also? Only because if you don't put \ in front of it, your string will end prematurely.


    You escape " (quotation mark) because of NSString literal, not because it is a special character to regex engine.
    Canopus

Other Items of Interest

Ray's Monthly Newsletter

Sign up to receive a monthly newsletter with my favorite dev links, and receive a free epic-length tutorial as a bonus!

Advertise with Us!

Our Books

Our Team

Tutorial Team

  • Matthew Morey
  • Brian Broom

... 50 total!

Update Team

  • Andy Pereira
  • Riccardo D'Antoni

... 15 total!

Editorial Team

... 23 total!

Code Team

  • Orta Therox

... 3 total!

Translation Team

  • Marina Mukhina
  • Miguel Angel

... 33 total!

Subject Matter Experts

  • Richard Casey

... 4 total!