SoFunction
Updated on 2025-04-11

RegexBuilder Study Guide in Swift

Preface

In our daily project development, we often encounter dealing with regular expressions. For example, user passwords are usually required to include lowercase letters, uppercase letters, and numbers at the same time, and the length is no less than 8 digits to improve the security of passwords.

In Swift, we can implement it in the literal way of regular expressions.

Regex literal

Regex literal implementation code:

let regex = /^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)[a-zA-Z\d]{8,}$/
let text = "Aa11111111"
print((of: regex).first?.output) // Optional("Aa11111111")

Through the above code, you can see that//Two slashes can be used to generate regular literals. Literal method can indeed make the code very concise, but the price of conciseness is that it is difficult to understand, and it also causes great difficulties in maintaining the subsequent code.

Just like a joke circulating on the Internet: "I have a problem, so I wrote a regular expression. Now, I have two problems." 😂

For the problem that Regex is difficult to understand and maintain, the solution given by the Swift development team is: RegexBuilder.

RegexBuilder - Write regulars like code

Suppose we have a string "name: John Appleseed, user_id: 100", and want to extract the value of user_id. First, import RegexBuilder:

import RegexBuilder

Then, through the structureRegexTo build regular statement:

let regex = Regex {
    "user_id:" // 1
    OneOrMore(.whitespace) // 2
    Capture(.localizedInteger(locale: Locale(identifier: "zh-CN"))) // 3
}

The first line of code matches a fixed string: "user_id", the second line of code matches one or more spaces, and the third line of code matches an integer number.

localizedIntegerThe matching numbers will be automatically converted into integers, such as the following example:

let input = "user_id:  100.11"
let regex = Regex {
    Capture(.localizedInteger(locale: Locale(identifier: "zh-CN")))
}
if let match = (of: regex) {
    print("Matched: \(match.0)") // Matched:  100.11
    print("User ID: \(match.1)") // User ID: 100
}

Although the match is 100.11, the output is still 100.

Finally, data extraction can be performed through macth's related functions:

if let match = (of: regex) {
    print("Matched: \(match.0)")
    print("User ID: \(match.1)")
}

RegexRepetitionBehavior

This structure is used to define the matching repetitive behavior, and it has three values:

  • edger: It will match as many characters as possible and will trace back when necessary. Default is edger
  • reluctant: It will match as few characters as possible, and it will increase the matching area bit by bit according to your needs to complete the matching.
  • possessive: Will match as many characters as possible and will not go backtrack.

For example, the following example:

let testSuiteTestInputs = [    "2022-06-06 09:41:00.001",    "2022-06-06 09:41:00.001.",    "2022-06-06 09:41:00.001."]
let regex = Regex {
    Capture(OneOrMore(.any))
    Optionally(".")
}
for line in testSuiteTestInputs {
    if let (dateTime) = (of: regex)?.output {
        print("Matched: \(dateTime)\"")
    }
}

Because the last three data.There is not necessarily a case, so our regularity has oneOptionally("."). But the matching dateTime will still be provided.. Because edger will match all characters including the last dot,Optionally(".")It won't work at all.

Change toCapture(OneOrMore(.any, .reluctant))This problem will be fixed. becausereluctantIt is matched as little input as possible, so the last oneOptionally(".")Will execute.

In Swift 5.7, the Foundation framework also adapts RegexBuilder. Therefore, for types such as Date, URL, etc., we can use the powerful functions of Foundation to parse.

Foundation support

If we are building a financial-related APP, in order to be compatible with some old data, we need to convert some string-type data into structures.

This is our string data:

let statement = """
CREDIT    2022/03/03    Zhang San     ¥2,000,000.00
DEBIT     03/03/2022    Tom      $2,000,000.00
DEBIT

This is the structure we need to rotate:

struct Trade {
    let type: String
    let date: Date
    let name: String
    let count: Decimal
}

The following is the Regex we need to write:

let regex = Regex {
    Capture {
        /CREDIT|DEBIT/
    }
    OneOrMore(.whitespace)
    Capture {
        One(.date(.numeric, locale: Locale(identifier: "zh_CN"), timeZone: .gmt))
    }
    OneOrMore(.whitespace)
    Capture {
        OneOrMore(.word)
    }
    OneOrMore(.whitespace)
    Capture {
        One(.localizedCurrency(code: "CNY", locale: Locale(identifier: "zh_CN")))
    }
}

First, we need to match the fixed string: CREDIT/DEBIT, followed by one or more spaces.

Next is the highlight of Foundation. For date-type strings, we do not need to write some rules that match the year, month and day rules, we only need to use the functions embedded in Foundation. This not only saves us the time to write it ourselves, but more importantly, the official writing can ensure the correctness of the code more than what we write ourselves.

It should be noted that Apple recommends that we explicitly write the locale attribute instead of the following system writing method:

One(.date(.numeric, locale: , timeZone: ))

Because this writing method will bring about multiple expectations and cannot guarantee the certainty of the data.

After matching the date, the matching of the space and the username is followed. Finally, it is a match to the transaction amount, which is also a match made by the functions provided by Foundation.

Test code:

let result = (of: regex)
var trades = [Trade]()
 { match in
    let (_, type, date, name, count) = 
    (Trade(type: String(type), date: date, name: String(name), count: count))
}
print(trades) 
// [(type: "CREDIT", date: 2022-03-03 00:00:00 +0000, name: "Zhang San", count: 2000000), (type: "DEBIT", date: 2022-03-05 00:00:00 +0000, name: "Li San", count: 33.27)]

Through printing, we can see that the output result did not meet expectations and missed the Tom data. The reason for the missing can be found at a glance through the code: because we explicitly specify the date and amount in Chinese format, obviously03/03/2022This format does not conform to the format of year, month and day. This also reflects the benefits of explicitly specifying the format: it is convenient to troubleshoot problems.

We just convert the date format to year, month and day format, and then convert $ to ¥ to make the regular match correctly.

First, we need to return the correct Date type according to currency:

func pickStrategy(_ currency: Substring) ->  {
  switch currency {
  case "$": return .date(.numeric, locale: Locale(identifier: "en_US"), timeZone: .gmt)
  case "¥": return .date(.numeric, locale: Locale(identifier: "zh_CN"), timeZone: .gmt)
  default: fatalError("We found another one!")
  }
}

Next, write a regular expression to get the corresponding string field:

let regex1 = #/
  (?<date>     \d{2} / \d{2} / \d{4})
  (?<name>   \P{currencySymbol}+)
  (?<currency> \p{currencySymbol})
/#

Note:#//#The format is the format of the runtime regular expression in Swift.

Finally, call the replace function to perform regular character replacement:

(regex1) { match -> String in
    print()
    let date = try! Date(String(), strategy: pickStrategy())
    // ISO 8601, it's the only way to be sure
    let newDate = (.().month().day())
    return newDate +  + "¥"
  }
statement = (of: "-", with: "/")

In this way, we can parse out the Trade type data that meets our needs.

Summarize

  • RegexBuilder makes the code easier to read and maintain
  • The difference between the three values ​​of RegexRepetitionBehavior
  • Use as many functions provided by Foundation to parse data as possible
  • When using Foundation, you must specify the format to parse the data, so that the data can be unique

Reference link

  • Source of regular expressions for user passwords
  • RegexBuilder

The above is the detailed content of the RegexBuilder study guide in Swift. For more information about Swift RegexBuilder, please follow my other related articles!