SoFunction
Updated on 2025-03-02

POSIX style and compatible Perl style analogy of the main functions of the two regular expressions (preg_match, preg_replace, ereg, ereg_replace)

First, let’s take a look at the two main functions of POSIX-style regular expressions:

ereg function: (regular expression matching)

Format: int ereg ( string pattern, string string [, array &regs] )
Note: The preg_match() function that uses Perl-compatible regular expression syntax is usually a faster alternative than ereg(). (Usually, use preg_match(), which is easier to solve~~)

Finds a substring in a string in a case-sensitive manner that matches the given regular expression pattern. If a substring matches the subpattern in parentheses in pattern and the function call gives the third parameter regs, the match will be stored in the regs array. $regs[1] contains the substring starting with the first left bracket, $regs[2] contains the second substring, and so on. $regs[0] contains the entire matching string.

Return value: Returns the length of the matching string if no match is found or error is returned. If no optional parameter regs is passed or the matching string length is 0, this function returns 1.

Let’s take a look at the example of the ereg() function:

The following code snippet accepts the date in ISO format (YYYY-MM-DD) and then displays in format:
Copy the codeThe code is as follows:

<?php
if (ereg ("([0-9]{4})-([0-9]{1,2})-([0-9]{1,2})", $date, $regs)) {
echo "$regs[3].$regs[2].$regs[1]";
} else {
echo "Invalid date format: $date";
}
?>

-----------------------------------------------------------------------------------
ereg_replace function: (regular expression replacement)

Format: string ereg_replace ( string pattern, string replacement, string string )
Function description:
This function scans the part that matches the pattern in string and replaces it with replacement.
Returns the replaced string. (If there is no match for replacement, the original string will be returned.)
If pattern contains substrings in brackets, replacement can contain substrings like \\digit, which will be replaced with substrings in the number of brackets represented by numbers; \\0 contains the entire contents of the string. Up to nine substrings can be used. Parentheses can be nested, in this case, the order is calculated using the left brackets.
If no match is found in string, string will return as is.
Let's take a look at this function example:
1. The following code snippet outputs "This was a test" three times:
Copy the codeThe code is as follows:

<?php
$string = "This is a test";
echo str_replace(" is", " was", $string);
echo ereg_replace("( )is", "\\1was", $string);
echo ereg_replace("(( )is)", "\\2was", $string);
?>

One thing to note is that if an integer value is used in the replacement parameter, you may not get the desired result. This is because ereg_replace() will interpret and apply the number as a sequence value of a character. For example:
2. Example when the replacement parameter is an integer:
Copy the codeThe code is as follows:

<?php
/* Cannot produce the desired result */
$num = 4;
$string = "This string has four words.";
$string = ereg_replace('four', $num, $string);
echo $string; /* Output: 'This string has words.' */
/* This example works fine */
$num = '4';
$string = "This string has four words.";
$string = ereg_replace('four', $num, $string);
echo $string; /* Output: 'This string has 4 words.' */
?>

3. Replace the URL with a hyperlink:
Copy the codeThe code is as follows:
$text = ereg_replace("[[:alpha:]]+://[^<>[:space:]]+[[:alnum:]/]",
"<a href=\"\\0\">\\0</a>", $text);

Tip: The preg_replace() function uses Perl-compatible regular expression syntax, which is usually a faster alternative than ereg_replace().
Let’s take a look at two main functions that are compatible with Perl regular expressions:
preg_match function: (do regular expression matching)
Format: int preg_match ( string pattern, string subject [, array matches [, int flags]] )
Function description:
Search the subject string for content matching the regular expression given by pattern.
If matches are provided, they will be populated with the search results. $matches[0] will contain text that matches the entire pattern, $matches[1] will contain text that matches the subpattern in the first captured bracket, and so on.
flags can be the following tag:
PREG_OFFSET_CAPTURE
If this flag is set, the matching result that occurs also returns its attached string offset at the same time. Note that this changes the value of the returned array so that each unit in it is also an array, where the first term is the matching string and the second term is its offset. This marker is available from PHP 4.3.0.
The flags parameter is available from PHP 4.3.0.
preg_match() returns the number of times pattern matches. Either 0 times (no match) or 1 time, because preg_match() will stop searching after the first match. preg_match_all() will instead search until the end of subject. If an error occurs preg_match() returns FALSE.
Tips: If you want to see if one string is included in another string, do not use preg_match(). You can use strpos() or strstr() instead, which is much faster.
Let's take a look at its example:
Example 1. Search for "php" in the text:
Copy the codeThe code is as follows:

<?php
// "i" after the pattern delimiter indicates a case-insensitive search
if (preg_match ("/php/i", "PHP is the web scripting language of choice.")) {
print "A match was found.";
} else {
print "A match was not found.";
}
?>

Example 2. Search for the word "web":
Copy the codeThe code is as follows:

<?php
/* The \b in the pattern indicates the boundary of the word, so only independent "web" words will be matched.
* without matching for example "webbing" or "cobweb" part */
if (preg_match ("/\bweb\b/i", "PHP is the web scripting language of choice.")) {
print "A match was found.";
} else {
print "A match was not found.";
}
if (preg_match ("/\bweb\b/i", "PHP is the website scripting language of choice.")) {
print "A match was found.";
} else {
print "A match was not found.";
}
?>

Example 3. Remove the domain name from the URL:
Copy the codeThe code is as follows:

<?php
// Get the hostname from the URL
preg_match("/^(http:\/\/)?([^\/]+)/i",
"/", $matches);
$host = $matches[2];
// Get the next two segments from the host name
preg_match("/[^\.\/]+\.[^\.\/]+$/", $host, $matches);
echo "domain name is: {$matches[0]}\n";
?>

This example will output:
domain name is:
-----------------------------------------------------------------------------------
preg_replace function: (execute and replace regular expressions)
Format: mixed preg_replace ( mixed pattern, mixed replacement, mixed subject [, int limit] )
Function description:
Search for matches for pattern patterns in subject and replace with replacement. If limit is specified, only limit matches are replaced. If limit is omitted or its value is -1, all matches will be replaced.
replacement can contain a reverse reference to the form \\n or (as of PHP 4.0.4) $n, the latter is preferred. Each such reference will be replaced with text matching the subpattern in the nth captured bracket. n can range from 0 to 99, where \\0 or $0 refers to the text that is matched by the entire pattern. Count left to right (starting from 1) on the left brackets to get the number of sub-patterns.
When a replacement pattern is followed by a number (that is, the number immediately following a matching pattern), the familiar \\1 symbol cannot be used to represent a reverse reference. For example, \\11 will make preg_replace() unclear whether you want a \\1 reverse reference followed by a number 1 or a \\11 reverse reference. The solution in this example is to use \${1}1. This creates an isolated $1 reverse reference, while making the other 1 just a simple text.
Let's take a look at its example:
Example 1. Usage of reverse reference followed by numbers:
Copy the codeThe code is as follows:

<?php
$string = "April 15, 2003";
$pattern = "/(\w+) (\d+), (\d+)/i";
$replacement = "\${1}1,\$3";
print preg_replace($pattern, $replacement, $string);
/* Output
======
April1,2003
*/
?>

If a match is searched, the replaced subject will be returned, otherwise the original unchanged subject will be returned.
Each parameter of preg_replace() (except limit) can be an array. If pattern and replacement are both arrays, they will be processed in the order in which their key names appear in the array. This does not necessarily mean the same number order as the index. If you use an index to identify which pattern will be replaced by which replacement, you should sort the array with ksort() before calling preg_replace().
Example 2. Use index arrays in preg_replace():
Copy the codeThe code is as follows:

<?php
$string = "The quick brown fox jumped over the lazy dog.";
$patterns[0] = "/quick/";
$patterns[1] = "/brown/";
$patterns[2] = "/fox/";
$replacements[2] = "bear";
$replacements[1] = "black";
$replacements[0] = "slow";
print preg_replace($patterns, $replacements, $string);
/* Output
======
The bear black slow jumped over the lazy dog.
*/
/* By ksorting patterns and replacements,
we should get what we wanted. */
ksort($patterns);
ksort($replacements);
print preg_replace($patterns, $replacements, $string);
/* Output
======
The slow black bear jumped over the lazy dog.
*/
?>

If subject is an array, search and replacement are performed on each item in the subject and an array is returned.
If pattern and replacement are both arrays, preg_replace() will take out values ​​from them in turn to search and replace subjects. If the value in replacement is less than in pattern, an empty string is used as the remaining replacement value. If pattern is an array and replacement is a string, this string is used as a replacement value for each value in pattern. In turn, it makes no sense.
The /e modifier makes preg_replace() treat the replacement parameter as PHP code (after the appropriate reverse reference is replaced). Tip: Make sure replacement forms a legitimate PHP code string, otherwise PHP will experience a syntax parsing error in the line containing preg_replace().
Example 3. Replace several values:
Copy the codeThe code is as follows:

<?php
$patterns = array ("/(19|20)(\d{2})-(\d{1,2})-(\d{1,2})/",
"/^\s*{(\w+)}\s*=/");
$replace = array ("\\3/\\4/\\1\\2", "$\\1 =");
print preg_replace ($patterns, $replace, "{startDate} = 1999-5-27");
?>

This example will output:
$startDate = 5/27/1999
Example 4. Use the /e correction character:
Copy the codeThe code is as follows:

<?php
preg_replace ("/(<\/?)(\w+)([^>]*>)/e",
"'\\1'.strtoupper('\\2').'\\3'",
$html_body);
?>

This will make all HTML tags in the input string capitalize.
Example 5. Convert HTML to text:
Copy the codeThe code is as follows:

<?php
// $document should contain an HTML document.
// This example will remove HTML tags and javascript code
// and whitespace characters. Will also add some general
// Convert HTML entities into corresponding text.
$search = array ("'<script[^>]*?>.*?</script>'si", // Remove javascript
"'<[\/\!]*?[^<>]*?>'si", // Remove HTML tags
"'([\r\n])[\s]+'", // Remove whitespace characters
"'&(quot|#34);'i", // Replace HTML entity
"'&(amp|#38);'i",
"'&(lt|#60);'i",
"'&(gt|#62);'i",
"'&(nbsp|#160);'i",
"'&(iexcl|#161);'i",
"'&(cent|#162);'i",
"'&(pound|#163);'i",
"'&(copy|#169);'i",
"'&#(\d+);'e"); // Run as PHP code
$replace = array ("",
"",
"\\1",
"\"",
"&",
"<",
">",
" ",
chr(161),
chr(162),
chr(163),
chr(169),
"chr(\\1)");
$text = preg_replace ($search, $replace, $document);
?>

The End…