An article to understand the techniques of replacing regular expressions

1. Regular expression application - replace the specified content to the end of the line

The original text is as follows two lines

abc aaaaa
123 abc 444

I hope that every time I encounter "abc", replace "abc" and the content after it to the end of the line is "abc efg"

That is, the text above is finally replaced by:

abc efg
123 abc efg

solve:

① In the replacement dialog box, enter "abc.*" in the search content, and enter "abc efg" in the replacement content.

② Check the "regular expression" check box at the same time, and then click the "Replace All" button

Among them, the meaning of the symbol is as follows:

"." = Match any character

"*" = Match 0 times or more

Note: In fact, it is regular expression replacement. Here we just sort out some of the problems raised. From the perspective of the regular expression itself, thousands of special cases can be extended.

2. Regular expression application-number replacement

Hope to

asdadas123asdasdas456asdasdasd789asdasd

Replace with:

asdadas[123]asdasdas[456]asdasdasd[789]asdasd

In the Replace dialog box, select the "Regular Expressions" check box;

Enter "([0-9])([0-9])([0-9])" in the search content without quotation marks

"Replace with:" enter "[\1\2\3]" without quotation marks

####Note####: Find ([0-9]+) Replace: [\1] is simpler and more general

The range is the range you operated, and then select Replace.

In fact, this is also a special case of regular expression use. "[0-9]" means matching any special case between 0 and 9, and "[a-z]" means matching any special case between a to z.

The above is repeated using "[0-9]", indicating three consecutive numbers

Brackets are used to select prototypes, group them, and use them when replacing them.

"\1" represents the prototype corresponding to the first "[0-9]", "\2" represents the prototype corresponding to the second "[0-9]", and so on

"["|"]" is a simple character, indicating that "[" or "]" is added. If you enter "Other\1\2\3 Other", the replacement result is:

asdadas other 123 other asdasdas other 456 other asdasdasd others 789 other asdasd

Feature enhancement:

If you change the search content "[0-9][0-9][0-9]" to "[0-9]*[0-9]", corresponding to 1 or 123 or 12345 or ...

Customize it as needed

There are many related contents, you can refer to the syntax of regular expressions and study them carefully.

3. Regular expression application - delete the specified characters at the end of each line

Because these characters also appear in the line, it is definitely not possible to use simple replacements to implement it.

for example

12345 1265345
2345

Need to delete "345" at the end of each line

This is also considered the usage of regular expressions. In fact, it should be simple to read the regular expression carefully. However, since this problem has been raised, it means that there must be a process of understanding the regular expression. The solution is as follows

solve:

In the Replace dialog box, enable the Regular Expressions check box

Enter "345$" in the search content

Here "$" means matching from the end of the line

If you match from the line header, you can use "^" to achieve it, but EditPlus has another function that can simply delete the string at the line header.

a. Select the row to operate

b. Edit - Format - Delete Line Comments

c. Enter the first character of the line to be cleared in the pop-up dialog box.

4. Regular expression application - replace multiple lines with half-width brackets

There is a code below in hundreds of web pages:

<script LANGUAGE=JavaScript1.1>
<!--
htmlAdWH('93163607', '728', '90');
//-->
</SCRIPT>

I wanted to remove them all, but I found a lot of search & replace software, and they could only operate on "one line".

EditPlus is still relatively smooth to open hundreds of web page files, so it is perfectly competent for this job.

Specific solution: Use regular expressions in Editplus. Since "(", ")" is used as a flag for preset expressions (or can be called subexpressions), search

“<script LANGUAGE=JavaScript1.1>\n\n</SCRIPT>\n”

It will prompt that the search cannot be found, so it cannot be replaced. At this time, "(", ")" can be replaced with any character marking, that is, half-width period: ".". Replace the content as

<script LANGUAGE=JavaScript1.1>\n\n</SCRIPT>\n

Enable the "regular expression" option in the Replace dialog box, and the replacement can be completed.

Replenish:

For special symbols like ( ), they should be used to represent them. This is also a very standard regexp syntax and can be written as

<script LANGUAGE=JavaScript1.1>\n\n</SCRIPT>\n

5. Regular expression application - delete empty lines

Start EditPlus and open the pending text type file.

①. Select the "Replace" command in the "Find" menu to pop up the text replacement dialog box. Select the "regular expression" check box to indicate that we want to use regular expressions in search and replace. Then, select Current File in Replace Scope to indicate the operation on the current file.

②. Click the button to the right of the "Find Content" combo box, and the drop-down menu appears.

③. Add a regular expression to the following operation, which represents the blank line to be found. (Tip tip: blank lines only include space characters, tab characters, and carriage return characters, and must start with one of these three symbols as the line and end with a carriage return character. The key to finding blank lines is to construct a regular expression representing blank lines).

Directly enter the regular expression "^[ \t]*\n" in the search, and note that there is a space character before \t.

(1) Select "Match from the beginning of the line", and the character "^" appears in the "Find Content" combo box, indicating that the string to be searched must be the beginning of a line in the text.

(2) Select "character in range", then a pair of brackets "[]" will be added after "^", and the current insertion point will be in brackets. Parentheses are represented in regular expressions. If characters in text match any character in parentheses, they meet the search criteria.

(3) Press the space bar to add space characters. The space character is a component of a blank line.

(4) Select "Tab" and add "\t" representing the tab character.

(5) Move the cursor, move the current insertion point to "]", and then select "Match 0 times or more". This operation will add the asterisk character "*". An asterisk indicates that the space characters or tab characters in the bracket "[]" before it appear in 0 or more in a line.

(6) Select "line break", insert "\n" to indicate the carriage return character.

④. The "Replace with" combo box remains empty, indicating that the found content is deleted. Click the "Replace" button to delete empty lines one by one, or click the "Replace All" button to delete all empty lines (Note: EditPlus sometimes has the problem that "Replace All" cannot completely delete empty lines at once. It may be a program bug, and you need to press the button several times more).

6. Regular expression application-instance application

1. Verify user name and password: (^[a-zA-Z]\w{5,15}$) Correct format: [A-Z][a-z]_[0-9], and the first word must be 6~16 digits;

2. Verify phone number: (^(\\d{3,4}-)\\d{7,8}$) Correct format: xxx/xxxxx-xxxxxx/xxxxxxxxxx;

3. Verify mobile phone number: ^1[3|4|5|7|8][0-9]\\d{8}$;

4. Verify the ID number (15 or 18 digits): \\d{14}[[0-9], 0-9xX];

5. Verify the email address: (^\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*$);

6. Only strings composed of numbers and 26 English letters can be entered: (^[A-Za-z0-9]+$);

7. Integer or decimal: ^[0-9]+([.][0-9]+){0,1}$

8. Only enter numbers: ^[0-9]*$.

9. Only n-bit numbers can be entered: ^\d{n}$.

10. Only numbers with at least n digits can be entered: ^\d{n,}$.

11. Only numbers in m~n bits can be entered: ^\d{m,n}$.

12. Only numbers starting with zero and non-zero are entered: ^(0|[1-9][0-9]*)$.

13. Only positive real numbers with two decimal places can be entered: ^[0-9]+(.[0-9]{2})?$.

14. Only positive real numbers with 1 to 3 decimal places can be entered: ^[0-9]+(\.[0-9]{1,3})?$.

15. Only non-zero positive integers can be entered: ^\+?[1-9][0-9]*$.

16. Only non-zero negative integers can be entered: ^\-[1-9][0-9]*$.

17. Only characters with length 3 can be entered: ^.{3}$.

18. Only strings composed of 26 English letters can be entered: ^[A-Za-z]+$.

19. Only strings composed of 26 capital English letters can be entered: ^[A-Z]+$.

20. Only strings composed of 26 lowercase English letters can be entered: ^[a-z]+$.

21. Verify whether it contains characters such as ^%&',;=?$\: [%&',;=?$\\^]+.

22. Only Chinese characters can be entered: ^[\u4e00-\u9fa5]{0,}$.

23. Verification URL: ^http://([\w-]+\.)+[\w-]+(/[\w-./?%&=]*)?$.

24. Verify the 12 months of one year: ^(0?[1-9]|1[0-2])$ The correct format is: 01~09 and 10~12.

25. Verify 31 days of one month: ^((0?[1-9])|((1|2)[0-9])|30|31)$The correct format is; 01~09, 10~29 and "30"~"31".

26. Get the date regular expression: \\d{4}[year|\-|\.]\d{\1-\12}[month|\-|\.]\d{\1-\31}day?

Comment: Can be used to match most year, month and day information.

27. Match double-byte characters (including Chinese characters): [^ - ]

Comment: It can be used to calculate the length of a string (one double-byte character length meter 2, ASCII character meter 1)

28. Regular expression matching blank lines: \n\s*\r

Comment: Can be used to delete blank lines

29. Regular expressions matching HTML tags: <(\S*?)[^>]*>.*?</>|<.*? />

Comment: The version circulating online is too bad, and the above one can only match the part, and it is still powerless to use complex nested markers.

30. Regular expression matching the beginning and end whitespace characters: ^\s*|\s*$

Comment: It can be used to delete whitespace characters at the beginning and end of the line (including spaces, tabs, page breaks, etc.), a very useful expression

31. Regular expression matching URL: [a-zA-z]+://[^\s]*

Comment: The functions of the version circulating online are very limited, and the above can basically meet the needs

32. Match whether the account is legal (beginning with letters, 5-16 bytes allowed, alphanumeric underscores allowed): ^[a-zA-Z][a-zA-Z0-9_]{4,15}$

Comment: It is very practical when verifying the form

33. Match Tencent QQ number: [1-9][0-9]{4,}

Comment: Tencent QQ number starts at 10,000

34. Match the Chinese postal code: [1-9]\\d{5}(?!\d)

Comment: China's postal code is 6 digits

35. Match ip address: ([1-9]{1,3}\.){3}[1-9].

Comment: It is useful when extracting IP addresses

36. Match MAC address: ([A-Fa-f0-9]{2}\:){5}[A-Fa-f0-9]

Summarize

This is the article about regular expression replacement skills. For more related regular expression replacement skills, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!