How to use C# regular expressions to determine whether it is a valid file and folder path

Save money

/// &lt;summary&gt;
/// Whether the file is valid, the folder path/// &lt;/summary&gt;
/// &lt;param name="val"&gt;&lt;/param&gt;
/// <returns> Yes, return true; not return false</returns>public bool IsValidFolderPath(string val)
{
    Regex regex = new Regex(@"^([a-zA-Z]:\\)([-\u4e00-\u9fa5\w\s.()~!@#$%^&amp;()\[\]{}+=]+\\?)*$");
    Match result = (val);
    return ;
}

// "F:\\TotalClient Project\\Client Project\\2017-01-09 Client\\(aa)\\V1.3.4\\New_1.2\\V13&amp;V14\\.()~!@#$%^&amp;()-+="; // Match results：true

explain:

It is divided into 2 large segments, one matching the drive letter, and one matching the subsequent file and folder path

^([a-zA-Z]:\\): Must start with the form of a drive letter.

^ means matching from the starting position, [a-zA-Z] means that the first bit must be one of a~z or A~Z. :\\ means that the string must be followed by the first bit:\. \\ is an escape from \ in the regular.
([-\u4e00-\u9fa5\w\s.()~!@#$%^&()\[\]{}+=]+\\?)*$: Subsequently, a structure is composed of a certain range of values.

Let’s look at the [] first. \u4e00-\u9fa5 means matching Chinese characters, \w and \s are all meta characters with their corresponding matching ranges. The remaining characters -.()~!@#$%^&()\[\]{}+= represent themselves. where \[ is the escape of [, \] is the escape of ]. [~]+ means that the content in [] needs to appear at least once. \\? means that after the characters in [~] are written, you can either follow a character \ or not. (~)* means that the content of () can be repeated as many times or not at once. $ means matching to the end position, and matching the previous ^ means that the structure of the entire input string must conform to this regular expression.

A few notes:

The above is called [~] refers to all the contents of [] in the expression, and (~) refers to all the contents of () in the expression, which should be easy to understand.
After writing the explanation, I mainly summarize it myself. You probably can't understand it. You might as well use it directly or learn it honestly. These are the basics, and it is basically enough to learn the basics of regular expressions.
Whether Chinese characters can be matched under different systems is different. For example, in the C# environment, \w seems to match Chinese characters, but in the JavaScript environment, \w cannot match Chinese characters.
Regularly, your own escape and the escapes placed in strings are quite easy to be confused, so be careful when writing.
Personally, there is only one path, and it is impossible to tell whether this path is a file or a folder, because the folder name can also be called, and the file name can also have no suffix. The file naming specification of Windows does not allow 9 characters to appear. / \ ? * : " < > |Everything else is OK.

Learn to write a verification process

Given that I found several of them online, they were garbage and they were neither good nor knew what they were judging, so I had to rely on myself for everything.

Learn fromhttps:///tools/

metacharacter

character	Related explanations
\b	Match the beginning or end of a word, that is, the boundary of the word. Can be used to find a word accurately
.	Match any character other than a newline
*	*The previous content can be repeated any time
+	+The previous content can be repeated 1 or any more times in succession. In layman's terms, it must be matched at least once.
?	?The previous content can be repeated 0 or 1 times in succession.
{x}	x: number. {x} The previous content must be repeated x times
{x,}	x. {x,} The previous content must be repeated at least x times
{x,y}	x,y: number. {x,y} The content before it must be repeated between x and y, including x and y
(xxx)	Indicates grouping
[x,y,z]	Indicates a single match
\d	Match a decimal number, that is, 0~9	[0-9]
\s	Match any whitespace, space, tab, line break, Chinese full-width space, etc.
\w	Match numbers, letters, underlines [Chinese]	[a-z0-9A-Z_]
^	Match the start position of the string
$	Match the end position of the string

How to find the string 'hi' from a string?

Regex regex = new Regex("hi");
// Notice：likehistory，highAmong the wordshiWill be matched。

How to find the word hi exactly? Use \b

Regex regex = new Regex(@"\bhi\b");

This way, the word ‘hi’ can be found accurately.

How to find hi, xxxxx, Lucy?

Regex regex = new Regex(@"\bhi\b.*\bLucy\b");
// `.*`Can't change lines。is a match for any number of characters that do not contain a newline

How to match a Chinese phone number? The format is: xx-xxxxxxxxxxxxxxxxxxxxxxxxx

Regex regex = new Regex(@"\d\d-\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d");

How to match a phone number starting with 188?

Regex regex = new Regex(@"\d\d-188\d\d\d\d\d\d\d\d\d\d\d\d");

If you want to match a 100-digit number, won’t you be able to write it down, so there must be an optimization writing method.

Regex regex = new Regex(@"\d{2}-188\d{12}");

\d{2} Can it match 85555? The same as matching hi in high?

If there is no limit, it will all match. If you want an exact match, you must also use metacharacters before and after \b

What if you have trombones and short numbers that you want to match together?

Regex regex = new Regex(@"\d{2,3}-188\d{6,12}");

Parses a phone number-related expression^\(?0\d{2}[) -]?\d{8}

^: means that the verification string must start with (or 0)
\(? : \(Yes the escape of (, which means that it appears once or does not appear
0\d{2} : Indicates that 0 starts with 2 digits followed by 0
[) -]?: means 1 bit after the number or nothing, or is it), space, - one of 3
\d{8}: represents an 8-digit number.

string phone = "(010)88886666";
string phone1 = "(010x88886666";
string phone2 = "011)-888866660000";
string phone3 = "011-888866660000";
string phone4 = "99011-88886666";
Regex regex30 = new Regex(@"^\(?0\d{2}[) -]?\d{8}");
Match resultphone = (phone);// Match successfullyMatch resultphone1 = (phone1);// Match failed because after 010, characters that do not exist in [] appearedMatch resultphone2 = (phone2);// Match failed because [] is not followed by 8 digits, not that - it is also in [], [] will only match one position foreverMatch resultphone3 = (phone3);// The match is successful because this regular expression does not end with $ limit.Match resultphone4 = (phone4);// Match failed，Because this regular expression uses^Limit the beginning，Must be（or0。

About ^ and $

I used to not understand what this thing is for, but in fact I just restrict the matching range. Take a numeric character as an example.

Normally, 9/d{2} is used for matching. The matching condition can be interpreted as "a string starting with 9 and following any 2-digit number" can match successfully, that is, 998.

But if you use ^ for limiting ^9/d{2}. The matching condition becomes "the input string must be a string that starts with 9 and is followed by any 2-digit number". The match will fail.

Remove 9 ^/d{2}, and the matching condition becomes "the input string must be a string starting with any 2 digits". The match will succeed.

If you use $ limit /d{2}$, the matching condition becomes "the input string must be a string ending with any 2-digit number". The match was successful.

If changed to 58$, the matching condition becomes "the input string must be a string ending with '58'". The match will fail.

If you use ^, $ limit, ^/d{2}$, the matching condition becomes "the input string must be a 2-digit string", and the matching fails.

Change to ^/d{18}$, the matching condition becomes "the input string must be a 18-digit string", and the matching is successful.

The code is as follows:

string numberStr2 = "123456789987645312";
Regex regex2 = new Regex(@"9\d{2}");
Match result2 = (numberStr2);

Regex regex3 = new Regex(@"^9\d{2}");
Match result3 = (numberStr2);

Regex regex4 = new Regex(@"\d{2}$");
Match result4 = (numberStr2);

Regex regex5 = new Regex(@"58$");
Match result5 = (numberStr2);

Regex regex6 = new Regex(@"^\d{2}$");
Match result6 = (numberStr2);

Regex regex7 = new Regex(@"^\d{18}$");
Match result7 = (numberStr2);

About(), [], and {}

First of all, {}, this has nothing to say, it means the number of repetitions. {2}, {2,}, {2,5}.

Secondly [] represents a single match. Only 1 position can be represented, and the content of this position must be one of the options in [].

I have some questions about this description

What's the use of [] alone?

Regex regex8 = new Regex(@"[beat]");
Match result8 = (Str8);

Regex regex9 = new Regex(@"[s]");
Match result9 = (Str8);

Regex regex10 = new Regex(@"[Blow S Black]");
Match result10 = (Str8);

Regex regex11 = new Regex(@"[black]");
Match result11 = (Str8);

Regex regex12 = new Regex(@"beat");
Match result12 = (Str8);

Regex regex13 = new Regex(@"Black Strike");
Match result13 = (Str8);// Match failed
Regex regex14 = new Regex(@"[Black S fight]");// The match is successful, find the "kill"Match result14 = (Str8);

// Use alone is to match every character of the input string from beginning to end.  Find the first character in the input string that matches any character in [].// If there is only 1 character in [], then is there exactly the same []?  It is different if there are multiple characters in [].// Can't imagine the use scenario，[]In actual application, it is also mostly used with other conditions.

What if there are meta characters in []?

The solution is simple: escape.

But how to turn it in detail is still a bit confusing. This confusion does not mean how difficult it is, but that you need to have an impression of this point and be able to reflect it when encountering it.

This point I think is easy to be confused about is the mixture of escape from C#'s own string and escape from regular expressions.

First, let’s clarify the escape in C#. There are two ways to escape in C#:

// Original character text://aI call []{}aa\bb"cc''dd^ee/dff//another row
// '[',']','{','}',''','^','/' itself does not require escape, what needs to be escaped is '\','"', line break
// Type 1: The symbol that needs to be escaped is added '\' before itstring stringStr1 = "aI'll call[]{}aa\\bb\"cc''dd^ee/dff\r\nanother row";
// The second type, the entire string is modified with '@'// In this case, '\', line breaks need not be escaped.  But '"' also need to be escaped, because the string ends early without escaping it, using two double quotes'""' means normal characters'"'string stringStr2 = @"aI'll call[]{}aa\bb""cc''dd^ee/dff
another row";

There are many symbols that need to be escaped in regular expressions, and all meta characters need to be escaped. But the good news is that there is only one way to escape, which is to add \ before the symbol that needs to be escaped.

If you put these expressions into a C# string, you will not be able to tell whether it is a string escape or a regular escape, whether it will be escaped after entering the string after regular escape, etc. stupid.

Metacharacters include: ( ) [ ] { } \ ^ $ | ? * . + /. "/"It seems a bit controversial whether it needs to be escaped. After checking, I found that many compilers have some default processing on escaping of regularity, and no authoritative rules were found, depending on the specific situation.

// The above example matches a character with 9 and 2 numbersRegex regex15 = new Regex("9\d{2}"); // This will report an error, because the '\' in the regular '\d' is c# The escaped identifier in the string. If you write C# like this, you will think that '\d' is an escape character, but you don't know what the escape is, and you will report an error: CS1009: Unrecognized escape sequence.// You need to escape '\', as above, use @ or another escape method as you canRegex regex16 = new Regex(@"9\d{2}");
Regex regex17 = new Regex("9\\d{2}");

// Another example that needs to be impressed.  It is only possible to use characters '\' or characters 'd' in a position// It's actually very simple, just use []Regex regex18 = new Regex("[\d]");//As reported as above: CS1009: Unrecognized escape sequence.// Once opened, there is no escape, change toRegex regex19 = new Regex("[\\d]");//Compilation is passed and completed.string str10 = @"aI'll call[]{}aa\bb""cc''dd^ee/dff\\";
Match result19 = (str10);// Match failed// However, the result of result19 is a match failure.// This is the mixture of character escape and regular escape.// '\\d' only handles the problem of '\' in the string, and does not solve the problem that '\' in the regular '[\d]' also needs to be escaped;// If you want to implement a position, you can only use the character '\' or character 'd'. The correct regular expression should be [\\d]// Then put it in CIn the # string, each '\' must be escaped again, that is,Regex regex20 = new Regex("[\\\\\d]");Match result20 = (str10);
// orRegex regex21 = new Regex(@"[\\d]");
Match result21 = (str10);
// I was curious about what happens if there are duplicate characters in []Regex regex22 = new Regex(@"[\\\\d]");
Match result22 = (str10);
// The result seems to be no difference, it doesn't matter if there are the same characters, it doesn't matter.
// I have found a little more in the subsequent use here, so I will add it// mentioned above:Regex regex19 = new Regex("[\\d]");
string str10 = @"aI'll call[]{}aa\bb""cc''dd^ee/dff\\";
Match result19 = (str10);
// It will compile and pass, but the matching fails.  A question is missing here, which is what exactly matches new Regex("[\\d]");?// The answer is that it matches '/d', that is, any number that is considered 0~9string str101 = @"aI'll call[]{}aa\bb""cc''dd^ee/dff\\";
string str102 = @"aI'll call[]{}aa\bb""cc''d9d^ee/dff\\";//A 9 added in the middleRegex regex191 = new Regex("[\\d]");
Match result191 = (str101);//Match failedMatch result192 = (str102);//Match successfully, 9 were found//so[],More than just matching[]The actual content in，It can also match all characters with meta characters

[]Coordinate - can represent continuous characters

Regex regex22 = new Regex("[0-3]");// A certain position match0~3，That is0，1，2，3

[]Coordinate^ can indicate exclusion

Regex regex23 = new Regex("[^0-3]");//  A certain position matches0，1，2，3It will be all right。

You can review what if you want to match -, ^, or even []?

Regex regex24 = new Regex("[\\^\\[\\]\\-]");
//orRegex regex25 = new Regex(@"[\^\[\]\-]");

I actually tried these special characters here, which can match without escape, bewildering.

Finally, there are many functions of(), including limiting the scope of multiple-select structures, grouping, capturing text, surround viewing, and special mode processing.

I feel that the more basic use is to limit multiple selection and grouping.

// Matching must be the same as the abc or bcd or cde.Regex regex26 = new Regex("(abc|bcd|cde)");

// Match must have 2 consecutive adb duplications, that is, abcabc, abcaabc cannotRegex regex27 = new Regex("(abc){2}");

Summarize

This is the article about C# using regular expressions to determine whether it is a valid file and folder path. For more relevant C# regular expressions to determine valid file content, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!