SoFunction
Updated on 2025-03-04

Implementation of regular expression extraction URL, title, image, etc. (.Net Asp Javascript/Js)

In some cases of crawling, filtering, etc., the advantages of regular expression regular expression are very obvious.
For example, there is a string like this:
Copy the codeThe code is as follows:

<li><a href="/something/article/" title="FCKEditor highlight code plug-in test"><span class="article-date">[09/11]</span>FCKEditor highlight code plug-in test</a></li>

Now, you need to extract the URL after href, the date in [], and the text of the link.
The following is the implementation method of C#, ASP and Javascript
Implementation of C#
Copy the codeThe code is as follows:

string strHTML = "<li><a \"href=/something/article/\" title=\"FCKEditor highlight code plug-in test\"><span class=\"article-date\">[09/11]</span>FCKEditor highlight code plug-in test</a></li>";
string pattern = "http://([^\\s]+)\".+?span.+?\\[(.+?)\\].+?>(.+?)<";
Regex reg = new Regex( pattern, );
MatchCollection mc = ( strHTML );
if ( > 0)
{
foreach (Match m in mc)
{
( [1].Value );
( [2].Value );
( [3].Value );
}
}

Implementation of ASP
Copy the codeThe code is as follows:

<%
Dim str, reg, objMatches
str = "<li><a href=""http://localhost/Z-Blog18/article/"" title=""FCKEditor highlight code plug-in test""><span class=""article-date"">[09/11]</span>FCKEditor highlight code plug-in test</a></li>"
Set reg = new RegExp
= True
= True
= "http://([^\s]+)"".+?span.+?\[(.+?)\].+?>(.+?)<"
Set objMatches = (str)
If > 0 Then
("Website:")
(objMatches(0).SubMatches(0))
("<br>")
("date:")
(objMatches(0).SubMatches(1))
("<br>")
("title:")
(objMatches(0).SubMatches(2))
End If
%>

Javascript implementation
Copy the codeThe code is as follows:

<script type="text/javascript">
var str = '<li><a href="http://localhost/Z-Blog18/article/" title="FCKEditor highlight code plug-in test"><span class="article-date">[09/11]</span>FCKEditor highlight code plug-in test</a></li>';
var pattern = /http:\/\/([^\s]+)".+?span.+?\[(.+?)\].+?>(.+?)</gi;
var mts = (str);
if (mts != null)
{
alert(mts[1]);
alert(mts[2]);
alert(mts[3]);
alert(mts[4]);
}
</script>