新手正则表达式,大神们求解答
新手求一个正则表达式,大神们求解答
<TR>
<TD>
<DIV class=dccss>
<a title="第1章抓周上" href="3530.html" target="_blank" >第1章抓周上</a>
</TD>
<TD>
<DIV class=dccss>
<a title="第2章抓周中" HrEf='3531.html'>第2章抓周中</a>
</DIV></TD>
<TD>
<DIV class=dccss>
<a target="_blank" hRef="3532.html">第3章抓周下</a>
</DIV></TD>
<TD>
<DIV class=dccss>
<a HreF='3533.html' target="_blank" title="第4章心思上">第4章心思上</a>
</DIV></TD></TR>
<TR>
<TD>
<DIV class=dccss>
<a title="第5章心思中" href="3534.html" target="_blank" >第5章心思中</a>
</TD>
<TD>
<DIV class=dccss>
<a title="第6章心思下" HrEf='3535.html'>第6章心思下</a>
</DIV></TD>
<TD>
<DIV class=dccss>
<a target="_blank" hRef="3536.html">第7章盘算</a>
</DIV></TD>
<TD>
<DIV class=dccss>
<a HreF='3537.html' target="_blank" title="第8章旧事">第8章旧事</a>
</DIV></TD></TR>
------解决方案--------------------
刚才没测序,测试了一下,没问题了。
<TR>
<TD>
<DIV class=dccss>
<a title="第1章抓周上" href="3530.html" target="_blank" >第1章抓周上</a>
</TD>
<TD>
<DIV class=dccss>
<a title="第2章抓周中" HrEf='3531.html'>第2章抓周中</a>
</DIV></TD>
<TD>
<DIV class=dccss>
<a target="_blank" hRef="3532.html">第3章抓周下</a>
</DIV></TD>
<TD>
<DIV class=dccss>
<a HreF='3533.html' target="_blank" title="第4章心思上">第4章心思上</a>
</DIV></TD></TR>
<TR>
<TD>
<DIV class=dccss>
<a title="第5章心思中" href="3534.html" target="_blank" >第5章心思中</a>
</TD>
<TD>
<DIV class=dccss>
<a title="第6章心思下" HrEf='3535.html'>第6章心思下</a>
</DIV></TD>
<TD>
<DIV class=dccss>
<a target="_blank" hRef="3536.html">第7章盘算</a>
</DIV></TD>
<TD>
<DIV class=dccss>
<a HreF='3537.html' target="_blank" title="第8章旧事">第8章旧事</a>
</DIV></TD></TR>
------解决方案--------------------
刚才没测序,测试了一下,没问题了。
var httpClient = new WebClient();
var page = httpClient.DownloadString("http://www.lwxs.org/books/0/20/index.html");
var lastIndex = page.LastIndexOf("<DIV class=dccss>", StringComparison.Ordinal);
var firstIndex = page.IndexOf("<DIV class=dccss>", StringComparison.Ordinal);
page = page.Substring(firstIndex, lastIndex - firstIndex);
const string pattern = "<a[^>]*?href=['
------解决方案--------------------
\"]([^<\\s]*)['
------解决方案--------------------
\"][^>]*?>([^<]*)</a>";
var myRegex = new Regex(pattern, RegexOptions.IgnoreCase);
var myMatch = myRegex.Match(page);
while (myMatch.Success)
{
var link = "http://www.lwxs.org/books/0/20/" + myMatch.Groups[1].Value;
var title = myMatch.Groups[2].Value;