使用HTML Agility Pack选择所有以特定文本值开头的段落

使用HTML Agility Pack选择所有以特定文本值开头的段落

问题描述:

我正在学习使用Html Agility Pack.

I am learning to use the Html Agility Pack.

我有一系列的段落元素,如下所示(为了清晰起见,对代码进行了分割):

I have a series of paragraph elements that look like this (code split for clarity):

<p class="rvps2">
    <img alt="New Version Icon" 
         style="vertical-align: middle; padding : 1px; margin : 0px 5px;"
         src="lib/IMG_NewVersion.png">
    <span class="rvts16">Version 21.1.0 - 2021 Edition</span>
    <span class="rvts15"> (22nd March 2021)</span>
</p>

我只对以文本版本"开头的段落感兴趣.此刻,我正在这样做:

I am only interested in the paragraphs that start with the text "Version". At the moment I am doing it like this:

// Select all Paragraph elements
var nodesParagraph = nodeRevHist.SelectNodes("p");

int iRevisionCount = 0;
foreach (HtmlNode itemParagraph in nodesParagraph)
{
    string text = itemParagraph.InnerText;
    if (text.Length > 7 && text.Substring(0, 7) == "Version")
    {
        iRevisionCount++;

是否可以将 nodesParagraph 过滤到内部文本以"Version"开头的所有段落?

Is it possible for nodesParagraph to be filtered to all paragraphs where the inner text starts with "Version"?

如果可以的话,这将使我的代码更整洁.附带的问题,我也只对这些段落元素的前5个感兴趣.

This would make my code cleaner if it is possible. Side question, I am also only interested in the first 5 of these paragraph elements.

可以进行过滤吗?

您可以获取前5个段落,其中内部文本以"Version"开头.像这样:

You can get the first 5 paragraphs where the inner text starts with "Version" like this:

var nodesParagraph = nodeRevHist
    .Elements("p")
    .Where(p => p.InnerText.Trim().StartsWith("Version"))
    .Take(5);

在此处工作的演示: https://dotnetfiddle.net/uvwcUN