XPath 查找所有后续兄弟姐妹,直到特定类型的下一个兄弟姐妹
鉴于此 XML/HTML:
Given this XML/HTML:
<dl>
<dt>Label1</dt><dd>Value1</dd>
<dt>Label2</dt><dd>Value2</dd>
<dt>Label3</dt><dd>Value3a</dd><dd>Value3b</dd>
<dt>Label4</dt><dd>Value4</dd>
</dl>
我想找到所有 然后,为每个找到以下
直到下一个
I want to find all <dt>
and then, for each, find the following <dd>
up until the next <dt>
.
使用 Ruby 的 Nokogiri 我可以这样完成:
Using Ruby's Nokogiri I am able to accomplish this like so:
dl.xpath('dt').each do |dt|
ct = dt.xpath('count(following-sibling::dt)')
dds = dt.xpath("following-sibling::dd[count(following-sibling::dt)=#{ct}]")
puts "#{dt.text}: #{dds.map(&:text).join(', ')}"
end
#=> Label1: Value1
#=> Label2: Value2
#=> Label3: Value3a, Value3b
#=> Label4: Value4
但是,正如您所看到的,我在 Ruby 中创建了一个变量,然后使用它编写了一个 XPath.我如何编写一个具有等效功能的 XPath 表达式?
However, as you can see I'm creating a variable in Ruby and then composing an XPath using it. How can I write a single XPath expression that does the equivalent?
我猜:
following-sibling::dd[count(following-sibling::dt)=count(self/following-sibling::dt)]
但显然我不明白 self
在那里是什么意思.
but apparently I don't understand what self
means there.
这个问题类似于XPath:选择以下所有兄弟姐妹直到另一个兄弟姐妹,除非停止"节点没有唯一标识符.
This question is similar to XPath : select all following siblings until another sibling except there is no unique identifier for the 'stop' node.
这个问题几乎和xpath 查找所有后续兄弟相邻节点直到另一种类型 除非我要求仅使用 XPath 解决方案.
This question is almost the same as xpath to find all following sibling adjacent nodes up til another type except that I'm asking for an XPath-only solution.
一种可能的解决方案:
dl.xpath('dt').each_with_index do |dt, i|
dds = dt.xpath("following-sibling::dd[not(../dt[#{i + 2}]) or " +
"following-sibling::dt[1]=../dt[#{i + 2}]]")
puts "#{dt.text}: #{dds.map(&:text).join(', ')}"
end
这依赖于 dt
元素的 value 比较,并且在重复时会失败.以下(更复杂的)表达式不依赖于唯一的 dt
值:
This relies on a value comparison of dt
elements and will fail when there are duplicates. The following (much more complicated) expression does not depend on unique dt
values:
following-sibling::dd[not(../dt[$n]) or
(following-sibling::dt[1] and count(following-sibling::dt[1]|../dt[$n])=1)]
注意:您对 self
的使用失败了,因为您没有正确地将它用作轴 (self::
).此外,self
始终只包含上下文节点,因此它会引用表达式检查的每个 dd
,而不是返回原始的 dt
Note: Your use of self
fails because you're not properly using it as an axis (self::
). Also, self
always contains just the context node, so it would refer to each dd
inspected by the expression, not back to the original dt