使用纯Java从HTML文档中使用xpath提取内容
问题描述:
我想使用Java使用xpaths从HTML中提取内容。在红宝石中,我可以使用nokogiri来做到这一点,如下所示。
I want to extract content from an HTML using xpaths using Java. In ruby I can do this using nokogiri as shown here.
xpath = '/html/body/div/div[2]/div[2]/div/div[2]/div[3]/p'
doc = Nokogiri::HTML(open('test_001_html64.html'))
doc.xpath().each do |link|
puts link.content
end
我想在纯Java中执行。我看着Jsoup,但找不到任何使用xpath执行此操作的文档或示例。有人可以提出一种方法吗?
I want to do it in pure Java. I looked at Jsoup but I couldn't find any documentation or example that uses an xpath to do this. Can someone suggest a way?
谢谢
Thanks