wikipedia api:仅获取解析介绍
问题描述:
使用 PHP,是否有一种很好的方法可以从维基百科页面唯一获取(解析)介绍?
Using PHP, is there a nice way to get the (parsed) introduction only from a wikipedia page?
我必须采用当前方法:
- 首先是调用api页面并返回,然后在我从第一个请求中提取的介绍上调用Wiki解析器(两个请求,从文本中提取介绍也不漂亮).
- 第二种是调用整个页面解析器,并使用
xpath
来检索内容表之前的每个标签.
- The first is to call the api page and return, then call the Wiki parser on the introduction I have pulled from the first request (two requests, extracting the intro from the text isn't pretty either).
- The second is to call the entire page parser and use
xpath
to retrieve every<p>
tag before the contents table.
使用这两种方法后,我必须重新解析 HTML 以确保介绍链接中的相关链接指向维基百科.
With both methods I then have to re-parse the HTML to ensure the relevant links inside the introduction link off to wikipedia.
两者都不理想,一定有更好的方法吗?
Neither are ideal really, there must be a better way?
答
action=parse
API 模块接受一个节号参数,像这样.领先的是第 0 节.
The action=parse
API module accepts a section number parameter, like this. The lead is section number 0.