使用PHP Simple HTML DOM刮取数据
I structure like this:
<tr>
<td>
<strong>Tel. nr.:</strong>
+370 000 000
<strong>Faksas:</strong>
+370 5 0000
</td>
</tr>
I new in using Simple HTML DOM. What I need, I need content +370 000 000 and +370 5 0000 . I see that this library does not support Xpath, how can I write a query where I can extract the contents after the HTML link <strong>Tel. nr.:</strong>
?
I found only one way, get HTML and with regex get text from </strong>
till <strong>
, but maybe Simple HTML DOM have own method for this?
我的结构如下: p>
&lt; tr&gt; \ n&lt; td&gt;
&lt; strong&gt;电话。 nr.:</strong>
+370 000 000
&lt; strong&gt; Faksas:&lt; / strong&gt;
+370 5 0000
&lt; / td&gt;
&lt; / tr&gt;
code> pre>
我是使用Simple HTML DOM的新手。 我需要的是,我需要 +370 000 000 strong>和 +370 5 0000 strong>的内容。 我看到这个库不支持Xpath,如何编写查询,我可以在HTML链接&lt; strong&gt; Tel之后提取内容。 nr。:&lt; / strong&gt; code>? p>
我发现只有一种方法,获取HTML并使用正则表达式从&lt; / strong&gt; code>获取文本,直到&lt; strong&gt; code>, 但也许简单的HTML DOM有自己的方法吗? p>
div>
Try like this...
<?php
require('simple_parser.php');
$html = str_get_html('
<tr>
<td>
<strong>Tel. nr.:</strong>
+370 000 000
<strong>Faksas:</strong>
+370 5 0000
</td>
</tr>');
$td =$html->find('td',0) ;
echo $td->plaintext;
?>
Post your full code to get a clear answer
You could use ->find('text')
in order to get the text nodes:
$sample_html = '
<table>
<tr>
<td>
<strong>Tel. nr.:</strong>
+370 000 000
<strong>Faksas:</strong>
+370 5 0000
</td>
</tr>
</table>
';
$html = str_get_html($sample_html);
foreach($html->find('tr') as $row) {
$first_td = $row->find('td', 0);
echo $first_td->find('text', 2);
echo $first_td->find('text', 4);
}
But this solution is rather clunky. One removal of those newlines on the elements would yield another result.
I suggest use DOMDocument
with xpath instead:
$dom = new DOMDocument;
$dom->loadHTML($sample_html);
$xpath = new DOMXpath($dom);
$elements = $xpath->query('//tr[1]/td[1]/text()');
foreach($elements as $e) {
echo trim($e->textContent) . '<br/>';
}