php只从url中检索外部div中的一个字符串

问题描述：

I am trying to link my page to another website in which I can use there div tags in order to keep my site up to date.

I've got some code after some research and it's echoing out just 1 string whereas there are multiple div classes on the page and I would like to echo them all. I am just wondering if this is possible or not?

Here is the current code:

<?php
$url = 'http://www.domain.com';
$content = file_get_contents($url);
$activity = explode( '<div class="class">' , $content );
$activity_second = explode("</div>" , $activity );

echo $activity_second[0];
?>

I can echo $activity_second[0] which will display the first line and $activity_second[1] which will display the second line.

However, I am looking to expand this over to allow all of the div classes on the same page to be put into an array which can then be echo'd out into different parts of a table.

Thank you for your help in advance.

我正在尝试将我的页面链接到另一个网站，我可以在那里使用div标签以保留我的网站最新。 p>

经过一些研究后我得到了一些代码，它只回显了1个字符串，而页面上有多个div类，我想全部回应它们。我只是想知道这是否可行？ p>

这是当前的代码： p>

 ＆lt;？php 
 $ url  ='http：//www.domain.com'; 
 $ content = file_get_contents（$ url）; 
 $ activity = explode（'＆lt; div class =“class”＆gt;'，$ content）; 
  $ activity_second = explode（“＆lt; / div＆gt;”，$ activity）; 
 
echo $ activity_second [0]; 
？＆gt; 
  code>  pre> 
 
  I 可以回显 $ activity_second [0]  code>，它将显示第一行， $ activity_second [1]  code>将显示第二行。   p> 
 
 
但是，我希望扩展它以允许将同一页面上的所有div类放入一个数组中，然后可以将其回显到一个数组的不同部分 表。 p> 
 
 

提前感谢您的帮助。 p> 
  div>

答

Let me see if I get it straight, you have something like this:

<div id="another-class"><div class="class">some text 1</div></div>
<div class="class">some text 2</div>
<div class="class">some text 3</div>
<div class="class">some text 4</div>
<div class="class">some text 5</div>
<div class="class">some text 6</div>

And you need the text contained the div elements. If this is correct, replace:

$activity = explode( '<div class="class">' , $content );
$activity_second = explode("</div>" , $activity );

with this:

preg_match_all('#<div class="class">(.+?)</div>#', $content, $matches);

In this example, after the function call $matches will have the following:

Array
(
    [0] => Array
        (
            [0] => <div class="class">some text 1</div>
            [1] => <div class="class">some text 2</div>
            [2] => <div class="class">some text 3</div>
            [3] => <div class="class">some text 4</div>
            [4] => <div class="class">some text 5</div>
            [5] => <div class="class">some text 6</div>
        )

    [1] => Array
        (
            [0] => some text 1
            [1] => some text 2
            [2] => some text 3
            [3] => some text 4
            [4] => some text 5
            [5] => some text 6
        )

)

The data you need is in $matches[1].

答

The problem probably is that only the first key of the first array gets into the second explode. Try the following after $activity:

$result = array();

foreach ($activity as $div){
    $handle = explode("</div>", $div);
    $result -> append($handle);
}
foreach ($result as $key){
   echo $key;
}

I'm sorry for the original reply, I misunderstood your question.

The regex way will work as well.

答

If your intention is to get all contents of divs with that class name, you can use regex capturing those strings between the tags of those divs:

preg_match_all('/<div class="class">([^<]+)<\/div>/', $content, $m);

print_r($m[1]);

Now $m[1] will be an array containing all those inner HTML strings of those divs.

答

The rule is: when I act with HTML, I have to use a parser.

Assuming you have a HTML document like this:

$html = '<html>
<head><title>Untitled</title></head>
<body>
    <div class="class">
        <b>My Content 1</b>
    </div>
    <div class="class">
        <b>My Content 2</b>
    </div>
    <div class="class">
        <b>My Content 3</b>
    </div>
</body>
</html>';

load it into a DOMDocument object, init a DOMXPath object based on loaded HTML:

$dom = new DOMDocument();
libxml_use_internal_errors(1);
$dom->formatOutput = True;
$dom->loadHTML( $html );
$xpath = new DOMXPath( $dom );

and with this command you can access to all <div class="class">:

foreach( $xpath->query( '//div[@class="class"]' ) as $node )
{
    echo trim( $node->nodeValue ) . '<br>';
}

Your output:

My Content 1
My Content 2
My Content 3

If you want echo the node as HTML, replace echo ...

with:

echo $dom->saveHTML( $node );

will output:

<div class="class">
    <b>My Content 1</b>
</div>
<div class="class">
    <b>My Content 2</b>
</div>
<div class="class">
    <b>My Content 3</b>
</div>

At last, if you want echo only the innerHTML of the nodes, you have to write something like this:

foreach( $xpath->query( '//div[@class="class"]' ) as $node )
{
    foreach ($node->childNodes as $child) 
    { 
        echo $dom->saveHTML( $child );
    }
}

and your output will be:

<b>My Content 1</b>
<b>My Content 2</b>
<b>My Content 3</b>

Read more about DOMDocument
Read more about DOMXPath
Read why you can't parse [X]HTML with regex

php只从url中检索外部div中的一个字符串

相关推荐