使用php DOM api删除元标记时出现问题
问题描述:
$html = new DOMDocument();
$html->loadHTMLFile($filename);
$meta = $html->getElementsByTagName("meta");
foreach($meta as $oldmeta_tags)
{
$parent = $oldmeta_tags->parentNode;
$parent->removeChild($oldmeta_tags);
}
echo "<br>Number of bytes stored = ".$html->saveHTMLFile($filename);
$result[] = file_get_contents($filename);
Some of the meta tags are removed and some are not. please help what i am doing wrong
$ html = new DOMDocument();
$ html-&gt; loadHTMLFile($ filename);
$ meta = $ html-&gt; getElementsByTagName(“meta”);
foreach($ meta as $ oldmeta_tags)
{
$ parent = $ oldmeta_tags-&gt; parentNode;
$ parent-&gt; removeChild($ oldmeta_tags);
}
echo“&lt; br&gt;存储的字节数=”。$ html-&gt; saveHTMLFile($ filename);
$ result [] = file_get_contents($ filename);
code> pre>
删除了一些元标记,而某些元标记则没有删除。 请帮助我做错了 p>
div>
答
When you use foreach
to iterate over the DOMNodeList
and remove an element, you are changing the DOMNodeList
content, so nodes will be skipped. You have to iterate backwards:
$nodes = $dom->getElementsByTagName('meta');
for ($i = $nodes->length - 1; $i >= 0; $i--) {
$nodes->item($i)->parentNode->removeChild($nodes->item($i));
}
答
You're looping over the array and removing from it at the same time.
Unfortunately, this means that every time you remove a child inside the loop, the next loop iteration skips a node. foreach
is not "clever enough" in conjunction with DOMDocument
to do this intelligently.
Instead of foreach
, use indexes:
$meta = $html->getElementsByTagName("meta");
for ($i = $meta->length - 1; $i >= 0; $i--) { // `foreach` breaks the `removeChild`
$child = $meta->item($i);
$child->parentNode->removeChild($child);
}