使用php DOM api删除元标记时出现问题

使用php DOM api删除元标记时出现问题

问题描述:

$html = new DOMDocument();
           $html->loadHTMLFile($filename);

           $meta = $html->getElementsByTagName("meta");


           foreach($meta as $oldmeta_tags)
           {

               $parent = $oldmeta_tags->parentNode;
               $parent->removeChild($oldmeta_tags);

           }
         echo "<br>Number of bytes stored = ".$html->saveHTMLFile($filename);
           $result[] = file_get_contents($filename);

Some of the meta tags are removed and some are not. please help what i am doing wrong

  $ html = new DOMDocument(); 
 $ html-&gt; loadHTMLFile($ filename);  
 
 $ meta = $ html-&gt; getElementsByTagName(“meta”); 
 
 
 foreach($ meta as $ oldmeta_tags)
 {
 
 $ parent = $ oldmeta_tags-&gt; parentNode;  
 $ parent-&gt; removeChild($ oldmeta_tags); 
 
} 
 echo“&lt; br&gt;存储的字节数=”。$ html-&gt; saveHTMLFile($ filename); 
 $ result []  = file_get_contents($ filename); 
  code>  pre> 
 
 

删除了一些元标记,而某些元标记则没有删除。 请帮助我做错了 p> div>

When you use foreach to iterate over the DOMNodeList and remove an element, you are changing the DOMNodeList content, so nodes will be skipped. You have to iterate backwards:

$nodes = $dom->getElementsByTagName('meta');
for ($i = $nodes->length - 1; $i >= 0; $i--) {
    $nodes->item($i)->parentNode->removeChild($nodes->item($i));
}

You're looping over the array and removing from it at the same time.

Unfortunately, this means that every time you remove a child inside the loop, the next loop iteration skips a node. foreach is not "clever enough" in conjunction with DOMDocument to do this intelligently.

Instead of foreach, use indexes:

$meta = $html->getElementsByTagName("meta");
for ($i = $meta->length - 1; $i >= 0; $i--) { // `foreach` breaks the `removeChild`
   $child = $meta->item($i);
   $child->parentNode->removeChild($child);
}