从文本中删除锚点
问题描述:
我需要从某些文本中删除锚标记,并且似乎无法使用正则表达式来实现.
只是锚标记,而不是其内容.
例如,<a href="http://www.google.com/" target="_blank">google</a>
将变为google
.
I need to remove anchor tags from some text, and can't seem to be able to do it using regex.
Just the anchor tags, not their content.
For instance, <a href="http://www.google.com/" target="_blank">google</a>
would become google
.
答
确实,使用正则表达式无法正确完成.
Exactly, it cannot be done properly using a regular expression.
以下是使用DOM的示例:
Here is an example using DOM :
$xml = new DOMDocument();
$xml->loadHTML($html);
$links = $xml->getElementsByTagName('a');
//Loop through each <a> tags and replace them by their text content
for ($i = $links->length - 1; $i >= 0; $i--) {
$linkNode = $links->item($i);
$lnkText = $linkNode->textContent;
$newTxtNode = $xml->createTextNode($lnkText);
$linkNode->parentNode->replaceChild($newTxtNode, $linkNode);
}
每当对DOM进行更改时,向后循环很重要.
It's important to loop backward whenever changes will be made to the DOM.