找到页面中的所有href并用链接维护以前的链接替换 - PHP
问题描述:
我正在尝试在网页上找到所有href链接,并用我自己的代理链接替换该链接。
I'm trying to find all href links on a webpage and replace the link with my own proxy link.
例如
<a href="http://www.google.com">Google</a>
需要
<a href="http://www.example.com/?loadpage=http://www.google.com">Google</a>
答
使用PHP的 DomDocument
解析页面
$doc = new DOMDocument();
// load the string into the DOM (this is your page's HTML), see below for more info
$doc->loadHTML('<a href="http://www.google.com">Google</a>');
//Loop through each <a> tag in the dom and change the href property
foreach($doc->getElementsByTagName('a') as $anchor) {
$link = $anchor->getAttribute('href');
$link = 'http://www.example.com/?loadpage='.urlencode($link);
$anchor->setAttribute('href', $link);
}
echo $doc->saveHTML();
请在此处查看: http://codepad.org/9enqx3Rv
如果您没有将HTML作为字符串,则可以使用cUrl( docs )获取HTML,或者你可以使用 loadHTMLFile
方法 DomDocument
If you don't have the HTML as a string, you may use cUrl (docs) to grab the HTML, or you can use the loadHTMLFile
method of DomDocument
文档
-
DomDocument
- http://php.net/manual/en/class.domdocument.php -
DomElement
- http://www.php。 net / manual / en / class.domelement.php -
DomElement :: getAttribute
- http://www.php.net/manual/en/domelement.getattribute.php -
DOMElement :: setAttribute
- http:// www.php.net/manual/en/domelement.setattribute.php -
urlencode
- http://php.net/manual/en/function.urlencode.php -
DomDocument :: loadHTMLFile
- http://www.php.net/manual/en/domdocument.loadhtmlfile.php - cURL - http://php.net/manual/en/book.curl.php
-
DomDocument
- http://php.net/manual/en/class.domdocument.php -
DomElement
- http://www.php.net/manual/en/class.domelement.php -
DomElement::getAttribute
- http://www.php.net/manual/en/domelement.getattribute.php -
DOMElement::setAttribute
- http://www.php.net/manual/en/domelement.setattribute.php -
urlencode
- http://php.net/manual/en/function.urlencode.php -
DomDocument::loadHTMLFile
- http://www.php.net/manual/en/domdocument.loadhtmlfile.php - cURL - http://php.net/manual/en/book.curl.php