从 UIWebView 读取 HTML 内容

问题描述:

是否可以读取已加载到 UIWebView 中的网页的原始 HTML 内容?

Is it possible to read the raw HTML content of a web page that has been loaded into a UIWebView?

如果没有,是否有另一种方法可以从 iPhone SDK 中的网页中提取原始 HTML 内容(例如等效于 .NET WebClient::openRead)?

If not, is there another way to pull raw HTML content from a web page in the iPhone SDK (such as an equivalent of the .NET WebClient::openRead)?

其实第二个问题更容易回答.查看stringWithContentsOfURL:encoding:error: NSString 方法 - 它允许您将 URL 作为 NSURL 的实例传入(可以轻松地从 NSString 实例化)并返回一个字符串,其中包含该 URL 处的页面的完整内容.例如:

The second question is actually easier to answer. Look at the stringWithContentsOfURL:encoding:error: method of NSString - it lets you pass in a URL as an instance of NSURL (which can easily be instantiated from NSString) and returns a string with the complete contents of the page at that URL. For example:

NSString *googleString = @"http://www.google.com";
NSURL *googleURL = [NSURL URLWithString:googleString];
NSError *error;
NSString *googlePage = [NSString stringWithContentsOfURL:googleURL 
                                                encoding:NSASCIIStringEncoding
                                                   error:&error];

运行此代码后,googlePage 将包含 www.google.com 的 HTML,error 将包含在提取中遇到的任何错误.(你应该在fetch之后检查error的内容.)

After running this code, googlePage will contain the HTML for www.google.com, and error will contain any errors encountered in the fetch. (You should check the contents of error after the fetch.)

相反(从 UIWebView)有点棘手,但基本上是相同的概念.您必须从视图中提取 request,然后像以前一样进行提取:

Going the other way (from a UIWebView) is a bit trickier, but is basically the same concept. You'll have to pull the request from the view, then do the fetch as before:

NSURL *requestURL = [[yourWebView request] URL];
NSError *error;
NSString *page = [NSString stringWithContentsOfURL:requestURL 
                                          encoding:NSASCIIStringEncoding
                                             error:&error];

然而,这两种方法都会对性能造成影响,因为它们执行了两次请求.您可以通过使用其 stringByEvaluatingJavascriptFromString: 方法从当前加载的 UIWebView 中获取内容来解决此问题,如下所示:

Both these methods take a performance hit, however, since they do the request twice. You can get around this by grabbing the content from a currently-loaded UIWebView using its stringByEvaluatingJavascriptFromString: method, as such:

NSString *html = [yourWebView stringByEvaluatingJavaScriptFromString: 
                                         @"document.body.innerHTML"];

这将使用文档对象模型获取视图的当前 HTML 内容,解析 JavaScript,然后将其作为 HTML 的 NSString* 提供给您.

This will grab the current HTML contents of the view using the Document Object Model, parse the JavaScript, then give it to you as an NSString* of HTML.

另一种方法是首先以编程方式执行您的请求,然后从您请求的内容中加载 UIWebView.假设您采用上面的第二个示例,其中您将 NSString *page 作为对 stringWithContentsOfURL:encoding:error: 的调用的结果.然后,您可以使用 loadHTMLString:baseURL: 将该字符串推送到 Web 视图中,假设您还保留了您请求的 NSURL:

Another way is to do your request programmatically first, then load the UIWebView from what you requested. Let's say you take the second example above, where you have NSString *page as the result of a call to stringWithContentsOfURL:encoding:error:. You can then push that string into the web view using loadHTMLString:baseURL:, assuming you also held on to the NSURL you requested:

[yourWebView loadHTMLString:page baseURL:requestURL];

但是,我不确定这是否会运行在您加载的页面中找到的 JavaScript(方法名称,loadHTMLString,有些含糊不清,并且文档对此没有太多说明).

I'm not sure, however, if this will run JavaScript found in the page you load (the method name, loadHTMLString, is somewhat ambiguous, and the docs don't say much about it).

更多信息:

  • UIWebView class reference
  • NSString class reference
  • NSURL class reference