如何使用Javascript或JQuery获取页面内容

问题描述:

我将在远程页面上有一个小部件.在小部件中,我希望javascript或jquery从网页中获取所有文章内容,然后将其发送回我的网站.我只需要文章内容,而不需要网页上的所有其他信息.我希望脚本发送远程网页的url,页面内容,标题文本和h1文本.我不希望收到任何html标签.这可能吗?

I will have a widget on a remote page. In the widget I want javascript or jquery to get all the article content from the webpage and send it back to my website. I only need just the article content and not all the other information on the webpage. I would like the script to send the remote webpage url, page content, title text, and h1 text. I would not like to receive any html tags. Is this possible to do?

我正在编写的脚本就像google adsense. 另外,我会使用c#作为后端服务器

The script I am making is like google adsense. Also, Ill be using c# as my backend server

类似的东西行得通吗? http://blog.nparashuram.com/2009/08/screen-scraping-with-javascript-firebug.html

will something like this work? http://blog.nparashuram.com/2009/08/screen-scraping-with-javascript-firebug.html

我的建议是,如果不是太多数据,那就使用信标.

my suggestion, if it's not too much data would be to use a beacon.

var beac = new Image();
beac.onload = function () {
  //do somethiringng on completion
}
beac.src = "youdomain/somthing.php?var=asdasd&key=someUniqueString";

这使您可以向其他域上的服务器发送适量的数据,前提是您不需要任何东西.

This allows you to send a moderate amount of data to a server on another domain, provided you don't need anything back.