如何使用javascript从网站提取数据.

如何使用javascript从网站提取数据.

问题描述:

您好,我是这里的新手,请多多包涵.看起来很简单,但是我似乎找不到一种简单的方法来做到这一点.

Hi complete newbie here so bear with me. Seems like a simple job but I can't seem to find an easy way to do this.

因此,我需要从"www.example.com/index.php"网页中提取特定文本.我知道该文本将在带有特定ID的p标签中可用.如何使用javascript提取这些数据?

So I need to extract a particular text from a webpage "www.example.com/index.php". I know that the text would be available in p tag with certain id. How do I extract this data out using javascript?

我目前正在尝试的是在计算机上使用以下代码存储我的javascript文件(trying.js):

What I'm trying currently is that I have my javascript file (trying.js) on my computer with the following code:

$(document).ready(function () {
    $.get("www.example.com/index.php", function(data) {
        console.log(data)
    }) ;
});

和运行javascript文件的html.

and a html that runs the javascript file.

当我使用firefox打开此html页面时,它在控制台中没有显示任何内容.我如何获得网站数据?我在正确的轨道上吗?有更好的方法吗?

When I open this html page with firefox it doesn't show me anything in console. How do I get the website's data? Am I on the correct track here? Is there a better way to do this?

您正在寻找的是页面抓取器. Javascript无法实现,因为它只能从您所在的域中收集数据.

What you're looking for is a page scraper. Javascript can't pull it off because it can only gather data from the domain you're on.

例如,您可以在Ruby中构建它,并使用许多现有的宝石之一来执行此类任务,例如 https://github.com/assaf/scrapi http://nokogiri.org/

You could build it in Ruby, for example, and use one of the many existing gems for this sort of task, like https://github.com/assaf/scrapi or http://nokogiri.org/