有办法遍历出全部淘宝号吗
有办法遍历出所有淘宝号吗?
http://www.taodake.com/HistoryList_all_4.html
这个连接里的,500000条记录,但是分页了!如果能一次性搞出来就好了!
------解决方案--------------------
用jsoup很方便,下个jsoup.jar就可以用了。可能需要在优化下
http://www.taodake.com/HistoryList_all_4.html
这个连接里的,500000条记录,但是分页了!如果能一次性搞出来就好了!
------解决方案--------------------
用jsoup很方便,下个jsoup.jar就可以用了。可能需要在优化下
package com.joup;
import java.io.IOException;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
public class Taodake {
private static String url="http://www.taodake.com/HistoryList_all_";
private static int start_page = 1;
private static int last_page = 10;
public static void main(String[] args){
Document doc = null;;
for(int i=start_page;i<last_page;i++){
try {
doc = Jsoup.connect(url+i+".html").get();
} catch (IOException e) {
System.out.println("第"+i+"页异常");
e.printStackTrace();
}
TaoThread tao = new TaoThread(doc);
tao.start();
}
}
}
package com.joup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
public class TaoThread extends Thread{
private Document doc;
public TaoThread() {
}
public TaoThread(Document doc) {
this.doc = doc;
}
@Override
public void run() {
Elements ids = doc.getElementsByClass("id_td");
for(Element id:ids){
System.out.println("取出id:"+id.text());
}
}
}