如何在所有文件和子目录下下载HTTP目录,因为它们出现在在线文件/文件夹列表中?

问题描述:

有一个我可以访问的在线HTTP目录。我试图通过wget下载所有子目录和文件。但是,问题是当wget下载子目录时,它会下载包含该目录中的文件列表的index.html文件,而无需自己下载文件。
有没有下载子目录和文件没有深度限制的方法(好像我想下载的目录只是一个我想要复制到我的电脑的文件夹)。

There is an online HTTP directory that I have access to. I have tried to download all sub-directories and files via wget. But, the problem is that when wget downloads sub-directories it downloads the index.html file which contains the list of files in that directory without downloading the files themselves. Is there a way to download the sub-directories and files without depth limit (as if the directory I want to download is just a folder which I want to copy to my computer).

解决方案

wget -r -np -nH --cut-dirs=3 -R index.html http://hostname/aaa/bbb/ccc/ddd/

说明:


  • 它将在ddd目录中下载所有文件和子文件夹:

  • 递归(-r)不要将文件保存到主机名文件夹(-nH),

  • 不要像上面的目录,如ccc / ...(-np) li>
  • ,但是通过省略
    到ddd前3个文件夹aaa,bbb,ccc(--cut-dirs = 3)

  • 不包括index.html
    文件(-R index.html)

  • It will download all files and subfolders in ddd directory:
  • recursively (-r),
  • not going to upper directories, like ccc/… (-np),
  • not saving files to hostname folder (-nH),
  • but to ddd by omitting first 3 folders aaa, bbb, ccc (--cut-dirs=3)
  • excluding index.html files (-R index.html)

参考: http://bmwieczorek.wordpress.com/2008/10/01/wget-recursively-download-all-files-from-某些目录列出的apache /