使用rest-client将文件下载到磁盘,而无需先将其全部加载到内存中

使用rest-client将文件下载到磁盘,而无需先将其全部加载到内存中

问题描述:

我正在使用rest-client下载大页面(大小约为1.5 GB).检索到的值存储在内存中,而不是保存到文件中.结果我的程序因failed to allocate memory (NoMemoryError)崩溃.

I am using rest-client to download large page (around 1.5 GB in size). Retrieved value is stored in memory than saved into a file. As result my program crashes with failed to allocate memory (NoMemoryError).

但是不必将这些数据保留在内存中,甚至可以将其直接保存到磁盘中.

But it is not necessary to keep this data in memory, it may be even saved directly to disk.

我发现您可以:(...)手动处理响应(例如,以流的形式对其进行操作,而不是将其全部读入内存),请参见RestClient :: Request的文档以获取更多信息."在 https://github.com/rest-client/rest-client 上不幸阅读 http://www.rubydoc.info/gems/rest -client/1.7.3/RestClient/Request 我不知道如何实现.

I found "You can: (...) manually handle the response (e.g. to operate on it as a stream rather than reading it all into memory) See RestClient::Request's documentation for more information." on https://github.com/rest-client/rest-client Unfortunately after reading http://www.rubydoc.info/gems/rest-client/1.7.3/RestClient/Request I have no idea how it may be accomplished.

我也知道我可能会使用其他库(

I am also aware that I may use other library (Using WWW:Mechanize to download a file to disk without loading it all in memory first) but my program is already using rest-client.

简化代码:

data = RestClient::Request.execute(:method => :get, :url => url, :timeout => 3600)
file = File.new(filename, 'w')
file.write data
file.close

代码- https://github.com/mkoniecz/CartoCSSHelper /blob/395deab626209bcdafd675c2d8e08d0e3dd0c7f9/downloader.rb#L126

另一种方法是使用raw_response.这将直接保存到文件中,通常在/tmp中.这样可以毫无问题地处理重定向. 请参见流式响应.这是他们的示例:

Another way is to use raw_response. This saves directly to a file, usually in /tmp. This handles redirects without a problem. See Streaming Responses. Here's their example:

>> raw = RestClient::Request.execute(
           method: :get,
           url: 'http://releases.ubuntu.com/16.04.2/ubuntu-16.04.2-desktop-amd64.iso',
           raw_response: true)
=> <RestClient::RawResponse @code=200, @file=#<Tempfile:/tmp/rest-client.20170522-5346-1pptjm1>, @request=<RestClient::Request @method="get", @url="http://releases.ubuntu.com/16.04.2/ubuntu-16.04.2-desktop-amd64.iso">>
>> raw.file.size
=> 1554186240
>> raw.file.path
=> "/tmp/rest-client.20170522-5346-1pptjm1"