使用rest-client将文件下载到磁盘,而无需先将其全部加载到内存中
我正在使用rest-client下载大页面(大小约为1.5 GB).检索到的值存储在内存中,而不是保存到文件中.结果我的程序因failed to allocate memory (NoMemoryError)
崩溃.
I am using rest-client to download large page (around 1.5 GB in size). Retrieved value is stored in memory than saved into a file. As result my program crashes with failed to allocate memory (NoMemoryError)
.
但是不必将这些数据保留在内存中,甚至可以将其直接保存到磁盘中.
But it is not necessary to keep this data in memory, it may be even saved directly to disk.
我发现您可以:(...)手动处理响应(例如,以流的形式对其进行操作,而不是将其全部读入内存),请参见RestClient :: Request的文档以获取更多信息."在 https://github.com/rest-client/rest-client 上不幸阅读 http://www.rubydoc.info/gems/rest -client/1.7.3/RestClient/Request 我不知道如何实现.
I found "You can: (...) manually handle the response (e.g. to operate on it as a stream rather than reading it all into memory) See RestClient::Request's documentation for more information." on https://github.com/rest-client/rest-client Unfortunately after reading http://www.rubydoc.info/gems/rest-client/1.7.3/RestClient/Request I have no idea how it may be accomplished.
I am also aware that I may use other library (Using WWW:Mechanize to download a file to disk without loading it all in memory first) but my program is already using rest-client.
简化代码:
data = RestClient::Request.execute(:method => :get, :url => url, :timeout => 3600)
file = File.new(filename, 'w')
file.write data
file.close
另一种方法是使用raw_response
.这将直接保存到文件中,通常在/tmp
中.这样可以毫无问题地处理重定向.
请参见流式响应.这是他们的示例:
Another way is to use raw_response
. This saves directly to a file, usually in /tmp
. This handles redirects without a problem.
See Streaming Responses. Here's their example:
>> raw = RestClient::Request.execute(
method: :get,
url: 'http://releases.ubuntu.com/16.04.2/ubuntu-16.04.2-desktop-amd64.iso',
raw_response: true)
=> <RestClient::RawResponse @code=200, @file=#<Tempfile:/tmp/rest-client.20170522-5346-1pptjm1>, @request=<RestClient::Request @method="get", @url="http://releases.ubuntu.com/16.04.2/ubuntu-16.04.2-desktop-amd64.iso">>
>> raw.file.size
=> 1554186240
>> raw.file.path
=> "/tmp/rest-client.20170522-5346-1pptjm1"