使用Google Cloud功能进行抓取,状态码为304
我正在尝试使用谷歌云功能,但可以正常工作,但状态码为304,不确定原因是什么。
下面是代码,
I am trying out google cloud functions, it works but finishes with status code of 304 not sure what is the reason. Below is the code,
//gcloud beta functions deploy scrapeGitCollection --trigger-http
var cheerio = require('cheerio');
var request = require('request');
function getDateTime() {
var date = new Date();
var hour = date.getHours();
hour = (hour < 10 ? "0" : "") + hour;
var min = date.getMinutes();
min = (min < 10 ? "0" : "") + min;
var sec = date.getSeconds();
sec = (sec < 10 ? "0" : "") + sec;
var year = date.getFullYear();
var month = date.getMonth() + 1;
month = (month < 10 ? "0" : "") + month;
var day = date.getDate();
day = (day < 10 ? "0" : "") + day;
return year + "/" + month + "/" + day + " " + hour + ":" + min + ":" + sec;
}
var scrape = new Promise((resolve, reject) =>{
var text;
var array = [];
request({
method: 'GET',
url: 'https://github.com/collections'
}, function(err, response, body) {
if (err) return reject(err);
// Tell Cherrio to load the HTML
$ = cheerio.load(body);
$('.col-10 h2 a').each(function(i, element) {
var node = $(this);
text = node.text();
//console.log(text);
array.push(text);
});
text = JSON.stringify(array);
console.log(text);
resolve(text);
});
});
// [START functions_helloworld_http]
/**
* HTTP Cloud Function.
* @param {Object} req Cloud Function request context.
* @param {Object} res Cloud Function response context.
*
*/
exports.scrapeGitCollection = (req, res) => {
console.log('Triggered @ '+getDateTime());
scrape.then((data) =>{
res.send(`Hello ${data || 'World'}!`);
}).catch( (errorMessage) =>{
console.error(errorMessage);
});
};
// [END functions_helloworld_http]
这是我在stackdriver中看到的日志
This is the log i see in stackdriver
2018-07-05 22:19:39.181 IST
scrapeGitCollection
x6mbdi17rdj7
Function execution took 7 ms, finished with status code: 304
{
insertId: "000000-4f8925ba-a5a5-4f0e-9c32-906e91374e32"
labels: {…}
logName: "projects/btd-in-16062018/logs/cloudfunctions.googleapis.com%2Fcloud-functions"
receiveTimestamp: "2018-07-05T16:49:45.425756526Z"
resource: {…}
severity: "DEBUG"
textPayload: "Function execution took 7 ms, finished with status code: 304"
timestamp: "2018-07-05T16:49:39.181731257Z"
}
出于调试目的,我在scrape()函数的下面添加了该行,该行未出现在日志中。 / p>
For debugging purposes, i had added below line in scrape() function, it did not appear in the log.
console.log(text);
如果我强行说如下,我将得到上面的console.log文本以及结果。
If i forcibly say as below, i get the above console.log text with the results.
res.status(200).send(`Hello ${data || 'World'}!`);
我不想强行说出身分(200)。想解决304,这是我早些时候遇到的。
I don't want to forcibly say status(200). Would like to resolve 304 which i was getting earlier.
根据Mozilla的文档, 304状态代码表示请求已重定向,可能重定向到了缓存的资源。我认为这与响应正文始终相同有关,因为如果将响应行更改为
According to Mozilla's documentation, the 304 status code indicates that the request was redirected, likely to a cached resource. I believe this has something to do with the body of the response always being the same, because if you change the line with the response to
res.send(`Hello ${data || 'World'}!` + getDateTime());
然后您会看到每次响应代码都是200。
then you'll see that the response code is 200 every time.
请注意304不是错误。错误状态代码为 400及更高版本。
Note that 304 isn't an error. The error status codes are 400 and above.