Apache HttpClient未显示响应的Content-Length和Content-Encoding标头

Apache HttpClient未显示响应的Content-Length和Content-Encoding标头

问题描述:

我安装了 Apache httpcomponents-client-5.0.x ,并且在查看http响应的标头时,我很震惊,它没有显示 Content-Length Content-Encoding 标头,这是我使用的代码测试

I installed Apache httpcomponents-client-5.0.x and while reviewing the headers of the http response, I was shocked it doesn't show the Content-Length and Content-Encoding headers, this is the code I used for testing

import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.CloseableHttpResponse;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import com.sun.net.httpserver.Headers;

CloseableHttpClient httpclient = HttpClients.createDefault();
HttpGet request = new HttpGet(new URI("https://www.example.com"));
CloseableHttpResponse response = httpclient.execute(request);
Header[] responseHeaders = response.getHeaders();
for(Header header: responseHeaders) {               
    System.out.println(header.getName());
}
// this prints all the headers except 
// status code header
// Content-Length
// Content-Encoding

无论我尝试什么,我都会得到相同的结果

No matter what I try I get the same result, like this

Iterator<Header> headersItr = response.headerIterator();
while(headersItr.hasNext()) {
    Header header = headersItr.next();
    System.out.println(header.getName());
}

或者这个

HttpEntity entity = response.getEntity();
System.out.println(entity.getContentEncoding()); // NULL
System.out.println(entity.getContentLength());   // -1

根据6年前提出的这个问题,即使使用旧版本的Apache HttpClient,这似乎也是一个老问题.

According to this question that has been asked 6 years ago, it seems like an old issue even with older versions of Apache HttpClient.

当然,服务器实际上正在返回由Wireshark确认的标头,而Apache HttpClient会自行记录

Of-course the server is actually returning those headers as confirmed by Wireshark, and Apache HttpClient logs itself

2020-04-03 07:59:09,106 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << HTTP/1.1 200 OK
2020-04-03 07:59:09,106 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << Content-Encoding: gzip
2020-04-03 07:59:09,106 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << Accept-Ranges: bytes
2020-04-03 07:59:09,107 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << Age: 451956
2020-04-03 07:59:09,107 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << Cache-Control: max-age=604800
2020-04-03 07:59:09,107 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << Content-Type: text/html; charset=UTF-8
2020-04-03 07:59:09,107 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << Date: Fri, 03 Apr 2020 05:59:09 GMT
2020-04-03 07:59:09,108 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << Etag: "3147526947+gzip"
2020-04-03 07:59:09,108 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << Expires: Fri, 10 Apr 2020 05:59:09 GMT
2020-04-03 07:59:09,108 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << Last-Modified: Thu, 17 Oct 2019 07:18:26 GMT
2020-04-03 07:59:09,108 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << Server: ECS (dcb/7EEB)
2020-04-03 07:59:09,108 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << Vary: Accept-Encoding
2020-04-03 07:59:09,109 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << X-Cache: HIT
2020-04-03 07:59:09,109 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << Content-Length: 648

BTW, java.net.http 库称为

BTW, java.net.http library known as JDK HttpClient works great and show all the headers.

我做错了什么,还是应该报告一个已经存在多年的错误?

Is there something wrong I did, or should I report a bug that been there for years ?

HttpComponents提交者在这里...

HttpComponents committer here...

您没有密切注意Dave G所说的话.默认情况下, HttpClientBuilder 将启用透明解压缩,而您再也看不到某些标头的原因是

You did not closely pay attention what Dave G said. By default, HttpClientBuilder will enable transparent decompression and the reason why you don't see some headers anymore is here:

if (decoderFactory != null) {
  response.setEntity(new DecompressingEntity(response.getEntity(), decoderFactory));
  response.removeHeaders(HttpHeaders.CONTENT_LENGTH);
  response.removeHeaders(HttpHeaders.CONTENT_ENCODING);
  response.removeHeaders(HttpHeaders.CONTENT_MD5);
} ...

关于JDK HttpClient,它将不会执行任何透明的解压缩,因此您会看到压缩流的长度.您必须自行解压缩.

Regarding the JDK HttpClient, it will not perform any transparent decompression, therefore you see the length of the compressed stream. You have to decompress on your own.

卷曲提交者在这里...

curl committer here...

我也有提出了问题.