UTF-8使用Tomcat编码servlet表单提交
我正在尝试将包含unicode字符的简单表单发布到servlet操作。在Jetty上,一切都没有障碍。在Tomcat服务器上,utf-8字符被破坏。
I'm attempting to post a simple form that includes unicode characters to a servlet action. On Jetty, everything works without a snag. On a Tomcat server, utf-8 characters get mangled.
我得到的最简单的案例:
The simplest case I've got:
表格:
<form action="action" method="post">
<input type="text" name="data" value="It’s fine">`
</form>`
行动:
class MyAction extends ActionSupport {
public void setData(String data) {
// data is already mangled here in Tomcat
}
}
- 我在server.xml中的
< Connector>
上有URIEncoding =UTF-8 - 操作的第一个过滤器调用request.setCharacterEncoding(UTF-8);
- 包含表单的页面的内容类型是text / html; charset = UTF-8
- 在表单中添加accept-charset没有区别
- I've got URIEncoding="UTF-8" on
<Connector>
in server.xml - The first filter on the action calls request.setCharacterEncoding("UTF-8");
- The content type of the page that contains the form is "text/html; charset=UTF-8"
- Adding "accept-charset" to the form makes no difference
我能使其工作的唯一两种方法是使用Jetty或将其切换为method =get。这两个都导致角色没有问题。
The only two ways I can make it work are to use Jetty or to switch it to method="get". Both of those cause the characters to come through without a problem.
我'在server.xml中的
< Connector>
上获得URIEncoding =UTF-8
I've got URIEncoding="UTF-8" on
<Connector>
in server.xml
这仅与GET请求相关。
That's only relevant for GET requests.
操作的第一个过滤器调用
request.setCharacterEncoding(UTF-8);
很好,这应该适用于POST请求。你只需要确保如果你没有调用 getParameter()
, getReader()
, getInputStream()
或在调用 setCharacterEncoding()
之前触发解析请求正文的任何其他内容。
Fine, that should apply on POST requests. You only need to make sure that if you haven't called getParameter()
, getReader()
, getInputStream()
or anything else which would trigger parsing the request body before calling setCharacterEncoding()
.
包含表单的页面的内容类型是
text / html; charset = UTF-8
你如何设置?如果在< meta>
中完成,那么当页面通过HTTP提供时,您需要了解浏览器忽略 HTTP Content-Type
响应标头存在。平均网络服务器默认已经设置了它。只有当页面保存到本地磁盘并从那里查看时,才会使用< meta>
内容类型。
How exactly are you setting it? If done in a <meta>
, then you need to understand that this is ignored by the browser when the page is served over HTTP and the HTTP Content-Type
response header is present. The average webserver namely already sets it by default. The <meta>
content type will then only be used when the page is saved to local disk and viewed from there.
要正确设置响应标头字符集,请将以下内容添加到JSP的顶部:
To set the response header charset properly, add the following to top of your JSP:
<%@page pageEncoding="UTF-8" %>
顺便提一下,这也会告诉服务器在给定的字符集中发送响应。
This will by the way also tell the server to send the response in the given charset.
在表单中添加accept-charset没有区别
它只会在MSIE中产生差异,但即便如此,它也会错误地使用它。无论如何,整个属性都毫无价值。算了吧。
It only makes difference in MSIE, but even then it is using it wrongly. The whole attribute is worthless anyway. Forget it.
- Unicode - How to get the characters right?