将HTML字符代码转换为Java中的字符
我们的XML Feed在ISO-8859-1文件中提供了编码的UTF-8字符。这被送入数据库。所以文本是ISO-8859-1编码,并包含以下内容:
Our XML feed gives us encoded UTF-8 characters inside ISO-8859-1 a file. This is being fed into the database. So the text is ISO-8859-1 encoded and contains following stuff:
金融市场
有没有办法将其转换成普通的Java字符串?类似于:
Is there a way to convert that into a normal Java string? Similar to:
String str = fromHtmlUtf8("金融市场");
其中结果的str将包含正常的UTF8字符。中国人在这种情况下,但可以相当混合。
Where resulting str will contain normal UTF8 chars. Chinese in this case, but can be quite mixed.
谢谢。
您可以使用Apache Commons中的StringEscapeUtils:
http://commons.apache.org/lang/api-2.6/org/apache/commons/lang/StringEscapeUtils.html
You can use the StringEscapeUtils from Apache Commons: http://commons.apache.org/lang/api-2.6/org/apache/commons/lang/StringEscapeUtils.html
下次搜索之前:如何在HTML中将HTML转换为UTF-8