如何在Java中取消对HTML字符实体的转义?

问题描述：

基本上，我想解码给定的HTML文档，并替换所有特殊字符，例如" "-> " "，">"-> ">".

Basically I would like to decode a given Html document, and replace all special chars, such as " " -> " ", ">" -> ">".

在.NET中，我们可以使用HttpUtility.HtmlDecode.

In .NET we can make use of HttpUtility.HtmlDecode.

Java中的等效功能是什么?

What's the equivalent function in Java?

答

我使用了Apache Commons

I have used the Apache Commons StringEscapeUtils.unescapeHtml4() for this:

转义包含实体的字符串转义到包含实际的Unicode字符对应于逃生.技术支持 HTML 4.0实体.

Unescapes a string containing entity escapes to a string containing the actual Unicode characters corresponding to the escapes. Supports HTML 4.0 entities.

如何在Java中取消对HTML字符实体的转义?

相关推荐