Java或Scala.如何将像\ x22这样的字符转换为字符串

Java或Scala.如何将像\ x22这样的字符转换为字符串

问题描述:

我有一个看起来像这样的字符串:

I have a string that looks like this:

{\x22documentReferer\x22:\x22http:\x5C/\x5C/pikabu.ru\x5C/freshitems.php\x22}

如何将其转换为可读的JSON?

How could I convert this into a readable JSON?

我发现了不同的慢速解决方案,例如此处使用regEx

I've found different slow solutions like here with regEx

已经尝试过:

URL.decode
StringEscapeUtils
JSON.parse // from different libraries 

例如python有简单的解决方案,例如从'string_escape'

For example python has simple solution like decode from 'string_escape'

链接的可能重复项适用于Python,我的问题是有关Java或Scala的

Linked possible duplicate applies to Python, and my question is about Java or Scala

我正在使用的工作但也很慢的解决方案来自此处:

Working but also very slow solution I'm using now is from here:

 def unescape(oldstr: String): String = {
val newstr = new StringBuilder(oldstr.length)
var saw_backslash = false
var i = 0
while (i < oldstr.length) {
  {
    val cp = oldstr.codePointAt(i)
    if (!saw_backslash) {
      if (cp == '\\') saw_backslash = true
      else newstr.append(cp.toChar)
    } else {
      if (cp == '\\') {
        saw_backslash = false
        newstr.append('\\')
        newstr.append('\\')
      } else {
        if (cp == 'x') {
          if (i + 2 > oldstr.length) die("string too short for \\x escape")
          i += 1
          var value = 0
          try
            value = Integer.parseInt(oldstr.substring(i, i + 2), 16)
          catch {
            case nfe: NumberFormatException =>
              die("invalid hex value for \\x escape")
          }
          newstr.append(value.toChar)
          i += 1
        }
        else {
          newstr.append('\\')
          newstr.append(cp.toChar)
        }
        saw_backslash = false
      }
    }
  }
  i += 1
}
    if (saw_backslash) newstr.append('\\')
    newstr.toString
  }

private def die(msg: String) {
  throw new IllegalArgumentException(msg)
}

\x用于转义Python和其他语言的ASCII字符.在Scala和Java中,可以使用\u来转义Unicode字符.由于ASCII是Unicode的子集(如此处所述),我们可以使用unescapeJava方法(在StringEscapeUtils )以及一些简单的替换操作,以添加\u转义字符和2个前导零:

\x is used to escape ASCII characters in Python and other languages. In Scala and Java, you can use \u to escape Unicode characters. Since ASCII is a subset of Unicode (as explained here), we can use the unescapeJava method (in StringEscapeUtils) along with some simple replacement to add the \u escape character together with 2 leading zeros:

import org.apache.commons.lang3.StringEscapeUtils
StringEscapeUtils.unescapeJava(x.replaceAll("""\\x""", """\\u00"""))

您还可以使用正则表达式查找转义序列并将其替换为适当的ASCII字符:

You can also use regex to find the escape sequences and replace them with the appropriate ASCII character:

val pattern = """\\x([0-9A-F]{2})""".r

pattern.replaceAllIn(x, m => m.group(1) match {
  case "5C" => """\\""" //special case for backslash
  case hex => Integer.parseInt(hex, 16).toChar.toString
})

这似乎更快,并且不需要外部库,尽管对于您的需求来说可能仍然很慢.它可能也不会涵盖某些极端情况,但可能会涵盖简单的需求.

This appears to be faster and does not require an external library, although it is still may be slow for your needs. It probably also does not cover some edge cases, but might cover simple needs.

我绝对不是这方面的专家,所以可能会有更好的方法来解决这个问题.

I am definitely not an expert on this so there might be a better way to handle this.