问题与Base64编码/解码:德codeD字符串为“?”
我想读取图像,然后使用Base64编码将其转换成字节数组,然后以字符串通过网络发送。问题是,当我尝试去code为Base64 EN codeD字符串,我得到不正确的数据。
I am trying to read an image and use Base64 encoding to convert it into byte array and then to string to send it over network. The problem is that when I try to decode the Base64 encoded string, I am getting incorrect data.
有关如。我面临着以下特殊字符的问题。
For eg. I am facing issue with below special character.
我使用以下code编码:
I am using following code for encoding:
byte[] b = Base64.encodeBase64(IOUtils.toByteArray(loInputStream));
String ab = new String(b);
IOUtils
是 org.apache.commons.io.IOUtils
。
和loInput
code解码:
byte[] c = Base64.decodeBase64(ab.getBytes());
String ca = new String(c);
System.out.println(ca);
它打印?
德为codeD字符串。
It prints ?
for decoded String.
任何人都可以请让我知道这个问题。
Can anyone please let me know the issue.
正如我已经说过elsewhere,在Java中,字符串
是文本,字节[]
是二进制数据。
As I've said elsewhere, in Java, String
is for text, and byte[]
is for binary data.
字符串≠的byte []
文字≠二进制数据
的图像是二进制数据。 Base64是一种编码,它允许在US_ASCII兼容的文本渠道二进制数据传输(有一个类似的编码ASCII文本的超集:引用Printable)。
An image is binary data. Base64 is an encoding which allows transmission of binary data over US_ASCII compatible text channels (there is a similar encoding for supersets of ASCII text: Quoted Printable).
因此,它是这样:
图像(二进制数据)→图像(文字,Base64编码的连接codeD二进制数据)→图像(二进制数据)
在这里你可以使用字符串连接codeBase64String(字节[])
为en code和字节[德$] C $ C(字符串)
脱code。这是唯一明智的API为Base64编码,字节[] EN codeBase64(字节[])
是一种误导,其结果是US_ASCII兼容的文本(所以,一个字符串的不的字节[]
)。
where you would use String encodeBase64String(byte[])
to encode, and byte[] decode(String)
to decode. These are the only sane API's for Base64, byte[] encodeBase64(byte[])
is misleading, the result is US_ASCII-compatible text (so, a String
, not byte[]
).
现在,文本具有一个字符集和编码,字符串
使用一个固定的Uni code / UTF-16字符集/编码组合内部,你必须指定一个字符集/编码从/转换时的东西到字符串
,任何明示或暗示,使用平台的默认编码(这是 PrintStream.println ()
一样)。 Base64编码的文本是纯粹的US_ASCII,所以你需要使用,或US_ASCII的超集。 org.apache.commons。codec.binary.Base64
使用UTF8,这是US_ASCII的超集,所以一切都很好。 (OTOH,内部的java.util。prefs.Base64
使用平台的默认编码,所以我想,如果你有,比如开始你的JVM它会打破,一个UTF-16编码)。
Now, text has a charset and an encoding, String
uses a fixed Unicode/UTF-16 charset/encoding combination internally, and you have to specify a charset/encoding when converting something from/to a String
, either explicitly, or implicitly, using the platform's default encoding (which is what PrintStream.println()
does). Base64 text is pure US_ASCII, so you need to use that, or a superset of US_ASCII. org.apache.commons.codec.binary.Base64
uses UTF8, which is a superset of US_ASCII, so all is well. (OTOH, the internal java.util.prefs.Base64
uses the platform's default encoding, so I guess it would break if you start your JVM with, say, an UTF-16 encoding).
回到主题:你尝试打印的德codeD映像(二进制数据)文本,这显然没有奏效。 的PrintStream
有 的write()
可写入二进制数据,因此你可以使用这些,并且如果你写,你会得到相同的垃圾的方法>原始图像。这将是更好的使用 的FileOutputStream
和比较原始图像文件生成的文件。
Back on topic: you've tried to print the decoded image (binary data) as text, which obviously hasn't worked. PrintStream
has write()
methods that can write binary data, so you could use those, and you would get the same garbage as if you wrote the original image. It would be much better to use a FileOutputStream
, and compare the resulting file with the original image file.