Base64编码不会因无效字符而失败[关闭]
I am trying to ensure a string coming from an http request is valid for use in a base64 url param. I've been experimenting with base64.RawURLEncoding
as I assumed encoding an invalid string would throw an err, or at least decoding the result of this would fail, however it quite happily encodes/decodes the string regardless of the input.
https://play.golang.org/p/3sHUfl2NSJK
I have created the above playground showing the issue I'm having (albeit an extreme example). Is there another way of ascertaining whether a string consists entirely of valid base64 characters?
我正在尝试确保来自http请求的字符串在base64 url参数中有效。 我一直在尝试 https://play.golang.org/ p / 3sHUfl2NSJK p>
我创建了上面的游乐场,显示了我遇到的问题(尽管是一个极端的例子)。 还有另一种确定字符串是否完全由有效的base64字符组成的方法吗? p>
div> base64.RawURLEncoding code>,因为我认为编码无效的字符串会引发err,或者至少解码此结果会失败,但是无论如何,它都会很高兴地对字符串进行编码/解码 输入。 p>
To clarify, Base64 is an encoding scheme which allows you to take arbitrary binary data and safely encode it into ASCII characters which can later be decoded into the original binary string.
That means that the "Base64-encode" operation can take literally any input and produce valid, encoded data. However, the "Base64-decode" operation will fail if its input string contains characters outside of set of ASCII characters that the encoding uses (meaning that the given string was not produced by a valid Base64-encoder).
To test if a string contains a valid Base64 encoded sequence, you just need to call base64.Encoding.DecodeString(...)
and test if the error is "nil".
For example (Go Playground):
func IsValidBase64(s string) bool {
_, err := base64.StdEncoding.DecodeString(s)
return err == nil
}
func main() {
ss := []string{"ABBA", "T0sh", "Foo=", "Bogus\x01"}
for _, s := range ss {
if IsValidBase64(s) {
fmt.Printf("OK: valid Base64 %q
", s)
} else {
fmt.Printf("ERR: invalid Base64 %q
", s)
}
}
// OK: valid Base64 "ABBA"
// OK: valid Base64 "T0sh"
// OK: valid Base64 "Foo="
// ERR: invalid Base64 "Bogus\x01"
}
base64 encoding works by interpreting an arbitrary bit stream as a string of 6-bit integers, which are then mapped one-by-one to the chosen base64 alphabet.
Your example string starts with these 8-bit bytes:
11000010 10111010 11000010 10101010 11100010 10000000
Re-arrange them into 6-bit numbers:
110000 101011 101011 000010 101010 101110 001010 000000
And map them to a base64 alphabet (here URL encoding):
w r r C q u K A
Since every 6-bit number can be mapped to a character in the alphabet (there's exactly 64 of them), there are no invalid inputs to base64. This is precisely what base64 is used for: turn arbitrary input into printable ASCII characters.
Decoding, on the other hand, can and will fail if the input contains bytes outside of the base64 alphabet — they can't be mapped back to the 6-bit integer.