检查符文是否在基本多语言平面中的正确方法是什么?
I want to check, whether a given rune is in a basic multilingual plane or not.
That is, what to put in this function - https://play.golang.org/p/3szTn8pP7xe
package main
import (
"fmt"
)
func isBMP(r rune) bool {
// ???
return false
}
func main() {
fmt.Println(isBMP(rune('պ'))) // expect true
fmt.Println(isBMP(rune('
我要检查给定的符文是否在基本的多语言平面。 p>
也就是说,要在此函数中添加什么内容- https:/ /play.golang.org/p/3szTn8pP7xe p>
package main
import(
“ fmt”
)
func isBMP(r rune)bool {
// ???
返回false
}
func main(){
fmt.Println(isBMP(rune('պ') ))//期望为真
fmt.Println(isBMP(rune('
Basic Multilingual Plane have the following code point ranges allocated:
0000–0FFF 8000–8FFF
1000–1FFF 9000–9FFF
2000–2FFF A000–AFFF
3000–3FFF B000–BFFF
4000–4FFF C000–CFFF
5000–5FFF D000–DFFF
6000–6FFF E000–EFFF
7000–7FFF F000–FFFF
So to tell if a rune falls in the basic multilingual plane, just check if it falls inside any of these ranges. Since these ranges cover all values between 0
and 0xffff
(both inclusive), just check it like this:
func isBMP(r rune) bool {
return r >= 0 && r <= 0xffff
}
Note that since rune
is alias for int32
, it may have negative values, so also checking if it's not negative is important.
This will output your expected result. Try it on the Go Playground.
Note #2: iterating over the runes of a string which contains invalid UTF-8 bytes, you will get the Unicode replacement character for the invalid bytes, which is 0xfffd
. If you want to exclude those from your test, you could modify it like:
func isBMP(r rune) bool {
return r >= 0 && r <= 0xffff && r != 0xfffd
}
I'm not that familiar with go. However a bit of Googleing suggests that a rune is in fact an int32 so as anything in the basic multilingual plain has a code point between 0 and 65535 you should be able to do this
func isBMP(r rune) bool {
if r <= 65535 {
return true
}
else {
return false
}
}