您如何安全地在C中声明16位字符串文字?
我知道以L
为前缀已经存在一种标准方法:
I'm aware that there is already a standard method by prefixing with L
:
wchar_t *test_literal = L"Test";
问题在于,不能保证wchar_t
是16位的,但是对于我的项目,我需要一个16位的wchar_t
.我也想避免通过-fshort-wchar
的要求.
The problem is that wchar_t
is not guaranteed to be 16-bits, but for my project, I need a 16-bit wchar_t
. I'd also like to avoid the requirement of passing -fshort-wchar
.
那么,C(不是C ++)是否有任何前缀可以让我声明UTF-16字符串文字?
So, is there any prefix for C (not C++) that will allow me to declare a UTF-16 string literal?
那么,C(不是C ++)是否有任何前缀可以让我声明UTF-16字符串文字?
So, is there any prefix for C (not C++) that will allow me to declare a UTF-16 string literal?
几乎,但不完全是. C2011为您提供以下选项:
Almost, but not quite. C2011 offers you these options:
- 字符字符串文字(
char
类型的元素)-无前缀.示例:"Test"
- UTF-8字符串文字(
char
类型的元素)-'u8'前缀.示例:u8"Test"
- 三种风格的宽字符串文字:
-
wchar_t
元素-'L'前缀.示例:L"Test"
-
char16_t
元素-'u'前缀.示例:u"Test"
-
char32_t
元素-'U'前缀.示例:U"Test"
- character string literals (elements of type
char
) - no prefix. Example:"Test"
- UTF-8 string literals (elements of type
char
) - 'u8' prefix. Example:u8"Test"
- wide string literals of three flavors:
-
wchar_t
elements - 'L' prefix. Example:L"Test"
char16_t
elements - 'u' prefix. Example:u"Test"
-
char32_t
elements - 'U' prefix. Example:U"Test"
但是请注意,尽管您可以声明具有
char16_t
类型的元素的宽字符串文字,但是该标准不保证将UTF-16编码用于它们,也不会对它做出任何特殊要求.语言的基本字符集之外的哪些字符必须包含在执行字符集中.但是,您可以在编译时测试前者:如果char16_t
在给定的实现中表示UTF-16编码的字符,则该实现会将宏__STDC_UTF_16__
定义为1
.Note well, however, that although you can declare a wide string literal having elements of type
char16_t
, the standard does not guarantee that the UTF-16 encoding will be used for them, nor does it make any particular requirements on which characters outside the language's basic character set must be included in the execution character set. You can test the former at compile time, however: ifchar16_t
represents UTF-16-encoded characters in a given conforming implementation, then that implementation will define the macro__STDC_UTF_16__
to1
.还请注意,您需要包括(C's)
uchar.h
标头才能使用char16_t
类型名称,但是文字的u"..."
语法并不依赖于此.请注意,因为此标头名称与Unicode国际组件的C接口所使用的名称相冲突,Unicode是一种相对广泛使用的Unicode支持软件包.Note also that you need to include (C's)
uchar.h
header to use thechar16_t
type name, but theu"..."
syntax for literals does not depend on that. Take care, as this header name collides with one used by the C interface of the International Components for Unicode, a relatively widely-used package for Unicode support.最后,请注意,其中很多是C2011中的新功能.要使用它,您需要一个合格的C2011实现.这些当然是可用的,但是也有很多仅符合早期标准甚至没有标准的实现.标准C99和更早版本没有提供保证16位元素的字符串文字语法.
Finally, be aware that much of this was new in C2011. To make use of it, you need a conforming C2011 implementation. Those are certainly available, but so are a lot of implementations that conform only to earlier standards, or even to none. Standard C99 and earlier do not provide a string literal syntax that guarantees 16-bit elements.
-
-