将字符串从UTF-8转换为ISO-8859-1

问题描述:

我正在尝试将UTF-8 string转换为ISO-8859-1 char*以在旧代码中使用.我看到的唯一方法是使用 iconv .

I'm trying to convert a UTF-8 string to a ISO-8859-1 char* for use in legacy code. The only way I'm seeing to do this is with iconv.

我绝对希望使用完全基于string的C ++解决方案,然后仅在结果字符串上调用.c_str().

I would definitely prefer a completely string-based C++ solution then just call .c_str() on the resulting string.

我该怎么做?如果可能,请提供代码示例.如果您唯一知道的解决方案,我可以使用iconv.

How do I do this? Code example if possible, please. I'm fine using iconv if it is the only solution you know.

我将从另一个答案中修改我的代码 实施Alf的建议.

I'm going to modify my code from another answer to implement the suggestion from Alf.

std::string UTF8toISO8859_1(const char * in)
{
    std::string out;
    if (in == NULL)
        return out;

    unsigned int codepoint;
    while (*in != 0)
    {
        unsigned char ch = static_cast<unsigned char>(*in);
        if (ch <= 0x7f)
            codepoint = ch;
        else if (ch <= 0xbf)
            codepoint = (codepoint << 6) | (ch & 0x3f);
        else if (ch <= 0xdf)
            codepoint = ch & 0x1f;
        else if (ch <= 0xef)
            codepoint = ch & 0x0f;
        else
            codepoint = ch & 0x07;
        ++in;
        if (((*in & 0xc0) != 0x80) && (codepoint <= 0x10ffff))
        {
            if (codepoint <= 255)
            {
                out.append(1, static_cast<char>(codepoint));
            }
            else
            {
                // do whatever you want for out-of-bounds characters
            }
        }
    }
    return out;
}

无效的UTF-8输入会导致字符丢失.

Invalid UTF-8 input results in dropped characters.