使用JS将Windows-1252转换为UTF-8

问题描述:

我有一些荷兰语的字符串。我知道如何使用PHP对其进行编码

I have some strings in dutch language. I know how to encode them using PHP

$str = iconv( "Windows-1252", "UTF-8", $str );

JavaScript中的等效内容是什么?

What would be the equivalent in Javascript?

Windows-1252是单字节编码,非常方便:您可以构建查找表。

Windows-1252 is a single-byte encoding, which is pretty convenient: you can just build a lookup table.

<?php
$s = '';

for ($i = 0; $i < 256; $i++) {
    $converted = iconv('Windows-1252', 'UTF-8', chr($i));

    if ($converted === false) {
        $s .= "\xef\xbf\xbd";  # UTF-8 replacement character
    } else {
        $s .= $converted;
    }
}

echo $s;

假设您想要一个常规的JavaScript字符串作为结果(而不是UTF-8),并且输入是一个字符串,其中每个字符的Unicode代码点实际上代表一个Windows-1252,而结果表可以读取为UTF-8,并放入JavaScript字符串文字中,并且使用:

Assuming you want a regular JavaScript string as a result (rather than UTF-8) and that the input is a string where each character’s Unicode codepoint actually represents a Windows-1252 one, the resulting table can be read as UTF-8, put in a JavaScript string literal, and voilà:

var WINDOWS_1252 = '\u0000\u0001\u0002\u0003\u0004\u0005\u0006\u0007\b\t\n\u000b\f\r\u000e\u000f\u0010\u0011\u0012\u0013\u0014\u0015\u0016\u0017\u0018\u0019\u001a\u001b\u001c\u001d\u001e\u001f !"#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~€�‚ƒ„…†‡ˆ‰Š‹Œ�Ž��‘’""•–—˜™š›œ�žŸ ¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ';

function fromWindows1252(binaryString) {
    var text = '';

    for (var i = 0; i < binaryString.length; i++) {
        text += WINDOWS_1252.charAt(binaryString.charCodeAt(i));
    }

    return text;
}