PHP:使用utf8_encode时在csv中错误编码的字符
I am facing a strange issue when extracting data from a MySql database and inserting it in a CSV file. In the database, the field value is the following:
K Secure Connection 1 año 1 PC
When I echo
it before writing it to the CSV file, I get the same as the above in my terminal.
I use the following code to write content to the CSV file:
fwrite($this->fileHandle, utf8_encode($lineContent . PHP_EOL));
Yet, when I open the CSV with LibreOffice Calc (and specify UTF-8 as the encoding format), the following is displayed:
K Secure Connection 1 año 1 PC
I have no idea why this happens. Can someone explain how to solve this?
REM:
SELECT @@character_set_database;
returns
latin1
REM 2:
`var_dump($lineContent, bin2hex($lineContent))`
gives
string(39) "Kaspersky Secure Connection 1 año 1 PC"
string(78) "4b6173706572736b792053656375726520436f6e6e656374696f6e20312061c3b16f2031205043"
从MySql数据库中提取数据并将其插入CSV文件时,我遇到了一个奇怪的问题。 在数据库中,字段值如下: p>
K安全连接1año1PC
code> pre>
在将其写入CSV文件之前,我 echo code>,我在终端中得到与上面相同的内容。 p>
我使用以下代码将内容写入 CSV文件: p>
fwrite($ this-> fileHandle,utf8_encode($ lineContent.PHP_EOL));
code> pre>
然而,当我用LibreOffice Calc打开CSV(并指定UTF-8作为编码格式)时,会显示以下内容: p>
K Secure Connection1año 1 PC
code> pre>
我不知道为什么会这样。 有人可以解释如何解决这个问题吗? p>
REM: strong> p>
SELECT @@ character_set_database;
pre>
返回 p>
latin1
code> pre>
REM 2: strong> p>
`var_dump($ lineContent,bin2hex($ lineContent))`
code> pre>
give p>
string(39)“Kaspersky Secure Connection 1año1PC”
string(78)“4b6173706572736b792053656375726520436f6e6e656374696f6e20312061c3b16f2031205043”
code> pre>
The var_dump
shows that the string is already encoded in UTF-8. Using utf8_encode
on it will garble it (the function attempts a conversion from Latin-1 to UTF-8). You're therefore actually writing "año" encoded in UTF-8 into your file, which is then "correctly" picked up by LibreOffice.
Simply don't utf8_encode
.
I would try to open the csv file with other editor just to make sure te problem is not with the office...
You may be double encoding the content if it is already in UTF-8 format.
I also prefer to aways work with UTF-8, so I get the data from database already in UTF-8 and no more convertion is needed. For that I run this query right after opening the SQL connection:
"set names 'utf8'"