从数据库转换乱码

从数据库转换乱码

问题描述:

I have a database which has a tabled which has a column called 'name'

In this name for example I have a field which has the value: 대한민국•KOREA

To get my table populated I had to do use an SQL file which was 99K lines big and it had the fields populated as such...

Now is there any way I can make my PHP script work so that it gives me the actual representation that I need for the characters which should be 대한민국•KOREA

I have checked my server config. Apache, PHP, mysql are all utf-8 set and ready...

Update

SET SQL_MODE="NO_AUTO_VALUE_ON_ZERO";
SET time_zone = "+00:00";

CREATE TABLE IF NOT EXISTS `activeclans` (
  `id` bigint(11) NOT NULL,
  `name` text CHARACTER SET utf8 NOT NULL,
  `location` text CHARACTER SET utf8 NOT NULL,
  `playercount` int(11) NOT NULL,
  `clanlevel` int(11) NOT NULL,
  `score` int(11) NOT NULL,
  `warswon` int(11) NOT NULL,
  `warslost` int(11) NOT NULL,
  `warstied` int(11) NOT NULL,
  UNIQUE KEY `id` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci;

--
-- Daten für Tabelle `activeclans`
--

INSERT INTO `activeclans` (`id`, `name`, `location`, `playercount`, `clanlevel`, `score`, `warswon`, `warslost`, `warstied`) VALUES
(1669110, '벽치는 골렘들 #.#', 'South Korea', 48, 6, 21589, 44, 18, 1),
(1707076, 'THE HURRIKANEZ', 'International', 29, 5, 20241, 39, 17, 3),
(1710048, '+WARLORDS+', 'India', 42, 3, 20700, 5, 49, 0),
(1748604, 'الحجايا', 'Jordan', 40, 4, 21218, 27, 13, 2),
(1749793, 'Vindica➕ed', 'International', 50, 5, 22212, 32, 18, 4),
(1755056, 'Hunter hogs 101', 'United States', 44, 3, 20501, 5, 23, 2),
(1761757, 'دعم وحرب', 'Saudi Arabia', 45, 1, 22373, 0, 7, 0),
(1764157, 'tiwi warriors', 'Morocco', 49, 5, 21663, 36, 13, 3),
(1785753, 'INCAUDA VENENUM', 'France', 41, 4, 20412, 25, 8, 1),
(1797817, 'QUEENofURAGONs', 'Philippines', 49, 5, 24104, 38, 8, 1),
(1829886, 'Max clasher', 'Philippines', 43, 5, 20076, 32, 12, 0),
(1837025, 'DUCATO ACCADEMY', 'Italy', 38, 3, 22227, 12, 12, 1),
(1845435, 'shoook tower', 'Iran', 44, 4, 21531, 30, 10, 0),
(1876715, '無言', 'Japan', 48, 4, 20178, 5, 25, 0),
(343597774219, 'CLASH O'' TRASH', 'United States', 46, 1, 20790, 0, 0, 0);

/*!40101 SET CHARACTER_SET_CLIENT=@OLD_CHARACTER_SET_CLIENT */;
/*!40101 SET CHARACTER_SET_RESULTS=@OLD_CHARACTER_SET_RESULTS */;
/*!40101 SET COLLATION_CONNECTION=@OLD_COLLATION_CONNECTION */;

That is part of the Sql file.

My connection to the Db is as follows:

<?php
$con = new mysqli("localhost", "beep", "beep", "beep");

if ($con->connect_errno) {
    printf("Connect failed: %s
", $con->connect_error);
    exit();
}
?>

And I have this for displaying the data:

$table = "
<table border='1' align='center' cellspacing='3' cellpadding='3' style='border-collapse:collapse;'>
    <tr>
        <td>Clan ID</td>
        <td>Clan Name</td>
        <td>Location</td>
        <td>Player Count</td>
        <td>Clan Level</td>
        <td>Score</td>
        <td>Wars Won</td>
        <td>Wars Lost</td>
        <td>Wars Tied</td>
    </tr>";


$clanResultQuery = "SELECT * FROM activeclans ORDER BY score DESC LIMIT 100";

if($activeClansResult = mysqli_query($con,$clanResultQuery)){
    while($row = mysqli_fetch_array($activeClansResult)){
        $table .="
        <tr>
            <td>".$row['id']."</td>
            <td>".$row['name']."</td>
            <td>".$row['location']."</td>
            <td>".$row['playercount']."</td>
            <td>".$row['clanlevel']."</td>
            <td>".$row['score']."</td>
            <td>".$row['warswon']."</td>
            <td>".$row['warslost']."</td>
            <td>".$row['warstied']."</td>
        </tr>";
    }
}

$table .= "</table>";

Your character sets are messed up. Your data is in latin1 (Windows-1252) but MySQL thinks it's UTF-8.

It's more likely that in the past, a system using UTF-8 has connected to MySQL using the latin1 character set and this is how the mix up originally occurred.

Whenever connecting to a MySQL instance, it helps if the client informs MySQL what character set and collation it's using like this.

SET NAMES utf8;

Leave that out and you may be in a world of pain when you try to migrate your database.

To fix your data, you need to export, do some replaces and import the data again - always test on a backup before trying on your live server! This will work on most linux based systems.

mysqldump -u _USER_ -h _HOST_ -p --add-drop-table  --default-character-set=latin1 _DNNAME_  > mydata.sql

sed -i -e 's/CHARSET=latin1/CHARSET=utf8/ig' -e 's/latin1_bin/utf8_bin/ig' -e 's/latin1_swedish_ci/utf8_general_ci/ig' -e 's/SET NAMES latin1/SET NAMES utf8/ig' -e 's/CHARACTER SET latin1/CHARACTER SET utf8/ig' mydata.sql

mysql -u _USER_ -h _HOST_ -p  _DNNAME_< mydata.sql 

What the commands do is dump the database in it's existing form - latin1. Then usse sed to replace all statements regarding the character set of the data from latin1 to utf-8. When you import, MySQL will now see the data as utf-8 and display it as such. Remember to test on a copy first, character sets issues can be awkward to fully resolve.

When it's all done, make sure you use the set names command after connecting MySQL to inform the server what character set your client is working in.