如何在Perl脚本中使用UTF8连接到MySQL?

如何在Perl脚本中使用UTF8连接到MySQL?

问题描述:

在Perl脚本中:如何以允许将Perl脚本中的四字节unicode字符U + 1F61C(????")从perl脚本传输到MySQL-Table的方式连接到MySQL存储吗?

Within a Perl-Script: How do I connect to MySQL in a way that allows to transmit the four-byte unicode character U+1F61C ("????") from the perl script to a MySQL-Table where this character should be stored?

使用{mysql_enable_utf8 => 1}不能解决问题.

我遇到与问题错误1366(HY000)中所述的完全相同的问题:字符串值不正确:'\ xF0 \ x9F \ x98 \ x9C'用于第1行注释"列,甚至具有相同的Unicode字符(????= U + 1F61C =

I have exactly the same problem as described in the Question ERROR 1366 (HY000): Incorrect string value: '\xF0\x9F\x98\x9C' for column 'comment' at row 1 and even with the same Unicode character (???? = U+1F61C = FACE WITH STUCK-OUT TONGUE AND WINKING EYE) which produces the error message

DBD::mysql::st execute failed: Incorrect string value: '\xF0\x9F\x98\x9C' for column ...

但是我不使用PHP,而是使用Perl.

But I don't use PHP, I use Perl.

另一个问题的公认答案是:

The accepted answer in the other question says:

  • 运行MySQL 5.5或更高版本.
    我检查版本:

  • Run MySQL 5.5 or later.
    I check the version:

mysql> select version();
+-------------------------+
| version()               |
+-------------------------+
| 5.7.13-0ubuntu0.16.04.2 |
+-------------------------+

现在是5.7,晚于5.5.
✅选中

So it is 5.7 which is later than 5.5.
✅checked

将表格的字符设置为utf8mb4.
我检查数据库,表甚至报告列的字符集:

Set table's character to utf8mb4.
I check the character set of my database, my table and even of the reported column:

mysql> SELECT default_character_set_name FROM information_schema.SCHEMATA 
    -> WHERE schema_name = "myDatabase";
+----------------------------+
| default_character_set_name |
+----------------------------+
| utf8mb4                    |
+----------------------------+

mysql> SELECT CCSA.character_set_name FROM information_schema.`TABLES` T,
    ->        information_schema.`COLLATION_CHARACTER_SET_APPLICABILITY` CCSA
    -> WHERE CCSA.collation_name = T.table_collation
    ->   AND T.table_schema = "myDatabase"
    ->   AND T.table_name   = "myTable";
+--------------------+
| character_set_name |
+--------------------+
| utf8mb4            |
+--------------------+

mysql>   SELECT character_set_name FROM information_schema.`COLUMNS` 
    -> WHERE table_schema = "myDatabase"
    ->   AND table_name   = "myTable"
    ->   AND column_name  = "myColumn";
+--------------------+
| character_set_name |
+--------------------+
| utf8mb4            |
+--------------------+

所以我的数据库,表和报告的列都使用字符集utf8mb4.
✅选中

So my database, my table and the reported column all use the character set utf8mb4.
✅checked

SET NAMES utf8,或者在连接时使用一个选项以类似方式启用它.

SET NAMES utf8, or use an option when connecting that similarly enables it.

我不知道如何在perl脚本中使用SET NAMES utf8,所以我做了过去几年中的工作.我认为这是"连接时启用它的一个选项".
它是在以my $dbh = DBI->connect开头的长行的结尾:

I don't know how to SET NAMES utf8 within a perl script, so I did it how I did it over the last years. I think that this is "an option when connecting that similarly enables it".
It's at the end of the long line that begins with my $dbh = DBI->connect:

#!/usr/bin/perl -w
use strict;
use warnings;
use utf8;
use Encode;
use DBI;
binmode STDOUT, ":utf8";

#Here I connect using the parameter mysql_enable_utf8 (create database handle):
my $dbh = DBI->connect('DBI:mysql:database=myDatabase;host=localhost','aUser','aPassword',{mysql_enable_utf8 => 1});

#Prepare the statement (create statement handle):
my $sth = $dbh->prepare('INSERT INTO `myTable` (`myColumn`) VALUES(?);');

#This doesn't work: 
$sth->execute('????');

#This doesn't work either: 
$sth->execute(encode_utf8('????'));

#end processing:
$dbh->disconnect();
exit(0);

两个执行都抛出相同的错误(仅末尾的行号发生更改):

Both executes throw the same error (only the line number at the end changes):

DBD::mysql::st execute failed: Incorrect string value: '\xF0\x9F\x98\x9C' for column 'myColumn' at row 1 at myTestScript.pl line 16.

我做错了什么?
我怎样才能做得更好?

What am I doing wrong?
How can I do it better?

问题出在SET NAMES utf8命令.在MySQL中,utf8字符集不是真正的utf8,它支持字符最多3个字节,并且该字符有4个字节:

The problem is with the SET NAMES utf8 command. In MySQL the utf8 character set is not truly utf8, it supports characters up 3 bytes only and the character in question has 4 bytes:

MySQL中的utf8字符集具有以下特征:

The utf8 character set in MySQL has these characteristics:

•不支持补充字符(仅BMP字符).

• No support for supplementary characters (BMP characters only).

•每个多字节字符最多三个字节.

• A maximum of three bytes per multibyte character.

真正的utf8是您在字段本身中用作字符集的utf8mb4.因此,请使用SET NAMES utf8mb4

The true utf8 is the utf8mb4 that you use as character set in the field itself. So, use SET NAMES utf8mb4

因此在Perl中,您应该使用{mysql_enable_utf8mb4 => 1}而不是{mysql_enable_utf8 => 1}.

So from Perl you should use {mysql_enable_utf8mb4 => 1} instead of {mysql_enable_utf8 => 1}.