修复sql脚本中的“错误的字符串值”

问题描述 投票:0回答:1

我正在尝试将大量数据导入到 MariaDB 表中。我生成了一个很大的 SQL 脚本文件,其中包含很多特殊字符(主要是 µ、° 和 NBSP)。

我想运行我的脚本将数据放入 SQL 并保留这些字符。每次尝试运行脚本时,我要么收到“字符串值不正确”错误,要么字符显示为“?”当我查询数据时(见下文)。有没有办法让这个脚本运行并导入这些特殊字符?

我将在底部提供一个示例 sql 脚本,并解释我如何尝试运行它。

我尝试在 Linux 终端中运行我的脚本,

mysql -e "source Test.sql"
。这在第一个错误后停止了,所以我将使用 mysql 终端的示例,因为它将显示我在小示例脚本中遇到的所有错误。

注意我第一次尝试在 SQL 终端中运行我的脚本文件。对于上述每个字符,我得到了“错误的字符串值”(\xB0 = °、\xB5 = µ、\xA0 = NBSP)。运行该脚本不起作用,但是,当我将脚本复制/粘贴到我的 mysql 会话中时,它运行得很好。这给了我想要的输出,但是在我的完整脚本中这样做会很麻烦。

知道如何通过运行脚本来使其工作吗?

#### Trying to run my script from SQL terminal
MariaDB [atlas]> source Test.sql
Query OK, 0 rows affected (0.017 sec)

Query OK, 0 rows affected (0.027 sec)

ERROR 1366 (22007) at line 12 in file: 'Test.sql': Incorrect string value: '\xB0)' for column `atlas`.`TestSQL`.`comment` at row 1
ERROR 1366 (22007) at line 14 in file: 'Test.sql': Incorrect string value: '\xB5)' for column `atlas`.`TestSQL`.`comment` at row 1
ERROR 1366 (22007) at line 16 in file: 'Test.sql': Incorrect string value: '\xA0' for column `atlas`.`TestSQL`.`comment` at row 1

#### Copying the lines from the script into SQL works...
MariaDB [atlas]> INSERT INTO TestSQL (timestamp,comment)
    ->         VALUES ('2019-03-04 13:11:35','This one contains a deg symbol (°)');
Query OK, 1 row affected (0.005 sec)

MariaDB [atlas]> INSERT INTO TestSQL (timestamp,comment)
    ->         VALUES ('2020-02-04 13:48:04','This one has a mu (µ)');
Query OK, 1 row affected (0.004 sec)

MariaDB [atlas]> INSERT INTO TestSQL (timestamp,comment)
    ->         VALUES ('2022-06-12 16:07:50','This one has a line break >
    '> And it contains a NBSP > ');
Query OK, 1 row affected (0.005 sec)

这就是我运行脚本后希望输出的样子

MariaDB [atlas]> select * from TestSQL;
+---------------------+--------------------------------------------------------+
| timestamp           | comment                                                |
+---------------------+--------------------------------------------------------+
| 2019-03-04 13:11:35 | This one contains a deg symbol (°)                     |
| 2020-02-04 13:48:04 | This one has a mu (µ)                                  |
| 2022-06-12 16:07:50 | This one has a line break >
And it contains a NBSP >   |
+---------------------+--------------------------------------------------------+
3 rows in set (0.000 sec)

MariaDB [atlas]>

这是我的 SQL 脚本

TestSQL.sql
来重复此问题。

DROP TABLE IF EXISTS TestSQL;

--SET CHARACTER SET utf8mb4;

CREATE TABLE IF NOT EXISTS TestSQL (
        timestamp       timestamp DEFAULT CURRENT_TIMESTAMP,
        comment         varchar(256) COLLATE utf8mb4_unicode_ci,
        PRIMARY KEY (timestamp)
) CHARACTER SET 'utf8mb4';


INSERT INTO TestSQL (timestamp,comment)
        VALUES ('2019-03-04 13:11:35','This one contains a deg symbol (°)');
INSERT INTO TestSQL (timestamp,comment)
        VALUES ('2020-02-04 13:48:04','This one has a mu (µ)');
INSERT INTO TestSQL (timestamp,comment)
        VALUES ('2022-06-12 16:07:50','This one has a line break >
And it contains a NBSP > ');

我尝试过的其他事情:

  1. 在上面的脚本中,我使用了“SET CHARACTER SET utf8mb4;”行。当脚本在未注释的情况下运行时,它似乎可以工作,但是当我查询数据时它有问号:
MariaDB [atlas]> select * from TestSQL;
+---------------------+-------------------------------------------------------+
| timestamp           | comment                                               |
+---------------------+-------------------------------------------------------+
| 2019-03-04 13:11:35 | This one contains a deg symbol (?)                    |
| 2020-02-04 13:48:04 | This one has a mu (?)                                 |
| 2022-06-12 16:07:50 | This one has a line break >
And it contains a NBSP >? |
+---------------------+-------------------------------------------------------+
3 rows in set (0.000 sec)

在运行脚本之前在 SQL 终端中使用该命令时,我得到了相同的结果。

  1. 我尝试过摆弄
    /etc/my.cnf
    文件。我真的没有看到这里有任何变化。我尝试重新启动 mariadb 服务器,以防万一也产生影响。这是我完成的配置文件:
[client]
database=atlas
default-character-set=utf8mb4

[mysqld]
bind-address=0.0.0.0
max_allowed_packet=64M
collation-server = utf8mb4_unicode_ci
character-set-client=utf8mb4
character-set-server=utf8mb4
init-connect = 'SET NAMES utf8mb4 COLLATE utf8mb4_unicode_ci'

[client-server]

这是类似帖子中提出的问题的输出(如何修复“错误的字符串值”错误?):

MariaDB [atlas]> show variables like '%colla%';
+----------------------+--------------------+
| Variable_name        | Value              |
+----------------------+--------------------+
| collation_connection | utf8mb4_general_ci |
| collation_database   | latin1_swedish_ci  |
| collation_server     | utf8mb4_general_ci |
+----------------------+--------------------+
3 rows in set (0.001 sec)

MariaDB [atlas]> show variables like '%charac%';
+--------------------------+------------------------------+
| Variable_name            | Value                        |
+--------------------------+------------------------------+
| character_set_client     | utf8mb4                      |
| character_set_connection | utf8mb4                      |
| character_set_database   | latin1                       |
| character_set_filesystem | binary                       |
| character_set_results    | utf8mb4                      |
| character_set_server     | utf8mb4                      |
| character_set_system     | utf8                         |
| character_sets_dir       | /usr/share/mariadb/charsets/ |
+--------------------------+------------------------------+
8 rows in set (0.001 sec)

编辑: 根据评论添加此信息。

MariaDB [atlas]> select @@character_set_database, @@collation_database;
+--------------------------+----------------------+
| @@character_set_database | @@collation_database |
+--------------------------+----------------------+
| utf8mb4                  | utf8mb4_unicode_ci   |
+--------------------------+----------------------+
1 row in set (0.000 sec)

回复ysth的评论:

MariaDB [atlas]> select *,char_length(comment),hex(comment) from TestSQL;
+---------------------+-------------------------------------------------------+----------------------+------------------------------------------------------------------------------------------------------------+
| timestamp           | comment                                               | char_length(comment) | hex(comment)                                                                                               |
+---------------------+-------------------------------------------------------+----------------------+------------------------------------------------------------------------------------------------------------+
| 2019-03-04 13:11:35 | This one contains a deg symbol (▒)                     |                   34 | 54686973206F6E6520636F6E7461696E732061206465672073796D626F6C2028B029                                       |
| 2020-02-04 13:48:04 | This one has a mu (▒)                                  |                   21 | 54686973206F6E65206861732061206D752028B529                                                                 |
| 2022-06-12 16:07:50 | This one has a line break >
And it contains a NBSP >  |                   53 | 54686973206F6E65206861732061206C696E6520627265616B203E0A416E6420697420636F6E7461696E732061204E425350203E20 |
+---------------------+-------------------------------------------------------+----------------------+------------------------------------------------------------------------------------------------------------+
3 rows in set (0.000 sec)

里克·詹姆斯请求的信息: 请注意,我确实必须将 NBSP 添加到最后一个查询中。不知何故,我一定是不小心删除了 NBSP 角色!

MariaDB [atlas]> select *,char_length(comment),hex(comment) from TestSQL;
+---------------------+----------------------------------------------------------+----------------------+--------------------------------------------------------------------------------------------------------------+
| timestamp           | comment                                                  | char_length(comment) | hex(comment)                                                                                                 |
+---------------------+----------------------------------------------------------+----------------------+--------------------------------------------------------------------------------------------------------------+
| 2019-03-04 13:11:35 | This one contains a deg symbol (°)                       |                   34 | 54686973206F6E6520636F6E7461696E732061206465672073796D626F6C2028B029                                         |
| 2020-02-04 13:48:04 | This one has a mu (µ)                                    |                   21 | 54686973206F6E65206861732061206D752028B529                                                                   |
| 2022-06-12 16:07:50 | This one has a line break >
And it contains a NBSP >     |                   54 | 54686973206F6E65206861732061206C696E6520627265616B203E0A416E6420697420636F6E7461696E732061204E425350203EA0A0 |
+---------------------+----------------------------------------------------------+----------------------+--------------------------------------------------------------------------------------------------------------+
3 rows in set (0.000 sec)

MariaDB [atlas]> set names latin1;
Query OK, 0 rows affected (0.000 sec)

MariaDB [atlas]> select *,char_length(comment),hex(comment) from TestSQL;
+---------------------+--------------------------------------------------------+----------------------+--------------------------------------------------------------------------------------------------------------+
| timestamp           | comment                                                | char_length(comment) | hex(comment)                                                                                                 |
+---------------------+--------------------------------------------------------+----------------------+--------------------------------------------------------------------------------------------------------------+
| 2019-03-04 13:11:35 | This one contains a deg symbol (▒)                      |                   34 | 54686973206F6E6520636F6E7461696E732061206465672073796D626F6C2028B029                                         |
| 2020-02-04 13:48:04 | This one has a mu (▒)                                   |                   21 | 54686973206F6E65206861732061206D752028B529                                                                   |
| 2022-06-12 16:07:50 | This one has a line break >
And it contains a NBSP >▒▒   |                   54 | 54686973206F6E65206861732061206C696E6520627265616B203E0A416E6420697420636F6E7461696E732061204E425350203EA0A0 |
+---------------------+--------------------------------------------------------+----------------------+--------------------------------------------------------------------------------------------------------------+
3 rows in set (0.000 sec)

MariaDB [atlas]> show create table;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near '' at line 1
MariaDB [atlas]> show create table Saves \G
*************************** 1. row ***************************
       Table: Saves
Create Table: CREATE TABLE `Saves` (
  `expNum` varchar(16) DEFAULT NULL,
  `timestamp` timestamp NOT NULL DEFAULT current_timestamp(),
  `comment` varchar(256) DEFAULT NULL,
  PRIMARY KEY (`timestamp`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.000 sec)
mariadb collate character-set
1个回答
0
投票

由于客户端中的字节以latin1(0xB5等)编码,因此对于其中一些设置,您必须使用

latin1
,而不是
utf8mb4
。一种方法是:

`SET NAMES latin1;

这改变了

character_set_client/connection/results
。最后一个可能与加载无关,而仅与在“SELECT”期间转换回 latin1 相关。

© www.soinside.com 2019 - 2024. All rights reserved.