Using LOAD DATA INFILE with arabic data

℡╲_俬逩灬. 提交于 2019-12-24 00:54:28

问题


I am trying to import a .csv file into a table. I have figured out how to get the data inserted by using the following query:

LOAD DATA INFILE 'examplesofdata.csv' INTO TABLE coins FIELDS TERMINATED BY ',' 
ENCLOSED BY '' ESCAPED BY '\\'  IGNORE 1 LINES;

However for several of my fields I have Arabic content which gets entered as a series of ? I assume this is because I haven't collated the database correctly or I don't fully understand the LOAD DATA INFILE query. Any advice would be greatly appreciated.

The SHOW CREATE TABLE coins; output is:

CREATE TABLE `coins` (
  `cat_num` int(11) NOT NULL,
  `reg_num` int(11) NOT NULL,
  `period` varchar(255) NOT NULL,
  `arb_period` varchar(255) character set utf8 collate utf8_unicode_ci NOT NULL,
  `ruler` varchar(255) NOT NULL,
  `arb_ruler` varchar(255) character set utf8 collate utf8_unicode_ci NOT NULL,
  `mint` varchar(255) NOT NULL,
  `arb_mint` varchar(255) character set utf8 collate utf8_unicode_ci NOT NULL,
  `date` varchar(255) NOT NULL,
  `weight` float NOT NULL,
  `diameter` float NOT NULL,
  `khedieval_num` varchar(255) NOT NULL,
  `ref` text NOT NULL,
 PRIMARY KEY  (`cat_num`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8

回答1:


LOAD DATA LOCAL INFILE 'filename' INTO TABLE tablename CHARACTER SET utf8 COLUMNS TERMINATED BY '\t' LINES TERMINATED BY '\n';

the CHARACTER SET utf8 does the trick.




回答2:


This is still a bug with MySQL. However, I found out that the database's default charset is the culprit. There are two possible workarounds:

  1. If you change your database's default charset to LATIN1 then it will work. You can keep your tables/columns UTF-8.
  2. Strangely, if you use the "CHARACTER SET latin1" it will work for both UTF-8 and Latin1 tables/columns. With this method, you can keep your db/table/column charset on UTF-8.



回答3:


So I ended up getting an answer from an old instructor for my Databases class. He told me that this problem is actually a reported bug with the current version of MySQL and that the only known solution at the time is to manually import the data through PHP or another scripting language.

The bug for this issue is at: http://bugs.mysql.com/bug.php?id=10195

It didn't help me too much since I was only working on a prototype, and managed a workaround in the mean time, but hopefully it can be of more use to you.




回答4:


How about setting CHARACTER SET utf8_unicode or to your locale?




回答5:


I also found out that your character_set_client needs to be UTF-8 as well:

mysql> show session variables like 'char%';
+--------------------------+----------------------------------------+
| Variable_name            | Value                                  |
+--------------------------+----------------------------------------+
| character_set_client     | latin1                        
...

Read mysql docs on how to go about changing that for the who server or just the session only.




回答6:


I also had that issue, but instead of series of ?, I was getting truncated data.

Like "aeióu" was being truncated in "aei".

Check the solution I came up with here, you need to match the CSV charset with the LOAD DATA INFILE charset.

Cheers




回答7:


Adding CHARACTER SET utf8 to the LOAD DATA statement is the proximate answer. However, two other issues have been brought up...

When trying to use utf8/utf8mb4, if you see Question Marks (regular ones, not black diamonds),

  • The bytes to be stored are not encoded as utf8. Fix this.
  • The column in the database is CHARACTER SET utf8 (or utf8mb4). Fix this.
  • Also, check that the connection during reading is utf8.

When trying to use utf8/utf8mb4, if you see Truncated text,

  • The bytes to be stored are not encoded as utf8. Fix this.
  • Also, check that the connection during reading is utf8.


来源:https://stackoverflow.com/questions/2137175/using-load-data-infile-with-arabic-data

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!