When I importing gujarati data using csv file that time data show like?

江枫思渺然 提交于 2020-01-05 08:35:00

问题


I am using db2 database and when I importing gujarati data that time data show some symbols. I try to set UTF-8 but still it's show symbol. Db2-server platform is windows. How to importing gujarati data.?


回答1:


It is not clear from the problem description whether there is an issue with the client or the database, so I will show universal steps to troubleshoot an issue of this kind. I understand that your intention is to store the data as UTF-8 and Db2 documentation says:

The following Indic scripts are supported through Unicode: Hindi, Gujarati, Kannada, Konkani, Marathi, Punjabi, Sanskrit, Tamil and Telugu.

i.e. we can use any UTF-8 database (code page 1252) for Gujarati. It has 91 code points assigned according to Wikipedia, from U+0A81 to U+0AD0. This implies internally they will need 3 bytes for storage encoded as UTF-8 (which also means first byte will be always 0xE).

Let's try to use "ગુજરાતી" (Gujarati) as an example. It consists of 7 characters:

U+0A97 GUJARATI LETTER GA       utf-8 0xE0AA97
U+0AC1 GUJARATI VOWEL SIGN U    utf-8 0xE0AB81
U+0A9C GUJARATI LETTER JA       utf-8 0xE0AA9C
U+0AB0 GUJARATI LETTER RA       utf-8 0xE0AAB0
U+0ABE GUJARATI VOWEL SIGN AA   utf-8 0xE0AABE
U+0AA4 GUJARATI LETTER TA       utf-8 0xE0AAA4
U+0AC0 GUJARATI VOWEL SIGN II   utf-8 0xE0AB80

Let's test:

db2 "create table gujarati_tab(c1 int, c2 varchar(10 codeunits32))"
db2 "insert into gujarati_tab values(1, 'ગુજરાતી')"

To make sure data is stored properly we can examine the binary structure of our column:

db2 "select hex(c2) from gujarati_tab"

1                                          
-------------------------------------------
E0AA97E0AB81E0AA9CE0AAB0E0AABEE0AAA4E0AB80 

Now you can split that into 7 3-byte structures each matching expected set of bytes for given characters:

E0AA97 E0AB81 E0AA9C E0AAB0 E0AABE E0AAA4 E0AB80

which implies data is stored correctly in the database. If there is still an issue on the client end, it will be strictly a problem of client application that is not interpreting correct UFT-8 data returned by the database.



来源:https://stackoverflow.com/questions/58092941/when-i-importing-gujarati-data-using-csv-file-that-time-data-show-like

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!