utf-8 | 易学教程

How do you know what encoding the user is inputing into the browser?

阅读更多关于 How do you know what encoding the user is inputing into the browser?

问题 I read Joel's article about character sets and so I'm taking his advice to use UTF-8 on my web page and in my database. What I can't understand is what to do with user input. As Joel says, "It does not make sense to have a string without knowing what encoding it uses." But how do I know what encoding the user input string uses? If I have <input type="text" name="atextfield" > on my page, how do I know what encoding I'm getting from the user? What if the user puts in some special ASCII symbol,

Integrity constraint violation: 1062 Duplicate entry for utf8_unicode_ci collation

阅读更多关于 Integrity constraint violation: 1062 Duplicate entry for utf8_unicode_ci collation

问题 I have a table called tag with a unique constraint on the name column: CREATE TABLE `tag` ( `id` int(11) NOT NULL AUTO_INCREMENT, `name` varchar(255) COLLATE utf8_unicode_ci NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `UNIQ_389B7835E237E06` (`name`) ) ENGINE=InnoDB AUTO_INCREMENT=13963 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci The collation for this table is utf8_unicode_ci. When I try to insert the following 2 entries I get an "Integrity constraint violation" execption. SQL log: 130607 14:35

Email subject MIME encoding in Perl.

阅读更多关于 Email subject MIME encoding in Perl.

问题 I am trying to send an email with non-ASCII characters in the subject line under Perl 5.8.5. My simple example uses the word "Änderungen" (German umlaut), but instead of correctly converting the "Ä" the subject line will always turn out as "Ã?nderungen". #!/usr/bin/env perl use warnings; use strict; use Encode qw(decode encode); my $subject = "Änderungen"; my $subject_encoded = encode("MIME-Q", $subject); [...] open(MAIL, "| /usr/sbin/sendmail -n -t $recipient") || return "ERROR"; print MAIL

Storing swedish characters in mysql database

阅读更多关于 Storing swedish characters in mysql database

问题 I'm having problems storing Swedish characters in my MySQL database. I want to store them in my table called users with the collation utf8-bin. Even though I'm using utf8, the characters å ä ö gets stored as Ã¥ Ã¤ Ã¶ and I don't know why. Retrieving the data and echoing it gives me the same output, with the weird characters instead of å ä ö . Any help is appreciated. 回答1: Call mysql_set_charset("utf8"); After connecting and before making any queries. Your database charset is just for storage,

Emoji in R [UTF-8 encoding]

阅读更多关于 Emoji in R [UTF-8 encoding]

问题 I'm trying to make an emoji analysis on R. I have stored some tweets where there are emojis. Here is one of the tweet that I want to analyze : > tweetn2 [1] "Programme du week-end: \xed\xa0\xbd\xed\xb2\x83\xed\xa0\xbc \xed\xbe\xb6\xed\xa0\xbc \xed\xbd\xbb\xed\xa0\xbc\xed\xbd\xbb\xed\xa0\xbc \xed\xbd\xbb\xed\xa0\xbc\xed\xbd\xbb" To be sure that I have "UTF-8": > Encoding(tweetn2) [1] "UTF-8 " Now when I'm trying to recognize some characters, it's not working fine > grepl("\\xed",tweetn2) [1]

can xsd schema validate encoding, e.g. UTF-8, possible?

阅读更多关于 can xsd schema validate encoding, e.g. UTF-8, possible?

问题 By using schema, is there any simple/easy way to validate the encoding of an xml msg? Assuming the 1st line of xml is "not" trustworthy? e.g. ignore ?xml version="1.0" encoding="UTF-8" ? 回答1: No, schema can't dictate encoding type except in terms of the binary data element types, but this encoding is still going to be encapsulated by the high level encoding of the document itself. This makes sense if you realize that the schema is suppose to describe the information and not the transport

Could File::Find::Rule be patched to automatically handle filename character encoding/decoding?

阅读更多关于 Could File::Find::Rule be patched to automatically handle filename character encoding/decoding?

问题 Suppose I have a file with name æ (UNICODE : 0xE6, UTF8 : 0xC3 0xA6) in the current directory. Then, I would like to use File::Find::Rule to locate it: use feature qw(say); use open qw( :std :utf8 ); use strict; use utf8; use warnings; use File::Find::Rule; my $fn = 'æ'; my @files = File::Find::Rule->new->name($fn)->in('.'); say $_ for @files; The output is empty, so apparently this did not work. If I try to encode the filename first: use Encode; my $fn = 'æ'; my $fn_utf8 = Encode::encode(

how to make sql developer display non-English character correctly instread of displaying squares?

阅读更多关于 how to make sql developer display non-English character correctly instread of displaying squares?

问题 in sql developer--preference Environment--encoding is already set to 'UTF-8' Code Editor--fonts was set to 'Verdana' Database--NLS--Language was set to 'American' The data in db was written by Java in UTF-8 encoding (95% percent sure） What else I need to do to make it displayed correctly? Note: the squares characters are actually Chinese characters. 回答1: Problem solved. Using font 'Microsoft YaHei' 回答2: sqldeveloper uses the system fonts from the host machine. On my Win8 system there is a

Disable encoding of unicode characters in ASP.NET-MVC3

阅读更多关于 Disable encoding of unicode characters in ASP.NET-MVC3

问题 On my site every text is served as UTF-8. Since nowadays every browser supports unicode characters, I would like to use them as-is. The asp.net framework is very helpful by replacing any unicode with a Numeric Character Reference, like á . For reference check: http://en.wikipedia.org/wiki/Unicode_and_HTML#HTML_document_characters Sure, this way the webpage renders correctly in the oldest netscape possible, but for example the google analytics ecommerce module has some trouble understanding

How to get PyCharm to display unicode data in its console?

阅读更多关于 How to get PyCharm to display unicode data in its console?

问题 I have switched over to PyCharm and have had a blast using it. I code for projects that use languages other than English (i.e. Hebrew and Arabic) and need to debug encodings once in a while. For some reason, PyCharm will not display Unicode characters in its debug console. I have set the IDE encoding to UTF-8 but it did not help. Any ideas? 回答1: You need to change the console font to the one which contains the required Unicode glyphs: 回答2: The accepted answer is no longer correct. Of the