latin1 | 易学教程

Converting character encoding within c++

阅读更多关于 Converting character encoding within c++

问题 I have a website which allows users to input usernames. The problem here is that the code in c++ assumes the browser encoding is Western Europe and converts the string received from the username text box into unicode to compare with string stored within the databasse. with the right browser encoding set the character úser is recieved as %FAser and coverted properly to úser within the program however with the browser settings set to UTF-8 the string is recieved as %C3%BAser and then converted

Converting character encoding within c++

阅读更多关于 Converting character encoding within c++

Python UTF-8 Latin-1 displays wrong character

阅读更多关于 Python UTF-8 Latin-1 displays wrong character

问题 I'm writing a very small script that can convert latin-1 characters into unicode (I'm a complete beginner in Python). I tried a method like this: def latin1_to_unicode(character): uni = character.decode('latin-1').encode("utf-8") retutn uni It works fine for characters that are not specific to the latin-1 set, but if I try the following example: print latin1_to_Unicode('å') It returns Ã¥ instead of å . Same goes for other letters like æ and ø . Can anyone please explain why this is happening?

Utf-8 characters displayed as ISO-8859-1

阅读更多关于 Utf-8 characters displayed as ISO-8859-1

问题 I've got an issue with inserting/reading utf8 content from a db. All verifications I'm doing seem to point to the fact that the content in my DB should be utf8 encoded, however it seems to be latin encoded. The data are initially imported from a PHP script from the CLI. Configuration: Zend Framework Version: 1.10.5 mysql-server-5.0: 5.0.51a-3ubuntu5.7 php5-mysql: 5.2.4-2ubuntu5.10 apache2: 2.2.8-1ubuntu0.16 libapache2-mod-php5: 5.2.4-2ubuntu5.10 Vertifications: -mysql: mysql> SHOW VARIABLES

Strip down everything, except alphanumeric and European characters in PHP

阅读更多关于 Strip down everything, except alphanumeric and European characters in PHP

问题 I am working on validating my commenting script, and I need to strip down all non-alphanumeric chars except those used in Western Europe. My plan is to regex out all non-alphanumeric characters with: preg_replace("/[^A-Za-z0-9 ]/", '', $string); But that so far strips out all European characters and a £ sign, so "Café Rouge" becomes "Caf Rouge". How can I add an array of Euro chars to the above regex. The array is: £, €, á, à, â, ä, æ, ã, å, è, é, ê, ë, î, ï, í, ì, ô, ö, ò, ó, ø, õ, û, ü, ù,

NodeJS decodeURIComponent not working properly

阅读更多关于 NodeJS decodeURIComponent not working properly

问题 When I tryed to decode the string below in nodeJS using decodeURLCompnent: var decoded = decodeURI('Ulysses%20Guimar%C3%A3es%20-%20lado%20par'); console.log(decoded); I got Ulysses GuimarÃ£es - lado par Instead of Avenida Ulysses Guimarães - lado par But when I use the same code on the client side (browser) I can get the right char 'ã'. Is there a way to convert from Ã£ to ã in a Node script? 回答1: I cannot reproduce it in 0.10 or 0.11 versions of node. You can convert first to second using

Convert QString into QByteArray with either UTF-8 or Latin1 encoding

阅读更多关于 Convert QString into QByteArray with either UTF-8 or Latin1 encoding

问题 I would like to covert a QString into either a utf8 or a latin1 QByteArray, but today I get everything as utf8. And I am testing this with some char in the higher segment of latin1 higher than 0x7f, where the german ü is a good example. If I do like this: QString name("\u00fc"); // U+00FC = ü QByteArray utf8; utf8.append(name); qDebug() << "utf8" << name << utf8.toHex(); QByteArray latin1; latin1.append(name.toLatin1()); qDebug() << "Latin1" << name << latin1.toHex(); QTextCodec *codec =

Python 3 chokes on CP-1252/ANSI reading

阅读更多关于 Python 3 chokes on CP-1252/ANSI reading

问题 I'm working on a series of parsers where I get a bunch of tracebacks from my unit tests like: File "c:\Python31\lib\encodings\cp1252.py", line 23, in decode return codecs.charmap_decode(input,self.errors,decoding_table)[0] UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 112: character maps to <undefined> The files are opened with open() with no extra arguemnts. Can I pass extra arguments to open() or use something in the codec module to open these differently? This came

Using .NET how to convert ISO 8859-1 encoded text files that contain Latin-1 accented characters to UTF-8

阅读更多关于 Using .NET how to convert ISO 8859-1 encoded text files that contain Latin-1 accented characters to UTF-8

问题 I am being sent text files saved in ISO 88591-1 format that contain accented characters from the Latin-1 range (as well as normal ASCII a-z, etc.). How do I convert these files to UTF-8 using C# so that the single-byte accented characters in ISO 8859-1 become valid UTF-8 characters? I have tried to use a StreamReader with ASCIIEncoding, and then converting the ASCII string to UTF-8 by instantiating encoding ascii and encoding utf8 and then using Encoding.Convert(ascii, utf8, ascii.GetBytes(

mysql charset latin1 into utf-8 conversion issue

阅读更多关于 mysql charset latin1 into utf-8 conversion issue

问题 My client's web app has large database which millions of records. All table's encoding is latin1. When I fetch some text field which holds huge data and mail that string some strange haracter issue comes. Such when I recieve email spaces are converted into this character Â. It is not premissible to change the DB encoding. I tried the following PHP function but no outcome ;( $msg = mb_convert_encoding($msg, "UTF-8", "latin1"); Please help 回答1: I would check for the encoding php thinks it is