character-encoding | 易学教程

Converting character encoding within c++

阅读更多关于 Converting character encoding within c++

问题 I have a website which allows users to input usernames. The problem here is that the code in c++ assumes the browser encoding is Western Europe and converts the string received from the username text box into unicode to compare with string stored within the databasse. with the right browser encoding set the character úser is recieved as %FAser and coverted properly to úser within the program however with the browser settings set to UTF-8 the string is recieved as %C3%BAser and then converted

SEO Canonical URL in Greek characters

阅读更多关于 SEO Canonical URL in Greek characters

问题 I have a URL which including Greek letters http://www.mydomanain.com/gr/τιτλος-σελιδας/20/ I am using $_SERVER['REQUEST_URI'] to insert value to canonical link in my page head like this <link rel="canonical" href="http://www.mydomanain.com<?php echo $_SERVER['REQUEST_URI']; ?>" /> The problem is when I am viewing the page source the URL is displayed with characters like ...CE%B3%CE%B3%CE%B5%CE%BB... but when clicking on it, its display the link as it should be Is this will caused any penalty

Reading proper unicode characters into a ReadStream in node.js

阅读更多关于 Reading proper unicode characters into a ReadStream in node.js

问题 Sometimes strange things happen in the world of coding, and I have no explanation at all. :) A text file I have contains the following lines: en …π 1 1 en Œ® 1 1 en Œ© 1 1 en –° 1 1 en —† 1 1 en “§ 1 1 en ◊° 2 2 en ·∏§anƒ´f 1 1 en ·π_ 1 1 en ˝mage:whiteshark-tgoss1.jpg 4 4 en ˝stanbul 114 114 My code is as follows: var fileReadStream = fs.createReadStream(fileName, {encoding: 'utf8'}); fileReadStream.on('data', function(data){ //do something with the data }); When I look at the data element,

Reading proper unicode characters into a ReadStream in node.js

阅读更多关于 Reading proper unicode characters into a ReadStream in node.js

Dealing with char values over 127 in C

阅读更多关于 Dealing with char values over 127 in C

问题 I'm quite new to C programming, and I have some problems trying to assign a value over 127 (0x7F) in a char array. In my program, I work with generic binary data and I don't face any problem printing a previously acquired byte stream (e.g. with fopen or fgets, then processed with some bitwise operations) as %c or %d. But if I try to print a character from its numerical value like this: printf("%c\n", 128); it just prints FFFD (the replacement character). Here is another example: char abc[] =

Python 3 itertools.islice continue despite UnicodeDecodeError

阅读更多关于 Python 3 itertools.islice continue despite UnicodeDecodeError

问题 I have a python 3 program that monitors a log file. The log includes, among other things, chat messages written by users. The log is created by a third party application which I cannot change. Today a user wrote "텋��텋��" and it caused the program to crash with the following error: future: <Task finished coro=<updateConsoleLog() done, defined at /usr/local/src/bserver/logmonitor.py:48> exception=UnicodeDecodeError('utf-8',... say "\xed\xa0\xbd\xed\xb1\x8c"\r\n', 7623, 7624, 'invalid

Trailing equal signs (=) in emails

阅读更多关于 Trailing equal signs (=) in emails

问题 I download messages from a Gmail account using POP3 and save them in a SQLite database for futher processing: mailbox = poplib.POP3_SSL('pop.gmail.com', '995') mailbox.user(user) mailbox.pass_(password) msgnum = mailbox.stat()[0] for i in range(msgnum): msg = '\n'.join(mailbox.retr(i+1)[1]) save_message(msg, dbmgr) mailbox.quit() However, looking in the database, all lines but the last one of the message body (payload) have trailing equal signs. Do you know why this happens? 回答1: Frederic's

Python 3 itertools.islice continue despite UnicodeDecodeError

阅读更多关于 Python 3 itertools.islice continue despite UnicodeDecodeError

Python urllib.request.urlopen: AttributeError: 'bytes' object has no attribute 'data'

阅读更多关于 Python urllib.request.urlopen: AttributeError: 'bytes' object has no attribute 'data'

问题 I am using Python 3 and trying to connect to dstk . I am getting an error with urllib package. I researched a lot on SO and could not find anything similar to this problem. api_url = self.api_base+'/street2coordinates' api_body = json.dumps(addresses) #api_url=api_url.encode("utf-8") #api_body=api_body.encode("utf-8") print(type(api_url)) response_string = six.moves.urllib.request.urlopen(api_url, api_body).read() response = json.loads(response_string) If I do not encode the api_url and api

MySQL - select first 10 bytes of a string

阅读更多关于 MySQL - select first 10 bytes of a string

问题 Hello wise men & women, How would you select the first x bytes of a string? The use case: I'm optimizing product description texts for upload to Amazon, and Amazon measures field lengths by bytes in utf8 (not latin1 as I stated earlier), not by characters. MySQL on the other hand, seems to operate character-based. (e.g., the function left() is character-based, not byte-based). The difference (using English, French, Spanish & German) is roughly 10%, but it can vary widely. Some tests