encoding | 易学教程

UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 47: ordinal not in range(128)

阅读更多关于 UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 47: ordinal not in range(128)

问题 I am using Python 2.7 and MySQLdb 1.2.3. I tried everything I found on stackoverflow and other forums to handle encoding errors my script is throwing. My script reads data from all tables in a source MySQL DB, writes them in a python StringIO.StringIO object, and then loads that data from StringIO object to Postgres database (which apparently is in UTF-8 encoding format. I found this by looking into Properties--Definition of database in pgadmin) using psycopg2 library's copy_from command. I

Encode non ascii characters in C# .NET

阅读更多关于 Encode non ascii characters in C# .NET

问题 I want to add a custom header to the emails my application is sending out. The header name can only contain ASCII chars, but for the value and users could potentially enter UTF-8 characters and I have to base64-encode them. Also I have to decode them back to UTF-8 to show them back to the user in the UI. What's the best way to do this? 回答1: To convert from a .net string to base 64, using UTF8 as the underlying encoding: string base64 = Convert.ToBase64String(Encoding.UTF8.GetBytes(text)); And

Difference Between ASCIIEncoding and Encoding

阅读更多关于 Difference Between ASCIIEncoding and Encoding

问题 I understand that Encoding can be used to initialize object to perform any type of Encoding, ASCII, Unicode, UTF-8 etc. It appears to me that all these are sufficient for performing any kind of encoding, then what is the need for ASCIIEncoding? 回答1: The Encoding class, in addition to being the base class of all encoders, provides static property accessors to the named subclasses. Encoding.ASCII returns an instance of ASCIIEncoding which, in turn, subclasses Encoding and passes the codepage

Detect what Unicode glyphs exist?

阅读更多关于 Detect what Unicode glyphs exist?

问题 Is there a way in JavaScript/CSS/web stuff to detect whether the system has a valid glyph for a certain Unicode character? For example, I would like to detect whether a certain character in a language shows up as a square box because the user doesn't have a font that shows those Unicode points, or if they will actually be seeing the characters. 来源： https://stackoverflow.com/questions/13639734/detect-what-unicode-glyphs-exist

IMAP folder path encoding (IMAP UTF-7) for Python

阅读更多关于 IMAP folder path encoding (IMAP UTF-7) for Python

问题 I would like to know if any "official" function/library existed in Python for IMAP4 UTF-7 folder path encoding. In the imapInstance.list() I get the following path IMAP UTF-7 encoded : '(\\HasNoChildren) "." "[Mails].Test&AOk-"', If I do the following encoding : (u"[Mails].Testé").encode('utf-7') I get : '[Mails].Test+AOk-' Which is UTF-7 but not IMAP UTF-7 encoded. Test+AOk- instead of Test&AOk- I'd need an official function or library to get the IMAP UTF-7 encoded version. 回答1: The

Why is stringr changing encoding when manipulating strings?

阅读更多关于 Why is stringr changing encoding when manipulating strings?

问题 There is this strange behavior of stringr , which is really annoying me. stringr changes without a warning the encoding of some strings that contain exotic characters, in my case ø, å, æ, é and some others... If you str_trim a vector of characters, then those with exotic letters will be converted to a new Encoding. letter1 <- readline('Gimme an ASCII character!') # try q or a letter2 <- readline('Gimme an non-ASCII character!') # try ø or é Letters <- c(letter1, letter2) Encoding(Letters) #

Why would I use a Unicode Signature Byte-Order-Mark (BOM)?

阅读更多关于 Why would I use a Unicode Signature Byte-Order-Mark (BOM)?

问题 Are these obsolete? They seem like the worst idea ever -- embed something in the contents of your file that no one can see, but impacts the file's functionality. I don't understand why I would want one. 回答1: They're necessary in some cases, yes, because there are both little-endian and big-endian implementations of UTF-16. When reading an unknown UTF-16 file, how can you tell which of the two is used? The only solution is to place some kind of easily identifiable marker in the file, which can

Python 2 and 3 csv reader

阅读更多关于 Python 2 and 3 csv reader

问题 I'm trying to use the csv module to read a utf-8 csv file, and I have some trouble to create a generic code for python 2 and 3 due to encoding. Here is the original code in Python 2.7: with open(filename, 'rb') as csvfile: csv_reader = csv.reader(csvfile, quotechar='\"') langs = next(csv_reader)[1:] for row in csv_reader: pass But when I run it with python 3, it doesn't like the fact that I open the file without "encoding". I tried this: with codecs.open(filename, 'r', encoding='utf-8') as

Why should I use & instead of &?

阅读更多关于 Why should I use & instead of &?

问题 why should I use & instead of & when writing HTML for my site? Where can I find a list of other symbols that I should be encoding? (the bar / too, right?) What problems could I have if I paste the symbol right as it is into the html? The thing is , I have a few affiliate links and I'm worried that, if I write them with the & in some cases, for some reasons, because of the specific browser or device... that the info in the url wouldn't be passed correctly. Hopefully someone can clear my mind?

Convert UTF16LE file to UTF8 in Python?

阅读更多关于 Convert UTF16LE file to UTF8 in Python?

问题 I have big file with utf16le (BOM) encoding. Is it possible to convert it to usual UTF8 by python? Something like file_old = open('old.txt', mode='r', encoding='utf-16-le') file_new = open('new.txt', mode='w', encoding='utf-8') text = file_old.read() file_new.write(text.encode('utf-8')) http://docs.python.org/release/2.3/lib/node126.html (-- utf_16_le UTF-16LE) Not working. Can't understand "TypeError: must be str, not bytes" error. python 3 回答1: You should not be encoding it. Let the stdlib