character-encoding | 易学教程

How do I fix charset problems in .gs script?

阅读更多关于 How do I fix charset problems in .gs script?

问题 I have a problem with charsets. I parsed a csv file in google-app-engine and I'm posting to an uiapp table. But I checked special characters like áéíóú and those are not well displayed (?square symbol). When I was setting up my code I played writing the string imported to a google docs document and it worked the same. some advice please? I search for: a global charset definition to the code. or string var transformation that makes the chars appear like I want to. (avoiding html &number

How do I fix charset problems in .gs script?

阅读更多关于 How do I fix charset problems in .gs script?

What can cause git to mess with character encoding?

阅读更多关于 What can cause git to mess with character encoding?

问题 Edit: git does not mess with character encoding. This is still here to share knowlege and avoid others making the same mistake. The context : My enterprise uses an svn repository. I'm using git-svn as a client to interact with this repository. All text files in the project are (and must be) encoded with windows default encoding (cp-....). I use git-extensions, and sometimes the command line to pilot git. What I did : During the last 3 days, I was working on a new feature, and I did a number

What can cause git to mess with character encoding?

阅读更多关于 What can cause git to mess with character encoding?

Remove all hex characters from string in Python

阅读更多关于 Remove all hex characters from string in Python

问题 Although there are similar questions, I can't seem to find a working solution for my case: I'm encountering some annoying hex chars in strings, e.g. '\xe2\x80\x9chttp://www.google.com\xe2\x80\x9d blah blah#%#@$^blah' What I need is to remove these hex \xHH characters, and them alone, in order to get the following result: 'http://www.google.com blah blah#%#@$^blah' decoding doesn't help: s.decode('utf8') # u'\u201chttp://www.google.com\u201d blah blah#%#@$^blah' How can I achieve that? 回答1:

Remove all hex characters from string in Python

阅读更多关于 Remove all hex characters from string in Python

Can the Encoding API decode a Stream/noncontinuous bytes?

阅读更多关于 Can the Encoding API decode a Stream/noncontinuous bytes?

问题 Usually we can get a string from a byte[] using something like var result = Encoding.UTF8.GetString(bytes); However, I am having this problem: my input is an IEnumerable<byte[]> bytes (implementation can be any structure of my choice). It is not guaranteed a character is within a byte[] (for example, a 2-byte UTF8 char can have its 1st byte in bytes[1][length - 1] and its 2nd byte in bytes[2][0]). Is there anyway to decode them without merging/copying all the array together? UTF8 is main

Can the Encoding API decode a Stream/noncontinuous bytes?

阅读更多关于 Can the Encoding API decode a Stream/noncontinuous bytes?

Beautiful Soup default decode charset?

阅读更多关于 Beautiful Soup default decode charset?

问题 I have a huge set of web pages with different encodings, and I try to parse it using Beautiful Soup. As I have noticed, BS detects encoding using meta-charset or xml-encoding tags. But there are documents with no such tags or typos in charset name - and BS fails on all of them. I suppose it's default guess is utf-8, which is wrong. Luckily, all such pages (or nearly all of them) have the same encoding. Is there any way to set it as default? I've also tried to grep charset and use iconv to

Strange results when converting from byte array to string

阅读更多关于 Strange results when converting from byte array to string

问题 I get strange results when converting byte array to string and then converting the string back to byte array. Try this: byte[] b = new byte[1]; b[0] = 172; string s = Encoding.ASCII.GetString(b); byte[] b2 = Encoding.ASCII.GetBytes(s); MessageBox.Show(b2[0].ToString()); And the result for me is not 172 as I'd expect but... 63. Why does it happen? 回答1: Why does it happen? Because ASCII only contains values up to 127. When faced with binary data which is invalid for the given encoding, Encoding