utf-8

PHP SimpleXML asXML writes ANSI encoded file

人走茶凉 提交于 2020-01-15 02:37:46
问题 I am trying to write some content into an XML file, yet I do have problems with special characters. The content I'd like to write is submitted to the script via $_GET , so I assume it is properly decoded into UTF-8 content. $write = $_GET['content']; will be fed like: file.php?content=s%F6per In the PHP I do the following: $xml = simplexml_load_file('file.xml'); $newentry = $xml -> addChild('element',$write); $xml -> asXML($xml_filename); The XML file that is opened is UTF-8 encoded. When I

Python: Write Unicode to CSV using UnicodeWriter

微笑、不失礼 提交于 2020-01-14 20:44:04
问题 Python Documents have following code example on writing unicode to csv file. I think it has mentioned there that this is the way to do since csv module can't handle unicode strings. class UnicodeWriter: """ A CSV writer which will write rows to CSV file "f", which is encoded in the given encoding. """ def __init__(self, f, dialect=csv.excel, encoding="utf-8", **kwds): # Redirect output to a queue self.queue = cStringIO.StringIO() self.writer = csv.writer(self.queue, dialect=dialect, **kwds)

Python: Write Unicode to CSV using UnicodeWriter

血红的双手。 提交于 2020-01-14 20:42:46
问题 Python Documents have following code example on writing unicode to csv file. I think it has mentioned there that this is the way to do since csv module can't handle unicode strings. class UnicodeWriter: """ A CSV writer which will write rows to CSV file "f", which is encoded in the given encoding. """ def __init__(self, f, dialect=csv.excel, encoding="utf-8", **kwds): # Redirect output to a queue self.queue = cStringIO.StringIO() self.writer = csv.writer(self.queue, dialect=dialect, **kwds)

Python: Write Unicode to CSV using UnicodeWriter

左心房为你撑大大i 提交于 2020-01-14 20:42:15
问题 Python Documents have following code example on writing unicode to csv file. I think it has mentioned there that this is the way to do since csv module can't handle unicode strings. class UnicodeWriter: """ A CSV writer which will write rows to CSV file "f", which is encoded in the given encoding. """ def __init__(self, f, dialect=csv.excel, encoding="utf-8", **kwds): # Redirect output to a queue self.queue = cStringIO.StringIO() self.writer = csv.writer(self.queue, dialect=dialect, **kwds)

How to exclude U+2028 from line separators in Python when reading file?

一曲冷凌霜 提交于 2020-01-14 18:45:25
问题 I have a file in UTF-8, where some lines contain the U+2028 Line Separator character (http://www.fileformat.info/info/unicode/char/2028/index.htm). I don't want it to be treated as a line break when I read lines from the file. Is there a way to exclude it from separators when I iterate over the file or use readlines()? (Besides reading the entire file into a string and then splitting by \n.) Thank you! 回答1: I can't duplicate this behaviour in python 2.5, 2.6 or 3.0 on mac os x - U+2028 is

Converting a YAML response w/ binary data to UTF-8 in Ruby 1.8.7

白昼怎懂夜的黑 提交于 2020-01-14 14:43:52
问题 I'm pulling a response from an API and receiving: response: job: unit_count: "1" slug: Answers lc_tgt: ja body_tgt: !binary | 5Zue562U lc_src: en body_src: Answers job_id: "1948888" opstat: ok That body_tgt value should be a couple Japanese characters(回答), but they are being converted for safe shipping. I'm in 1.8.7, so I can't force_encoding. Is there a way to unpack() them? 回答1: That appears to be a YAML document, not JSON, using YAML's binary data language (which in turn uses base64

Converting a YAML response w/ binary data to UTF-8 in Ruby 1.8.7

时间秒杀一切 提交于 2020-01-14 14:43:12
问题 I'm pulling a response from an API and receiving: response: job: unit_count: "1" slug: Answers lc_tgt: ja body_tgt: !binary | 5Zue562U lc_src: en body_src: Answers job_id: "1948888" opstat: ok That body_tgt value should be a couple Japanese characters(回答), but they are being converted for safe shipping. I'm in 1.8.7, so I can't force_encoding. Is there a way to unpack() them? 回答1: That appears to be a YAML document, not JSON, using YAML's binary data language (which in turn uses base64

PHP and character encoding problem with  character

我与影子孤独终老i 提交于 2020-01-14 10:46:27
问题 I'm having a problem where PHP (5.2) cannot find the character 'Â' in a string, though it is clearly there. I realize the underlying problem has to do with character encoding, but unfortunately I have no control over the source content. I receive it as UTF-8, with those characters already in the string. I would simply like to remove it from the string. strpos(), str_replace(), preg_replace(), trim(), etc. Cannot correctly identify it. My string is this: "Â Â Â A lot of couples throughout the

Pandas convert object column to str - column contains unicode, float etc

佐手、 提交于 2020-01-14 10:46:08
问题 I have pandas data frame where column type shows as object but when I try to convert to string, df['column'] = df['column'].astype('str') UnicodeEncodeError get thrown: *** UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-5: ordinal not in range(128) My next approach was to handle the encoding part: df['column'] = filtered_df['column'].apply(lambda x: x.encode('utf-8').strip()) But that gives following error: *** AttributeError: 'float' object has no attribute 'encode'

Cannot insert character '≤' in SQL Server 2008

烂漫一生 提交于 2020-01-14 10:26:07
问题 I have a SQL Server 2008 database and a nvarchar(256) field of a table. The crazy problem is that when I run this query: update ruds_values_short_text set value = '≤ asjdklasd' where rud_id=12202 and field_code='detection_limit' and then select * from ruds_values_short_text where rud_id=12202 and field_code='detection_limit' I get this result: 12202 detection_limit = asjdklasd 11 You can see that the character ≤ has been transformed in = It's an encoding related problem, for sure, in fact, if