encoding | 易学教程

Python read from file and remove non-ascii characters

阅读更多关于 Python read from file and remove non-ascii characters

问题 I have the following program that reads a file word by word and writes the word again to another file but without the non-ascii characters from the first file. import unicodedata import codecs infile = codecs.open('d.txt','r',encoding='utf-8',errors='ignore') outfile = codecs.open('d_parsed.txt','w',encoding='utf-8',errors='ignore') for line in infile.readlines(): for word in line.split(): outfile.write(word+" ") outfile.write("\n") infile.close() outfile.close() The only problem that I am

Redirect binary data from Process.StandardOutput causes corrputed data

阅读更多关于 Redirect binary data from Process.StandardOutput causes corrputed data

问题 On top of this problem, I have another. I try to get binary data from a external process, but the data(a image) seems to be corrupted. The screenshot below shows the corruption: The left image was done by executing the program on command line, the right one from code. My Code so far: var process = new Process { StartInfo = { Arguments = string.Format(@"-display"), FileName = configuration.PathToExternalSift, RedirectStandardError = true, RedirectStandardInput = true, RedirectStandardOutput =

Redirect binary data from Process.StandardOutput causes corrputed data

阅读更多关于 Redirect binary data from Process.StandardOutput causes corrputed data

Python 3.4 hex to Japanese Characters

阅读更多关于 Python 3.4 hex to Japanese Characters

问题 I am currently writing a script to pull information off my site which contains Japanese characters. So far I have my script pulling out the data off the site. It has return as a string: "\xe5\xb9\xb4\xe3\x81\xab\xe4\xb8\x80\xe5\xba\xa6\xe3\x81\xae\xe6\x99\xb4\xe3\x82\x8c\xe5\xa7\xbf" Using an online hex to text tool, I am giving: 年に一度の晴れ姿 I know this phrase is correct, but my question is how do I convert it in python? When I run something like: name = "\xe5\xb9\xb4\xe3\x81\xab\xe4\xb8\x80\xe5

Saving a Binary tree to a file [closed]

阅读更多关于 Saving a Binary tree to a file [closed]

问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 6 years ago . I have a non-balanced (not binary-search) binary tree Need to incode (and later decode) it to txt file. How can I do it in efficient way? I found this link which talks about similar (same) problem,but it is obvious for me 回答1: Please look at this on LeetCode. I like this solution because it's relatively

Encoding of AVMetadataItem

阅读更多关于 Encoding of AVMetadataItem

问题 I have a AVMetadataItem which has fields encoded in CP1251 (Cyrillic). After reading item.stringValue I get garbage - incorrectly encoded string. I've tried converting that string to raw UTF8 and then creating a new string using the CP1251 encoding - no luck, result is nil. Tried taking the item.dataValue - no dice, it contains a raw list data (starting with bplist...). Any ideas are very appreciated. Thanks in advance. 回答1: Swift 2.0 solution: let origTitleMeta: NSData = (<AVMetadataItem>

Foreign characters and LDAP. What encoding/charset does LDAP expect?

阅读更多关于 Foreign characters and LDAP. What encoding/charset does LDAP expect?

问题 I am parsing XML, with simplexml_load_string() , and using the data within it to update Active Directory (AD) objects, via LDAP. Example XML (simplified): <?xml version="1.0" encoding="UTF-8"?> <users> <user>Bìlbö Bággįnš</user> <user>Gãńdåłf Thê Gręât</user> <user>Śām Wīšë</user> </users> I firstly run an ldap_search() to find a single user and then proceed to change their attributes. Pumping the above values straight into AD, using LDAP, will result in some pretty mangled characters showing

PHP: Shorter/obscured encoding for a URL embedded in another URL?

阅读更多关于 PHP: Shorter/obscured encoding for a URL embedded in another URL?

问题 I'm writing myself a script which basically lets me send a URL and two integer dimensions in the querystring of a single get request. I'm using base64 to encode it, but its pretty damn long and I'm concerned the URL may get too big. Does anyone know an alternative, shorter method of doing this? It needs to be decode-able when received in a get request, so md5/sha1 are not possible. Thanks for your time. Edit : Sorry - I should have explained better: Ok, on our site we display screenshots of

Assigning Arabic text to R variables

阅读更多关于 Assigning Arabic text to R variables

问题 R doesn't display correctly Arabic text. I get very weird stuff when I use Arabic. Here's a screenshot: The problem is that I want to create a wordcloud with Arabic text and I need to solve this problem first. R version: R 2.15.2 GUI 1.53 Leopard build 64-bit (6335) Here are more info: > options("encoding") $encoding [1] "native.enc" > Encoding("الله") [1] "unknown" SessionInfo(): > sessionInfo() R version 2.15.2 (2012-10-26) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] C/C

Incorrect encoding after redirecting `dir` output to a file

阅读更多关于 Incorrect encoding after redirecting `dir` output to a file

问题 I run this code on Windows cmd.exe in Europe and I use local settings here, for my language. So I use diacritics in names of the directories. I try to list names of the directories and they are displayed correctly. Then I save them into file, but when I open it in notepad, the diacritics is not readable: for example, instead of Střední Čechy I have Stýednˇ ¬echy . What did I do wrong and how can I correct it? @echo off del directories.conf FOR /F "delims=!" %%R IN ('dir * /b /a:d /o:n') DO (