encoding | 易学教程

Java unicode byte parsing

阅读更多关于 Java unicode byte parsing

问题 I'm just in the process of reading some data from a file as a stream of bytes, and I've just encountered some unicode strings that I'm not sure how best to handle. Each character is using two bytes, with only the first seeming to contain actual data, so for example the string 'trust' is stored in the file as: 0x74 0x00(t) 0x72 0x00(r) ...and so on Normally I'd just use a regex to replace the zeros with nothing and therefore remove the whitespace. However, the spaces between words within the

PowerShell - Batch change files encoding To UTF-8

阅读更多关于 PowerShell - Batch change files encoding To UTF-8

问题 I'm trying to do a dead simple thing: to change files encoding from anything to UTF-8 without BOM. I found several scripts that do this and the only one that really worked for me is this one: https://superuser.com/questions/397890/convert-text-files-recursively-to-utf-8-in-powershell#answer-397915. It worked as expected, but I need the generated files without BOM. So I tried to modify the script a little, adding the solution given to this question: Using PowerShell to write a file in UTF-8

Character set that is not a superset of ASCII

阅读更多关于 Character set that is not a superset of ASCII

问题 Is there a character set other than EBCDIC that is not a superset of 7-bit ASCII? 回答1: Yes. JIS X 0208 is not a superset of ASCII. Some versions of this standard include most of the ASCII characters, but not all of them. A related fact is that a file encoded with UTF-16 or UTF-32 is not byte-equivalent to an ASCII file of the same characters, but since those are not character sets, and since Unicode is certainly a superset of ASCII, they do not qualify as answers to your question. 回答2: There

Character set that is not a superset of ASCII

阅读更多关于 Character set that is not a superset of ASCII

Handlebars triple-stash to avoid escaping html entities

阅读更多关于 Handlebars triple-stash to avoid escaping html entities

问题 I use handlebars, and if an escaped character such as ' is processed it is rendered on screen as ' . I know wrapping the variable in a triple-stash will prevent this. I processed the following string within a triple-stash as a quick test and it seemed fine " <p>hello<p> wouldn't wouldn ' t" This rendered to screen exactly how I wanted it to. My question is, is it safe to simply wrap all variables in triple-stash? or will this have some unforeseen consequences I haven't considered? Thanks 回答1:

Why does HttpUtility.UrlEncode(HttpUtility.UrlDecode(“%20”)) return + instead of %20?

阅读更多关于 Why does HttpUtility.UrlEncode(HttpUtility.UrlDecode(“%20”)) return + instead of %20?

问题 I'm having a problem with a file download where the download is replacing all the spaces with underscores. Basically I'm getting a problem here: Response.AddHeader("Content-Disposition", "attachment; filename=" + someFileName); The problem is that if someFileName had a space in it such as "check this out.txt" then the user would be prompted to download "check_this_out.txt". I figured the best option would be to UrlEncode the filename so I tried HttpUtility.UrlEncode(someFileName); But it's

Java App : Unable to read iso-8859-1 encoded file correctly

阅读更多关于 Java App : Unable to read iso-8859-1 encoded file correctly

问题 I have a file which is encoded as iso-8859-1, and contains characters such as ô . I am reading this file with java code, something like: File in = new File("myfile.csv"); InputStream fr = new FileInputStream(in); byte[] buffer = new byte[4096]; while (true) { int byteCount = fr.read(buffer, 0, buffer.length); if (byteCount <= 0) { break; } String s = new String(buffer, 0, byteCount,"ISO-8859-1"); System.out.println(s); } However the ô character is always garbled, usually printing as a ? . I

How to replace special characters with their equivalent (such as “ á ” for “ a”) in C#?

阅读更多关于 How to replace special characters with their equivalent (such as “ á ” for “ a”) in C#?

问题 I need to get the Portuguese text content out of an Excel file and create an xml which is going to be used by an application that doesn't support characters such as "ç", "á", "é", and others. And I can't just remove the characters, but replace them with their equivalent ("c", "a", "e", for example). I assume there's a better way to do it than check each character individually and replace it with their counterparts. Any suggestions on how to do it? 回答1: You could try something like var

UTF-8 encoding in Volley Requests

阅读更多关于 UTF-8 encoding in Volley Requests

问题 In my Android app I am loading json data with a Volley JsonArrayRequest . The data were created by myself and I saved them with Sublime with UTF-8 encoding. When I get the Response and fill my ListView , the texts are not displayed correctly (umlauts). This is what my Request looks like: JsonArrayRequest request = new JsonArrayRequest(targetUrl, new Response.Listener<JSONArray>() { @Override public void onResponse(final JSONArray response) { try { fillList(response); } catch (JSONException e)

How do you URL encode parameters in Erlang?

阅读更多关于 How do you URL encode parameters in Erlang?

问题 I'm using httpc:request to post some data to a remote service. I have the post working but the data in the body() of the post comes through as is, without any URL-encoding which causes the post to fail when parsed by the remote service. Is there a function in Erlang that is similar to CGI.escape in Ruby for this purpose? 回答1: You can find here the YAWS url_encode and url_decode routines They are fairly straightforward, although comments indicate the encode is not 100% complete for all