character-encoding | 易学教程

Why do i have to use set_charset(“utf8”) even though everything is utf-8 encoded? (MySQLi-PHP)

阅读更多关于 Why do i have to use set_charset(“utf8”) even though everything is utf-8 encoded? (MySQLi-PHP)

问题 My table's collation is utf8_general_ci. My pages are encoded with UTF-8 (without BOM). Within my pages, my Equiv meta tag sets character set to utf8 My data has Turkish characters in it. When i output them, it's not showing them as it should be but when i do $db->set_charset("utf8"); , it works. Why do i have to use $db->set_charset("utf8"); even though everything is utf-8 encoded? 回答1: The data is stored as UTF-8 in MySQL, but the PHP's client connection collation is not. Which is why you

Adding encoded chars to the url breaks htaccess

阅读更多关于 Adding encoded chars to the url breaks htaccess

问题 Here's my code: RewriteEngine on RewriteRule page/(.*) index.php?url=$1 [NC] When I access page/http://google.com/ = works just fine When I access page/http%3A%2F%2Fgoogle.com%2F = server reports 404 Martti Laine 回答1: Apache returns a (somewhat non-intuitive) 404 in cases when you have encoded slashes in the request, but do not have AllowEncodedSlashes set to on. To confirm this is the case, check your error log, which likely contains an entry like this: found %2f (encoded '/') in URI

File encoded as UCS-2 Little Endian reports 2x too many lines to Java

阅读更多关于 File encoded as UCS-2 Little Endian reports 2x too many lines to Java

问题 I was processing several txt files with a simple Java program, and the first step of my process is counting the lines of each file: int count = 0; br = new BufferedReader(new FileReader(myFile)); // myFile is the txt file in question while (br.readLine() != null) { count++; } For one of my files, Java was counting exactly twice as many lines as there really were! This was confusing me greatly at first. I opened each file in Notepad++ and could see that the mis-counting file ended every line

File encoded as UCS-2 Little Endian reports 2x too many lines to Java

阅读更多关于 File encoded as UCS-2 Little Endian reports 2x too many lines to Java

TextEncoder and TextDecoder not perfect inverses of each other

阅读更多关于 TextEncoder and TextDecoder not perfect inverses of each other

问题 I received this answer to my previous question about encoding strings. My hope in asking that question was to get some reversible way of shifting between a string and its representation as an array of bytes like in Python 3. I ran into a problem with one particular Uint8Array though: var encoder = new TextEncoder(); var decoder = new TextDecoder(encoder.encoding); var s = [248, 35, 45, 41, 178, 175, 190, 62, 134, 39]; var t = Array.from(decoder.decode(encoder.encode(Uint8Array(s))); I

Decoding Ebcdic

阅读更多关于 Decoding Ebcdic

问题 I'm being passed data that is ebcdic encoded. Something like: s = u'@@@@@@@@@@@@@@@@@@@ÂÖÉâÅ@ÉÄ' Attempting to .decode('cp500') is wrong, but what's the correct approach? If I copy the string into something like Notepad++ I can convert it from EBCDIC to ascii, but I can't seem to find a viable approach in python to achieve the same. For what it's worth, the correct result is: BOISE ID (plus or minus space padding). The information is being retrieved from a file of lines of JSON objects. That

Strange gcc error: stray '\NNN' in program

阅读更多关于 Strange gcc error: stray '\NNN' in program

问题 The following issue popped up in my open source library, and I can't figure out what's going on. Two of my users have (gcc) compiler errors that look like: /home/someone/Source/src/._regex.cpp:1:1: warning: null character(s) ignored /home/someone/Source/src/._regex.cpp:1: error: stray ‘\5’ in program /home/someone/Source/src/._regex.cpp:1: error: stray ‘\26’ in program /home/someone/Source/src/._regex.cpp:1: error: stray ‘\7’ in program /home/someone/Source/src/._regex.cpp:1:5: warning: null

using UTF-8 characters in JAVA variable-names

阅读更多关于 using UTF-8 characters in JAVA variable-names

问题 I would like to know that can I use my native language characters (or String) as JAVA variable names ? So, I had tested as below with Myanmar Unicode. public static void main(final String[] args) { String ဆဆဆ = "မောင်မောင်"; System.out.println("ကောင်းသောနေ.ပါ " + ဆဆဆ); } This code show my successful message as 'ကောင်းသောနေ.ပါ မောင်မောင်' . But in below code with another variable name ( it also my native language String )..... public static void main(final String[] args) { String တက်စတင်း =

UTF-8 encode URLs

阅读更多关于 UTF-8 encode URLs

问题 Info: I've a program which generates XML sitemaps for Google Webmaster Tools (among other things). GWTs is giving me errors for some sitemaps because the URLs contain character sequences like ã¾, ã‹, ã€, etc. ** GWTs says: We require your Sitemap file to be UTF-8 encoded (you can generally do this when you save the file). As with all XML files, any data values (including URLs) must use entity escape codes for the characters: & , ' , " , < , > . The special characters are excaped in the XML

ASP.NET MVC: how to set encoding when returning FileResult

阅读更多关于 ASP.NET MVC: how to set encoding when returning FileResult

问题 In my controller, I have the following to send HTML snippet stored in CSHTML files to the front. public FileResult htmlSnippet(string fileName) { string contentType = "text/html"; return new FilePathResult(fileName, contentType); } The fileName looks like the following: /file/abc.cshtml What troubles me now is that these HTML snippet files have Spanish characters and they don't look right when they are displayed in pages. Thanks and regards. 回答1: First ensure that your file is UTF-8 encoded: