character-encoding | 易学教程

Alternative to mb_convert_encoding with HTML-ENTITIES charset

阅读更多关于 Alternative to mb_convert_encoding with HTML-ENTITIES charset

问题 I have the following code: mb_convert_encoding($string, 'HTML-ENTITIES', 'utf-8'); I need to have an alternative code which does exactly the same but does not use any mb_* functions (the mb extension is not available on some environments). I thought that utf8_decode(htmlentities($string, ENT_COMPAT, 'utf-8')); should do exactly the same, but unfortunately it does not. 回答1: I played around a bit, and find this very interesting. It seems like the second part also runs "htmlspecialchars". Must

SQL Server: set character set (not collation)

阅读更多关于 SQL Server: set character set (not collation)

问题 How does one set the default character set for fields when creating tables in SQL Server? In MySQL one does this: CREATE TABLE tableName ( name VARCHAR(128) CHARACTER SET utf8 ) DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; Note that I set the character set twice here. It is redundant, I added both ways just to demonstrate. I set the collation also to demonstrate that collation is something different. I am not asking about setting the collation. Most questions asking about

Inno Setup - Convert array of string to Unicode and back to ANSI

阅读更多关于 Inno Setup - Convert array of string to Unicode and back to ANSI

问题 I'm loading a Korean CP51949 (EUC-KR) encoded ANSI file into an array of strings ( LoadStringsFromFile ). My system and the intended end user systems do not have CP51949 set as a legacy non-Unicode encoding. At the moment I have 2 problems with this: Unless I run the application with Locale Emulator (which is just annoying, since the setup itself is in English only), the Korean text is displayed as gibberish. Pos gives wrong results and StringChange fails completely unless I switch to String

Converting Unicode to Windows-1252 for vCards

阅读更多关于 Converting Unicode to Windows-1252 for vCards

问题 I am trying to write a program in C# that will split a vCard (VCF) file with multiple contacts into individual files for each contact. I understand that the vCard needs to be saved as ANSI (1252) for most mobile phones to read them. However, if I open a VCF file using StreamReader and then write it back with StreamWriter (setting 1252 as the Encoding format), all special characters like å , æ and ø are getting written as ? . Surely ANSI (1252) would support these characters. How do I fix this

Fixing encodings

阅读更多关于 Fixing encodings

问题 I have ended up with messed up character encodings in one of our mysql columns. Typically I have √© instead of é √∂ instead of ö √≠ instead of í and so on... Fairly certain that someone here would know what happened and how to fix. UPDATE: Based on bobince's answer and since I had this data in a file I did the following #!/user/bin/env python import codecs f = codecs.open('./file.csv', 'r', 'utf-8') f2 = codecs.open('./file-fixed.csv', 'w', 'utf-8') for line in f: f2.write(line.encode(

Json_encode Charset problem

阅读更多关于 Json_encode Charset problem

问题 When I use json_encode to encode my multi lingual strings , It also changes special characters.What should I do to keep them same . For example <? echo json_encode(array('şüğçö')); It returns something like ["\u015f\u00fc\u011f\u00e7\u00f6"] But I want ["şüğçö"] 回答1: try it: <? echo json_encode(array('şüğçö'), JSON_UNESCAPED_UNICODE); 回答2: In JSON any character in strings may be represented by a Unicode escape sequence. Thus "\u015f\u00fc\u011f\u00e7\u00f6" is semantically equal to "şüğçö" .

Why does Sass prepend an incorrect @charset rule?

阅读更多关于 Why does Sass prepend an incorrect @charset rule?

问题 I use sass --watch scss:css to have Sass automatically create CSS files (and put them in the /css directory) for each SCSS file (from my /scss directory). In my SCSS file I have this: .foo::before { content: "▶"; } When I test the web page in the browser, that "play" character is not displayed - instead I see a bunch of weird letters with carons and other accents. I inspected the generated CSS file and noticed this in the first line: @charset "CP852"; I then manually changed that to this:

Why does Sass prepend an incorrect @charset rule?

阅读更多关于 Why does Sass prepend an incorrect @charset rule?

What is “ANSI as UTF-8” and how can I make fputcsv() generate UTF-8 w/BOM?

阅读更多关于 What is “ANSI as UTF-8” and how can I make fputcsv() generate UTF-8 w/BOM?

问题 I made a PHP script that generates CSV files that were previously generated by another process. And then, the CSV files have to be imported by yet another process. The import of the old CSV files works fine, but but when importing the new CSV files there are issues with special characters. When I open old CSVs with Notepad++, it says the encoding is UTF-8, and when I open the new CSVs with it, it says their encoding is 'ANSI as UTF-8'. What's the difference of the two? And how can I make

How do i replace accents (german) in .NET

阅读更多关于 How do i replace accents (german) in .NET

问题 I need to replace accents in the string to their english equivalents for example ä = ae ö = oe Ö = Oe ü = ue I know to strip of them from string but i was unaware about replacement. Please let me know if you have some suggestions. I am coding in C# 回答1: If you need to use this on larger strings, multiple calls to Replace() can get inefficient pretty quickly. You may be better off rebuilding your string character-by-character: var map = new Dictionary<char, string>() { { 'ä', "ae" }, { 'ö',