utf-8 | 易学教程

PHP file_exists with accent returns false

阅读更多关于 PHP file_exists with accent returns false

问题 I have two folders, Folder and Folderé . The second one could not be catch by PHP. Here is my test: <?php $dir = 'D:\wamp\www\test\data\Folder'; var_dump(file_exists($dir)); // true $dir = 'D:\wamp\www\test\data\Folderé'; var_dump(file_exists($dir)); // false ?> How to fix it? 回答1: This works like charm <?php $dir = 'D:\wamp\www\test\data\Folderé'; var_dump(file_exists((utf8_decode($dir)))); 来源： https://stackoverflow.com/questions/19200750/php-file-exists-with-accent-returns-false

Python can not open UTF-8 encoded text file

阅读更多关于 Python can not open UTF-8 encoded text file

问题 I have .py script which contains following code to open specific text file (which was generated by Exchange Powershell): with codecs.open("C:\\Temp\\myfile.txt",encoding="utf_8",mode="r",errors="replace") as myfile: content = myfile.readlines() #here we convert lines to list print(content) however, i tried also utf-16-be and utf-16-le (and standard ASCII obviously), but the file output is still looking like this (this is just part of it): ['��\r', '\x00\n', '\x00D\x00o\x00m\x00a\x00i\x00n\x00

Python can not open UTF-8 encoded text file

阅读更多关于 Python can not open UTF-8 encoded text file

R JSON UTF-8 parsing

阅读更多关于 R JSON UTF-8 parsing

问题 I have an issue when trying to parse a JSON file in russian alphabet in R. The file looks like this: [{"text": "Валера!", "type": "status"}, {"text": "когда выйдет", "type": "status"}, {"text": "КАК ДЕЛА?!)", "type": "status"}] and it is saved in UTF-8 encoding. I tried libraries rjson, RJSONIO and jsonlite to parse it, but it doesn't work: library(jsonlite) allFiles <- fromJSON(txt="ru_json_example_short.txt") gives me error Error in feed_push_parser(buf) : lexical error: invalid char in

Call a program via shell_exec with utf-8 text input

阅读更多关于 Call a program via shell_exec with utf-8 text input

问题 Perquisites: hunspell and php5 . Test code from bash: user@host ~/ $ echo 'sagadījās' | hunspell -d lv_LV,en_US Hunspell 1.2.14 + sagadīties - works properly. Test code (test.php): $encoding = "lv_LV.utf-8"; setlocale(LC_CTYPE, $encoding); // test putenv('LANG='.$encoding); // and another test $raw_response = shell_exec("LANG=$encoding; echo 'sagadījās' | hunspell -d lv_LV,en_US"); echo $raw_response; returns Hunspell 1.2.14 & sagad 5 0: tagad, sagad?ties, sagaudo, sagand?, sagar?o * *

Call a program via shell_exec with utf-8 text input

阅读更多关于 Call a program via shell_exec with utf-8 text input

Does C# have something like PHP's mb_convert_encoding()?

阅读更多关于 Does C# have something like PHP's mb_convert_encoding()?

问题 Is there a way on C# that I can convert unicode strings into ASCII + html entities, and then back again? See, in PHP, I can do it like so: <?php // RUN ME AT COMMAND LINE $sUnicode = '<b>Jöhan Strauß</b>'; echo "UNICODE: $sUnicode\n"; $sASCII = mb_convert_encoding($sUnicode, 'HTML-ENTITIES','UTF-8'); echo "ASCII: $sASCII\n"; $sUnicode = mb_convert_encoding($sASCII, 'UTF-8', 'HTML-ENTITIES'); echo "UNICODE (TRANSLATED BACK): $sUnicode\n"; Background: I need this to work in C# .NET 2 because we

Data too long for column error with national characters

阅读更多关于 Data too long for column error with national characters

问题 I have to port some DBS into stand alone MySQL Version: 5.0.18 running on Windows7 64bit and I got a problem I am stuck with. If I try to insert any national/unicode character into varchar I got error: ERROR 1406 (22001): Data too long for column 'nam' at row 1 Here MCVE SQL script: SET NAMES utf8; DROP TABLE IF EXISTS `tab`; CREATE TABLE `tab` (`ix` INT default 0,`nam` VARCHAR(1024) default '' ) DEFAULT CHARSET=utf8; INSERT INTO `tab` VALUES (1,'motorček'); INSERT INTO `tab` VALUES (2,

Data too long for column error with national characters

阅读更多关于 Data too long for column error with national characters

Java: Detect non-displayable chars for a given Character Encoding

阅读更多关于 Java: Detect non-displayable chars for a given Character Encoding

问题 I'm currently working on an application to validate and parse CSV-files. The CSV files have to be encoded in UTF-8, although sometimes we get files in a false encoding. The CSV-files most likely contain special characters of the German alphabet (Ä, Ö, Ü, ß) as most of the texts within the CSV file are in German language. For the part of the validator, i need to make sure, the file is UTF-8 encoded. As long as there are no special characters present, there is most likely no problem with