utf-8

PHP file_exists with accent returns false

蓝咒 提交于 2019-12-30 18:06:59
问题 I have two folders, Folder and Folderé . The second one could not be catch by PHP. Here is my test: <?php $dir = 'D:\wamp\www\test\data\Folder'; var_dump(file_exists($dir)); // true $dir = 'D:\wamp\www\test\data\Folderé'; var_dump(file_exists($dir)); // false ?> How to fix it? 回答1: This works like charm <?php $dir = 'D:\wamp\www\test\data\Folderé'; var_dump(file_exists((utf8_decode($dir)))); 来源: https://stackoverflow.com/questions/19200750/php-file-exists-with-accent-returns-false

Python can not open UTF-8 encoded text file

我怕爱的太早我们不能终老 提交于 2019-12-30 14:16:14
问题 I have .py script which contains following code to open specific text file (which was generated by Exchange Powershell): with codecs.open("C:\\Temp\\myfile.txt",encoding="utf_8",mode="r",errors="replace") as myfile: content = myfile.readlines() #here we convert lines to list print(content) however, i tried also utf-16-be and utf-16-le (and standard ASCII obviously), but the file output is still looking like this (this is just part of it): ['��\r', '\x00\n', '\x00D\x00o\x00m\x00a\x00i\x00n\x00

Python can not open UTF-8 encoded text file

偶尔善良 提交于 2019-12-30 14:16:14
问题 I have .py script which contains following code to open specific text file (which was generated by Exchange Powershell): with codecs.open("C:\\Temp\\myfile.txt",encoding="utf_8",mode="r",errors="replace") as myfile: content = myfile.readlines() #here we convert lines to list print(content) however, i tried also utf-16-be and utf-16-le (and standard ASCII obviously), but the file output is still looking like this (this is just part of it): ['��\r', '\x00\n', '\x00D\x00o\x00m\x00a\x00i\x00n\x00

R JSON UTF-8 parsing

社会主义新天地 提交于 2019-12-30 13:56:46
问题 I have an issue when trying to parse a JSON file in russian alphabet in R. The file looks like this: [{"text": "Валера!", "type": "status"}, {"text": "когда выйдет", "type": "status"}, {"text": "КАК ДЕЛА?!)", "type": "status"}] and it is saved in UTF-8 encoding. I tried libraries rjson, RJSONIO and jsonlite to parse it, but it doesn't work: library(jsonlite) allFiles <- fromJSON(txt="ru_json_example_short.txt") gives me error Error in feed_push_parser(buf) : lexical error: invalid char in

Call a program via shell_exec with utf-8 text input

馋奶兔 提交于 2019-12-30 12:16:11
问题 Perquisites: hunspell and php5 . Test code from bash: user@host ~/ $ echo 'sagadījās' | hunspell -d lv_LV,en_US Hunspell 1.2.14 + sagadīties - works properly. Test code (test.php): $encoding = "lv_LV.utf-8"; setlocale(LC_CTYPE, $encoding); // test putenv('LANG='.$encoding); // and another test $raw_response = shell_exec("LANG=$encoding; echo 'sagadījās' | hunspell -d lv_LV,en_US"); echo $raw_response; returns Hunspell 1.2.14 & sagad 5 0: tagad, sagad?ties, sagaudo, sagand?, sagar?o * *

Call a program via shell_exec with utf-8 text input

Deadly 提交于 2019-12-30 12:16:05
问题 Perquisites: hunspell and php5 . Test code from bash: user@host ~/ $ echo 'sagadījās' | hunspell -d lv_LV,en_US Hunspell 1.2.14 + sagadīties - works properly. Test code (test.php): $encoding = "lv_LV.utf-8"; setlocale(LC_CTYPE, $encoding); // test putenv('LANG='.$encoding); // and another test $raw_response = shell_exec("LANG=$encoding; echo 'sagadījās' | hunspell -d lv_LV,en_US"); echo $raw_response; returns Hunspell 1.2.14 & sagad 5 0: tagad, sagad?ties, sagaudo, sagand?, sagar?o * *

Does C# have something like PHP's mb_convert_encoding()?

非 Y 不嫁゛ 提交于 2019-12-30 11:55:22
问题 Is there a way on C# that I can convert unicode strings into ASCII + html entities, and then back again? See, in PHP, I can do it like so: <?php // RUN ME AT COMMAND LINE $sUnicode = '<b>Jöhan Strauß</b>'; echo "UNICODE: $sUnicode\n"; $sASCII = mb_convert_encoding($sUnicode, 'HTML-ENTITIES','UTF-8'); echo "ASCII: $sASCII\n"; $sUnicode = mb_convert_encoding($sASCII, 'UTF-8', 'HTML-ENTITIES'); echo "UNICODE (TRANSLATED BACK): $sUnicode\n"; Background: I need this to work in C# .NET 2 because we

Data too long for column error with national characters

╄→гoц情女王★ 提交于 2019-12-30 11:09:44
问题 I have to port some DBS into stand alone MySQL Version: 5.0.18 running on Windows7 64bit and I got a problem I am stuck with. If I try to insert any national/unicode character into varchar I got error: ERROR 1406 (22001): Data too long for column 'nam' at row 1 Here MCVE SQL script: SET NAMES utf8; DROP TABLE IF EXISTS `tab`; CREATE TABLE `tab` (`ix` INT default 0,`nam` VARCHAR(1024) default '' ) DEFAULT CHARSET=utf8; INSERT INTO `tab` VALUES (1,'motorček'); INSERT INTO `tab` VALUES (2,

Data too long for column error with national characters

一笑奈何 提交于 2019-12-30 11:09:21
问题 I have to port some DBS into stand alone MySQL Version: 5.0.18 running on Windows7 64bit and I got a problem I am stuck with. If I try to insert any national/unicode character into varchar I got error: ERROR 1406 (22001): Data too long for column 'nam' at row 1 Here MCVE SQL script: SET NAMES utf8; DROP TABLE IF EXISTS `tab`; CREATE TABLE `tab` (`ix` INT default 0,`nam` VARCHAR(1024) default '' ) DEFAULT CHARSET=utf8; INSERT INTO `tab` VALUES (1,'motorček'); INSERT INTO `tab` VALUES (2,

Java: Detect non-displayable chars for a given Character Encoding

浪尽此生 提交于 2019-12-30 10:53:31
问题 I'm currently working on an application to validate and parse CSV-files. The CSV files have to be encoded in UTF-8, although sometimes we get files in a false encoding. The CSV-files most likely contain special characters of the German alphabet (Ä, Ö, Ü, ß) as most of the texts within the CSV file are in German language. For the part of the validator, i need to make sure, the file is UTF-8 encoded. As long as there are no special characters present, there is most likely no problem with