utf-8 | 易学教程

How to change the preferred encoding in Sublime Text 3 for MacOS

阅读更多关于 How to change the preferred encoding in Sublime Text 3 for MacOS

问题 I want to change the preferred encoding from US-ASCII to UTF-8 in Sublime Text 3 on Yosemite. The preferred encoding in the bash is set to UTF-8 so when python is run in the terminal: import locale print(locale.getpreferredencoding()) the output is: UTF-8 When the same code is run in Sublime Text, the output is US-ASCII . Setting in the build system for Python 3: "encoding": "UTF-8" or "env": {"PYTHONIOENCODING": "utf-8} has not helped. How can the setting be changed permanently so that I don

PHP include html page charset problem

阅读更多关于 PHP include html page charset problem

问题 after querying a mysql db using the code below i have generated an html file: $myFile = "page.htm"; $fh = fopen($myFile, 'w') or die("can't open file"); fwrite($fh, $row['text']); fclose($fh); On the msql db the text is encoded using utf8_general_ci. But i need to include it in a php web page as shown below: <?include('page.htm');?> bearing in mind that the php web page uses utf8 charset on the header: <meta http-equiv="content-type" content="text/html; charset=utf8" /> Now if i write on the

PHP include html page charset problem

阅读更多关于 PHP include html page charset problem

How to parse UTF-8 characters in Excel files using POI

阅读更多关于 How to parse UTF-8 characters in Excel files using POI

问题 I have been using POI to parse XLS and XLSX files successfully. However, I am unable to correctly extract special characters, such as UTF-8 encoded characters like Chinese or Japanese, from an Excel spreadsheet. I have figured out how to extract data from a UTF-8 encoded csv or tab delimited file, but no luck with the Excel file. Can anyone help? ( Edit: Code snippet from comments ) HSSFSheet sheet = workbook.getSheet(worksheet); HSSFEvaluationWorkbook ewb = HSSFEvaluationWorkbook.create

MySQL decode Unicode to UTF-8 function

阅读更多关于 MySQL decode Unicode to UTF-8 function

问题 I want to decode Unicode strings to UTF-8 when inserting in a table. Here's what I have: ('\u0645\u064e\u062b\u0652\u0646\u064e\u0649 \u00a0\u062c \u0645\u064e\u062b\u064e\u0627\u0646\u064d') So I want these values to be converted to UTF-8, for example: INSERT INTO `nouns`(`NOUNID`, `WORDID`, `SINGULAR`, `PLURAL`) VALUES (781, 3188, '\u0646\u064e\u062c\u0652\u0645', ('\u0646\u064e\u062c\u0652\u0645')) I am migrating my h2 database to MySQL, so I got this when scripting my h2 db: INSERT INTO

Can there be 2 different UTF-8 encodings for the same character?

阅读更多关于 Can there be 2 different UTF-8 encodings for the same character?

问题 I'm writing an application that needs to transcode its input from UTF-8 to ISO-8859-1 (Latin 1). All works fine, except I sometimes get strange encodings for some umlaut characters. For example the Latin 1 E with 2 dots (0xEB) usually comes as UTF-8 0xC3 0xAB, but sometimes also as 0xC3 0x83 0xC2 0xAB. This happened a number of times from different sources and noting that first and last characters match what I expect, could there be an encoding rule that my library doesn't know about ? 回答1: $

mb_convert_encoding for russian in php

阅读更多关于 mb_convert_encoding for russian in php

问题 how to convert Russian character to utf-8 in PHP using mb_convert_encoding or any other method? 回答1: Did you try the following? Not sure if it works, though. mb_convert_encoding($str, 'UTF-8', 'auto'); 回答2: $file = 'images/да так 1.jpg';//this is in UTF-8, needs to be system encoding (Russian) $new_filename = mb_convert_encoding($file, "Windows-1251", "utf-8");//turn utf-8 to system encoding Windows-1251 (Russian) now your russian files should open your russian characters in php are already

RegEx with extended latin alphabet (ä ö ü è ß)

阅读更多关于 RegEx with extended latin alphabet (ä ö ü è ß)

问题 I want to do some basic String testing in Node.js. Assume I have a form where users enter their name and I wanna check if it's just rubbish or a real name. Happily (or sadly for my check) I get users from all around the world which means that their names contain non-english characters, like ä ö ü ß é . I was used to use /[A-Za-z -]{2,}/ but this doesn't match names like "Jan Buschtöns" . Do I have to manually add every possible non-english but latin character to my RegEx to work? I don't want

Converting “normal” std::string to utf-8

阅读更多关于 Converting “normal” std::string to utf-8

问题 Let's see if I can explain this without too many factual errors... I'm writing a string class and I want it to use utf-8 (stored in a std::string) as it's internal storage. I want it to be able to take both "normal" std::string and std::wstring as input and output. Working with std::wstring is not a problem, I can use std::codecvt_utf8<wchar_t> to convert both from and to std::wstring. However after extensive googling and searching on SO I have yet to find a way to convert between a "normal

Sending arabic characters in URL

阅读更多关于 Sending arabic characters in URL

问题 i have these arabic sentence: نايتيد أمامه عشرة أيام فقط لكي يقرر مستقبل برباتوف في النادي It must be sent in the url. I tried this approach: $url = 'http://example.com/?q='.urlencode('نايتيد أمامه عشرة أيام فقط لكي يقرر مستقبل برباتوف في النادي'); The result of that encoding is: %D9%86%D8%A7%D9%8A%D8%AA%D9%8A%D8%AF+%D8%A3%D9%85%D8%A7%D9%85%D9%87+%D8%B9%D8%B4%D8%B1%D8%A9+%D8%A3%D9%8A%D8%A7%D9%85+%D9%81%D9%82%D8%B7+%D9%84%D9%83%D9%8A+%D9%8A%D9%82%D8%B1%D8%B1+%D9%85%D8%B3%D8%AA%D9%82%D8%A8%D9