encoding | 易学教程

Detect the encoding of a text file using C#

阅读更多关于 Detect the encoding of a text file using C#

问题 I have a set of markdown files to be passed to jekyll project , need to find the encoding format of them i.e UTF-8 with BOM or UTF-8 without BOM or ANSI using a program or a API . if i pass the location of the files , the files have to be listed,read and the encoding should be produced as result . Is there any Code or API for it ? i have already tried the sr.CurrentEncoding for stream reader as mentioned in Effective way to find any file's Encoding but the result varies with the result from a

Response.Write - filename encoding wrong in Internet Explorer

阅读更多关于 Response.Write - filename encoding wrong in Internet Explorer

问题 I use the following code to send files from my server to the client: Response.AppendHeader("content-disposition", "attachment; filename=" + FileName); Response.ContentType = MimeType; Response.WriteFile(PathToFile); Response.End(); This works fine. Problem is, that when I download files from Internet Explorer, special characters, like the danish æ, ø and å, gets interpreted wrong. So i file with the name 'Test æ ø å file.txt' downloads as 'Test Ã¦_Ã¸_Ã¥ file.txt' I´ve tried adding Byte Order

Response.Write - filename encoding wrong in Internet Explorer

阅读更多关于 Response.Write - filename encoding wrong in Internet Explorer

Conversion from string to wstring is causing ú to lose encoding

阅读更多关于 Conversion from string to wstring is causing ú to lose encoding

问题 The variable filepath which is a string contains the value Música . I have the following code: wstring fp(filepath.length(), L' '); copy(filepath.begin(), filepath.end(), fp.begin()); fp then contains the value M?sica . How do I convert filepath to fp without losing the encoding for the ú character? 回答1: Use the function MultiByteToWideChar. Sample code: std::string toStdString(const std::wstring& s, UINT32 codePage) { unsigned int bufferSize = (unsigned int)s.length()+1; char* pBuffer = new

UTF-8 support issue to Java Swing? [duplicate]

阅读更多关于 UTF-8 support issue to Java Swing? [duplicate]

问题 This question already has an answer here : Closed 7 years ago . Possible Duplicate: how to implement UTF-8 format in Swing application? In Swing application I have the send button, one text area and a text field. If I press the send button, I need to send the text from text field to text area It's working fine in English But not in the local language... package package1; import java.awt.*; import java.awt.event.*; import java.io.UnsupportedEncodingException; import javax.swing.BorderFactory;

How to use Stanford LexParser for Chinese text?

阅读更多关于 How to use Stanford LexParser for Chinese text?

问题 I can't seem to get the correct input encoding for Stanford NLP's LexParser. How do I use the Stanford LexParser for Chinese text? I've done the following to download the tool: $ wget http://nlp.stanford.edu/software/stanford-parser-full-2015-04-20.zip $ unzip stanford-parser-full-2015-04-20.zip $ cd stanford-parser-full-2015-04-20/ And my input text is in UTF-8 : $ echo "应有尽有的丰富选择定将为您的旅程增添无数的赏心乐事。" > input.txt $ echo "应有尽有#VV 的#DEC 丰富#JJ 选择#NN 定#VV 将#AD 为#P 您#PN 的#DEG 旅程#NN 增添

Python insert UTF8 string into SQLite

阅读更多关于 Python insert UTF8 string into SQLite

问题 I know there are similar questions, but the answers are distinct and kind of confusing. I have this string: titulo = "Así Habló Zaratustra (Cómic)" When I try to insert it to the SQLite database I get the error: sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings. I've tried a couple of things without

How to normalize unicode encoding for iso-8859-15 conversion in python?

阅读更多关于 How to normalize unicode encoding for iso-8859-15 conversion in python?

问题 I want to convert unicode string into iso-8859-15. These strings include the u"\u2019" (RIGHT SINGLE QUOTATION MARK see http://www.fileformat.info/info/unicode/char/2019/index.htm) character which is not part of the iso-8859-15 characters set. In Python, how to normalize the unicode characters in order to match the iso-8859-15 encoding? I have looked at the unicodedata module without success. I manage to do the job with s.replace(u"\u2019", "'").encode('iso-8859-15') but I would like to find

Mysql german accents not-sensitive search in full-text searches

阅读更多关于 Mysql german accents not-sensitive search in full-text searches

问题 Let`s have a example hotels table: CREATE TABLE `hotels` ( `HotelNo` varchar(4) character set latin1 NOT NULL default '0000', `Hotel` varchar(80) character set latin1 NOT NULL default '', `City` varchar(100) character set latin1 default NULL, `CityFR` varchar(100) character set latin1 default NULL, `Region` varchar(50) character set latin1 default NULL, `RegionFR` varchar(100) character set latin1 default NULL, `Country` varchar(50) character set latin1 default NULL, `CountryFR` varchar(50)

Rails UTF-8 response

阅读更多关于 Rails UTF-8 response

问题 I've got a Rails 3.2 app running on Ruby 1.9.3 that returns JSON data stored in a MongoDB database. The data seems to be stored correctly in mongo, e.g. (look at the name attribute): { "_id" : ObjectId("4f986cbe4c8086fdc9000002"), "created_at" : ISODate("2012-04-25T21:31:45.474Z"), "updated_at" : ISODate("2012-04-26T22:07:23.901Z"), "creator_id" : ObjectId("4f6b4d3c4c80864381000001"), "updater_id" : null, "name" : "Trädgår'n", "sort" : "tradgarn", "address" : "Nya Allén 11", "coordinates" : [