character-encoding

JSF 2.0 request.getParameter return a string with wrong encoding

戏子无情 提交于 2020-01-02 07:58:49
问题 I'm writing an application in JSF 2.0 which supports many languages, among them ones with special characters. I use String value = request.getParameter("name") and POST method, the page encoding is set to UTF-8 and the app is deployed on apache tomcat 6 which has the connector set correctly to utf-8 in a server.xml file: <Connector URIEncoding="utf-8" connectionTimeout="20000" port="8088" protocol="HTTP/1.1" redirectPort="8443"/> Yes I get strange results like ä for example in place of

Character encoding in Excel spreadsheet (and what Java charset to use to decode it)

百般思念 提交于 2020-01-02 06:00:34
问题 I am using the JExcel library to read excel spreadsheets. Each cell on the spreadsheet may contain localization strings in any of something like 44 languages (English, Portugese, French, Chinese, etc). Today I don't tell the API anything regarding the encoding its supposed to use. Its handling the Chinese OK, but it always screws up Portugese and German. Somehow the default encoding (MacRoman on my dev box, UTF-8 on production) is failing to properly interpret the strings it pulls out of the

How can I detect japanese text in a Java string?

放肆的年华 提交于 2020-01-02 05:21:50
问题 I need to be able to detect Japanese characters in a Java string. Currently I'm getting the UnicodeBlock and checking to see if it's equal to Character.UnicodeBlock.KATAKANA or Character.UnicodeBlock.HALFWIDTH_AND_FULLWIDTH_FORMS, but I'm not 100% that's going to cover everything. Any suggestions? 回答1: I use the following java method. Might not completely address your requirement though. <!-- language: lang-java --> /** * Returns if a character is one of Chinese-Japanese-Korean characters. *

How can I print a euro (€) symbol in Python?

喜欢而已 提交于 2020-01-02 05:04:58
问题 I'm teaching myself Python using the command-line interpreter (v3.5 for Windows). All I want to do is output some text that includes the euro (€) symbol which I understand to be code 80h (128 dec). #! # -*- coding: utf-8 -*- mytext = 'Please pay \x8035.' print(mytext) It falls over on the last line: UnicodeEncodeError: 'charmap' codec can't encode character '\x80' in position 11: character maps to <undefined> I've done lots of googling (re encodings etc) and I've a rough idea why the print

Behavior of using \Z vs \z as Scanner delimiter

孤者浪人 提交于 2020-01-02 04:30:09
问题 [Edit] I found the answer, but I can't answer the question due to restrictions on new users. Either way, this is a known bug in Java. http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8028387 I'm trying to read a file into a string in Java 6 on 64 bit ubuntu. Java is giving me the very strange result that with "\\Z" it reads the entire file, but with "\\z" it reads the entire string up to 1024 characters. I've read the Java 6 API for all the classes and I am at a loss. Description of \Z and

Zend Studio for eclipse - Switch character encoding for all files in a project

丶灬走出姿态 提交于 2020-01-02 03:00:29
问题 I'm using Zend Studio for Eclipise on Mac, and it seems to keep setting all files to have and encoding of 'Mac Roman'. This becomes problematic when I save the files, as they all need to be UTF-8. I know how to change the encoding to UTF-8 on a file by file basis, but I was wondering if I could set this project wide? 回答1: Eclipse-Wide: Window->Preferences->Appearence->Workspace Project-Wide: Rightclick on Project->Properties Filewide: Rightclick on File->Properties 回答2: On my Eclipse for PHP

Changing the “locale preferred encoding”

一个人想着一个人 提交于 2020-01-02 02:55:30
问题 [Using Python 3.2] If I don't provide encoding argument to open , the file is opened using locale.getpreferredencoding() . So for example, on my Windows machine, any time I use open('abc.txt') , it would be decoded using cp1252 . I would like to switch all my input files to utf-8 . Obviously, I can add encoding = 'utf-8' to all my open function calls. Or, better, encoding = MY_PROJECT_DEFAULT_ENCODING , where the constant is defined at the global level somewhere. But I was wondering if there

How can i get know that my String contains diacritics?

孤人 提交于 2020-01-02 02:23:11
问题 For Example - text = Československá obchodní banka; text string contains diacritics like Č , á etc. I want to write a function where i will pass this string "Československá obchodní banka" and function will return true if string contains diacritics else false . I have to handle diacritics and string which contains character which doesn't fall in A-z or a-z range separately. 1) If String contains diacritics then I have to do some XXXXXX on it. 2) If String contains character other than A-Z or

What is the native narrow string encoding on Windows?

删除回忆录丶 提交于 2020-01-02 02:12:07
问题 The Subversion API has a number of functions for converting from "natively-encoded" strings to strings that are encoded in UTF-8. My question is: what is this native encoding on Windows? Does it depend on locale? 回答1: "Natively encoded" strings are strings written in whatever code page the user is using. That is, they are numbers that are translated to the appropriate glyphs based on the correct code page. Assuming the file was saved that way and not as a UTF-8 file. This is a candidate

Python HTMLParser: UnicodeDecodeError

有些话、适合烂在心里 提交于 2020-01-02 00:55:19
问题 I'm using HTMLParser to parse pages I pull down with urllib, and am coming across UnicodeDecodeError exceptions when passing some to HTMLParser . I tried using chardet to detect the encodings and to convert to ascii , or utf-8 (the docs don't seem to say what it should be). lossiness is acceptable, but while the decode/encode lines work just fine, I always get the error after self.feed(). The information is there if I just print it out. from HTMLParser import HTMLParser import urllib import