character-encoding

Why are results of path.toString() failing to show all characters on Linux but ok on windows

谁说胖子不能爱 提交于 2019-12-24 00:51:08
问题 In my Java code I use a FileVisitor to traverse a filesystem and creating a structure of Paths, then later on this is converted to a json object for rendering in html. Running on Windows it runs okay even against a linux filesystem, running on Linux against the same (now local) filesystem it fails to render special characters properly when call toString() on a path i.e Windows debug output CreateFolderTree:createJsonData:SEVERE: AddingNode(1):Duarte Lôbo- Requiem and html displays ok as

Recoding data.fame object from latin1 to utf-8

纵然是瞬间 提交于 2019-12-24 00:47:13
问题 I work with windows 7 (my system: "LC_COLLATE=French_France.1252) with data with accents. My data are coded in ANSI which allows me to visualize them correctly in the tabs of Rstudio. My problem: When I want to a create GoogleVis page (encoding utf-8), the accented characters are not displayed correctly. What I expected: I am looking to convert my latin1 Data.frames in utf-8 with R just before creating googleVis pages. I have no ideas. Stringi package seems only to work with raw data. fr <-

java print unicode characters to bash shell (mac OsX)

拥有回忆 提交于 2019-12-24 00:24:29
问题 I have this code in java 1.6: System.out.println("\u00b2"); but on bash on OSX10.6 I get question marks and not the unicode characters... actually I want to print the characters 176,177,178 on the extended ascii code (look here http://www.asciitable.com/) to create some art on the bash terminal.. any idea? thanks 回答1: The following code works for me in UTF-8 enabled Terminal.app on Mac OS X 10.6.7: # code taken from: # "Print Unicode characters to the Terminal with Java", # http://hints

How to check if a char lies between a certain unicode range…?

痞子三分冷 提交于 2019-12-23 23:54:36
问题 I want to check if a particular char I get from the text field lies between a particular hex range of unicode character set... Like if I enter capital C then I will specify the range 41-5a.. I want to do this for russian alphabet. But cant figure it out.I can get the last char entered using.. unichar lastEnteredChar= [[textField.text stringByReplacingCharactersInRange:range withString:string] characterAtIndex:[[textField.text stringByReplacingCharactersInRange:range withString:string] length]

Why does the string “¿” get translated to “¿” when calling .getBytes()

浪尽此生 提交于 2019-12-23 23:27:11
问题 When writing the string "¿" out using System.out.println(new String("¿".getBytes("UTF-8"))); ¿ is written instead of just ¿. WHY? And how do we fix it? 回答1: You don't have to use UTF-16 to solve this: new String("¿".getBytes("UTF-8"), "UTF-8"); works just fine. As long as the encoding given to the getBytes() method is the same as the encoding you pass to the String constructor, you should be fine! 回答2: You need to specify the Charset in the String constructor (see the API docs). 回答3: Try:

dompdf special character showing question mark?

流过昼夜 提交于 2019-12-23 22:03:26
问题 I have used dompdf 0.5.1 for generating PDF files. But the special characters are not properly showing. For example, . It is showing something like – “ in the generated PDF file. I used UTF-8 encoding like <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> in the HTML page which is rendered by the dompdf. I also have used the encoding before sending it to dompdf, like $dompdf->load_html(utf8_decode($html)); . But I get ? marks instead of the above characters. How do I

Scala - Converting from ISO-8859-1 to UTF-8 gives foreign character strangeness

房东的猫 提交于 2019-12-23 20:34:06
问题 Here's my problem; I have an InputStream that I've converted to a byte array, but I don't know the character set of the InputStream at runtime. My original thought was to do everything in UTF-8, but I see strange issues with streams that are encoded as ISO-8859-1 and have foreign characters. (Those crazy Swedes) Here's the code in question: IOUtils.toString(inputstream, "utf-8") // Fails on iso8859-1 foreign characters To simulate this, I have: new String("\u00F6") // Returns ö as expected,

Get source code with Chinese characters PHP

故事扮演 提交于 2019-12-23 20:22:27
问题 Well, I give up. I've been messing around with all I could think of to retrieve data from a target website that has information in traditional Chinese encoding (charset=GB2312). I've been using the simple_html_parser like always but it doesn't seem to return the Chinese characters, in fact all I get are some weird question marks embedded inside a rhomboid shape. ("�������ѯ�ؼ��֣�" Like so) Declaring the encoding for the php file didn't do anything except of getting rid of some unwanted

Convert ISO/Windows charsets to UTF-8 in Javascript

折月煮酒 提交于 2019-12-23 19:39:54
问题 I'm developing a firefox plugin and i fetch web pages to do some analysis for the user. The problem is when i try to get (XMLHttpRequest) pages that are not utf-8 encoded the string i see is messed up. For example hebrew pages with windows-1125 or Chinese pages with gb2312. I already tried the following: var uDecoder=Components.classes["@mozilla.org/intl/scriptableunicodeconverter"].getService(Components.interfaces.nsIScriptableUnicodeConverter); uDecoder.charset="windows-1255"; alert( xhr

Can't get a degree symbol into raw_input

百般思念 提交于 2019-12-23 19:22:49
问题 The problem in my code looks something like this: #!/usr/bin/python # -*- coding: UTF-8 -*- deg = u'°' print deg print '40%s N, 100%s W' % (deg, deg) codelim = raw_input('40%s N, 100%s W)? ' % (deg, deg)) I'm trying to generate a raw_input prompt for delimiter characters inside a latitude/longitude string, and the prompt should include an example of such a string. print deg and print '40%s N, 100%s W' % (deg, deg) both work fine -- they return "°" and "40° N, 100° W" respectively -- but the