utf-8

Python UnicodeDecodeError: 'utf8' codec can't decode byte… unexpected code byte

天涯浪子 提交于 2020-01-15 18:52:17
问题 Python newbie's journey to build his first webapp (app link: http://contractpy.appspot.com/ - it's just an experimental app). Following an advice of a stackoverflow user, I start to use a template system: Jinja2 (I'm using Python 2.6), but now I'm stucked with this error: 2012-06-17 11:44:39 Running command: "['C:\\Python26\\pythonw.exe', 'C:\\Program Files (x86)\\Google\\google_appengine\\dev_appserver.py', '--admin_console_server=', '--port=8084', 'C:\\Users\\CG\\Documents\\udacity\

Android - Characters such as å,ä,ö do not render correctly in WebView

青春壹個敷衍的年華 提交于 2020-01-15 12:02:53
问题 I am using the following code to render my webview in android - webview.loadDataWithBaseURL(null, "Subject: "+ getSubject() +" Content: "+ getContent() , "text/html" , "UTF-8", ""); The subject and content that I receive from the server are UTF encoded and show wrongly as Ã¥,ä,ö in the log and on screen. However in iOS webview they show up correctly as å,ä,ö. How do I get them to show as å,ä,ö in android as well? 回答1: make sure the content you recieve, in tag <head> use like this: <meta

Storing string datasets in hdf5 with unicode

别等时光非礼了梦想. 提交于 2020-01-15 10:14:35
问题 I am trying to store variable string expressions from a file which contains special characters, like ø, æ , and å . Here is my code: import h5py as h5 file = h5.File('deleteme.hdf5','a') dt = h5.special_dtype(vlen=str) dset = file.create_dataset("text",(1,),dtype=dt) dset.attrs[str(1)] = "some text with ø, æ, å" However the text is not stored properly. The data stored contains text: "some text with \37777777703\37777777670, \37777777703\37777777646,\37777777703\37777777645" How can I store

Is there a way to select all the contents of a node?

我的未来我决定 提交于 2020-01-15 09:20:51
问题 Is there a way to select all the contents of a node in Nokogiri? <root> <element>this is <hi>the content</hi> of my æøå element</element> </root> The result of getting the content of /root/element should be: this is <hi>the content</hi> of my æøå element Edit: It seems like the solution is simply to use myElement.inner_html() . The problem I had was in fact that I was relying on an old version of libxml2, which escaped all the special characters. 回答1: Nokogiri.parse('<root><element>this is

converting UTF-8 string to ASCII in pure LUA

大兔子大兔子 提交于 2020-01-15 09:11:53
问题 I have a question about sending and receiving data with special chars. (German Umlauts) When I send the string "Café Zeezicht" with the code below, then on the server-side the string is oke. But how can I receive and decode the receiving data that containing the same chars? Now it look likes "Caf? Zeezicht" I am searching for a pure LUA function, because I have no ability to load libraries. ------------------------------------------------------------ -- Function voor converting ASCII naar

How to distinguish en-dash from hyphen in Notepad++?

倖福魔咒の 提交于 2020-01-15 08:26:48
问题 Using Notepad++, I'm not able to distinguish an en-dash ( – ) from the standard hyphen ( - ). Surprisingly, both are represented with the latter. Is there a way to: visually distinguish them , or even better automatically replacing every en-dash with an hyphen when opening, or saving, the text file ? Using a RegEx to manually search and replace is not an option because I could forget to do it. I've installed Notepad++'s TextFX plugin that I know is good at handling this things, but I'm stuck.

How to distinguish en-dash from hyphen in Notepad++?

不问归期 提交于 2020-01-15 08:25:00
问题 Using Notepad++, I'm not able to distinguish an en-dash ( – ) from the standard hyphen ( - ). Surprisingly, both are represented with the latter. Is there a way to: visually distinguish them , or even better automatically replacing every en-dash with an hyphen when opening, or saving, the text file ? Using a RegEx to manually search and replace is not an option because I could forget to do it. I've installed Notepad++'s TextFX plugin that I know is good at handling this things, but I'm stuck.

Mangling of French unicode when webscraping with rvest

只谈情不闲聊 提交于 2020-01-15 06:40:11
问题 I'm looking at scraping a French website using the rvest package. library(rvest) url <- "https://www.vins-bourgogne.fr/nos-vins-nos-terroirs/tous-les-bourgognes/toutes-les-appellations-de-bourgogne-a-votre-portee,2378,9172.html?&args=Y29tcF9pZD0xMzg2JmFjdGlvbj12aWV3RnVsbExpc3RlJmlkPSZ8" s <- read_html(url) s %>% html_nodes('#resultatListeAppellation .lien') %>% html_text() I expect to see: Aloxe-Corton (Appellation Village, VIGNOBLE DE LA CÔTE DE BEAUNE) Auxey-Duresses (Appellation Village,

Reading in Unicode Emoji correctly into R

蓝咒 提交于 2020-01-15 05:01:48
问题 I have a set of comments from Facebook (pulled via a system like Sprinkr) that contain both text and emojis, and I'm trying to run a variety of analysis on them in R, but running into difficulty into ingesting the emoji characters correctly. For example: I have a .csv (encoded in UTF-8) that will have a message line containing something like this: "IS THIS CORRECT!?!?! Please say it isn't true!!! Our family only eats the original Reeses Peanut Butter Cups💚💚💚" I then ingest it into R in the

utf-8 data retrieve from database

此生再无相见时 提交于 2020-01-15 03:52:13
问题 I have utf-8-general-ci in database..and inserted data in hebrew langugae.. now when i retrieve data it displays me string like ??????.. database connection is like this.. function __construct($strHost='', $strDB='', $strUser='', $strPass='') { try{ if($strHost != ''){$this->strHost = $strHost;} if($strDB != ''){$this->strDB = $strDB;} if($strUser != ''){$this->strUser = $strUser;} if($strPass != ''){$this->strPass = $strPass;} $this->objDB = new PDO("mysql:host=".$this->strHost.";port=3306