non-ascii-characters

Accented characters in mySQL table

筅森魡賤 提交于 2019-11-26 22:22:13
问题 I have some texts in French (containing accented characters such as "é"), stored in a MySQL table whose collation is utf8_unicode_ci (both the table and the columns), that I want to output on an HTML5 page. The HTML page charset is UTF-8 (< meta charset="utf-8" />) and the PHP files themselves are encoded as "UTF-8 without BOM" (I use Notepad++ on Windows). I use PHP5 to request the database and generate the HTML. However, on the output page, the special characters (such as "é") appear

Removing non-ASCII characters from data files

送分小仙女□ 提交于 2019-11-26 19:42:54
I've got a bunch of csv files that I'm reading into R and including in a package/data folder in .rdata format. Unfortunately the non-ASCII characters in the data fail the check. The tools package has two functions to check for non-ASCII characters ( showNonASCII and showNonASCIIfile ) but I can't seem to locate one to remove/clean them. Before I explore other UNIX tools, it would be great to do this all in R so I can maintain a complete workflow from raw data to final product. Are there any existing packages/functions to help me get rid of the non-ASCII characters? To simply remove the non

Convert Hi-Ansi chars to Ascii equivalent (é -> e)

北城以北 提交于 2019-11-26 19:29:46
问题 Is there a routine available in Delphi 2007 to convert the characters in the high range of the ANSI table (>127) to their equivalent ones in pure ASCII (<=127) according to a locale (codepage)? I know some chars cannot translate well but most can, esp. in the 192-255 range: À → A à → a Ë → E ë → e Ç → C ç → c – (en dash) → - (hyphen - that can be trickier) — (em dash) → - (hyphen) 回答1: WideCharToMultiByte does best-fit mapping for any characters that aren't supported by the specified

Finding the Values of the Arrow Keys in Python: Why are they triples?

和自甴很熟 提交于 2019-11-26 16:11:18
I am trying to find the values that my local system assigns to the arrow keys, specifically in Python. I am using the following script to do this: import sys,tty,termios class _Getch: def __call__(self): fd = sys.stdin.fileno() old_settings = termios.tcgetattr(fd) try: tty.setraw(sys.stdin.fileno()) ch = sys.stdin.read(1) finally: termios.tcsetattr(fd, termios.TCSADRAIN, old_settings) return ch def get(): inkey = _Getch() while(1): k=inkey() if k!='':break print 'you pressed', ord(k) def main(): for i in range(0,25): get() if __name__=='__main__': main() Then I ran the script, and hit UP DOWN

How to fetch a non-ascii url with Python urlopen?

谁说胖子不能爱 提交于 2019-11-26 16:09:54
I need to fetch data from a URL with non-ascii characters but urllib2.urlopen refuses to open the resource and raises: UnicodeEncodeError: 'ascii' codec can't encode character u'\u0131' in position 26: ordinal not in range(128) I know the URL is not standards compliant but I have no chance to change it. What is the way to access a resource pointed by a URL containing non-ascii characters using Python? edit: In other words, can / how urlopen open a URL like: http://example.org/Ñöñ-ÅŞÇİİ/ Strictly speaking URIs can't contain non-ASCII characters; what you have there is an IRI . To convert an IRI

How to ignore acute accent in a javascript regex match?

荒凉一梦 提交于 2019-11-26 14:48:31
问题 I need to match a word like 'César' for a regex like this /^cesar/i . Is there an option like /i to configure the regex so it ignores the acute accents?. Or the only solution is to use a regex like this /^césar/i . 回答1: The standard ecmascript regex isn't ready for unicode (see http://blog.stevenlevithan.com/archives/javascript-regex-and-unicode). So you have to use an external regex library. I used this one (with the unicode plugin) in the past : http://xregexp.com/ In your case, you may

isalpha() giving an assertion

╄→尐↘猪︶ㄣ 提交于 2019-11-26 12:47:48
问题 I have a C code in which I am using standard library function isalpha() in ctype.h, This is on Visual Studio 2010-Windows. In below code, if char c is \'£\', the isalpha call returns an assertion as shown in the snapshot below: char c=\'£\'; if(isalpha(c)) { printf (\"character %c is alphabetic\\n\",c); } else { printf (\"character %c is NOT alphabetic\\n\",c); } I can see that this might be because 8 bit ASCII does not have this character. So how do I handle such Non-ASCII characters outside

matching unicode characters in python regular expressions

半城伤御伤魂 提交于 2019-11-26 09:39:32
问题 I have read thru the other questions at Stackoverflow, but still no closer. Sorry, if this is allready answered, but I didn`t get anything proposed there to work. >>> import re >>> m = re.match(r\'^/by_tag/(?P<tag>\\w+)/(?P<filename>(\\w|[.,!#%{}()@])+)$\', \'/by_tag/xmas/xmas1.jpg\') >>> print m.groupdict() {\'tag\': \'xmas\', \'filename\': \'xmas1.jpg\'} All is well, then I try something with Norwegian characters in it ( or something more unicode-like ): >>> m = re.match(r\'^/by_tag/(?P<tag

SyntaxError of Non-ASCII character [duplicate]

て烟熏妆下的殇ゞ 提交于 2019-11-26 09:20:00
问题 This question already has answers here : Correct way to define Python source code encoding (6 answers) SyntaxError: Non-ASCII character '\\xa3' in file when function returns '£' (5 answers) Closed 3 years ago . I am trying to parse xml which contains the some non ASCII cheracter, the code looks like below from lxml import etree from lxml import objectify content = u\'<?xml version=\"1.0\" encoding=\"utf-8\"?><div>Order date : 05/08/2013 12:24:28</div>\' mail.replace(\'\\xa0\',\' \') xml =

Check If the string contains accented characters in SQL?

泪湿孤枕 提交于 2019-11-26 08:36:54
问题 I want to perform a task if the input string contain any accented characters else do another task in SQL. Is there any way to check this condition in SQL ? Eg: @myString1 = \'àéêöhello!\' IF(@myString1 contains any accented characters) Task1 ELSE Task2 回答1: SQL Fiddle: http://sqlfiddle.com/#!6/9eecb7d/1607 declare @a nvarchar(32) = 'àéêöhello!' declare @b nvarchar(32) = 'aeeohello!' select case when (cast(@a as varchar(32)) collate SQL_Latin1_General_Cp1251_CS_AS) = @a then 0 else 1 end