non-ascii-characters

Removing non-ASCII characters from data files

喜夏-厌秋 提交于 2019-11-26 06:28:57
问题 I\'ve got a bunch of csv files that I\'m reading into R and including in a package/data folder in .rdata format. Unfortunately the non-ASCII characters in the data fail the check. The tools package has two functions to check for non-ASCII characters ( showNonASCII and showNonASCIIfile ) but I can\'t seem to locate one to remove/clean them. Before I explore other UNIX tools, it would be great to do this all in R so I can maintain a complete workflow from raw data to final product. Are there

How to fetch a non-ascii url with Python urlopen?

纵饮孤独 提交于 2019-11-26 04:43:08
问题 I need to fetch data from a URL with non-ascii characters but urllib2.urlopen refuses to open the resource and raises: UnicodeEncodeError: \'ascii\' codec can\'t encode character u\'\\u0131\' in position 26: ordinal not in range(128) I know the URL is not standards compliant but I have no chance to change it. What is the way to access a resource pointed by a URL containing non-ascii characters using Python? edit: In other words, can / how urlopen open a URL like: http://example.org/Ñöñ-ÅŞÇİİ/