发表新帖

发表新帖

Remove accented characters form string - Python

后端未结

关注

 2  644

小蘑菇 2021-01-28 10:09

I get some data from a webpage and read it like this in python

origional_doc = urllib2.urlopen(url).read()

Sometimes this url has characters su

2条回答

既然无缘 (楼主)

2021-01-28 10:41
using re you can sub all characters that are in a certain hexadecimal ascii range.
```
>>> re.sub('[\x80-\xFF]','','é and ä and ect')
' and  and ect'
```
You can also do the inverse and sub anything thats NOT in the basic 128 characters:
```
>>> re.sub('[^\x00-\x7F]','','é and ä and ect')
' and  and ect'
```
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...

热议问题