How to remove nonAscii characters in python

后端 未结 3 941
梦如初夏
梦如初夏 2020-12-22 01:00

This is my code:

#!C:/Python27/python
# -*- coding: utf-8 -*-
import requests
from bs4 import BeautifulSoup
import urllib2
import sys
import urlparse
import          


        
3条回答
  •  暗喜
    暗喜 (楼主)
    2020-12-22 01:52

    Try to normalize the string and then ASCII encode it ignoring errors.

    # -*- coding: utf-8 -*-
    from unicodedata import normalize
    
    string = 'úäô§'
    
    if isinstance(string, str):
        string = string.decode('utf-8')
    
    print normalize('NFKD', string).encode('ASCII', 'ignore')
    >>> uao
    

提交回复
热议问题