I have a string of HTML stored in a database. Unfortunately it contains characters such as ® I want to replace these characters by their HTML equivalent, either in the DB it
This code snippet may help you.
#!/usr/bin/env python
# -*- coding: UTF-8 -*-
def removeNonAscii(string):
nonascii = bytearray(range(0x80, 0x100))
return string.translate(None, nonascii)
nonascii_removed_string = removeNonAscii(string_to_remove_nonascii)
The encoding definition is very important here which is done in the second line.