Having a bit of struggle with Unicode file names in OS X and Python. I am trying to use filenames as input for a regular expression later in the code, but the encoding used
MacOS X uses a special kind of decomposed UTF-8 to store filenames. If you need to e.g. read in filenames and write them to a "normal" UTF-8 file, you must normalize them :
filename = unicodedata.normalize('NFC', unicode(filename, 'utf-8')).encode('utf-8')
from here: https://web.archive.org/web/20120423075412/http://boodebr.org/main/python/all-about-python-and-unicode