I have a script that parses the filenames of TV episodes (show.name.s01e02.avi for example), grabs the episode name (from the www.thetvdb.com API) and automatically renames
\X seems to be available as a generic word-character in some languages, it allows you to match a single character disregarding of how many bytes it takes up. Might be useful.