问题
I have a bunch of music files on a NTFS partition mounted on linux that have filenames with unicode characters. I'm having trouble writing a script to rename the files so that all of the file names use only ASCII characters. I think that using the iconv command should work, but I'm having trouble escaping the characters for the 'mv' command.
EDIT: It doesn't matter if there isn't a direct translieration for the unicode chars. I guess that i'll just replace those with a "?" character.
回答1:
I don't think iconv has any character replacement facilities. This in Python might help:
#!/usr/bin/python
import sys
def unistrip(s):
if isinstance(s, str):
s = s.decode('utf-8')
chars = []
for i in s:
if ord(i) > 0x7f:
chars.append(u'?')
else:
chars.append(i)
return u''.join(chars)
if __name__ == '__main__':
print unistrip(sys.argv[1])
Then call as:
$ ./unistrip.py "yikes_𝄞_oh_look_a_file_火"
yikes_?_oh_look_a_file_?
Also:
$ mv "yikes_𝄞_oh_look_a_file_火" "`./unistrip.py "yikes_𝄞_oh_look_a_file_火"`"
You might test it a bit first.
For large move operations, generating a list of mv commands (ie, write code to write a script) is advisable, as you can look over the move commands before telling them to execute.
回答2:
Sometimes mv will not be able to read the filename in a shell, so you can try the inode reference.
To get the inode of a file:
$ ls -il
Output will be something like this:
13377799 -rw-r--r-- 1 draco draco 11809 Apr 25 01:39 some_filename.ext
9340462 -rw-r--r-- 1 draco draco 81648 Apr 23 02:27 some_strange_filename.ext
9340480 -rw-r--r-- 1 draco draco 4717 Apr 23 03:54 yikes_𝄞_oh_look_a_file_火
Then use find to get your file and perhaps using the python code by Thanatos:
$ find . -inum 9340480 -exec ./unistrip.py {} \;
You could also use the above command with iconv in a shell.
Hope this helps someone out, and excuse me for any mistakes[first answer].
回答3:
convmv is a good Perl script to convert file name encodings. But it can't handle characters that aren't in the destination encoding.
You can change any character not in ASCII to '?' using the rename utility distributed with Perl:
rename 's/[^ -~]/?/g' *
Unfortunately this replaces multi-byte characters with multiple '?'s. Depending on the Unicode encoding that is used and the characters involved changing the regex may help, e.g.
rename 's/[^ -~]{2}/?/g' *
for 2-byte characters.
来源:https://stackoverflow.com/questions/3011569/how-do-i-convert-filenames-from-unicode-to-ascii