What is the fastest, easiest tool or method to convert text files between character sets?
Specifically, I need to convert from UTF-8 to ISO-8859-15 and vice versa.>
The character encoding of all matching text files gets detected automatically and all matching text files are converted to utf-8 encoding:
$ find . -type f -iname *.txt -exec sh -c 'iconv -f $(file -bi "$1" |sed -e "s/.*[ ]charset=//") -t utf-8 -o converted "$1" && mv converted "$1"' -- {} \;
To perform these steps, a sub shell sh is used with -exec, running a one-liner with the -c flag, and passing the filename as the positional argument "$1" with -- {}. In between, the utf-8 output file is temporarily named converted.
Whereby file -bi means:
-b, --brief
Do not prepend filenames to output lines (brief mode).
-i, --mime
Causes the file command to output mime type strings rather than the more traditional human readable ones. Thus it may say for example text/plain; charset=us-ascii rather than ASCII text. The sed command cuts this to only us-ascii as is required by iconv.
The find command is very useful for such file management automation.
Click here for more find galore.