I have a program I use to create a batch file. My problem is that the program\'s output is UTF-8 so as soon as any diacritical marks like é,à,ö,Ä are in my batch file it fai
You can get many GNU command line utilities from the GnuWin32 project. That includes iconv
(and many more):
C:\> iconv.exe -f UTF-8 -t WINDOWS-1252 input.bat > output.bat
You have stated you don't want to rely on the script host, but there is no native batch command that can do what you want. You are going to have to use something beyond pure batch. The script host is native to Windows, so I should think it would not be a problem.
The following UTF8toANSI.vbs script converts UTF-8 (with or without BOM) into ISO-8859-1, (basically the same as code page 1252). It is adapted from VB6/VbScsript change file / write file with encoding to ansii.
Option Explicit
Private Const adReadAll = -1
Private Const adSaveCreateOverWrite = 2
Private Const adTypeBinary = 1
Private Const adTypeText = 2
Private Const adWriteChar = 0
Private Sub UTF8toANSI(ByVal UTF8FName, ByVal ANSIFName)
Dim strText
With CreateObject("ADODB.Stream")
.Open
.Type = adTypeBinary
.LoadFromFile UTF8FName
.Type = adTypeText
.Charset = "utf-8"
strText = .ReadText(adReadAll)
.Position = 0
.SetEOS
.Charset = "iso-8859-1"
.WriteText strText, adWriteChar
.SaveToFile ANSIFName, adSaveCreateOverWrite
.Close
End With
End Sub
UTF8toANSI WScript.Arguments(0), WScript.Arguments(1)
The VBS script would need to be in your current directory or your path.
A batch script to convert and run your UTF8 encoded script could look something like this:
@echo off
UTF8toANSI "utf8.bat" "ansi.bat"
ansi.bat
Original Answer: below is my original answer that works for UTF-16 with BOM, but not for UTF-8
The output of internal commands is automatically converted to ANSI if output is piped or redirected to a file.
chcp 1252
type "utf_file.bat" >"ansi_file.bat"
The process can go in reverse if CMD is started with the /U
option, but unfortunately the unicode header bytes will be missing. But of course that is a non-issue for your situation.
In Unix I would use the "iconv" tool for converting between encodings:
iconv --from-code UTF-8 --to-code iso-8859-1 -c inputfile > outputfile
It seems a build for Windows is avialable at http://gnuwin32.sourceforge.net/packages/libiconv.htm