how to write UTF8 data to xml file using RandomAccessFile?

笑着哭i 提交于 2021-01-28 05:30:36

问题


When trying to write some UTF8 data to a file, I end up with some garbage in the file. The code is as follows

public static boolean saveToFile(StringBuffer buffer,
                                   String fileName,
                                   ArrayList exceptionList,
                                   String className)
  {
    log.debug("In saveToFile for file [" + fileName + "]");

                RandomAccessFile raf = null;
                File file = new File(fileName);
                File backupFile = new File(fileName+"_bck");

                try
                {
                    if (file.exists())
                    {
                            if (backupFile.exists())
                            {
                            backupFile.delete();
                            }
                            file.renameTo(backupFile);
                    }
                    raf = new RandomAccessFile(file, "rw");
                    raf.writeBytes(buffer.toString());
                    raf.close();

The output of buffer.toString() is

<?xml version="1.0" encoding="UTF-8"?>
<ivr>
<version>1.1</version>
<templateName>αβγδεζη

The data in the file however is

<?xml version="1.0" encoding="UTF-8"?>
<ivr>
<version>1.1</version>
<templateName>▒▒▒▒▒▒▒</templateName>

How can I make sure that data i nthe file itself is UTF8


回答1:


I'm not surpised you get garbage:

 raf.writeBytes(buffer.toString())

The documentation for RandomAccessFile.writeBytes(String) says (emphasis added):

Writes the string to the file as a sequence of bytes. Each character in the string is written out, in sequence, by discarding its high eight bits.

In a few circumstances, that operation will result in a correctly encoded file. But in most it won't. That writeBytes() method is a foolish design by the Java developers. You need to correctly encode your text as bytes in UTF-8, and then write those bytes.

Do you really need to operate on the file as a random access file. If not, just manipulate it with a Writer wrapping an OutputStream.

You could use Charset.encode(CharBuffer) to produce a ByteBuffer holding the encoded bytes, then write those bytes to the file:

 raf.write(StandardCharsets.UTF_8.encode(buffer).array());



回答2:


The Javadoc for RandomAccessFile states that for writeBytes()

Writes the string to the file as a sequence of bytes. Each character in the string is written out, in sequence, by discarding its high eight bits. The write starts at the current position of the file pointer.

Assuming that discarding parts of your String isn't what you want, you should be using writeUtf():

Writes a string to the file using modified UTF-8 encoding in a machine-independent manner.



来源:https://stackoverflow.com/questions/24932750/how-to-write-utf8-data-to-xml-file-using-randomaccessfile

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!