Converting UTF-8 to ISO-8859-1 in Java

后端未结

关注

 4  1512

挽巷 2020-12-08 11:45

I am reading an XML document (UTF-8) and ultimately displaying the content on a Web page using ISO-8859-1. As expected, there are a few characters are not displayed correctl

4条回答

没有蜡笔的小新 (楼主)

2020-12-08 12:27

With Java 8, McDowell's answer can be simplified like this (while preserving correct handling of surrogate pairs):

public final class HtmlEncoder {
    private HtmlEncoder() {
    }

    public static  T escapeNonLatin(CharSequence sequence,
                                                          T out) throws java.io.IOException {
        for (PrimitiveIterator.OfInt iterator = sequence.codePoints().iterator(); iterator.hasNext(); ) {
            int codePoint = iterator.nextInt();
            if (Character.UnicodeBlock.of(codePoint) == Character.UnicodeBlock.BASIC_LATIN) {
                out.append((char) codePoint);
            } else {
                out.append("&#x");
                out.append(Integer.toHexString(codePoint));
                out.append(";");
            }
        }
        return out;
    }
}

0 讨论(0)

查看其它4个回答