Print string literal unicode as the actual character

后端 未结 4 1018
自闭症患者
自闭症患者 2020-12-30 03:44

In my Java application I have been passed in a string that looks like this:

\"\\u00a5123\"

When printing that string into the console, I get the same string

相关标签:
4条回答
  • 2020-12-30 04:15

    I wrote a little program:

    public static void main(String[] args) {
        System.out.println("\u00a5123");
    }
    

    It's output:

    ¥123

    i.e. it output exactly what you stated in your post. I am not sure there is not something else going on. What version of Java are you using?

    edit:

    In response to your clarification, there are a couple of different techniques. The most straightforward is to look for a "\u" followed by 4 hex-code characters, extract that piece and replace with a unicode version with the hexcode (using the Character class). This of course assumes the string will not have a \u in front of it.

    I am not aware of any particular system to parse the String as though it was an encoded Java String.

    0 讨论(0)
  • 2020-12-30 04:26

    As has been mentioned before, these strings will have to be parsed to get the desired result.

    1. Tokenize the string by using \u as separator. For example: \u63A5\u53D7 => { "63A5", "53D7" }

    2. Process these strings as follows:

      String hex = "63A5";
      int intValue = Integer.parseInt(hex, 16);
      System.out.println((char)intValue);
      
    0 讨论(0)
  • 2020-12-30 04:26

    Could replace the above with this:

    System.out.println((char)0x63A5);
    

    Here is the code to print all of the box building unicode characters.

    public static void printBox()
    {
        for (int i=0x2500;i<=0x257F;i++)
        {
            System.out.printf("0x%x : %c\n",i,(char)i);
        }
    }
    
    0 讨论(0)
  • 2020-12-30 04:31

    You're probably going to have to write a parse for these, unless you can find one in a third party library. There is nothing in the JDK to parse these for you, I know because I fairly recently had an idea to use these kind of escapes as a way to smuggle unicode through a Latin-1-only database. (I ended up doing something else btw)

    I will tell you that java.util.Properties escapes and unescapes Unicode characters in this manner when reading and writing files (since the files have to be ASCII). The methods it uses for this are private, so you can't call them, but you could use the JDK source code to inspire your solution.

    0 讨论(0)
提交回复
热议问题