How to unescape a Java string literal in Java?

后端 未结 11 2004
庸人自扰
庸人自扰 2020-11-22 01:35

I\'m processing some Java source code using Java. I\'m extracting the string literals and feeding them to a function taking a String. The problem is that I need to pass the

11条回答
  •  说谎
    说谎 (楼主)
    2020-11-22 02:37

    If you are reading unicode escaped chars from a file, then you will have a tough time doing that because the string will be read literally along with an escape for the back slash:

    my_file.txt

    Blah blah...
    Column delimiter=;
    Word delimiter=\u0020 #This is just unicode for whitespace
    
    .. more stuff
    

    Here, when you read line 3 from the file the string/line will have:

    "Word delimiter=\u0020 #This is just unicode for whitespace"
    

    and the char[] in the string will show:

    {...., '=', '\\', 'u', '0', '0', '2', '0', ' ', '#', 't', 'h', ...}
    

    Commons StringUnescape will not unescape this for you (I tried unescapeXml()). You'll have to do it manually as described here.

    So, the sub-string "\u0020" should become 1 single char '\u0020'

    But if you are using this "\u0020" to do String.split("... ..... ..", columnDelimiterReadFromFile) which is really using regex internally, it will work directly because the string read from file was escaped and is perfect to use in the regex pattern!! (Confused?)

提交回复
热议问题