Illegal character - CTRL-CHAR

后端 未结 6 2224
野性不改
野性不改 2020-12-10 12:03

I am getting following exceptopn from webservices:

com.ctc.wstx.exc.WstxUnexpectedCharException: Illegal character ((CTRL-CHAR, code 15))

相关标签:
6条回答
  • 2020-12-10 12:43

    If you have control characters in your text data then you need to solve that problem at its source.

    The most likely causes are incorrect communication encodings (usually between database and app) or not sanitising user input.

    0 讨论(0)
  • 2020-12-10 12:54

    I found the same problem when I was passing null values for some of the parameters. I passed empty or wrench values instead and this error went away.

    0 讨论(0)
  • 2020-12-10 12:58

    I'm a bit confused by @ssedano's anwser, it seems to me he's trying to find all control chars from ASCII table 0x00 to 0x1F except for 0x0A (new line) and 0x0D (carriage return) plus 0x7F (del), then wouldn't ther regex be

    replaceAll("[\\x00-\\x09\\x0B\\x0C\\x0E-\\x1F\\x7F]", "")
    
    0 讨论(0)
  • 2020-12-10 12:59

    This error is being thrown by the Woodstox XML parser. The source code from the InputBootstrapper class looks like this:

    protected void reportUnexpectedChar(int i, String msg)
        throws WstxException
    {
        char c = (char) i;
        String excMsg;
    
        // WTF? JDK thinks null char is just fine as?!
        if (Character.isISOControl(c)) {
            excMsg = "Unexpected character (CTRL-CHAR, code "+i+")"+msg;
        } else {
            excMsg = "Unexpected character '"+c+"' (code "+i+")"+msg;
        }
        Location loc = getLocation();
        throw new WstxUnexpectedCharException(excMsg, loc, c);
    }
    

    Amusing comment aside, the Woodstox is performing some additional validation on top of the JDK parser, and is rejecting the ASCII character 15 as invalid.

    As to why that character is there, we can't tell you that, it's in your data. Similarly, we can't tell you if removing that character will break anything, since again, it's your data. You can only establish that for yourself.

    0 讨论(0)
  • 2020-12-10 13:04

    I would do what OrangeDog suggest. But if you want to solve it in your code try:

    replaceAll("[\\x00-\\x09\\x11\\x12\\x14-\\x1F\\x7F]", "")

    \\x12 is the char.

    0 讨论(0)
  • 2020-12-10 13:04

    Thanks guys for you inputs. I am sharing solution might be helpful for others. The requirement was not to wipe out CONTROL CHAR, it should remain as it is in DB also and one WS sends it across n/w client should able to get the CONTROL CHAR. So I implemented the code as follow:

    1. Encode strings using URLEncoder in Web-Service code.
    2. At client Side decode it using URLDecoder

    Sharing sample code and output bellow.
    Sample code:

    System.out.println("NewSfn");  
    System.out.println(URLEncoder.encode("NewSfn", "UTF-8"));  
    System.out.println(URLDecoder.decode("NewSfn", "UTF-8"));  
    

    Output:

    NewSfn  
    New%0FSfn  
    NewSfn 
    

    So client will recieve CONTROL CHARs.

    EDIT: Stack Exchange is not showing CONTROL CHAR above. NewSfn is like this New(CONTROL CHAR)Sfn.

    0 讨论(0)
提交回复
热议问题