Velocity - Correct Regex to remove control characters?

江枫思渺然 提交于 2019-12-12 22:34:52

问题


I'm trying to remove undesirable characters from a string in Velocity (newlines are ok, but not things like EM and CAN ASCII control characters).

#set($cleanScreen = $cleanScreen.replaceAll("\p{Cc}", ""))

Throws:

org.apache.velocity.exception.ParseErrorException: Lexical error: org.apache.velocity.runtime.parser.TokenMgrError: Lexical error at line 13, column 82.  Encountered: "p" (112), after : "\"\\"
    at org.apache.velocity.Template.process(Template.java:137)
    at org.apache.velocity.runtime.resource.ResourceManagerImpl.loadResource(ResourceManagerImpl.java:415)
    at org.apache.velocity.runtime.resource.ResourceManagerImpl.getResource(ResourceManagerImpl.java:335)
    at org.apache.velocity.runtime.RuntimeInstance.getTemplate(RuntimeInstance.java:1102)
    at org.apache.velocity.runtime.RuntimeInstance.getTemplate(RuntimeInstance.java:1077)
    at org.apache.velocity.runtime.RuntimeSingleton.getTemplate(RuntimeSingleton.java:303)
    at org.apache.velocity.app.Velocity.getTemplate(Velocity.java:503)

and

#set($cleanScreen = $cleanScreen.replaceAll("[[:cntrl:]]", ""))

This one doesn't thrown an exception, instead, it matches the characters c,n,t,r,l and removes them from the string.

and...

#set($cleanScreen = $cleanScreen.replaceAll("\\p{Cntrl}", ""))

Throws:

java.util.regex.PatternSyntaxException: Illegal repetition near index 2
\\p{Cntrl}
  ^
    at java.util.regex.Pattern.error(Unknown Source)
    at java.util.regex.Pattern.closure(Unknown Source)
    at java.util.regex.Pattern.sequence(Unknown Source)
    at java.util.regex.Pattern.expr(Unknown Source)
    at java.util.regex.Pattern.compile(Unknown Source)
    at java.util.regex.Pattern.<init>(Unknown Source)
    at java.util.regex.Pattern.compile(Unknown Source)
    at java.lang.String.replaceAll(Unknown Source)
    at sun.reflect.GeneratedMethodAccessor168.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
    at java.lang.reflect.Method.invoke(Unknown Source)
    at org.apache.velocity.util.introspection.UberspectImpl$VelMethodImpl.invoke(UberspectImpl.java:295)
    at org.apache.velocity.runtime.parser.node.ASTMethod.execute(ASTMethod.java:245)

I've tried several regex's (many seem to work in Java, but not VTL)? My key issue seems to be how things differ in their escaping between Java and Velocity?

Can anyone help? I only have access the the VTL, not the Java class.


回答1:


I can't comment on the actual regexp.

On the velocity side however, I find that...

#set($cleanScreen = $cleanScreen.replaceAll("\p{Cc}", ""))
#set($cleanScreen = $cleanScreen.replaceAll("[[:cntrl:]]", ""))

...these two are correct as they are. I have a little vtl shell into which I just copy pasted your vtl code. Are you really getting these errors with the first two expressions? How about using '\p{Cc}'?

#set($cleanScreen = $cleanScreen.replaceAll("\\p{Cntrl}", ""))

The '\\p' gets you into trouble.

On a side note, you can use http://velocity.apache.org/tools/devel/generic/EscapeTool.html for all your escaping needs.




回答2:


Those Velocity Parser Exceptions, might come from the double-quotes characters. I had similar problem in VTL when trying to String.replaceAll regular expression with a capturing group, like so:

#set( $Jira_links = $Jira_tickets.replaceAll("(CT-\d+)", "http://jira.site.com/browse/$1") )

Throws:

org.apache.velocity.exception.ParseErrorException: Lexical error: org.apache.velocity.runtime.parser.TokenMgrError: Lexical error at line 2, column 58. Encountered: "d" (100), after : "\" (CT-\"

Changing it into single quotes worked:

#set( $Jira_links = $Jira_tickets.replaceAll('(CT-\d+)', 'http://jira.site.com/browse/$1') )


来源:https://stackoverflow.com/questions/8255689/velocity-correct-regex-to-remove-control-characters

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!