In Saxon 9 he Java XML parser, word boundaries (\\b) in regular expressions are not recognized

蓝咒 提交于 2019-12-01 01:19:18

The regular expression dialect used in XSD and XPath does not recognize \b (either as a word boundary or as a backspace). I think the reason for excluding it was probably a misplaced anxiety that word boundaries are language/culture dependent, though that's illogical since the dialect does support \w (a word character), and a word boundary can be simply defined as a boundary between a character that matches \w and a character that doesn't. Alternatively the XSD team may have been worried about the ambiguities that arise with zero-length matches, which are a notorious source of bugs and make it very hard to specify rigorously exactly what regular expressions do.

So it's not a Saxon limitation, it's a limitation written into the XPath specification.

If you're not too concerned about standards conformance, Saxon allows you to put "!" at the end of the "flags" argument to indicate that your regular expression is a Java regular expression rather than an XPath regular expression.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!