问题
I'm trying to validate a restricted string using a regular expression ...
<xs:simpleType name="myStringType">
<xs:restriction base="xs:string">
<xs:pattern value="^urn:mystuff:v1:(ABC\.(?!Acme).\S+\.\S+\.a\d+\.v\d+|ABC\.Acme\.\S+\.a\d+\.\d+\.\d+)$"/>
</xs:restriction>
</xs:simpleType>
As you can see the regular expression I'm trying to use is
^urn:mystuff:v1:(ABC\.(?!Acme).\S+\.\S+\.a\d+\.v\d+|ABC\.Acme\.\S+\.a\d+\.\d+\.\d+)$
I would like the following to validate:
urn:mystuff:v1:ABC.Test.MyData.a1.v1
urn:mystuff:v1:ABC.Acme.MyData.a1.0.1
But I would like the following to fail
urn:mystuff:v1:ABC.Acme.MyData.a1.v1
This appears to work fine in an online regex tester but when I use Oxygen XML Editor I get the following error.
Pattern value '^urn:mystuff:v1:(ABC\.(?!Acme).\S+\.\S+\.a\d+\.v\d+|ABC\.Acme\.\S+\.a\d+\.\d+\.\d+)$' is not a valid regular expression. The reported error was: 'This expression is not supported in the current option setting.'.
This post suggests that lookaheads and lookbehinds are not supported in XSD regex but the question relates to number patterns so a brute force approach is taken in the example. This is possible because there's a very limited subset of possibilities.
How does one deal with this when the values to be disallowed is a specific string?
回答1:
adendum : Note that this solution plants an pseudo assertion at a fixed location in the string.
For an example solution of an assertion that should span the entire string
see this question XML schema restriction pattern for not allowing specific string
edit : As pointed out in a comment, use (..)
instead of (?:..)
if that is the only
supported construct.
Changed !
This series (?!Acme)\S+\.
can be replaced with this large series :
([^A]\S*|A([^c.]\S*)?|Ac([^m.]\S*)?|Acm([^e.]\S*)?)\.
which is bigger but should cover all cases and makes the regex now :
urn:mystuff:v1:(ABC\.([^A]\S*|A([^c.]\S*)?|Ac([^m.]\S*)?|Acm([^e.]\S*)?)\.\S+\.a\d+\.v\d+|ABC\.Acme\.\S+\.a\d+\.\d+\.\d+)
https://regex101.com/r/qXv9HU/2
Expanded
urn:mystuff:v1:
( # (1 start)
ABC \.
( # (2 start)
[^A] \S*
| A
( [^c.] \S* )? # (3)
| Ac
( [^m.] \S* )? # (4)
| Acm
( [^e.] \S* )? # (5)
) # (2 end)
\.
\S+ \. a \d+ \. v \d+
|
ABC \. Acme \. \S+ \. a \d+ \. \d+ \. \d+
) # (1 end)
回答2:
XSD has a particular definition of what it accepts in regular expression, and it rather more restrictive than many other regular expression dialects. I think the intention of the designers was to use a "common subset" of popular regex dialects so that it could be easily implemented on any platform. You are using constructs like (?! ... )
and (?: ... )
that aren't defined in this subset. So is the answer from @x15, unfortunately.
Telling you why your attempt isn't working is easy, finding an alternative that does work is harder. I would go for the easy option which is to use an XSD 1.1 assertion like test="matches($value, XX) or matches($value, YY) and not(matches($value, ZZ))"
. A solution using pure XSD 1.0 might be possible, but I can't immediately see it.
回答3:
The simplest way would be to exploit this rule in the XML Schem specification:
If multiple element information items appear as children of a
<simpleType>
, the values should be combined as if they appeared in a single regular expression as separate branches. Note: It is a consequence of the schema representation constraint Multiple patterns (§4.3.4.3) and of the rules for restriction that pattern facets specified on the same step in a type derivation are ORed together, while pattern facets specified on different steps of a type derivation are ANDed together.
Instead of trying to match both allowed patterns with a single regex, specify two separate pattern facets. That would also extend more naturally if a third, fourth URN pattern is required.
来源:https://stackoverflow.com/questions/59336944/not-allowing-a-specific-string-in-an-xsd-regular-expression