I need a regex or a function in PHP that will validate a string to be a good XML element name.
Form w3schools:
XML elements must follow these
The expression below should match valid unicode element names excepting xml. Names that start or end with xml will still be allowed. This passes @toscho's äøñ test. The one thing I could not figure out a regex for was extenders. The xml element name spec says:
[4] NameChar ::= Letter | Digit | '.' | '-' | '_' | ':' | CombiningChar | Extender
[5] Name ::= (Letter | '_' | ':') (NameChar)*
But there's no clear definition for a unicode category or class containing extenders.
^[\p{L}_:][\p{N}\p{L}\p{Mc}.\-|:]*((?