What are the (full) valid / allowed charset characters for CSS identifiers id and class?
Is there a regular expression tha
For anyone looking for something a little more turn-key. The full expression, replaced and all, from @BalusC's answer is:
/-?([_a-z]|[\240-\377]|([0-9a-f]{1,6}(\r\n|[ \t\r\n\f])?|[^\r\n\f0-9a-f]))([_a-z0-9-]|[\240-\377]|([0-9a-f]{1,6}(\r\n|[ \t\r\n\f])?|[^\r\n\f0-9a-f]))*/
And using DEFINE, which I find a little more readable:
/(?(DEFINE)
(?P [0-9a-f] )
(?P (?&h){1,6}(\r\n|[ \t\r\n\f])? )
(?P ((?&unicode)|[^\r\n\f0-9a-f])* )
(?P [\240-\377] )
(?P ([_a-z0-9-]|(?&nonascii)|(?&escape)) )
(?P ([_a-z]|(?&nonascii)|(?&escape)) )
(?P -?(?&nmstart)(?&nmchar)* )
) (?:
(?&ident)
)/x
Incidentally, the original regular expression (and @human's contribution) had a few rogue escape characters that allow [ in the name.
Also, it should be noted that the raw regex without, DEFINE, runs about 2x as fast as the DEFINE expression, taking only ~23 steps to identify a single unicode character, while the later takes ~40.